CASSANALYTICS-52: Downgrade noisy TokenPartitioner partition/range maps to DEBUG#209
Conversation
…ps to DEBUG
The partition map, reverse partition map, and initial-ranges log lines
in TokenPartitioner emit one entry per token; with the default 256
vnodes per node this produces tens of thousands of characters of log
output at INFO on every job startup, drowning out useful messages.
Lower these to DEBUG in both TokenPartitioner implementations
(cassandra-analytics-core bulk writer and cassandra-analytics-common
bulk reader). Also switch the two concatenated info strings in the
common variant to SLF4J placeholder form so the map's toString() is
not built when DEBUG is suppressed.
Scalar summaries ("Number of ranges", "Tasks to run",
"Calculated number of splits") remain at INFO.
Patch by Tejas Lodaya for CASSANALYTICS-52
| LOGGER.info("Number of partitions {}", reversePartitionMap.size()); | ||
| LOGGER.info("Partition map " + partitionMap); | ||
| LOGGER.info("Reverse partition map " + reversePartitionMap); | ||
| LOGGER.debug("Number of partitions {}", reversePartitionMap.size()); |
There was a problem hiding this comment.
I have found this information useful when debugging jobs in the past. I do agree that these log messages are not needed in the majority of case. I wonder if there's a better way to control when to log this information.
There was a problem hiding this comment.
probably a JavaDoc comment on calculateTokenRangeMap()?
There was a problem hiding this comment.
One option to reduce the verbosity is to only log at the driver instance. Currently the 3 messages are logged by every spark task. To remove the log messages from the executor nodes, we can skip logging in the code path of org.apache.cassandra.spark.data.partitioner.TokenPartitioner.Serializer#read
I'd strongly suggest keeping the log messages at the info level. They are useful when debugging.
There was a problem hiding this comment.
yeah, I second Yifan here. We should keep the log message. +1 to logging this from the driver
https://issues.apache.org/jira/browse/CASSANALYTICS-52