Skip to content

fix(spec): Fix incorrect totals when previous is omitted or invalid#2589

Open
dannycjones wants to merge 3 commits into
apache:mainfrom
dannycjones:fix-update-totals
Open

fix(spec): Fix incorrect totals when previous is omitted or invalid#2589
dannycjones wants to merge 3 commits into
apache:mainfrom
dannycjones:fix-update-totals

Conversation

@dannycjones
Copy link
Copy Markdown
Contributor

@dannycjones dannycjones commented Jun 4, 2026

Which issue does this PR close?

What changes are included in this PR?

This change prevents totals being incorrectly calculated where the total is omitted, or if any of the values associated with the calculation are not valid positive integers. Additionally, a warning trace will be emitted when invalid values are found for the previous total.

Of note: we will still calculate some of the statistics where we can, while others may be omitted.

This change also removes the table truncation metrics path which is currently unused / dead code. I do not believe that it was implemented correctly for partial overwrites, and so I've removed it for now. I think we'd be better suited letting the snapshot collector handle this behavior (which is yet to be implemented).

Are these changes tested?

New unit tests are added to cover these cases.

Comment thread crates/iceberg/src/spec/snapshot_summary.rs Outdated
@dannycjones
Copy link
Copy Markdown
Contributor Author

Please can you take a look, @mansi153 @CTTY?

return;
}
},
None => return,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should return 0 here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. There exists a previous summary however the total is omitted, which is fine since its optional. The way I'm interpreting the spec is that any one of these metrics could be missing. https://iceberg.apache.org/spec/#metrics

When the previous snapshot summary exists but there's no previous total, I don't think we can come up with a correct total for the new snapshot. The previous snapshot could have had data files (as an example).

I believe my change aligns with the Java implementation, which specifies no total if the previous one is missing or it fails to parse the integer:

https://github.com/apache/iceberg/blob/c00669fde813cac7e9d474c1a2c38fa8e4f75a95/core/src/main/java/org/apache/iceberg/SnapshotProducer.java#L926-L956

Comment thread crates/iceberg/src/spec/snapshot_summary.rs Outdated
.map_or(0, |value| value.parse::<u64>().unwrap())
});
let previous_total = match previous_summary {
Some(previous_summary) => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually more tedious than the original one, if we want to avoid unwrap, Result has a inspect method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see what I can do here to make it less tedious, though I think it depends on the outcome of other comment threads.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworked the snapshot collector a bit, which I think has made this logic of determining the previous total clearer.

if let Some(value) = summary.additional_properties.get(added_property) {
match value.parse::<u64>() {
Ok(v) => new_total += v,
Err(_) => return,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorret, we should either return error, or prints a warning and uses a default value.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should not swallow the error.

Would it make sense to use expect here? We just set this value, so we should expect it to be valid.

If not expect, I would propose to return an Iceberg error. I think a warning is the wrong choice here - if we cannot read our own properties, we're doing something wrong and should bail out of the commit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider rethinking this part of the code. I find it weird that we are parsing the number back from the string when we already had it as an integer. Instead, I think we should store the added and removed values as integers rather than strings, and then write it out to properties.

That being said, I'd prefer not to tackle in this PR if that's OK with you.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think it would be worth rethinking now.

My thoughts are that this logic should live in the snapshot collector and not as an afterthought. I'd propose taking the previous snapshot into the build() method on the collector, such that the summaries are final once built.

let previous_snapshot = table_metadata.current_snapshot();
let mut additional_properties = summary_collector.build();

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a change which updates build(&self) to build(&self, prev_summary: Option<&Summary>).

This allows us to access the collected metrics directly rather than serializing and deserializing them.

.map(|value| value.parse::<u64>().unwrap())
{
new_total -= value;
if let Some(value) = summary.additional_properties.get(removed_property) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Copy Markdown
Contributor Author

@dannycjones dannycjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick review, @blackmwk! I responded in comment threads.

Comment thread crates/iceberg/src/spec/snapshot_summary.rs Outdated
return;
}
},
None => return,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. There exists a previous summary however the total is omitted, which is fine since its optional. The way I'm interpreting the spec is that any one of these metrics could be missing. https://iceberg.apache.org/spec/#metrics

When the previous snapshot summary exists but there's no previous total, I don't think we can come up with a correct total for the new snapshot. The previous snapshot could have had data files (as an example).

I believe my change aligns with the Java implementation, which specifies no total if the previous one is missing or it fails to parse the integer:

https://github.com/apache/iceberg/blob/c00669fde813cac7e9d474c1a2c38fa8e4f75a95/core/src/main/java/org/apache/iceberg/SnapshotProducer.java#L926-L956

Comment thread crates/iceberg/src/spec/snapshot_summary.rs Outdated
if let Some(value) = summary.additional_properties.get(added_property) {
match value.parse::<u64>() {
Ok(v) => new_total += v,
Err(_) => return,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should not swallow the error.

Would it make sense to use expect here? We just set this value, so we should expect it to be valid.

If not expect, I would propose to return an Iceberg error. I think a warning is the wrong choice here - if we cannot read our own properties, we're doing something wrong and should bail out of the commit.

@dannycjones
Copy link
Copy Markdown
Contributor Author

FYI I've removed the overwrite truncation. It was dead code and I don't think it implemented overwrite correctly. I think we should address truncation as part of the standard summary stats collection and not have a special path for it.

Please do let me know if you think this is the wrong approach, happy to rework it. (FYI @blackmwk)

@dannycjones dannycjones requested a review from blackmwk June 5, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Snapshot summary totals appear to be calculated incorrectly when previous totals don't exist

2 participants