Skip to content

feat: support manifest compaction in table commits#400

Open
Aitozi wants to merge 5 commits into
apache:mainfrom
Aitozi:mwj/support-manifest-compaction
Open

feat: support manifest compaction in table commits#400
Aitozi wants to merge 5 commits into
apache:mainfrom
Aitozi:mwj/support-manifest-compaction

Conversation

@Aitozi

@Aitozi Aitozi commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Purpose

Support manifest compaction during table commits and add Java-compatible manifest compaction options.

Tests

  • cargo fmt --all
  • cargo test -p paimon manifest_compaction --features storage-memory
  • cargo test -p paimon table_commit::tests --features storage-memory
  • git diff --check

Comment thread crates/paimon/src/table/table_commit.rs Outdated
@Aitozi Aitozi force-pushed the mwj/support-manifest-compaction branch from 7768824 to eaf9177 Compare June 20, 2026 01:13
@JingsongLi JingsongLi closed this Jun 23, 2026
@JingsongLi JingsongLi reopened this Jun 23, 2026
.read_for_full_compaction(&manifest_file, &delete_identifiers)
.await?;
if read_result.require_change {
new_files.extend(self.write_compacted_entries(&read_result.entries).await?);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full compaction should feed all changed entries into one rolling writer instead of rewriting each input manifest independently. With the current loop, if a table has many small manifests and manifest.full-compaction-threshold-size is reached without deletes, every manifest is must_change, but each one is written back as its own new manifest and merge() returns before the minor compaction path can merge them. The result is N small manifests with new names, so full compaction does a lot of I/O while leaving the manifest count and small-file problem unchanged. Java keeps a single RollingFileWriter for the whole toBeMerged set and rolls only when manifest.target-file-size is reached; this path should do the same, or fall back to minor compaction when no entries were actually removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants