Skip to content

Parquet: Skip parquet conversion for blocks with too many labels#7524

Open
siddarth2810 wants to merge 10 commits into
cortexproject:masterfrom
siddarth2810:add-no-convert-marker
Open

Parquet: Skip parquet conversion for blocks with too many labels#7524
siddarth2810 wants to merge 10 commits into
cortexproject:masterfrom
siddarth2810:add-no-convert-marker

Conversation

@siddarth2810
Copy link
Copy Markdown
Contributor

@siddarth2810 siddarth2810 commented May 18, 2026

What this PR does:
If a TSDB block exceeds a configurable threshold of distinct label names, the converter writes a parquet-no-convert-mark.json marker and skips the block.

  • Added no-convert marker with read/write logic
  • Added parquet-converter.max-block-label-names limit

Which issue(s) this PR fixes:
Fixes #7195

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
  • docs/configuration/v1-guarantees.md updated if this PR introduces experimental flags

Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
- Add max-block-label-names limit, blocks exceeding it get a
  no-convert marker instead of being converted.

Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
…correctly

Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
…test

Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
@siddarth2810 siddarth2810 marked this pull request as ready for review May 18, 2026 12:15
@dosubot dosubot Bot added go Pull requests that update Go code storage/blocks Blocks storage engine type/feature labels May 18, 2026
friedrichg

This comment was marked as resolved.

- Add a new cortex_parquet_converter_blocks_skipped_total counter with user and reason labels
- Extract "too_many_labels" to a constant to avoid string duplication

Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>

This comment was marked as resolved.

Copy link
Copy Markdown
Member

@friedrichg friedrichg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one minor nit on the metrics that copilot suggested. pre-approving!

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label May 21, 2026
…ert marker exists

Signed-off-by: Siddarth Gundu <siddarthg0910@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go Pull requests that update Go code lgtm This PR has been approved by a maintainer size/L storage/blocks Blocks storage engine type/feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Parquet] Stop converting TSDB block to parquet if it has too many labels

3 participants