[BUGFIX] Ingester: release TSDB appender on early-return paths in Push#7528
Open
sandy2008 wants to merge 3 commits into
Open
[BUGFIX] Ingester: release TSDB appender on early-return paths in Push#7528sandy2008 wants to merge 3 commits into
sandy2008 wants to merge 3 commits into
Conversation
Push acquired a TSDB appender at the start of the function and, on early returns (e.g. out-of-order label set, hard append errors, Commit failure), could return without calling Commit or Rollback. This leaked TSDB head series references, mmap'd chunks, and pending appender state, observable as monotonic growth of cortex_ingester_tsdb_head_active_appenders. Defer Rollback right after the appender is acquired and skip it on the success path. `committed` is flipped to true before app.Commit() because Prometheus closes the appender even on Commit failure (internal self-rollback on WAL error), so the deferred Rollback must not run afterwards. The two now-redundant explicit Rollback calls inside the sample / histogram failure paths are removed and rely on the same defer. Add a regression test asserting cortex_ingester_tsdb_head_active_appenders stays at 0 after repeated out-of-order-label-set pushes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Sandy Chen <Yuxuan.Chen@morganstanley.com>
Contributor
|
Hi, might be better to open an issue first and get maintainer confirmation before making code changes |
Address `make check-modernize` lint failure by using Go 1.22+ `for range N` syntax instead of `for n := 0; n < numLeakyPushes; n++`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Sandy Chen <Yuxuan.Chen@morganstanley.com>
Contributor
Author
@siddarth2810 made the issue: #7530 |
10 tasks
friedrichg
reviewed
May 21, 2026
Co-authored-by: Friedrich Gonzalez <1517449+friedrichg@users.noreply.github.com> Signed-off-by: Sandy Chen <sandy19890604@gmail.com>
Contributor
Author
@friedrichg thx, updated CHANGELOG. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Ingester.Pushacquires a TSDB appender viaapp := db.Appender(ctx)at the top of the request-processing block, then on certain early-return paths returns without callingCommitorRollback. Concrete leak path the issue this PR targets: theisLabelSetOutOfOrdercheck atpkg/ingester/ingester.go:1454-1456returns beforeAppend/Commit, leaving the appender dangling. The two existingapp.Rollback()blocks inside the per-sample / per-histogram failure loops would still leak if a future return path were added between them andCommit.Symptom: monotonic growth of
cortex_ingester_tsdb_head_active_appendersplus retained head series references, mmap'd chunks, and pending appender state per leaked request.Fix: register a
deferimmediately after the appender is acquired that callsapp.Rollback()unless a localcommittedflag istrue.committedis flipped totrueimmediately beforeapp.Commit()because Prometheus'sheadAppender.Commitcloses the appender even on failure (it self-rolls-back on WAL error) — callingRollbackagain would produce a spuriousErrAppenderClosedwarn log. The two now-redundant explicitapp.Rollback()calls inside the loop are removed in favor of the single defer.This was found as part of a wider audit of CPU / memory / goroutine leaks in this codebase.
Which issue(s) this PR fixes
Fixes #7530
Checklist
CHANGELOG.mdupdated —[BUGFIX] Ingester:entry undermaster / unreleased. PR number will be added once this PR is assigned one.make docnot required).TestIngester_Push_OutOfOrderLabels_AppenderNotLeakedassertscortex_ingester_tsdb_head_active_appenders == 0after repeated out-of-order-label-set pushes (the failing-without-fix behaviour was verified by reverting only the defer and observingcortex_ingester_tsdb_head_active_appenders = 5after 5 leaky pushes).Test plan
go build -tags "netgo slicelabels" ./pkg/ingester/...— cleango vet -tags "netgo slicelabels" ./pkg/ingester/...— cleangofmt -l/goimports -local github.com/cortexproject/cortex -l— cleango test -tags "netgo slicelabels" -run "^TestIngester_Push" ./pkg/ingester/... -count=1 -timeout 300s— PASSgo test -race -tags "netgo slicelabels" -run "TestIngester_Push|TestIngester_OutOfOrder|TestIngester_Append" ./pkg/ingester/... -count=1 -timeout 600s— PASS, 0 data racesgo test -tags "netgo slicelabels" -run TestIngester_Push_OutOfOrderLabels_AppenderNotLeaked ./pkg/ingester/... -count=3— PASS x3 (no flakiness)🤖 Generated with Claude Code