Skip to content

refactor leader election around DB-issued "terms"#1213

Open
bgentry wants to merge 1 commit intomasterfrom
bg/fix-elector-flakiness
Open

refactor leader election around DB-issued "terms"#1213
bgentry wants to merge 1 commit intomasterfrom
bg/fix-elector-flakiness

Conversation

@bgentry
Copy link
Copy Markdown
Contributor

@bgentry bgentry commented Apr 14, 2026

The old elector treated leadership as a lease held by one client ID and renewed over time. That kept the happy path simple, but it left too much implicit. One leadership term was not clearly separated from the next, so reelection and resignation were not scoped to a specific term. That made same-client reacquisition and stale cleanup harder to reason about than they needed to be.

This rewrite makes the database issue explicit leadership terms using the columns we already have. leader_id still identifies the client, while (leader_id, elected_at) identifies one specific term. Elect, reelect, and resign now all operate on that exact term and return the leader row directly from the database.

The elector still keeps a bounded local trust window for its last successful confirmation, but that window is now anchored to the attempt that produced it rather than to when a response happened to arrive. That keeps slow successful reelections from stretching leadership past the real lease budget, while still avoiding direct app-vs-database clock comparisons in the state machine.

This branch also adds an explicit Time.Now API at the base-service layer. The old shared API only exposed NowUTC, even for code doing local deadline and duration math. For the elector that was the wrong abstraction, because these comparisons need to preserve Go's monotonic time.Time reading rather than normalize through UTC. Making that API explicit lets the elector use the right local clock for trust-window calculations while keeping NowUTCOrNil for the existing test path that passes stubbed wall-clock timestamps through to database code.

The tests were rewritten around the final model instead of the intermediate mechanics. The shared driver suite now covers term-scoped elect, reelect, and resign behavior across PostgreSQL and both SQLite backends, including same-client term replacement and stale-term rejection. The elector tests focus on observable behavior such as leadership handoff, resign requests, ordered subscriber notifications, and stepping down cleanly when the trust window expires.

This also rolls up the branch's earlier flake investigation and keeps the original CI reference for the shared-schema failures that led to the redesign:
https://github.com/riverqueue/river/actions/runs/24406465152

@bgentry bgentry force-pushed the bg/fix-elector-flakiness branch 5 times, most recently from c9df93e to 88d9ea3 Compare April 16, 2026 01:53
@bgentry bgentry changed the title Fix elector flakiness refactor leader election around DB-issued terms Apr 16, 2026
@bgentry bgentry force-pushed the bg/fix-elector-flakiness branch from 88d9ea3 to 28a0c9f Compare April 16, 2026 01:58
@bgentry
Copy link
Copy Markdown
Contributor Author

bgentry commented Apr 16, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@bgentry bgentry force-pushed the bg/fix-elector-flakiness branch 2 times, most recently from 23fa8fd to fce9b47 Compare April 16, 2026 02:11
@bgentry bgentry requested a review from brandur April 16, 2026 02:13
@bgentry
Copy link
Copy Markdown
Contributor Author

bgentry commented Apr 16, 2026

@brandur took a whole bunch of turns and iteration to get this to where I want it but I think it's pretty solid now! Pared back a lot of the added complexity from earlier just by going all in on "DB is source of truth including for time".

@bgentry bgentry marked this pull request as ready for review April 16, 2026 02:14
@bgentry bgentry changed the title refactor leader election around DB-issued terms refactor leader election around DB-issued "terms" Apr 16, 2026
@bgentry bgentry force-pushed the bg/fix-elector-flakiness branch from fce9b47 to bddcf95 Compare April 16, 2026 03:36
@bgentry bgentry force-pushed the bg/fix-elector-flakiness branch from bddcf95 to e35cdb6 Compare April 16, 2026 21:55
@bgentry bgentry changed the base branch from master to bg/rename-nowutc-now April 16, 2026 21:56
Base automatically changed from bg/rename-nowutc-now to master April 16, 2026 22:36
The old elector treated leadership as a lease held by one client ID and
renewed over time. That was simple, but it left too much implicit. One
leadership term was not clearly separated from the next, so reelection
and resignation were not scoped to a specific term. That made same-
client reacquisition harder to reason about, made it easy for stale work
or cleanup to target the wrong lease, and left the elector carrying more
responsibility for edge cases than it should have.

This change makes the database issue explicit leadership terms using the
columns we already have. `leader_id` remains the stable client ID, while
`(leader_id, elected_at)` identifies one specific term. Elect, reelect,
and resign now all operate on that exact term and return the leader row
from the database. The elector keeps a bounded local trust window for
its last successful confirmation, but that window is anchored to the
attempt that produced it, not to when the response happened to arrive.
That keeps slow successful reelections from stretching leadership past
its real lease budget while still avoiding direct app-vs-database clock
comparisons in the state machine.

The notification and test story is also clearer after the rewrite. Slow
subscribers now receive each leadership transition in order without
blocking the elector, resignation wakeups are coalesced safely, and the
poll-only coverage uses isolated fixtures so it can exercise real handoff
behavior without shared-schema flakiness. The shared driver suite now
covers term-scoped elect, reelect, and resign behavior across
PostgreSQL and both SQLite backends, including same-client term
replacement and stale-term rejection. The elector tests focus on the
observable behaviors that matter: gaining leadership, handing it off,
responding to resign requests, and stepping down cleanly when its trust
window expires.

This also rolls up the branch's earlier flake investigation and keeps
the original CI reference for the shared-schema failures that led to the
redesign:
https://github.com/riverqueue/river/actions/runs/24406465152
@bgentry bgentry force-pushed the bg/fix-elector-flakiness branch from e35cdb6 to 94f94fa Compare April 16, 2026 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant