Skip to content

fix(keychain): retry collection ops when gnome-keyring relocks#560

Closed
Benehiko wants to merge 2 commits into
mainfrom
fix/keychain-relock-retry
Closed

fix(keychain): retry collection ops when gnome-keyring relocks#560
Benehiko wants to merge 2 commits into
mainfrom
fix/keychain-relock-retry

Conversation

@Benehiko

Copy link
Copy Markdown
Member

The Linux keychain store dials a fresh D-Bus connection per operation and closes it on return. gnome-keyring scopes an unlock to the session that performed it, so closing a previous connection relocks the collection. That relock can land asynchronously mid-operation — after IsLocked reports the collection unlocked but before the call against the collection runs — surfacing an intermittent "Cannot create an item in a locked collection" error.

Add withRelockRetry, which reacts to the authoritative signal (the operation's own org.freedesktop.Secret.Error.IsLocked D-Bus error) by unlocking and retrying with bounded exponential backoff. Wrap the mutating and reading calls (CreateItem, DeleteItem, GetSecret) with it, and factor the unlock preamble into ensureUnlocked. A failed Unlock (e.g. dismissed auth prompt) aborts the loop immediately to avoid re-prompting.

Benehiko and others added 2 commits June 24, 2026 11:34
The Linux keychain store dials a fresh D-Bus connection per operation and
closes it on return. gnome-keyring scopes an unlock to the session that
performed it, so closing a previous connection relocks the collection. That
relock can land asynchronously mid-operation — after IsLocked reports the
collection unlocked but before the call against the collection runs — surfacing
an intermittent "Cannot create an item in a locked collection" error.

Add withRelockRetry, which reacts to the authoritative signal (the operation's
own org.freedesktop.Secret.Error.IsLocked D-Bus error) by unlocking and retrying
with bounded exponential backoff. Wrap the mutating and reading calls
(CreateItem, DeleteItem, GetSecret) with it, and factor the unlock preamble into
ensureUnlocked. A failed Unlock (e.g. dismissed auth prompt) aborts the loop
immediately to avoid re-prompting.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wrap the loadSecret signature across lines so it no longer exceeds the
200-char lll limit (the new collectionPath param pushed it to 207),
unblocking the lint job.

Review follow-ups, no behaviour change on any happy path:
- withRelockRetry now wraps the abort-on-failed-Unlock error as
  "unlock after relock: %w" so the cause is visible while errors.Is on
  the underlying Unlock error (e.g. a dismissed prompt) still holds.
- Correct the relockRetryMaxDelay doc comment: the 500ms cap first takes
  effect once maxRelockRetries reaches 6, not "past 6".
- Note that the sleepFn / newService test seams are unsynchronised
  package-level vars, so swapping them precludes t.Parallel().

Tests:
- TestFilterRetriesWhenCollectionRelocks covers the Filter->loadSecret
  read path through withRelockRetry (needs a configurable GetAttributes
  seam on the fake so the item is not skipped).
- TestRelockBackoffSchedule pins the documented 20/40/80/160/320ms
  backoff via a recording sleep seam.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Benehiko added a commit that referenced this pull request Jun 24, 2026
The real-keyring tests added by this PR fail intermittently in CI with
"Cannot create an item in a locked collection" and a duplicate backlog
that never collapses. Both are the gnome-keyring relock race: the store
dials a fresh D-Bus connection per operation, and a prior op's closing
connection relocks the collection asynchronously — after the unlock but
before the call against the collection runs. IsLocked cannot guard it
because the state changes between the check and the call.

React to the authoritative signal instead. Add withRelockRetry, which
retries an operation with bounded exponential backoff when it fails with
the org.freedesktop.Secret.Error.IsLocked D-Bus error, re-unlocking each
attempt and aborting immediately if Unlock itself fails (e.g. a dismissed
prompt).

Wrap the lock-gated mutations with it:
- Save: the create-when-absent CreateItem, the in-place SetItemSecret,
  and the best-effort duplicate-collapse DeleteItem loop. The collapse
  delete stays best-effort but is now relock-aware: a swallowed locked
  error would otherwise leave the duplicates this feature exists to drain
  (issue #446).
- Delete: DeleteItem.
Reads (Get/Filter GetSecret) are equally race-prone but are left to the
dedicated relock-retry change (#560); this commit covers the write path
the new tests exercise.

Also make the tests' own direct-daemon helpers relock-aware, since they
talk to the daemon outside the store: seedRealDuplicates' CreateItem and
purgeRealItems' DeleteItem now go through withRelockRetry.

Tests: fake-backed TestKeychainSaveRetriesWhenCreateRelocks,
TestKeychainSaveRetriesWhenSetSecretRelocks,
TestKeychainSaveCollapseRetriesWhenDeleteRelocks and
TestKeychainSaveStopsRetryingAfterMaxRelocks. Verified green against real
gnome-keyring on the fedora-43 and ubuntu-24 CI harnesses; lint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Benehiko

Copy link
Copy Markdown
Member Author

Superseded by #557, which merged the relock-retry mechanism (withRelockRetry / isLockedDBusError) and wrapped the write-path mutations (Save's CreateItem/SetItemSecret/duplicate-collapse DeleteItem, and Delete). That makes this PR's helper, its Delete wrap, and its now-obsolete single-CreateItem Save change redundant/conflicting against main.

The one piece not yet on main is the read-path relock retry (Get and Filter/loadSecret's GetSecret). Opening a focused follow-up PR for just that, on top of the merged helper.

@Benehiko Benehiko closed this Jun 24, 2026
@Benehiko

Copy link
Copy Markdown
Member Author

Follow-up with the read-path relock retry: #561

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant