Skip to content

fix(audit): serialise log_event with a file lock so concurrent writers cannot fork the hash chain#263

Open
galuis116 wants to merge 1 commit into
vouchdev:testfrom
galuis116:fix/audit-log-event-lock
Open

fix(audit): serialise log_event with a file lock so concurrent writers cannot fork the hash chain#263
galuis116 wants to merge 1 commit into
vouchdev:testfrom
galuis116:fix/audit-log-event-lock

Conversation

@galuis116

Copy link
Copy Markdown
Contributor

What changed

audit.log_event (src/vouch/audit.py) now holds an exclusive cross-process lock for the duration of read-prev_hash → derive → append. The lock is held on a sibling audit.log.jsonl.lock file (so the audit log itself is never opened in a mode that could truncate it) via fcntl.flock on POSIX and msvcrt.locking on Windows. _audit_lock(kb_dir) is the single context manager; log_event's critical section is one with _audit_lock(...) block. Adds AUDIT_LOCKFILE constant.

Why

Fixes #262. Before this PR, log_event did:

  1. _last_hash(kb_dir) — read the whole log to get the last hash.
  2. derive the new event's hash from that prev_hash.
  3. open("a") and append.

The three steps were not atomic. Two concurrent writers both observed the same prev_hash, both chained off it, and verify_chain then reported "previous hash mismatch" at the second concurrent event — permanently breaking the tamper-evidence guarantee #244/#253 was added to provide. Reachable from vouch serve + concurrent CLI, multiple agents on JSONL, scripted backgrounded approvals (vouch approve a & vouch approve b), and scheduled vouch expire running while vouch sync-apply writes its own audit events. The README's headline use case (Claude Code + Cursor + a CI bot sharing one repo) is exactly this scenario.

The matching read surface (verify_chain) already documents that prev_hash must form a monotone chain — the writer just needed to actually produce one under concurrency. Same tighten the writer to match the existing reader invariant lane as #74 (per-file sha256 on import) / #123 (write-time graph refs vs fsck) / #149 (artifact-id containment vs read_under_root).

What might break

Nothing on-disk, schema, MCP/JSONL surface, or audit-log shape. Strictly additive serialisation.

  • Single-writer KBs: lock is uncontended; one extra os.open + flock per event. Negligible.
  • Multi-writer KBs: writers now serialise instead of racing. Throughput drops to ~one event per _last_hash scan, but correctness is preserved. (_last_hash's O(N) scan is a separate optimisation worth tackling in a follow-up — out of scope here.)
  • Existing forked chains stay forked on read (the past is the past); every future write is correct.
  • New audit.log.jsonl.lock file appears in .vouch/. Empty, used only for locking. Not committed (covered by the existing .vouch/.gitignore state.db-* style; if it surfaces in git it should be added to .gitignore in a follow-up).
  • Cross-platform: fcntl on POSIX, msvcrt on Windows. The dynamic __import__ pattern keeps mypy src clean on both (fcntl.LOCK_EX / msvcrt.LK_LOCK aren't visible to mypy when the other platform's stubs are loaded).

VEP

Not required — adds an internal correctness guard to match an invariant verify_chain already enforces; no surface change.

Tests

  • make check passes locally — ruff check src tests clean, mypy srcSuccess: no issues found in 71 source files, pytest tests/test_audit.py 8/8 green plus the broader affected modules (test_storage, test_health, test_sessions, test_cli, test_sync, test_verify) all green. (tests/test_bundle.py::test_import_apply_rejects_absolute_path fails on test independent of this change — pre-existing regex-vs-message mismatch, confirmed by re-running on the parent commit.)
  • New behaviour has tests in tests/test_audit.py:
    • test_audit_chain_survives_concurrent_writers — spawns four multiprocessing.spawn workers writing 20 events each on a shared KB, asserts verify_chain.ok is True and the full 4 × 20 = 80 events landed. Uses subprocesses, not threads — the GIL serialises Python code and hides the file-level race.
    • test_audit_lock_released_on_exception — monkey-patches os.fsync to raise mid-event, asserts the next log_event call succeeds (lock released, chain continues from the surviving prefix).
  • CHANGELOG.md updated under ## [Unreleased] ### Fixed.

Closes #262.

…s cannot fork the hash chain

without the lock, two log_event calls racing on the same kb both observe
the same prev_hash from _last_hash, both append to audit.log.jsonl, and
verify_chain reports "previous hash mismatch" at the second concurrent
entry — permanently breaking the tamper-evidence guarantee vouchdev#244/vouchdev#253
added. reachable from vouch serve handling concurrent agents, vouch serve
alongside cli commands on the same kb, scripted backgrounded approvals,
and scheduled expire / sync runs.

holds an exclusive lock on a sibling audit.log.jsonl.lock file for the
duration of read-prev-hash → derive → append. uses fcntl.flock on posix
and msvcrt.locking on windows. the audit log itself is never opened in a
mode that could truncate it. dynamic __import__ keeps mypy clean on both
platforms.

new regression spawns four multiprocessing.spawn workers writing 20
events each and asserts verify_chain.ok plus the full 80 events landed —
threads hide the race behind the gil. a second test asserts the lock is
released on an exception mid-write so the next call succeeds.

Fixes vouchdev#262
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 65acb70c-e06e-429e-8599-b02968ecaf0d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant