Skip to content

test(assets): move cache cold-vs-hit timing out of the gate into a bench#21

Merged
guysenpai merged 1 commit into
mainfrom
phase-0/assets/cache-diff-to-bench
Jun 5, 2026
Merged

test(assets): move cache cold-vs-hit timing out of the gate into a bench#21
guysenpai merged 1 commit into
mainfrom
phase-0/assets/cache-diff-to-bench

Conversation

@guysenpai

Copy link
Copy Markdown
Contributor

Why

tests/assets/cache_diff.zig (M0.6 / E4) asserted wall-clock ratios inside the zig build test correctness gatehit < 10 ms (absolute) plus miss > hit*20 (relative). Both are fragile across hosts and CPU load: a single cache-hit sample can spike on a page fault / AV scan / cold directory, which red-fails on slower or Windows CI runners. The spike also inflates the relative threshold (e.g. a 3 ms hit ⇒ needs miss > 60 ms, but a ReleaseSafe cold cook is only ~13 ms ⇒ fail).

This was flagged as pre-existing M0.6 debt during M0.7 / E3 Windows validation — see briefs/M0.7-ipc-scm-rights-windows-fuzz.md § Acted deviations → "Known debt left untouched" — to be resolved separately by making the assertion tolerant or moving it out of the gate.

Measured evidence (idle dev Mac, ReleaseSafe, 16 MiB asset): cold p50 12.7 ms, hit p50 184 µs (69× speedup) — but hit max 2.98 ms. On a loaded Windows runner that single-sample spike breaks both the old absolute and relative checks.

What changed

Picked option 2 — move the timing to the bench suite, keep a deterministic functional assertion in the gate. This is the only option that makes the gate fully deterministic + cross-host, and it matches the repo's established methodology: every perf number lives in a separate bench-* step (ECS, IPC RTT, adler32, paeth, etch), archived and non-blocking, measured under the opposable protocol on the reference machine — never inside zig build test.

  • tests/assets/cache_diff.zig — stripped all wall-clock; the gate now asserts only deterministic facts: the miss → hit transition and a byte-identical cached artifact. Asset shrunk 16 MiB → 256 KiB (the large size only existed to make the cold cook expensive for timing, now the bench's job), keeping the gate fast on every host.
  • bench/asset_cache.zig (new) — zig build bench-asset-cache, multi-sample cold-vs-hit differential, --smoke for CI, writes the gitignored bench/out/asset_cache_<os>.md. Mirrors the adler32 / paeth / render bench conventions.
  • build.zig — registered the bench-asset-cache step alongside the other M0.6 benches.

Verification

Gate Result
zig build test (Debug) exit 0
zig build test -Doptimize=ReleaseSafe exit 0
zig build lint / zig build exit 0 / exit 0
zig fmt --check clean
pre-push hook (build + test + test-release) all ✔️

(The macOS full-suite failed command lines for events/plugin_loader are pre-existing spurious output; the suites exit 0.)

Scope

No IPC / M0.7 code touched. Branched off main (this is an independent M0.6-debt fix, not stacked on the unmerged M0.7 work).

🤖 Generated with Claude Code

The M0.6 cache_diff test asserted wall-clock ratios (hit < 10 ms, plus
miss > hit*20) inside `zig build test`. A single cache-hit sample can
spike on a page fault / AV scan / cold directory — ~3 ms observed even on
an idle dev box — which red-fails on slower or Windows CI runners and also
inflates the relative threshold (3 ms hit * 20 = 60 ms > the cold cook).
Flagged as pre-existing M0.6 debt in the M0.7 brief (Acted deviations ->
"Known debt left untouched").

The correctness gate now asserts only deterministic, cross-host facts: the
miss -> hit transition and a byte-identical cached artifact. The host- and
load-dependent cold-vs-hit differential moves to a new
`zig build bench-asset-cache` (bench/asset_cache.zig), archived and
non-blocking like every other perf number, measured under the opposable
protocol on the reference machine.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@guysenpai guysenpai merged commit b5d5e9e into main Jun 5, 2026
7 checks passed
@guysenpai guysenpai deleted the phase-0/assets/cache-diff-to-bench branch June 5, 2026 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant