From 8dfdee8498cda3b76e6d2e306790d6e07e47558e Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Wed, 3 Jun 2026 18:31:21 +0200 Subject: [PATCH 01/29] docs(brief): add m0.6 milestone brief --- briefs/m0.6-assets.md | 217 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 217 insertions(+) create mode 100644 briefs/m0.6-assets.md diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md new file mode 100644 index 0000000..f0c4077 --- /dev/null +++ b/briefs/m0.6-assets.md @@ -0,0 +1,217 @@ + + +# M0.6 — Asset pipeline v0 (formats, codecs, cooker, async loader) + +> **Status:** PLANNED +> **Phase:** 0 +> **Branch:** `phase-0/assets/pipeline-v0` +> **Planned tag:** `v0.6.0-M0.6-assets` +> **Dependencies:** M0.0 → M0.5 (Phase 0 to date). In particular: the Chase-Lev work-stealing job system (Tier 0), the `std.Io` async file I/O from the platform layer, and the uniform `root.zig` module convention. +> **Opened:** 2026-06-03 +> **Closed:** — + +--- + +# FROZEN SECTION + +*Produced by Claude.ai. Not modifiable by Claude Code outside a Claude.ai round-trip (cf. § Recorded deviations).* + +## Context + +First new functional surface since the Vulkan renderer (M0.4). Delivers the minimal Phase 0 asset pipeline — source import → offline cooking → async runtime loading — through a two-format model frozen on day 1: a diffable intermediate `.asset.etch` and a zero-copy runtime `..bin`. The milestone validates the central "design on day 1" hypothesis for assets: the on-disk formats, the `AssetHandle` API and the cooking-cache key are Phase 0+ stable surfaces; only the implementation behind them evolves. Advances criterion C0.6. + +## Scope + +Frozen day-1 surfaces (formats + handle + cache key) and their minimal PNG / static-glTF / WAV implementation. Decode-only — no encoder of any source format is delivered. + +- **Intermediate format `.asset.etch`** — Etch text (asset name, `type`, `version`, `source`, `source_hash`, `import_settings`, `process_settings`, `extracted`) plus a referenced hashed binary blob stored separately. Schema frozen day 1; only the PNG / static-glTF / WAV subset is populated. +- **Runtime format `..bin`** — zero-copy binary, 32-byte header (fields, in order: `magic` = "WELD", `version` u16, `asset_type` u16, `platform` u16, `flags` u16, `data_offset` u32, `data_size` u32, `metadata_offset` u32, `metadata_size` u32, `hash` u64), followed by metadata then bulk data, mmap-able directly. Header layout frozen day 1. +- **`AssetHandle`** — 64-bit packed handle, fields: `index` u32, `generation` u16, `type_tag` u16. Typed, generation-checked. Frozen day 1. (Named `AssetHandle`, not `AssetRef`.) +- **Asset registry** — handle allocation, refcount, generation bump on unload so that a stale handle is detectable. +- **Native in-tree DEFLATE/zlib codec** (`codecs/deflate/`) — RFC 1951 inflate (fixed + dynamic Huffman, table-driven) plus the zlib wrapper and ADLER32 trailer verification. +- **`foundation/simd/` module skeleton** — `simd.zig` (public re-export), `dispatch.zig` (comptime target selection), `portable.zig` (portable `@Vector` impls), `traits.zig` (capability bitflags), plus the `adler32` kernel (portable `@Vector` + scalar reference). +- **PNG decode codec** (`codecs/png/`) — IHDR/IDAT/IEND chunk parsing, all five line filters including the `paeth_filter_decode` kernel landed in `foundation/simd/`, indexed/palette, alpha, Adam7 interlace. +- **glTF static decode codec** (`codecs/gltf/`) — JSON parsed with `std.json`, accessors / bufferViews / static mesh extraction. +- **WAV decode** (in `importers/wav.zig`) — RIFF PCM. +- **Importers** (source → intermediate) for: png, gltf (static), wav. +- **Cookers** (intermediate → runtime `..bin`) for: texture (raw RGBA8 payload), mesh (raw f32 vertices, version 1), audio (raw PCM payload). +- **Local cooking cache** — key = `hash(source_bytes + import_settings + process_settings + cook_settings + platform)`; a second cook of an unchanged asset completes in < 10 ms versus ≥ 100 ms for the first cook (differential measurement). +- **Async runtime loader** — load / unload / reload lifecycle over `std.Io` + the Tier 0 job system; never blocks the main thread. + +## Out-of-scope + +Listed explicitly so the pipeline is not extended "to do it properly". Each item is a deliberate later-phase deferral. + +- JPEG native codec (Phase 1). +- glTF animations, skinning, morph/blend shapes (Phase 1). +- KTX / Basis Universal binding (Phase 1). +- Assimp binding and the formats it covers — FBX, OBJ, COLLADA, etc. (Phase 1). +- TGA, HDR, EXR, PSD codecs; FLAC, OGG/Vorbis codecs (Phase 1+ / Phase 2+). +- TTF/OTF parser and MSDF atlas generator (Phase 1). +- LOD generation, tangent generation, mipmap generation, GPU texture compression (BC7/BC5/BC4/ASTC) (Phase 1–2). Cooked texture payload in M0.6 is raw RGBA8; cooked audio payload is raw PCM (no Opus — the Opus keeper is not wired in M0.6). +- Vertex quantization (f16 positions + octahedral normals). Deferred to Phase 1 and added via a `.mesh.bin` `version` bump — the header `version` field reserves this evolution, so the deferral forces no refactor. +- Advanced hot-reload and the FileWatcher / inotify-FSEvents-ReadDirectoryChangesW path (Phase 2). +- Streaming manager: priority, memory budget, LRU eviction, placeholders policy (Phase 2+). +- Network and cloud cache tiers — only the local cache tier exists in M0.6. +- Asset registry queries, orphan detection, dependency graph, circular-dependency detection (Phase 1+). +- Any `weld cook` / `weld asset` CLI UX polish. The offline cooker is exercised through the round-trip tests and a thin offline entry only; the user-facing CLI surface is Phase 1. +- Any widening of the Tier 0 job system public surface (see Notes). + +## Execution stages (E1–E5) + +This milestone is staged with review gates, per the M0.1 / M0.5 staged-execution pattern. **One brief, one branch, one PR opened early (after E1).** Claude Code implements stages in order; at the end of each stage it commits, pushes, and posts the standardized message *"stage E complete — awaiting Claude.ai review"*, then **stops and does not begin the next stage until Guy relays a GO**. This is a planned review gate, not a reactive split trigger: stage boundaries do not authorize scope changes, and a single Claude Code session executes all five stages. + +- **E1 — Formats + handle + registry.** Intermediate `.asset.etch` schema, runtime `..bin` header/layout, `AssetHandle`, registry (refcount + generation invalidation). These are the day-1-frozen surfaces; this gate exists because every later stage builds on them. +- **E2 — DEFLATE/zlib codec + `foundation/simd/` skeleton + `adler32` kernel.** The heaviest algorithmic unit, isolated under its own gate. +- **E3 — Codecs PNG (incl. `paeth_filter_decode` kernel) + glTF static + WAV.** Built on E2 (inflate) and E1 (formats). +- **E4 — Importers + cookers + local cooking cache.** Wires codecs → intermediate → runtime `.bin`, plus the hash-keyed cache. +- **E5 — Async loader + lifecycle.** mmap of `.bin`, async over `std.Io` + the Tier 0 job system. This gate exists because it touches Tier 0; integration must not widen the job system surface. + +## Specs to read first + +Mandatory reads before any production code; Claude Code ticks each box in the LIVING SECTION. + +1. `engine-phase-0-plan.md` — § M0.6 — scope source of record. +2. `engine-asset-pipeline.md` — §1–10 — intermediate/runtime formats, importers, cookers, cache, registry, async loader. +3. `engine-spec.md` — §16 (Asset Pipeline) and §3.5 (in-tree as-if-lib discipline) — master alignment. +4. `engine-simd.md` — §1–3 (module role, structure, two-level API), §7.1 (Asset Pipeline hot-path map), §9 (phasing: M0.6 adds the skeleton + `adler32` + `paeth_filter_decode`), §10-referenced `@Vector`-first / asm-second discipline. +5. `engine-zig-conventions.md` — §13 surface coverage (lazy analysis guard), module rooting rule, codecs in-tree convention, `root.zig` convention. +6. `engine-directory-structure.md` — `src/modules/asset_pipeline/` and `src/foundation/simd/` layout. +7. `engine-development-workflow.md` — §4.3 (Conventional Commits), §4.6 (squash format), language closure criterion. + +## Files to create or modify + +Concrete paths. Files outside this list must not be touched without a written justification in the execution journal. + +**`foundation/simd/` (skeleton + two inaugural kernels):** +- `src/foundation/simd/simd.zig` — create — public API re-export. +- `src/foundation/simd/dispatch.zig` — create — comptime target dispatch. +- `src/foundation/simd/portable.zig` — create — portable `@Vector` implementations (always present). +- `src/foundation/simd/traits.zig` — create — capability bitflags. +- `src/foundation/simd/kernels/adler32.zig` — create — ADLER32 kernel. +- `src/foundation/simd/kernels/paeth.zig` — create — Paeth-filter-decode kernel. +- `src/foundation/simd/tests/correctness.zig`, `adler32_test.zig`, `paeth_test.zig` — create — portable-equals-reference + known-vector tests. +- `src/foundation/simd/bench/adler32_bench.zig`, `paeth_bench.zig` — create — baseline throughput (no parity target; see Acceptance › Benchmarks). + +**`asset_pipeline/`:** +- `src/modules/asset_pipeline/root.zig` — create — module entry (`root.zig` convention). +- `src/modules/asset_pipeline/format/` — create — intermediate `.asset.etch` schema + runtime `..bin` header/layout. +- `src/modules/asset_pipeline/registry/` — create — `AssetHandle` + registry (refcount, generation). +- `src/modules/asset_pipeline/codecs/deflate/` — create — RFC 1951 inflate + zlib wrapper. +- `src/modules/asset_pipeline/codecs/png/` — create — PNG decode. +- `src/modules/asset_pipeline/codecs/gltf/` — create — glTF static decode (via `std.json`). +- `src/modules/asset_pipeline/importers/png.zig`, `gltf.zig`, `wav.zig` — create — source → intermediate orchestration (WAV RIFF decode lives directly in `wav.zig`). +- `src/modules/asset_pipeline/cookers/` — create — intermediate → runtime `.bin`. +- `src/modules/asset_pipeline/cache/` — create — local cooking cache. +- `src/modules/asset_pipeline/loader/` — create — async `std.Io` loader, load/unload/reload lifecycle. + +**Build + tests:** +- `build.zig` — edit — register the `asset_pipeline` and `foundation/simd` modules with explicit dependencies (`foundation`, `core`), wire their test targets and benches. +- `tests/assets/png_roundtrip.zig`, `gltf_static_roundtrip.zig`, `wav_roundtrip.zig` — create — import + cook + load round-trip. +- `tests/assets/cache_diff.zig` — create — cooking-cache hit differential. +- `tests/assets/loader_async.zig` — create — async load does not block the main thread (internal timeout ≤ 5 s). +- `tests/assets/handle_generation.zig` — create — stale handle detected after unload. +- `tests/assets/deflate_vectors.zig` — create — inflate fixed + dynamic Huffman against known vectors, bit-exact. +- `tests/assets/data/checker.png`, `tests/assets/data/cube.gltf`, `tests/assets/data/tone.wav` — create — test fixtures. + +## Acceptance criteria + +### Tests + +- `tests/assets/png_roundtrip.zig` — `test "png import-cook-load round-trip"` — `checker.png` → intermediate → `.texture.bin` → load; decoded RGBA8 pixels match expected. +- `tests/assets/gltf_static_roundtrip.zig` — `test "gltf static import-cook-load round-trip"` — `cube.gltf` → `.mesh.bin` → load; vertex/index counts and bounds match expected. +- `tests/assets/wav_roundtrip.zig` — `test "wav import-cook-load round-trip"` — `tone.wav` → `.audio.bin` → load; sample count, channels, sample rate match expected. +- `tests/assets/deflate_vectors.zig` — `test "inflate fixed huffman"`, `test "inflate dynamic huffman"` — output bit-exact against known vectors. +- `src/foundation/simd/tests/adler32_test.zig` — `test "adler32 portable equals reference"`, `test "adler32 known vectors"`. +- `src/foundation/simd/tests/paeth_test.zig` — `test "paeth portable equals reference"`. +- `tests/assets/handle_generation.zig` — `test "stale handle after unload is rejected"` — load, capture handle, unload, assert the old handle no longer resolves (generation mismatch). +- `tests/assets/cache_diff.zig` — `test "second cook of unchanged asset hits cache"` — see Benchmarks for the timing target. +- `tests/assets/loader_async.zig` — `test "async load does not block main thread"` — main loop ticks N times while a load is in flight; load completes; internal timeout ≤ 5 s with clean teardown. +- **Surface coverage (lazy analysis guard).** Every public symbol delivered in `asset_pipeline` and `foundation/simd` must be exercised by at least one test that calls it with realistic data and asserts an observable result — not merely a test that compiles the symbol. A module shipping public symbols with zero consuming test is rejected in review even with green CI. (M0.4 seven-bugs lesson, `engine-zig-conventions.md` §13.) + +### Benchmarks + +- `tests/assets/cache_diff.zig` (differential) — first cook ≥ 100 ms, second cook of the unchanged asset < 10 ms, on the Phase 0 reference machine. This is the only numeric gate. +- `src/foundation/simd/bench/adler32_bench.zig`, `paeth_bench.zig` — throughput, **baseline recorded only, no parity target**. These kernels sit on a cold path (decode runs once at cook time; the runtime mmaps the cooked `.bin`); a zlib-ng parity target is explicitly out of scope to avoid optimizing a cold path. + +### Observable behavior + +- Running the offline cook on `checker.png` / `cube.gltf` / `tone.wav` produces the corresponding `..bin` files; re-running on the unchanged inputs logs a cache hit and completes under the differential target. +- A small harness loads a cooked `.bin` asynchronously while its loop keeps ticking; the log shows the loop advancing during the load and the asset becoming available, with no main-thread stall. + +### CI + +- `zig build` clean, zero warnings, on the configured matrix. +- `zig build test` green (Debug + ReleaseSafe). +- `zig fmt --check` green. +- `zig build lint` green. +- `commit-msg` hook green on every commit of the branch. +- Language closure: no French strings in any repo artifact produced by the milestone (code, comments, doc comments, commit messages, brief content including the LIVING SECTION). (`engine-development-workflow.md` language closure criterion.) + +## Conventions + +- **Branch:** `phase-0/assets/pipeline-v0` +- **Final tag:** `v0.6.0-M0.6-assets` +- **PR title:** `Phase 0 / Asset Pipeline / Asset pipeline v0 (formats, codecs, cooker, async loader)` +- **Commit convention:** Conventional Commits (cf. `engine-development-workflow.md` §4.3) +- **Merge strategy:** squash-and-merge (cf. `engine-development-workflow.md` §4.6) + +## Notes + +References and known pitfalls only. No anticipated edge cases (those are Scope or Out-of-scope items above). + +- **Format naming.** The M0.6 plan's term "`.asset`" is shorthand; the frozen surfaces are the two spec formats — intermediate `.asset.etch` and runtime `..bin` (`engine-asset-pipeline.md` §3, §5). A `engine-phase-0-plan.md` wording patch is tracked outside this brief. +- **`AssetHandle`, not `AssetRef`.** Aligns with `engine-asset-pipeline.md` §8 and criterion C0.6. +- **DEFLATE reference.** Read `puff.c` (Mark Adler) and/or miniz for structure only; write the inflate from scratch, table-driven Huffman. Target class is "correct + table-driven", **not** zlib-ng class (no 64-bit refill / SIMD copy fast paths). The rationale for a native implementation is API stability across Zig `std` churn and an owned, homogeneous surface for PNG/EXR-later consumers — **not** performance. Do not vendor or adapt `std.compress.flate` code. Do not reuse `zlc` (it is an asymmetric WORM LZ77 codec, not RFC 1951 DEFLATE). +- **glTF JSON.** Parse with `std.json`. "Native Zig parser" in the spec means "no cgltf C binding", **not** a from-scratch JSON parser. Do not write a JSON parser. +- **Inaugural SIMD kernels.** `adler32` and `paeth_filter_decode` are `foundation/simd/`'s first kernels: portable `@Vector` plus a scalar reference, **no ISA-specific asm** (`engine-simd.md` §10 — asm only with documented bench justification, which is absent here), **no zlib-ng parity chasing**. Their purpose is to stand up and validate the `foundation/simd/` infrastructure (test harness, scalar-reference-vs-`@Vector` pattern, the math/simd sibling boundary), not to be fast. +- **Cold-path principle.** No codec in this milestone is on a runtime hot path; the runtime mmaps the cooked `.bin`. Do not SIMD-optimize cold decode paths beyond the two planned inaugural kernels. +- **`foundation/math` ↔ `foundation/simd` boundary.** Sibling modules with no mutual dependency. SIMD kernels take raw slices; if a call site has typed math objects, conversion happens at the call site via `foundation/math` helpers. `foundation/simd` imports nothing but `std`. +- **Lazy analysis guard.** Compile-green does not prove a body is correct under Zig 0.16 lazy analysis; only a consuming test with realistic args and an assertion forces deep analysis. See Acceptance › Tests › Surface coverage. +- **Module rooting.** New modules under `src/` that contain inline tests must be transitively reachable from the covering test target's root (re-export via the module's `root.zig`, or an explicit `comptime { _ = @import(...); }` pin), or their inline tests are silently skipped. Use the `root.zig` convention (not `mod.zig` / `main.zig`). +- **Async loader tests.** Any test awaiting an async resource must carry an internal timeout ≤ 5 s with clean teardown; a hanging async test hangs the whole suite (S6 lesson, `engine-zig-conventions.md` §13). +- **Tier 0 boundary.** The loader consumes the existing Chase-Lev job system as-is. Do **not** widen the job system public surface. If the integration appears to need a new job-system API, that is a Case 2 design blocker (stop + journal + Claude.ai round-trip), not an in-session extension. +- **WAV location.** Decode lives in `importers/wav.zig` (RIFF is trivial); there is no separate `codecs/wav/`, matching the directory-structure tree. +- **Directory patch.** `codecs/deflate/` is a new sub-directory not yet present in `engine-directory-structure.md`; its addition is tracked as a spec patch outside this brief. + +--- + +# LIVING SECTION + +*Maintained by Claude Code during the milestone.* + +## Specs read + +- [ ] `engine-phase-0-plan.md` (§ M0.6) — read +- [ ] `engine-asset-pipeline.md` (§1–10) — read +- [ ] `engine-spec.md` (§16, §3.5) — read +- [ ] `engine-simd.md` (§1–3, §7.1, §9, §10) — read +- [ ] `engine-zig-conventions.md` (§13, module rooting, codecs in-tree, root.zig) — read +- [ ] `engine-directory-structure.md` (asset_pipeline, foundation/simd) — read +- [ ] `engine-development-workflow.md` (§4.3, §4.6, language closure) — read + +## Execution journal + +- + +## Recorded deviations + +- + +## Blockers encountered + +- — resolved by or + +## Closing notes + +- **What worked:** +- **What deviated from the original spec:** +- **What to flag explicitly in review:** +- **Final measurements** (perf, binary size, compile time, as relevant): +- **Residual risk / technical debt left deliberately:** From 89b160af489c4ae9b26f27181a2dcd33d057ffb2 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Wed, 3 Jun 2026 18:33:08 +0200 Subject: [PATCH 02/29] docs(brief): confirm specs read for m0.6 --- briefs/m0.6-assets.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index f0c4077..34caa8e 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -188,13 +188,13 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o ## Specs read -- [ ] `engine-phase-0-plan.md` (§ M0.6) — read -- [ ] `engine-asset-pipeline.md` (§1–10) — read -- [ ] `engine-spec.md` (§16, §3.5) — read -- [ ] `engine-simd.md` (§1–3, §7.1, §9, §10) — read -- [ ] `engine-zig-conventions.md` (§13, module rooting, codecs in-tree, root.zig) — read -- [ ] `engine-directory-structure.md` (asset_pipeline, foundation/simd) — read -- [ ] `engine-development-workflow.md` (§4.3, §4.6, language closure) — read +- [x] `engine-phase-0-plan.md` (§ M0.6) — read 2026-06-03 18:32 +- [x] `engine-asset-pipeline.md` (§1–10) — read 2026-06-03 18:32 +- [x] `engine-spec.md` (§16, §3.5) — read 2026-06-03 18:32 +- [x] `engine-simd.md` (§1–3, §7.1, §9, §10) — read 2026-06-03 18:32 +- [x] `engine-zig-conventions.md` (§13, module rooting, codecs in-tree, root.zig) — read 2026-06-03 18:32 +- [x] `engine-directory-structure.md` (asset_pipeline, foundation/simd) — read 2026-06-03 18:32 +- [x] `engine-development-workflow.md` (§4.3, §4.6, language closure) — read 2026-06-03 18:32 ## Execution journal From 65e7b63ff19cfd13cfdd8cf6d4c3fb9d69e41e45 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Wed, 3 Jun 2026 18:33:27 +0200 Subject: [PATCH 03/29] docs(brief): activate m0.6 --- briefs/m0.6-assets.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index 34caa8e..a14b606 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -8,7 +8,7 @@ # M0.6 — Asset pipeline v0 (formats, codecs, cooker, async loader) -> **Status:** PLANNED +> **Status:** ACTIVE > **Phase:** 0 > **Branch:** `phase-0/assets/pipeline-v0` > **Planned tag:** `v0.6.0-M0.6-assets` From 7e98a0391f5bbd19de542d91f706ba82a28b711a Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Wed, 3 Jun 2026 18:43:45 +0200 Subject: [PATCH 04/29] docs(brief): log E1 frozen-header-size blocker --- briefs/m0.6-assets.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index a14b606..2163ed6 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -198,7 +198,9 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o ## Execution journal -- +- 2026-06-03 18:34 — E1 setup complete: branch `phase-0/assets/pipeline-v0` off `main` (5407f2a); brief committed verbatim (8dfdee8); 7 specs read + confirmed (89b160a); brief activated (65e7b63). +- 2026-06-03 18:40 — E1 design: studied `build.zig` wiring (per-test `createModule` + `TestSpec` flags pattern; modules via `addImport`). Per the FROZEN `build.zig` bullet, `asset_pipeline` deps are `foundation` + `core` only (no `weld_etch`) → the intermediate `.asset.etch` will use a self-contained minimal Etch-syntax writer/reader, not the S3 Etch parser. Flagged for confirmation at the E1 gate. +- 2026-06-03 18:43 — E1 BLOCKED on the runtime `.bin` header size (frozen-day-1 surface contradiction — see Blockers). Stopped E1 implementation per the blocker protocol; nothing is built on an unconfirmed frozen format. ## Recorded deviations @@ -206,7 +208,7 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o ## Blockers encountered -- — resolved by or +- **[E1 — OPEN] Runtime `..bin` header size: the "32 bytes" label contradicts the 36-byte field list.** Both the FROZEN Scope (runtime format bullet) and `engine-asset-pipeline.md` §5 label the header "32 bytes", but the explicit field list sums to 36: `magic`[4] + `version` u16(2) + `asset_type` u16(2) + `platform` u16(2) + `flags` u16(2) + `data_offset` u32(4) + `data_size` u32(4) + `metadata_offset` u32(4) + `metadata_size` u32(4) + `hash` u64(8) = 36. The 10 fields as typed cannot fit in 32 bytes (only `hash` u64→u32 would reach exactly 32). The header layout is frozen day 1 and the cooker (E4) + the mmap loader (E5) build on it, so the exact frozen size must be confirmed before E1 writes it. Candidate resolutions: (a) the field list is authoritative → header is 36 bytes and "32" is a spec typo to patch in both the brief and `engine-asset-pipeline.md` §5; (b) "32" is authoritative → one field differs from the listed types. Recommendation: (a) — the detailed field list is the more specific source and `data_offset`/`metadata_offset` are explicit, so the byte-count label is descriptive only. Awaiting Claude.ai ruling (FROZEN SECTION change) or Guy's verbal decision recorded as a deviation. — UNRESOLVED ## Closing notes From 2696fed723fa5ecf04ae2e6b807d0669dfb25760 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Wed, 3 Jun 2026 21:15:22 +0200 Subject: [PATCH 05/29] docs(brief): patch .bin header to 40 bytes, aligned (recorded deviation) --- briefs/m0.6-assets.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index 2163ed6..cf16667 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -31,7 +31,7 @@ First new functional surface since the Vulkan renderer (M0.4). Delivers the mini Frozen day-1 surfaces (formats + handle + cache key) and their minimal PNG / static-glTF / WAV implementation. Decode-only — no encoder of any source format is delivered. - **Intermediate format `.asset.etch`** — Etch text (asset name, `type`, `version`, `source`, `source_hash`, `import_settings`, `process_settings`, `extracted`) plus a referenced hashed binary blob stored separately. Schema frozen day 1; only the PNG / static-glTF / WAV subset is populated. -- **Runtime format `..bin`** — zero-copy binary, 32-byte header (fields, in order: `magic` = "WELD", `version` u16, `asset_type` u16, `platform` u16, `flags` u16, `data_offset` u32, `data_size` u32, `metadata_offset` u32, `metadata_size` u32, `hash` u64), followed by metadata then bulk data, mmap-able directly. Header layout frozen day 1. +- **Runtime format `..bin`** — zero-copy binary, **40-byte header**, 8-byte aligned (`@sizeOf == 40`, no implicit padding), fields in order with offsets: `magic`="WELD" [4]u8 @0, `version` u16 @4, `asset_type` u16 @6, `platform` u16 @8, `flags` u16 @10, `data_offset` u32 @12, `data_size` u32 @16, `metadata_offset` u32 @20, `metadata_size` u32 @24, `_reserved` u32 @28 (zero-filled), `hash` u64 @32. Followed by metadata then bulk data; `data_offset` / `metadata_offset` let payload sections be independently aligned (e.g. 16-byte for GPU/SIMD data), so payload alignment is decoupled from header size. mmap-able directly. Header layout frozen day 1. - **`AssetHandle`** — 64-bit packed handle, fields: `index` u32, `generation` u16, `type_tag` u16. Typed, generation-checked. Frozen day 1. (Named `AssetHandle`, not `AssetRef`.) - **Asset registry** — handle allocation, refcount, generation bump on unload so that a stale handle is detectable. - **Native in-tree DEFLATE/zlib codec** (`codecs/deflate/`) — RFC 1951 inflate (fixed + dynamic Huffman, table-driven) plus the zlib wrapper and ADLER32 trailer verification. @@ -62,6 +62,7 @@ Listed explicitly so the pipeline is not extended "to do it properly". Each item - Asset registry queries, orphan detection, dependency graph, circular-dependency detection (Phase 1+). - Any `weld cook` / `weld asset` CLI UX polish. The offline cooker is exercised through the round-trip tests and a thin offline entry only; the user-facing CLI surface is Phase 1. - Any widening of the Tier 0 job system public surface (see Notes). +- General Etch parser, or any dependency on `weld_etch`. The intermediate `.asset.etch` is read and written by a minimal ad-hoc reader/writer limited to the `asset { … }` schema; the full Etch parser is M0.8. (`asset_pipeline` depends on `foundation` + `core` only.) ## Execution stages (E1–E5) @@ -179,6 +180,8 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o - **Tier 0 boundary.** The loader consumes the existing Chase-Lev job system as-is. Do **not** widen the job system public surface. If the integration appears to need a new job-system API, that is a Case 2 design blocker (stop + journal + Claude.ai round-trip), not an in-session extension. - **WAV location.** Decode lives in `importers/wav.zig` (RIFF is trivial); there is no separate `codecs/wav/`, matching the directory-structure tree. - **Directory patch.** `codecs/deflate/` is a new sub-directory not yet present in `engine-directory-structure.md`; its addition is tracked as a spec patch outside this brief. +- **Runtime header alignment.** The header is 40 bytes, 8-byte aligned, with an explicit `_reserved` u32 at @28 so that `hash` u64 lands at @32 (8-aligned) and `@sizeOf == 40` with no implicit padding. This avoids the misaligned-`u64` / silent-tail-padding trap that an unpadded layout would create for a zero-copy mmap format. `_reserved` is zero-filled and reserved for future small fields (additive, no version bump). (Resolves the M0.6/E1 header-size blocker; the earlier "32 bytes" in `engine-asset-pipeline.md` §5 was an arithmetic error.) +- **Intermediate format syntax.** The `.asset.etch` text emitted by the M0.6 ad-hoc writer must be a valid subset of the Etch grammar (`etch-grammar.md`), so the M0.8 Etch parser reads it unchanged. The ad-hoc reader/writer is transitional; the on-disk text format is the frozen surface, not the reader implementation. --- @@ -200,15 +203,16 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o - 2026-06-03 18:34 — E1 setup complete: branch `phase-0/assets/pipeline-v0` off `main` (5407f2a); brief committed verbatim (8dfdee8); 7 specs read + confirmed (89b160a); brief activated (65e7b63). - 2026-06-03 18:40 — E1 design: studied `build.zig` wiring (per-test `createModule` + `TestSpec` flags pattern; modules via `addImport`). Per the FROZEN `build.zig` bullet, `asset_pipeline` deps are `foundation` + `core` only (no `weld_etch`) → the intermediate `.asset.etch` will use a self-contained minimal Etch-syntax writer/reader, not the S3 Etch parser. Flagged for confirmation at the E1 gate. -- 2026-06-03 18:43 — E1 BLOCKED on the runtime `.bin` header size (frozen-day-1 surface contradiction — see Blockers). Stopped E1 implementation per the blocker protocol; nothing is built on an unconfirmed frozen format. +- 2026-06-03 18:43 — E1 BLOCKED on the runtime `.bin` header size (frozen-day-1 surface contradiction — see Blockers). Stopped E1 implementation per the blocker protocol; nothing built on an unconfirmed frozen format (7e98a03). +- 2026-06-03 21:14 — Header blocker RESOLVED by Claude.ai ruling (Cas 2, brief patched): header = 40 bytes, 8-byte aligned, explicit `_reserved` u32 @28, `hash` u64 @32, `@sizeOf == 40`, no implicit padding. FROZEN SECTION replaced verbatim (header bullet + Out-of-scope no-`weld_etch` + 2 Notes); deviation recorded. Secondary point confirmed: ad-hoc Etch-subset reader/writer, no `weld_etch`, must emit a valid `etch-grammar.md` subset. Resuming E1. ## Recorded deviations -- +- 2026-06-03 (Claude.ai ruling, E1; this commit) — Runtime `..bin` header corrected from 32 to 40 bytes during E1: the 10 typed fields total 36 bytes and `hash` u64 required 8-byte alignment; added explicit `_reserved` u32 @28 so `hash` lands at @32 and `@sizeOf == 40` with no implicit padding. `engine-asset-pipeline.md` §5 patched in lockstep. Rationale: zero-copy mmap safety. ## Blockers encountered -- **[E1 — OPEN] Runtime `..bin` header size: the "32 bytes" label contradicts the 36-byte field list.** Both the FROZEN Scope (runtime format bullet) and `engine-asset-pipeline.md` §5 label the header "32 bytes", but the explicit field list sums to 36: `magic`[4] + `version` u16(2) + `asset_type` u16(2) + `platform` u16(2) + `flags` u16(2) + `data_offset` u32(4) + `data_size` u32(4) + `metadata_offset` u32(4) + `metadata_size` u32(4) + `hash` u64(8) = 36. The 10 fields as typed cannot fit in 32 bytes (only `hash` u64→u32 would reach exactly 32). The header layout is frozen day 1 and the cooker (E4) + the mmap loader (E5) build on it, so the exact frozen size must be confirmed before E1 writes it. Candidate resolutions: (a) the field list is authoritative → header is 36 bytes and "32" is a spec typo to patch in both the brief and `engine-asset-pipeline.md` §5; (b) "32" is authoritative → one field differs from the listed types. Recommendation: (a) — the detailed field list is the more specific source and `data_offset`/`metadata_offset` are explicit, so the byte-count label is descriptive only. Awaiting Claude.ai ruling (FROZEN SECTION change) or Guy's verbal decision recorded as a deviation. — UNRESOLVED +- **[E1 — RESOLVED] Runtime `..bin` header size: the "32 bytes" label contradicted the 36-byte field list.** Both the FROZEN Scope and `engine-asset-pipeline.md` §5 labelled the header "32 bytes", but the explicit field list summed to 36 (`magic`[4] + 4×u16 + 4×u32 + `hash` u64). The 10 fields as typed cannot fit in 32 bytes. Logged in 7e98a03, signalled to Guy. **Resolved 2026-06-03 by Claude.ai ruling (Cas 2):** header = 40 bytes, 8-byte aligned, explicit `_reserved` u32 @28, `hash` u64 @32, `@sizeOf == 40`, no implicit padding — see Recorded deviations. Brief FROZEN SECTION + `engine-asset-pipeline.md` §5 patched in lockstep. ## Closing notes From c3184cddaa1e3a8cb08591f24fc766aac71e5a32 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Wed, 3 Jun 2026 21:33:27 +0200 Subject: [PATCH 06/29] feat(assets): add frozen on-disk format surfaces --- .../asset_pipeline/format/asset_type.zig | 96 ++++ .../asset_pipeline/format/intermediate.zig | 527 ++++++++++++++++++ src/modules/asset_pipeline/format/root.zig | 42 ++ .../asset_pipeline/format/runtime_bin.zig | 237 ++++++++ 4 files changed, 902 insertions(+) create mode 100644 src/modules/asset_pipeline/format/asset_type.zig create mode 100644 src/modules/asset_pipeline/format/intermediate.zig create mode 100644 src/modules/asset_pipeline/format/root.zig create mode 100644 src/modules/asset_pipeline/format/runtime_bin.zig diff --git a/src/modules/asset_pipeline/format/asset_type.zig b/src/modules/asset_pipeline/format/asset_type.zig new file mode 100644 index 0000000..8b81042 --- /dev/null +++ b/src/modules/asset_pipeline/format/asset_type.zig @@ -0,0 +1,96 @@ +//! Frozen `AssetType` enum — the category tag shared by the runtime +//! `..bin` header (`asset_type` field) and `AssetHandle.type_tag`. +//! +//! The variant list and explicit values are frozen day 1 (M0.6): the +//! `asset_type` u16 is an on-disk field of every cooked `.bin`, so the +//! numbering must never change. New categories append with the next free +//! value (additive, no renumber). Mirrors the canonical list in +//! `engine-asset-pipeline.md` §10. +//! +//! M0.6 only *populates* `texture`, `mesh`, and `audio` (PNG / static-glTF +//! / WAV); the remaining variants are declared so the frozen numbering is +//! already reserved for later phases. + +const std = @import("std"); + +/// Asset category. Backing type is `u16` to match the `.bin` header +/// `asset_type` field and `AssetHandle.type_tag`. Values are frozen. +pub const AssetType = enum(u16) { + /// Mesh (M0.6: static only). + mesh = 0, + /// 2D texture / image. + texture = 1, + /// Audio clip (raw PCM in M0.6). + audio = 2, + /// Skeletal animation clip (Phase 1). + animation = 3, + /// Font (Phase 1). + font = 4, + /// Scene graph. + scene = 5, + /// Entity template. + prefab = 6, + /// Gameplay code unit. + script = 7, + /// Shader program. + shader = 8, + /// Material definition. + material = 9, + /// Visual effect. + vfx = 10, + /// Data table. + data = 11, + /// UI widget. + widget = 12, + /// Motion asset. + motion = 13, + /// Cinematic sequence. + sequence = 14, + /// Animation graph. + anim_graph = 15, + /// Audio graph. + audio_graph = 16, + /// Audio score / arrangement. + audio_score = 17, + /// Localization table. + locale = 18, + + /// Numeric value stored in the `.bin` header `asset_type` field and in + /// `AssetHandle.type_tag`. + pub fn toU16(self: AssetType) u16 { + return @intFromEnum(self); + } + + /// Decode a stored `u16` into an `AssetType`, or `null` when the value + /// is not a known variant (e.g. a `.bin` cooked by a future engine + /// version carrying a category this build does not know). + pub fn fromU16(value: u16) ?AssetType { + return std.enums.fromInt(AssetType, value); + } + + /// Lowercase category tag used in the runtime extension `..bin` + /// (e.g. `mesh` → `.mesh.bin`). Equal to the enum field name. + pub fn tag(self: AssetType) []const u8 { + return @tagName(self); + } +}; + +test "asset_type u16 round-trips for the populated M0.6 subset" { + inline for (.{ AssetType.texture, AssetType.mesh, AssetType.audio }) |t| { + try std.testing.expectEqual(t, AssetType.fromU16(t.toU16()).?); + } + // Explicit frozen values — guard against an accidental renumber. + try std.testing.expectEqual(@as(u16, 0), AssetType.mesh.toU16()); + try std.testing.expectEqual(@as(u16, 1), AssetType.texture.toU16()); + try std.testing.expectEqual(@as(u16, 2), AssetType.audio.toU16()); +} + +test "asset_type fromU16 rejects an out-of-range value" { + try std.testing.expectEqual(@as(?AssetType, null), AssetType.fromU16(9999)); +} + +test "asset_type tag matches the field name" { + try std.testing.expectEqualStrings("texture", AssetType.texture.tag()); + try std.testing.expectEqualStrings("mesh", AssetType.mesh.tag()); + try std.testing.expectEqualStrings("audio", AssetType.audio.tag()); +} diff --git a/src/modules/asset_pipeline/format/intermediate.zig b/src/modules/asset_pipeline/format/intermediate.zig new file mode 100644 index 0000000..ffff211 --- /dev/null +++ b/src/modules/asset_pipeline/format/intermediate.zig @@ -0,0 +1,527 @@ +//! Intermediate `.asset.etch` document model + a minimal Etch-syntax +//! reader/writer. +//! +//! The on-disk text is the frozen surface (M0.6): a single top-level +//! `asset "" { … }` construct holding the importer-extracted metadata +//! and the user-editable settings. It is the diffable, Git-versioned half +//! of an asset (the bulk bytes live in a separate hashed blob). +//! +//! The reader/writer here is a deliberately small ad-hoc implementation: +//! the brief forbids a `weld_etch` dependency in M0.6 (the full Etch parser +//! is M0.8). The emitted text MUST stay a valid subset of `etch-grammar.md` +//! so the M0.8 parser reads it back unchanged — fields are `key: value`, +//! newline-separated inside `{ … }` blocks, arrays are comma-separated +//! `[ … ]`, enum literals are `.name`, strings are `"…"`. The on-disk text +//! is the frozen contract, not this reader implementation. +//! +//! Ownership: `parseEtch` allocates every string/array/object into the +//! caller-supplied allocator (use an arena and free it in one shot). The +//! returned `AssetDoc` borrows nothing from the source text. + +const std = @import("std"); + +/// A scalar or composite value in the `asset` document tree. +pub const Value = union(enum) { + /// Integer literal (e.g. `version: 1`). + int: i64, + /// Floating-point literal (always emitted with a decimal point). + float: f64, + /// Boolean literal (`true` / `false`). + boolean: bool, + /// Quoted string, stored without the surrounding quotes. + string: []const u8, + /// Bare identifier (e.g. an asset class name `StaticMesh`). + identifier: []const u8, + /// Enum literal `.name`, stored without the leading dot. + enum_literal: []const u8, + /// Comma-separated array of values. + array: []const Value, + /// Nested `{ … }` block of `key: value` fields. + object: []const Field, + + /// Deep structural equality. + pub fn eql(a: Value, b: Value) bool { + if (std.meta.activeTag(a) != std.meta.activeTag(b)) return false; + return switch (a) { + .int => |x| x == b.int, + .float => |x| x == b.float, + .boolean => |x| x == b.boolean, + .string => |x| std.mem.eql(u8, x, b.string), + .identifier => |x| std.mem.eql(u8, x, b.identifier), + .enum_literal => |x| std.mem.eql(u8, x, b.enum_literal), + .array => |x| blk: { + if (x.len != b.array.len) break :blk false; + for (x, b.array) |xa, ba| { + if (!Value.eql(xa, ba)) break :blk false; + } + break :blk true; + }, + .object => |x| fieldsEql(x, b.object), + }; + } +}; + +/// One `key: value` pair inside a block. +pub const Field = struct { + /// Field name (a bare identifier). + key: []const u8, + /// Field value. + value: Value, +}; + +/// Deep equality over two ordered field lists. +pub fn fieldsEql(a: []const Field, b: []const Field) bool { + if (a.len != b.len) return false; + for (a, b) |fa, fb| { + if (!std.mem.eql(u8, fa.key, fb.key)) return false; + if (!Value.eql(fa.value, fb.value)) return false; + } + return true; +} + +/// The frozen intermediate-format schema. The three settings/extracted +/// blocks are generic field lists so each asset category populates only +/// what it needs (M0.6: texture / mesh / audio). +pub const AssetDoc = struct { + /// Logical asset name (the `asset ""` string). + name: []const u8, + /// Asset class identifier (e.g. `Texture2D`, `StaticMesh`, `AudioClip`). + type_name: []const u8, + /// Schema version of this document. + version: u16, + /// Source file the asset was imported from. + source: []const u8, + /// Hex hash of the source bytes. + source_hash: []const u8, + /// User-editable import settings. + import_settings: []const Field = &.{}, + /// User-editable process settings. + process_settings: []const Field = &.{}, + /// Importer-extracted, machine-maintained facts. + extracted: []const Field = &.{}, + + /// Deep structural equality (used by the round-trip test). + pub fn eql(a: AssetDoc, b: AssetDoc) bool { + return std.mem.eql(u8, a.name, b.name) and + std.mem.eql(u8, a.type_name, b.type_name) and + a.version == b.version and + std.mem.eql(u8, a.source, b.source) and + std.mem.eql(u8, a.source_hash, b.source_hash) and + fieldsEql(a.import_settings, b.import_settings) and + fieldsEql(a.process_settings, b.process_settings) and + fieldsEql(a.extracted, b.extracted); + } +}; + +/// Error set raised while writing. `std.Io.Writer.Error` already covers a +/// failed underlying drain (e.g. allocation failure on an allocating +/// writer). +pub const WriteError = std.Io.Writer.Error; + +/// Serialize `doc` as `.asset.etch` text into `out`. +pub fn writeEtch(doc: AssetDoc, out: *std.Io.Writer) WriteError!void { + try out.print("asset \"{s}\" {{\n", .{doc.name}); + try out.print(" type: {s}\n", .{doc.type_name}); + try out.print(" version: {d}\n", .{doc.version}); + try out.print(" source: \"{s}\"\n", .{doc.source}); + try out.print(" source_hash: \"{s}\"\n", .{doc.source_hash}); + try writeBlock(out, "import_settings", doc.import_settings); + try writeBlock(out, "process_settings", doc.process_settings); + try writeBlock(out, "extracted", doc.extracted); + try out.writeAll("}\n"); +} + +/// Serialize `doc` into a freshly allocated, caller-owned byte slice. +pub fn writeAlloc(gpa: std.mem.Allocator, doc: AssetDoc) error{OutOfMemory}![]u8 { + var aw = std.Io.Writer.Allocating.init(gpa); + errdefer aw.deinit(); + writeEtch(doc, &aw.writer) catch return error.OutOfMemory; + return aw.toOwnedSlice(); +} + +fn writeBlock(out: *std.Io.Writer, name: []const u8, fields: []const Field) WriteError!void { + try out.print(" {s}: {{\n", .{name}); + for (fields) |f| { + try out.print(" {s}: ", .{f.key}); + try writeValue(out, f.value, 2); + try out.writeAll("\n"); + } + try out.writeAll(" }\n"); +} + +fn writeIndent(out: *std.Io.Writer, depth: usize) WriteError!void { + var i: usize = 0; + while (i < depth) : (i += 1) try out.writeAll(" "); +} + +fn writeValue(out: *std.Io.Writer, v: Value, depth: usize) WriteError!void { + switch (v) { + .int => |i| try out.print("{d}", .{i}), + .float => |f| try writeFloat(out, f), + .boolean => |b| try out.writeAll(if (b) "true" else "false"), + .string => |s| try out.print("\"{s}\"", .{s}), + .identifier => |s| try out.writeAll(s), + .enum_literal => |s| try out.print(".{s}", .{s}), + .array => |items| { + try out.writeAll("["); + for (items, 0..) |it, i| { + if (i != 0) try out.writeAll(", "); + try writeValue(out, it, depth); + } + try out.writeAll("]"); + }, + .object => |fields| { + try out.writeAll("{\n"); + for (fields) |f| { + try writeIndent(out, depth + 1); + try out.print("{s}: ", .{f.key}); + try writeValue(out, f.value, depth + 1); + try out.writeAll("\n"); + } + try writeIndent(out, depth); + try out.writeAll("}"); + }, + } +} + +/// Emit a float with a guaranteed decimal point so the reader keeps it a +/// float (otherwise `1.0` would format as `1` and parse back as an int). +fn writeFloat(out: *std.Io.Writer, f: f64) WriteError!void { + var buf: [512]u8 = undefined; + const s = std.fmt.bufPrint(&buf, "{d}", .{f}) catch unreachable; + try out.writeAll(s); + // If the shortest form has no '.', exponent, or inf/nan letters, it + // looks like an integer — append ".0" to preserve the float tag. + if (std.mem.indexOfAny(u8, s, ".eEnN") == null) { + try out.writeAll(".0"); + } +} + +/// Error set raised while parsing. +pub const ParseError = error{ + /// Allocation failed. + OutOfMemory, + /// Hit end of input mid-construct. + UnexpectedEnd, + /// A character not valid at this position. + UnexpectedChar, + /// The document does not start with the `asset` keyword. + ExpectedAssetKeyword, + /// A numeric literal failed to parse. + InvalidNumber, + /// The `version` field was absent, non-integer, or out of `u16` range. + InvalidVersion, +}; + +/// Parse `.asset.etch` text into an `AssetDoc`. Every owned slice is +/// allocated from `arena` (pass an arena and free it in one shot). +pub fn parseEtch(arena: std.mem.Allocator, src: []const u8) ParseError!AssetDoc { + var p = Parser{ .src = src, .arena = arena }; + return p.parseDoc(); +} + +const Parser = struct { + src: []const u8, + pos: usize = 0, + arena: std.mem.Allocator, + + fn isWs(c: u8) bool { + return c == ' ' or c == '\t' or c == '\n' or c == '\r'; + } + fn isIdentStart(c: u8) bool { + return (c >= 'a' and c <= 'z') or (c >= 'A' and c <= 'Z') or c == '_'; + } + fn isIdentChar(c: u8) bool { + return isIdentStart(c) or (c >= '0' and c <= '9'); + } + + fn skipWs(self: *Parser) void { + while (self.pos < self.src.len and isWs(self.src[self.pos])) : (self.pos += 1) {} + } + + fn peekNonWs(self: *Parser) ParseError!u8 { + self.skipWs(); + if (self.pos >= self.src.len) return error.UnexpectedEnd; + return self.src[self.pos]; + } + + fn expect(self: *Parser, ch: u8) ParseError!void { + self.skipWs(); + if (self.pos >= self.src.len) return error.UnexpectedEnd; + if (self.src[self.pos] != ch) return error.UnexpectedChar; + self.pos += 1; + } + + fn parseIdent(self: *Parser) ParseError![]const u8 { + self.skipWs(); + const start = self.pos; + if (self.pos >= self.src.len or !isIdentStart(self.src[self.pos])) return error.UnexpectedChar; + self.pos += 1; + while (self.pos < self.src.len and isIdentChar(self.src[self.pos])) : (self.pos += 1) {} + return self.arena.dupe(u8, self.src[start..self.pos]); + } + + fn parseString(self: *Parser) ParseError![]const u8 { + try self.expect('"'); + const start = self.pos; + while (self.pos < self.src.len and self.src[self.pos] != '"') : (self.pos += 1) {} + if (self.pos >= self.src.len) return error.UnexpectedEnd; + const inner = self.src[start..self.pos]; + self.pos += 1; // consume closing quote + return self.arena.dupe(u8, inner); + } + + fn parseNumber(self: *Parser) ParseError!Value { + self.skipWs(); + const start = self.pos; + if (self.pos < self.src.len and (self.src[self.pos] == '-' or self.src[self.pos] == '+')) { + self.pos += 1; + } + var is_float = false; + while (self.pos < self.src.len) : (self.pos += 1) { + const ch = self.src[self.pos]; + if (ch >= '0' and ch <= '9') continue; + if (ch == '.' or ch == 'e' or ch == 'E') { + is_float = true; + continue; + } + if ((ch == '+' or ch == '-') and self.pos > start) { + const prev = self.src[self.pos - 1]; + if (prev == 'e' or prev == 'E') continue; + } + break; + } + const slice = self.src[start..self.pos]; + if (slice.len == 0) return error.InvalidNumber; + if (is_float) { + const f = std.fmt.parseFloat(f64, slice) catch return error.InvalidNumber; + return .{ .float = f }; + } + const i = std.fmt.parseInt(i64, slice, 10) catch return error.InvalidNumber; + return .{ .int = i }; + } + + fn parseArray(self: *Parser) ParseError!Value { + try self.expect('['); + var items: std.ArrayList(Value) = .empty; + while (true) { + const c = try self.peekNonWs(); + if (c == ']') { + self.pos += 1; + break; + } + try items.append(self.arena, try self.parseValue()); + const d = try self.peekNonWs(); + if (d == ',') { + self.pos += 1; + continue; + } + if (d == ']') { + self.pos += 1; + break; + } + return error.UnexpectedChar; + } + return .{ .array = try items.toOwnedSlice(self.arena) }; + } + + fn parseObject(self: *Parser) ParseError!Value { + return .{ .object = try self.parseFields() }; + } + + /// Parse a `{ key: value … }` block and return its fields. + fn parseFields(self: *Parser) ParseError![]const Field { + try self.expect('{'); + var fields: std.ArrayList(Field) = .empty; + while (true) { + const c = try self.peekNonWs(); + if (c == '}') { + self.pos += 1; + break; + } + const key = try self.parseIdent(); + try self.expect(':'); + const value = try self.parseValue(); + try fields.append(self.arena, .{ .key = key, .value = value }); + } + return fields.toOwnedSlice(self.arena); + } + + fn parseValue(self: *Parser) ParseError!Value { + const c = try self.peekNonWs(); + switch (c) { + '"' => return .{ .string = try self.parseString() }, + '[' => return self.parseArray(), + '{' => return self.parseObject(), + '.' => { + self.pos += 1; // consume '.' + return .{ .enum_literal = try self.parseIdent() }; + }, + '-', '+', '0'...'9' => return self.parseNumber(), + else => { + if (!isIdentStart(c)) return error.UnexpectedChar; + const id = try self.parseIdent(); + if (std.mem.eql(u8, id, "true")) return .{ .boolean = true }; + if (std.mem.eql(u8, id, "false")) return .{ .boolean = false }; + return .{ .identifier = id }; + }, + } + } + + fn parseDoc(self: *Parser) ParseError!AssetDoc { + const keyword = self.parseIdent() catch return error.ExpectedAssetKeyword; + if (!std.mem.eql(u8, keyword, "asset")) return error.ExpectedAssetKeyword; + const name = try self.parseString(); + const fields = try self.parseFields(); + + var doc = AssetDoc{ + .name = name, + .type_name = "", + .version = 0, + .source = "", + .source_hash = "", + }; + for (fields) |f| { + if (std.mem.eql(u8, f.key, "type")) { + doc.type_name = switch (f.value) { + .identifier => |s| s, + .string => |s| s, + else => return error.UnexpectedChar, + }; + } else if (std.mem.eql(u8, f.key, "version")) { + doc.version = switch (f.value) { + .int => |i| std.math.cast(u16, i) orelse return error.InvalidVersion, + else => return error.InvalidVersion, + }; + } else if (std.mem.eql(u8, f.key, "source")) { + doc.source = switch (f.value) { + .string => |s| s, + else => return error.UnexpectedChar, + }; + } else if (std.mem.eql(u8, f.key, "source_hash")) { + doc.source_hash = switch (f.value) { + .string => |s| s, + else => return error.UnexpectedChar, + }; + } else if (std.mem.eql(u8, f.key, "import_settings")) { + doc.import_settings = switch (f.value) { + .object => |o| o, + else => return error.UnexpectedChar, + }; + } else if (std.mem.eql(u8, f.key, "process_settings")) { + doc.process_settings = switch (f.value) { + .object => |o| o, + else => return error.UnexpectedChar, + }; + } else if (std.mem.eql(u8, f.key, "extracted")) { + doc.extracted = switch (f.value) { + .object => |o| o, + else => return error.UnexpectedChar, + }; + } + // Unknown top-level keys are ignored (forward-compatibility). + } + return doc; + } +}; + +test "intermediate doc round-trips through etch text" { + const gpa = std.testing.allocator; + + const min_arr = [_]Value{ .{ .float = -1.0 }, .{ .float = -1.0 }, .{ .float = -1.0 } }; + const max_arr = [_]Value{ .{ .float = 1.0 }, .{ .float = 1.0 }, .{ .float = 1.0 } }; + const bounds = [_]Field{ + .{ .key = "min", .value = .{ .array = &min_arr } }, + .{ .key = "max", .value = .{ .array = &max_arr } }, + }; + const materials = [_]Value{ .{ .string = "body" }, .{ .string = "trim" } }; + + const import_settings = [_]Field{ + .{ .key = "scale", .value = .{ .float = 1.0 } }, + .{ .key = "axis_conversion", .value = .{ .enum_literal = "gltf_to_weld" } }, + }; + const process_settings = [_]Field{ + .{ .key = "generate_lods", .value = .{ .boolean = false } }, + }; + const extracted = [_]Field{ + .{ .key = "vertex_count", .value = .{ .int = 24 } }, + .{ .key = "bounds", .value = .{ .object = &bounds } }, + .{ .key = "materials", .value = .{ .array = &materials } }, + }; + + const original = AssetDoc{ + .name = "cube_mesh", + .type_name = "StaticMesh", + .version = 1, + .source = "cube.gltf", + .source_hash = "abc123", + .import_settings = &import_settings, + .process_settings = &process_settings, + .extracted = &extracted, + }; + + const text = try writeAlloc(gpa, original); + defer gpa.free(text); + + var arena = std.heap.ArenaAllocator.init(gpa); + defer arena.deinit(); + const parsed = try parseEtch(arena.allocator(), text); + + try std.testing.expect(original.eql(parsed)); + try std.testing.expectEqualStrings("StaticMesh", parsed.type_name); + try std.testing.expectEqual(@as(u16, 1), parsed.version); + try std.testing.expectEqual(@as(usize, 3), parsed.extracted.len); +} + +test "intermediate writer emits a valid asset construct shape" { + const gpa = std.testing.allocator; + const import_settings = [_]Field{ + .{ .key = "srgb", .value = .{ .boolean = true } }, + .{ .key = "max_resolution", .value = .{ .int = 4096 } }, + }; + const doc = AssetDoc{ + .name = "hero_albedo", + .type_name = "Texture2D", + .version = 1, + .source = "hero_albedo.png", + .source_hash = "7b3e2f1a", + .import_settings = &import_settings, + }; + const text = try writeAlloc(gpa, doc); + defer gpa.free(text); + + try std.testing.expect(std.mem.indexOf(u8, text, "asset \"hero_albedo\" {") != null); + try std.testing.expect(std.mem.indexOf(u8, text, "type: Texture2D") != null); + try std.testing.expect(std.mem.indexOf(u8, text, "version: 1") != null); + try std.testing.expect(std.mem.indexOf(u8, text, "srgb: true") != null); + try std.testing.expect(std.mem.indexOf(u8, text, "max_resolution: 4096") != null); +} + +test "intermediate float keeps its decimal point through a round-trip" { + const gpa = std.testing.allocator; + const import_settings = [_]Field{ + .{ .key = "scale", .value = .{ .float = 1.0 } }, + }; + const doc = AssetDoc{ + .name = "x", + .type_name = "Texture2D", + .version = 1, + .source = "x.png", + .source_hash = "0", + .import_settings = &import_settings, + }; + const text = try writeAlloc(gpa, doc); + defer gpa.free(text); + try std.testing.expect(std.mem.indexOf(u8, text, "scale: 1.0") != null); + + var arena = std.heap.ArenaAllocator.init(gpa); + defer arena.deinit(); + const parsed = try parseEtch(arena.allocator(), text); + try std.testing.expectEqual(Value{ .float = 1.0 }, parsed.import_settings[0].value); +} + +test "intermediate parse rejects input without the asset keyword" { + var arena = std.heap.ArenaAllocator.init(std.testing.allocator); + defer arena.deinit(); + try std.testing.expectError(error.ExpectedAssetKeyword, parseEtch(arena.allocator(), "widget \"x\" {}")); +} diff --git a/src/modules/asset_pipeline/format/root.zig b/src/modules/asset_pipeline/format/root.zig new file mode 100644 index 0000000..56f2ac8 --- /dev/null +++ b/src/modules/asset_pipeline/format/root.zig @@ -0,0 +1,42 @@ +//! Asset Pipeline `format/` namespace — the two day-1-frozen on-disk +//! surfaces. +//! +//! - `asset_type` — the `AssetType` category enum shared by both formats. +//! - `runtime_bin` — the runtime `..bin` 40-byte header. +//! - `intermediate` — the intermediate `.asset.etch` document model +//! and its ad-hoc Etch-subset reader/writer. + +const asset_type = @import("asset_type.zig"); +const runtime_bin = @import("runtime_bin.zig"); + +/// Intermediate `.asset.etch` document model + reader/writer. +pub const intermediate = @import("intermediate.zig"); + +/// Frozen asset category enum (`AssetType`). +pub const AssetType = asset_type.AssetType; + +/// Runtime `..bin` 40-byte header. +pub const RuntimeHeader = runtime_bin.RuntimeHeader; +/// Target platform tag stored in the header. +pub const Platform = runtime_bin.Platform; +/// `WELD` file signature. +pub const magic = runtime_bin.magic; +/// On-disk header size in bytes (40). +pub const header_size = runtime_bin.header_size; +/// Current header format version. +pub const current_version = runtime_bin.current_version; + +/// Intermediate-document value node. +pub const Value = intermediate.Value; +/// Intermediate-document `key: value` field. +pub const Field = intermediate.Field; +/// Intermediate-document root schema. +pub const AssetDoc = intermediate.AssetDoc; + +// Pins so the inline tests of every sub-file are analysed +// (engine-zig-conventions.md §13 module-rooting guard). +comptime { + _ = asset_type; + _ = runtime_bin; + _ = intermediate; +} diff --git a/src/modules/asset_pipeline/format/runtime_bin.zig b/src/modules/asset_pipeline/format/runtime_bin.zig new file mode 100644 index 0000000..21169f8 --- /dev/null +++ b/src/modules/asset_pipeline/format/runtime_bin.zig @@ -0,0 +1,237 @@ +//! Runtime `..bin` zero-copy container — the 40-byte header and its +//! read / write / validate helpers. +//! +//! Layout frozen day 1 (M0.6, brief §Scope ▸ runtime format). 40 bytes, +//! 8-byte aligned, declared as an `extern struct` whose natural C layout +//! matches the on-disk bytes exactly: the explicit `_reserved` u32 at +//! offset 28 makes `hash` (u64) land at the 8-aligned offset 32, so +//! `@sizeOf == 40` with no implicit padding and the struct can be read +//! straight out of an mmap'd region with no repacking (E5). +//! +//! Endianness: the format is little-endian. Both Phase 0 targets +//! (x86_64, aarch64) are little-endian, so the in-memory `extern struct` +//! and the on-disk bytes coincide; `read` / `writeTo` are nonetheless +//! explicit per-field so a future big-endian port has a single place to +//! byte-swap. + +const std = @import("std"); +const AssetType = @import("asset_type.zig").AssetType; + +/// File signature occupying the first 4 bytes of every `.bin`. +pub const magic: [4]u8 = .{ 'W', 'E', 'L', 'D' }; + +/// Total on-disk header size, in bytes. Frozen day 1. +pub const header_size: usize = 40; + +/// Header format version. Bumped only on a breaking header-layout change +/// (payload-shape changes use the per-category payload `version`, not this). +pub const current_version: u16 = 1; + +/// Target platform a `.bin` was cooked for. Stored in the header +/// `platform` field. Non-exhaustive so future platforms read back without +/// a format bump; M0.6 only cooks `pc`. +pub const Platform = enum(u16) { + /// Desktop PC (Vulkan / D3D12). + pc = 0, + /// Mobile (Android / iOS). + mobile = 1, + /// Web (WebGPU). + web = 2, + _, + + /// Numeric value stored in the header `platform` field. + pub fn toU16(self: Platform) u16 { + return @intFromEnum(self); + } +}; + +/// Errors surfaced by `read`. +pub const ReadError = error{ + /// The input slice is shorter than `header_size`. + ShortBuffer, + /// The first 4 bytes are not the `WELD` signature. + BadMagic, +}; + +/// 40-byte runtime header. Field order and offsets are frozen day 1; the +/// `comptime` block below pins them so a refactor that reorders fields or +/// changes a type fails to compile rather than silently breaking the +/// on-disk format. +pub const RuntimeHeader = extern struct { + /// `WELD` signature. + magic: [4]u8, + /// Header format version (`current_version`). + version: u16, + /// Asset category (`AssetType` value via `assetType`). + asset_type: u16, + /// Target platform (`Platform` value). + platform: u16, + /// Reserved bit flags; zero in M0.6. + flags: u16, + /// Byte offset of the bulk data section from the start of the file. + data_offset: u32, + /// Byte length of the bulk data section. + data_size: u32, + /// Byte offset of the metadata section from the start of the file. + metadata_offset: u32, + /// Byte length of the metadata section. + metadata_size: u32, + /// Reserved (zero-filled); makes `hash` land at the 8-aligned offset + /// 32. Reserved for future small fields (additive, no version bump). + _reserved: u32, + /// Content hash of the cooked payload (cache / integrity check). + hash: u64, + + comptime { + std.debug.assert(@sizeOf(RuntimeHeader) == header_size); + std.debug.assert(@alignOf(RuntimeHeader) == 8); + std.debug.assert(@offsetOf(RuntimeHeader, "magic") == 0); + std.debug.assert(@offsetOf(RuntimeHeader, "version") == 4); + std.debug.assert(@offsetOf(RuntimeHeader, "asset_type") == 6); + std.debug.assert(@offsetOf(RuntimeHeader, "platform") == 8); + std.debug.assert(@offsetOf(RuntimeHeader, "flags") == 10); + std.debug.assert(@offsetOf(RuntimeHeader, "data_offset") == 12); + std.debug.assert(@offsetOf(RuntimeHeader, "data_size") == 16); + std.debug.assert(@offsetOf(RuntimeHeader, "metadata_offset") == 20); + std.debug.assert(@offsetOf(RuntimeHeader, "metadata_size") == 24); + std.debug.assert(@offsetOf(RuntimeHeader, "_reserved") == 28); + std.debug.assert(@offsetOf(RuntimeHeader, "hash") == 32); + } + + /// Construct a header with the `WELD` magic filled in and `_reserved` + /// zeroed. `version` defaults to `current_version` and `platform` to + /// `.pc`. + pub fn init(params: struct { + asset_type: AssetType, + data_offset: u32, + data_size: u32, + metadata_offset: u32, + metadata_size: u32, + hash: u64, + version: u16 = current_version, + platform: Platform = .pc, + flags: u16 = 0, + }) RuntimeHeader { + return .{ + .magic = magic, + .version = params.version, + .asset_type = params.asset_type.toU16(), + .platform = params.platform.toU16(), + .flags = params.flags, + .data_offset = params.data_offset, + .data_size = params.data_size, + .metadata_offset = params.metadata_offset, + .metadata_size = params.metadata_size, + ._reserved = 0, + .hash = params.hash, + }; + } + + /// Decode the stored `asset_type` field, or `null` if it is not a known + /// `AssetType` variant. + pub fn assetType(self: RuntimeHeader) ?AssetType { + return AssetType.fromU16(self.asset_type); + } + + /// Serialize the header into a fixed 40-byte buffer, little-endian. + pub fn writeTo(self: RuntimeHeader, buf: *[header_size]u8) void { + @memcpy(buf[0..4], &self.magic); + std.mem.writeInt(u16, buf[4..6], self.version, .little); + std.mem.writeInt(u16, buf[6..8], self.asset_type, .little); + std.mem.writeInt(u16, buf[8..10], self.platform, .little); + std.mem.writeInt(u16, buf[10..12], self.flags, .little); + std.mem.writeInt(u32, buf[12..16], self.data_offset, .little); + std.mem.writeInt(u32, buf[16..20], self.data_size, .little); + std.mem.writeInt(u32, buf[20..24], self.metadata_offset, .little); + std.mem.writeInt(u32, buf[24..28], self.metadata_size, .little); + std.mem.writeInt(u32, buf[28..32], self._reserved, .little); + std.mem.writeInt(u64, buf[32..40], self.hash, .little); + } + + /// Serialize the header into a freshly returned 40-byte array. + pub fn toBytes(self: RuntimeHeader) [header_size]u8 { + var buf: [header_size]u8 = undefined; + self.writeTo(&buf); + return buf; + } + + /// Parse and validate a header from the front of `bytes`, little-endian. + /// Validates length and magic only; the caller checks `version` / + /// `assetType` against what it expects. + /// + /// Errors: + /// - `error.ShortBuffer` if `bytes.len < header_size` + /// - `error.BadMagic` if the signature is not `WELD` + pub fn read(bytes: []const u8) ReadError!RuntimeHeader { + if (bytes.len < header_size) return error.ShortBuffer; + if (!std.mem.eql(u8, bytes[0..4], &magic)) return error.BadMagic; + return .{ + .magic = magic, + .version = std.mem.readInt(u16, bytes[4..6], .little), + .asset_type = std.mem.readInt(u16, bytes[6..8], .little), + .platform = std.mem.readInt(u16, bytes[8..10], .little), + .flags = std.mem.readInt(u16, bytes[10..12], .little), + .data_offset = std.mem.readInt(u32, bytes[12..16], .little), + .data_size = std.mem.readInt(u32, bytes[16..20], .little), + .metadata_offset = std.mem.readInt(u32, bytes[20..24], .little), + .metadata_size = std.mem.readInt(u32, bytes[24..28], .little), + ._reserved = std.mem.readInt(u32, bytes[28..32], .little), + .hash = std.mem.readInt(u64, bytes[32..40], .little), + }; + } +}; + +test "runtime header round-trips through its bytes" { + const original = RuntimeHeader.init(.{ + .asset_type = .texture, + .data_offset = header_size, + .data_size = 4096, + .metadata_offset = header_size + 4096, + .metadata_size = 64, + .hash = 0xDEADBEEFCAFEF00D, + }); + + const bytes = original.toBytes(); + try std.testing.expectEqual(@as(usize, header_size), bytes.len); + + const parsed = try RuntimeHeader.read(&bytes); + try std.testing.expectEqual(original, parsed); + try std.testing.expectEqual(AssetType.texture, parsed.assetType().?); + try std.testing.expectEqual(@as(u16, current_version), parsed.version); + try std.testing.expectEqual(@as(u32, 0), parsed._reserved); +} + +test "runtime header layout is frozen at 40 bytes" { + try std.testing.expectEqual(@as(usize, 40), @sizeOf(RuntimeHeader)); + try std.testing.expectEqual(@as(usize, 8), @alignOf(RuntimeHeader)); + try std.testing.expectEqual(@as(usize, 32), @offsetOf(RuntimeHeader, "hash")); + try std.testing.expectEqual(@as(usize, 28), @offsetOf(RuntimeHeader, "_reserved")); + + const bytes = RuntimeHeader.init(.{ + .asset_type = .mesh, + .data_offset = 40, + .data_size = 0, + .metadata_offset = 40, + .metadata_size = 0, + .hash = 0, + }).toBytes(); + try std.testing.expectEqualSlices(u8, "WELD", bytes[0..4]); +} + +test "runtime header read rejects a short buffer" { + const short = [_]u8{ 'W', 'E', 'L', 'D' } ++ [_]u8{0} ** 10; + try std.testing.expectError(error.ShortBuffer, RuntimeHeader.read(&short)); +} + +test "runtime header read rejects a bad magic" { + var bytes = RuntimeHeader.init(.{ + .asset_type = .audio, + .data_offset = 40, + .data_size = 0, + .metadata_offset = 40, + .metadata_size = 0, + .hash = 0, + }).toBytes(); + bytes[1] = 'X'; + try std.testing.expectError(error.BadMagic, RuntimeHeader.read(&bytes)); +} From a23bafae82bb5cdcc3573fa3e6fa674dc583dafd Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Wed, 3 Jun 2026 21:33:28 +0200 Subject: [PATCH 07/29] feat(assets): add AssetHandle and generation-checked registry --- .../asset_pipeline/registry/Registry.zig | 256 ++++++++++++++++++ .../asset_pipeline/registry/asset_handle.zig | 76 ++++++ src/modules/asset_pipeline/registry/root.zig | 18 ++ 3 files changed, 350 insertions(+) create mode 100644 src/modules/asset_pipeline/registry/Registry.zig create mode 100644 src/modules/asset_pipeline/registry/asset_handle.zig create mode 100644 src/modules/asset_pipeline/registry/root.zig diff --git a/src/modules/asset_pipeline/registry/Registry.zig b/src/modules/asset_pipeline/registry/Registry.zig new file mode 100644 index 0000000..9e20be1 --- /dev/null +++ b/src/modules/asset_pipeline/registry/Registry.zig @@ -0,0 +1,256 @@ +//! Asset registry — owns the slot table that `AssetHandle`s index into, +//! plus per-slot refcount and generation. +//! +//! M0.6 scope (brief §Scope): handle allocation, refcount, and a generation +//! bump on unload so a stale handle is detectable. Mirrors the ECS +//! `EntityIdentityStore` (slot table + free-index stack + generation), +//! adding the refcount and the `type_tag` carried by `AssetHandle`. Payload +//! binding (the loaded bytes) is deliberately out of scope here — it lands +//! with the async loader (E5). +//! +//! Unmanaged: the registry stores no allocator (Asset Pipeline is not on the +//! `engine-zig-conventions.md` §3 allocator-storing whitelist); the +//! allocator is passed to every mutating call. + +const std = @import("std"); +const AssetHandle = @import("asset_handle.zig").AssetHandle; +const AssetType = @import("../format/asset_type.zig").AssetType; + +const Registry = @This(); + +/// Per-slot table, indexed by `AssetHandle.index`. +slots: std.ArrayListUnmanaged(Slot) = .empty, +/// Stack of freed slot indices available for recycling. +free_indices: std.ArrayListUnmanaged(u32) = .empty, + +/// One row of the slot table. Private — consumers go through the verbs. +const Slot = struct { + /// Current generation; bumped on unload so outstanding handles to the + /// previous occupant fail `liveIndex`. + generation: u16, + /// `AssetType` value of the current occupant. + type_tag: u16, + /// Strong reference count; the slot unloads when it reaches 0. + refcount: u32, + /// `true` while the slot points at a live asset. + alive: bool, +}; + +/// Errors surfaced by the mutating verbs. +pub const Error = error{ + /// The handle no longer resolves (freed slot, generation mismatch, + /// type-tag mismatch, or out-of-range index). + StaleHandle, + /// The slot table or free list could not grow. + OutOfMemory, +}; + +/// A resolved view of a live slot returned by `resolve`. +pub const Resolved = struct { + /// Raw `type_tag` stored at allocation. + type_tag: u16, + /// Current strong reference count. + refcount: u32, + /// Decoded `AssetType`, or `null` if the tag is not a known variant. + asset_type: ?AssetType, +}; + +/// Create an empty registry. No allocation happens until the first `alloc`. +pub fn init() Registry { + return .{}; +} + +/// Release the slot table + free list and poison `self`. +pub fn deinit(self: *Registry, gpa: std.mem.Allocator) void { + self.slots.deinit(gpa); + self.free_indices.deinit(gpa); + self.* = undefined; +} + +/// Reserve a fresh handle for an asset of `asset_type`, with refcount 1. +/// Recycles a freed slot (carrying the bumped generation) when one is +/// available, otherwise appends a new slot at generation 0. +/// +/// Errors: `error.OutOfMemory` if the slot table needs to grow. +pub fn alloc(self: *Registry, gpa: std.mem.Allocator, asset_type: AssetType) Error!AssetHandle { + const tag = asset_type.toU16(); + if (self.free_indices.pop()) |idx| { + const slot = &self.slots.items[idx]; + std.debug.assert(!slot.alive); + slot.alive = true; + slot.type_tag = tag; + slot.refcount = 1; + return .{ .index = idx, .generation = slot.generation, .type_tag = tag }; + } + const idx: u32 = @intCast(self.slots.items.len); + try self.slots.append(gpa, .{ .generation = 0, .type_tag = tag, .refcount = 1, .alive = true }); + return .{ .index = idx, .generation = 0, .type_tag = tag }; +} + +/// Resolve a handle to a live-slot view, or `null` if it is stale (freed, +/// generation mismatch, type-tag mismatch, or out-of-range index). +pub fn resolve(self: *const Registry, handle: AssetHandle) ?Resolved { + const idx = self.liveIndex(handle) orelse return null; + const slot = self.slots.items[idx]; + return .{ + .type_tag = slot.type_tag, + .refcount = slot.refcount, + .asset_type = AssetType.fromU16(slot.type_tag), + }; +} + +/// `true` if `handle` currently resolves to a live asset. +pub fn isAlive(self: *const Registry, handle: AssetHandle) bool { + return self.liveIndex(handle) != null; +} + +/// Current strong reference count, or `null` if the handle is stale. +pub fn refCount(self: *const Registry, handle: AssetHandle) ?u32 { + const idx = self.liveIndex(handle) orelse return null; + return self.slots.items[idx].refcount; +} + +/// Add a strong reference. Errors `error.StaleHandle` if the handle no +/// longer resolves. +pub fn retain(self: *Registry, handle: AssetHandle) Error!void { + const idx = self.liveIndex(handle) orelse return error.StaleHandle; + self.slots.items[idx].refcount += 1; +} + +/// Drop a strong reference; unloads the slot (bumping its generation) when +/// the count reaches 0. Errors `error.StaleHandle` if the handle no longer +/// resolves. +pub fn release(self: *Registry, gpa: std.mem.Allocator, handle: AssetHandle) Error!void { + const idx = self.liveIndex(handle) orelse return error.StaleHandle; + const slot = &self.slots.items[idx]; + std.debug.assert(slot.refcount > 0); + slot.refcount -= 1; + if (slot.refcount == 0) try self.freeSlot(gpa, idx); +} + +/// Force-unload the slot regardless of refcount, bumping its generation so +/// every outstanding handle becomes stale. Errors `error.StaleHandle` if +/// the handle is already stale. +pub fn unload(self: *Registry, gpa: std.mem.Allocator, handle: AssetHandle) Error!void { + const idx = self.liveIndex(handle) orelse return error.StaleHandle; + try self.freeSlot(gpa, idx); +} + +/// Number of currently live slots. O(n) — for tests / diagnostics. +pub fn liveCount(self: *const Registry) usize { + var n: usize = 0; + for (self.slots.items) |slot| { + if (slot.alive) n += 1; + } + return n; +} + +/// Return the slot index `handle` refers to, or `null` if it is stale. +fn liveIndex(self: *const Registry, handle: AssetHandle) ?u32 { + if (handle.index >= self.slots.items.len) return null; + const slot = self.slots.items[handle.index]; + if (!slot.alive or slot.generation != handle.generation or slot.type_tag != handle.type_tag) { + return null; + } + return handle.index; +} + +/// Mark a slot dead, bump its generation, and push it onto the free list. +/// Generation arithmetic wraps — only at risk after 64 K reuses of the same +/// slot, past the Phase 0 horizon. +fn freeSlot(self: *Registry, gpa: std.mem.Allocator, idx: u32) Error!void { + const slot = &self.slots.items[idx]; + slot.alive = false; + slot.refcount = 0; + slot.generation +%= 1; + try self.free_indices.append(gpa, idx); +} + +test "alloc returns a live, type-tagged handle with refcount 1" { + const gpa = std.testing.allocator; + var reg = Registry.init(); + defer reg.deinit(gpa); + + const h = try reg.alloc(gpa, .texture); + try std.testing.expect(reg.isAlive(h)); + try std.testing.expectEqual(@as(u32, 1), reg.refCount(h).?); + try std.testing.expectEqual(AssetType.texture, reg.resolve(h).?.asset_type.?); + try std.testing.expectEqual(AssetType.texture, h.assetType().?); + try std.testing.expectEqual(@as(usize, 1), reg.liveCount()); +} + +test "retain and release adjust the refcount without unloading" { + const gpa = std.testing.allocator; + var reg = Registry.init(); + defer reg.deinit(gpa); + + const h = try reg.alloc(gpa, .mesh); + try reg.retain(h); + try std.testing.expectEqual(@as(u32, 2), reg.refCount(h).?); + try reg.release(gpa, h); + try std.testing.expectEqual(@as(u32, 1), reg.refCount(h).?); + try std.testing.expect(reg.isAlive(h)); +} + +test "release to zero unloads and invalidates the handle" { + const gpa = std.testing.allocator; + var reg = Registry.init(); + defer reg.deinit(gpa); + + const h = try reg.alloc(gpa, .audio); + try reg.release(gpa, h); + try std.testing.expect(!reg.isAlive(h)); + try std.testing.expectEqual(@as(?Resolved, null), reg.resolve(h)); + try std.testing.expectError(error.StaleHandle, reg.retain(h)); + try std.testing.expectEqual(@as(usize, 0), reg.liveCount()); +} + +test "explicit unload invalidates outstanding handles regardless of refcount" { + const gpa = std.testing.allocator; + var reg = Registry.init(); + defer reg.deinit(gpa); + + const h = try reg.alloc(gpa, .texture); + try reg.retain(h); // refcount 2 + try reg.unload(gpa, h); + try std.testing.expect(!reg.isAlive(h)); + try std.testing.expectError(error.StaleHandle, reg.unload(gpa, h)); +} + +test "freed slot is reused with a strictly bumped generation" { + const gpa = std.testing.allocator; + var reg = Registry.init(); + defer reg.deinit(gpa); + + const a = try reg.alloc(gpa, .texture); + try std.testing.expectEqual(@as(u16, 0), a.generation); + try reg.unload(gpa, a); + + const b = try reg.alloc(gpa, .mesh); + try std.testing.expectEqual(a.index, b.index); + try std.testing.expect(b.generation > a.generation); + try std.testing.expect(reg.isAlive(b)); + try std.testing.expect(!reg.isAlive(a)); +} + +test "a handle with a mismatched type_tag does not resolve" { + const gpa = std.testing.allocator; + var reg = Registry.init(); + defer reg.deinit(gpa); + + const h = try reg.alloc(gpa, .texture); + var wrong = h; + wrong.type_tag = AssetType.mesh.toU16(); + try std.testing.expect(!reg.isAlive(wrong)); + try std.testing.expect(reg.isAlive(h)); +} + +test "out-of-range handle is treated as stale" { + const gpa = std.testing.allocator; + var reg = Registry.init(); + defer reg.deinit(gpa); + + const bogus = AssetHandle{ .index = 999, .generation = 0, .type_tag = AssetType.texture.toU16() }; + try std.testing.expect(!reg.isAlive(bogus)); + try std.testing.expectError(error.StaleHandle, reg.retain(bogus)); +} diff --git a/src/modules/asset_pipeline/registry/asset_handle.zig b/src/modules/asset_pipeline/registry/asset_handle.zig new file mode 100644 index 0000000..820ea38 --- /dev/null +++ b/src/modules/asset_pipeline/registry/asset_handle.zig @@ -0,0 +1,76 @@ +//! `AssetHandle` — the 64-bit typed, generation-checked asset reference. +//! +//! Layout frozen day 1 (M0.6, brief §Scope). `packed struct(u64)` fixes the +//! field order low-to-high: `index` (u32) addresses the registry slot table, +//! `generation` (u16) detects use-after-unload of a stale handle, `type_tag` +//! (u16) carries the `AssetType` so a handle can be type-checked without a +//! registry lookup. Named `AssetHandle`, not `AssetRef` +//! (`engine-asset-pipeline.md` §8). + +const std = @import("std"); +const AssetType = @import("../format/asset_type.zig").AssetType; + +/// 64-bit asset reference. Always 8 bytes; the `packed struct(u64)` layout +/// is the frozen surface — `@as(u64, @bitCast(h))` is stable on-the-wire. +pub const AssetHandle = packed struct(u64) { + /// Registry slot index. + index: u32, + /// Slot generation at allocation time; bumped on unload so any + /// outstanding handle to the previous occupant fails resolution. + generation: u16, + /// `AssetType` value (`@intFromEnum`) of the referenced asset. + type_tag: u16, + + /// Bit pattern reserved for "no asset". Never produced by + /// `Registry.alloc` — `index = maxInt(u32)` would require 4 G slots. + pub const none = AssetHandle{ + .index = std.math.maxInt(u32), + .generation = std.math.maxInt(u16), + .type_tag = std.math.maxInt(u16), + }; + + /// Pack the handle into its raw 64-bit representation. + pub fn toU64(self: AssetHandle) u64 { + return @bitCast(self); + } + + /// Reconstruct a handle from its raw 64-bit representation. + pub fn fromU64(bits: u64) AssetHandle { + return @bitCast(bits); + } + + /// Decode the `type_tag` into an `AssetType`, or `null` if it is not a + /// known variant. + pub fn assetType(self: AssetHandle) ?AssetType { + return AssetType.fromU16(self.type_tag); + } + + /// `true` if both handles refer to the same slot, generation, and type. + pub fn eql(a: AssetHandle, b: AssetHandle) bool { + return a.toU64() == b.toU64(); + } +}; + +test "asset handle is exactly 64 bits" { + try std.testing.expectEqual(@as(usize, 8), @sizeOf(AssetHandle)); + try std.testing.expectEqual(@as(usize, 64), @bitSizeOf(AssetHandle)); +} + +test "asset handle packs and unpacks losslessly" { + const h = AssetHandle{ .index = 7, .generation = 3, .type_tag = AssetType.texture.toU16() }; + const bits = h.toU64(); + const back = AssetHandle.fromU64(bits); + try std.testing.expect(h.eql(back)); + try std.testing.expectEqual(AssetType.texture, back.assetType().?); +} + +test "asset handle low-to-high field packing matches the frozen layout" { + const h = AssetHandle{ .index = 0x11223344, .generation = 0x5566, .type_tag = 0x7788 }; + // index in the low 32 bits, generation in [32,48), type_tag in [48,64). + try std.testing.expectEqual(@as(u64, 0x7788_5566_11223344), h.toU64()); +} + +test "asset handle none sentinel is distinct from a real handle" { + const real = AssetHandle{ .index = 0, .generation = 0, .type_tag = 0 }; + try std.testing.expect(!real.eql(AssetHandle.none)); +} diff --git a/src/modules/asset_pipeline/registry/root.zig b/src/modules/asset_pipeline/registry/root.zig new file mode 100644 index 0000000..3b26597 --- /dev/null +++ b/src/modules/asset_pipeline/registry/root.zig @@ -0,0 +1,18 @@ +//! Asset Pipeline `registry/` namespace — the `AssetHandle` and the slot +//! `Registry` that tracks handle allocation, refcount, and generation +//! invalidation on unload. + +const asset_handle = @import("asset_handle.zig"); + +/// 64-bit typed, generation-checked asset reference. +pub const AssetHandle = asset_handle.AssetHandle; + +/// Slot registry (refcount + generation invalidation). File-as-type. +pub const Registry = @import("Registry.zig"); + +// Pins so the inline tests of every sub-file are analysed +// (engine-zig-conventions.md §13 module-rooting guard). +comptime { + _ = asset_handle; + _ = Registry; +} From ab68defb307027a6eb1ade8362d0bd991b2a76ab Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Wed, 3 Jun 2026 21:33:35 +0200 Subject: [PATCH 08/29] feat(assets): wire asset_pipeline module and E1 handle test --- build.zig | 28 ++++++++++++++++++ src/modules/asset_pipeline/root.zig | 41 ++++++++++++++++++++++++++ tests/assets/handle_generation.zig | 45 +++++++++++++++++++++++++++++ 3 files changed, 114 insertions(+) create mode 100644 src/modules/asset_pipeline/root.zig create mode 100644 tests/assets/handle_generation.zig diff --git a/build.zig b/build.zig index bad2b2e..f9b21a3 100644 --- a/build.zig +++ b/build.zig @@ -78,6 +78,21 @@ pub fn build(b: *std.Build) void { }); render_module.addImport("weld_core", core_module); + // M0.6 — `weld_asset_pipeline` module: the Tier 1 Asset Pipeline. E1 + // ships the day-1-frozen on-disk surfaces (intermediate + // `.asset.etch` schema + runtime `..bin` 40-byte header), + // the `AssetHandle`, and the slot registry (refcount + generation + // invalidation). Depends on `weld_core` (the E5 async loader consumes the + // Tier 0 job system); the `foundation` (SIMD) import wires in at E2 when + // the `adler32` / `paeth` kernels land. No `weld_etch` dependency + // (brief §Out-of-scope). + const asset_pipeline_module = b.createModule(.{ + .root_source_file = b.path("src/modules/asset_pipeline/root.zig"), + .target = target, + .optimize = optimize, + }); + asset_pipeline_module.addImport("weld_core", core_module); + // M0.2 / E6 — plugin loader ABI module shared with the stub // plugin sub-projects under `tests/core/plugin_loader/stub_plugin/`. // Exposes the C ABI types from `desc.zig` (no `WeldAPI` itself, @@ -209,6 +224,12 @@ pub fn build(b: *std.Build) void { const etch_tests = b.addTest(.{ .root_module = etch_module }); test_step.dependOn(&b.addRunArtifact(etch_tests).step); + // M0.6 — inline tests inside src/modules/asset_pipeline/**. The module + // root re-exports format/ and registry/, so every sub-file is reachable + // and its inline tests run (engine-zig-conventions.md §13). + const asset_pipeline_tests = b.addTest(.{ .root_module = asset_pipeline_module }); + test_step.dependOn(&b.addRunArtifact(asset_pipeline_tests).step); + // Out-of-tree tests. Each file is its own root_module and imports // `weld_core` to reach the engine internals. // Out-of-tree bindings tests need to reach files that live outside @@ -276,6 +297,8 @@ pub fn build(b: *std.Build) void { /// M0.4 — when set, imports the `weld_render` module (GAL public /// surface + Null backend, Vulkan backend wires in later). render: bool = false, + /// M0.6 — when set, imports the `weld_asset_pipeline` module. + asset_pipeline: bool = false, /// M0.4 stabilization — when set, create a dedicated `zig build /// ` step that runs ONLY this test. Used by the CI /// runtime-smoke-test job to gate strictly on the capture PSNR @@ -390,6 +413,8 @@ pub fn build(b: *std.Build) void { // M0.4 — vk_gen *Raw variants emission (3 targets emitted, others // not emitted). .{ .path = "tests/vk_gen/raw_variants.zig" }, + // M0.6 / E1 — asset registry stale-handle (generation) acceptance. + .{ .path = "tests/assets/handle_generation.zig", .asset_pipeline = true }, }; for (test_specs) |spec| { const t_mod = b.createModule(.{ @@ -417,6 +442,9 @@ pub fn build(b: *std.Build) void { if (spec.render) { t_mod.addImport("weld_render", render_module); } + if (spec.asset_pipeline) { + t_mod.addImport("weld_asset_pipeline", asset_pipeline_module); + } const t = b.addTest(.{ .root_module = t_mod }); const t_run = b.addRunArtifact(t); if (spec.needs_stub_plugins) { diff --git a/src/modules/asset_pipeline/root.zig b/src/modules/asset_pipeline/root.zig new file mode 100644 index 0000000..a129614 --- /dev/null +++ b/src/modules/asset_pipeline/root.zig @@ -0,0 +1,41 @@ +//! Asset Pipeline module (Tier 1) — public entry point. +//! +//! M0.6 delivers the minimal Phase 0 pipeline in staged gates. E1 ships the +//! day-1-frozen surfaces every later stage builds on: +//! - `format` — the intermediate `.asset.etch` schema and the runtime +//! `..bin` 40-byte header. +//! - `registry` — `AssetHandle` and the slot `Registry` (refcount + +//! generation invalidation). +//! +//! Later gates add `codecs/` (DEFLATE, PNG, glTF), `importers/`, `cookers/`, +//! `cache/`, and the async `loader/`. The public surface here stays the +//! frozen contract; implementation behind it evolves. +//! +//! Dependencies: `core` (E5 loader consumes the Tier 0 job system) and, from +//! E2, `foundation` (the SIMD kernels). No `weld_etch` dependency +//! (brief §Out-of-scope). + +/// On-disk format surfaces: `AssetType`, the runtime `.bin` header, the +/// intermediate `.asset.etch` document model + reader/writer. +pub const format = @import("format/root.zig"); + +/// Asset identity: `AssetHandle` + the slot `Registry`. +pub const registry = @import("registry/root.zig"); + +/// 64-bit typed asset handle (convenience re-export). +pub const AssetHandle = registry.AssetHandle; +/// Slot registry (convenience re-export). +pub const Registry = registry.Registry; +/// Frozen asset category enum (convenience re-export). +pub const AssetType = format.AssetType; +/// Runtime `.bin` header (convenience re-export). +pub const RuntimeHeader = format.RuntimeHeader; +/// Intermediate-format root schema (convenience re-export). +pub const AssetDoc = format.AssetDoc; + +// Pins so the inline tests of both namespaces are analysed when the module +// is built as a test target (engine-zig-conventions.md §13). +comptime { + _ = format; + _ = registry; +} diff --git a/tests/assets/handle_generation.zig b/tests/assets/handle_generation.zig new file mode 100644 index 0000000..8a31136 --- /dev/null +++ b/tests/assets/handle_generation.zig @@ -0,0 +1,45 @@ +//! M0.6 / E1 — asset registry stale-handle acceptance test. +//! +//! Covers the E1 acceptance criterion (brief §Acceptance ▸ Tests): +//! `test "stale handle after unload is rejected"` — allocate a handle +//! ("load"), capture it, unload, and assert the captured handle no longer +//! resolves (generation mismatch). +//! +//! E1 exercises this at the registry surface (the day-1-frozen identity +//! layer). The full importer → cook → load → unload round-trip wires this +//! same registry into the async loader at E5. + +const std = @import("std"); +const assets = @import("weld_asset_pipeline"); + +const Registry = assets.Registry; +const AssetType = assets.AssetType; + +test "stale handle after unload is rejected" { + const gpa = std.testing.allocator; + var reg = Registry.init(); + defer reg.deinit(gpa); + + // "load" — allocate a handle for a texture asset (refcount 1). + const handle = try reg.alloc(gpa, .texture); + try std.testing.expect(reg.isAlive(handle)); + try std.testing.expectEqual(AssetType.texture, reg.resolve(handle).?.asset_type.?); + + // "unload" — frees the slot and bumps its generation. + try reg.unload(gpa, handle); + + // The captured handle is now stale: it resolves to nothing and every + // verb that takes it reports a stale handle. + try std.testing.expect(!reg.isAlive(handle)); + try std.testing.expectEqual(@as(?Registry.Resolved, null), reg.resolve(handle)); + try std.testing.expectError(error.StaleHandle, reg.retain(handle)); + + // Re-allocating recycles the same slot with a different generation, and + // the stale handle stays rejected even though the index now points at a + // live asset again. + const fresh = try reg.alloc(gpa, .texture); + try std.testing.expectEqual(handle.index, fresh.index); + try std.testing.expect(fresh.generation != handle.generation); + try std.testing.expect(reg.isAlive(fresh)); + try std.testing.expect(!reg.isAlive(handle)); +} From 41b528579580bbca9a690419df657510dcde468a Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Wed, 3 Jun 2026 21:34:21 +0200 Subject: [PATCH 09/29] docs(brief): journal update --- briefs/m0.6-assets.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index cf16667..5053e7a 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -205,6 +205,8 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o - 2026-06-03 18:40 — E1 design: studied `build.zig` wiring (per-test `createModule` + `TestSpec` flags pattern; modules via `addImport`). Per the FROZEN `build.zig` bullet, `asset_pipeline` deps are `foundation` + `core` only (no `weld_etch`) → the intermediate `.asset.etch` will use a self-contained minimal Etch-syntax writer/reader, not the S3 Etch parser. Flagged for confirmation at the E1 gate. - 2026-06-03 18:43 — E1 BLOCKED on the runtime `.bin` header size (frozen-day-1 surface contradiction — see Blockers). Stopped E1 implementation per the blocker protocol; nothing built on an unconfirmed frozen format (7e98a03). - 2026-06-03 21:14 — Header blocker RESOLVED by Claude.ai ruling (Cas 2, brief patched): header = 40 bytes, 8-byte aligned, explicit `_reserved` u32 @28, `hash` u64 @32, `@sizeOf == 40`, no implicit padding. FROZEN SECTION replaced verbatim (header bullet + Out-of-scope no-`weld_etch` + 2 Notes); deviation recorded. Secondary point confirmed: ad-hoc Etch-subset reader/writer, no `weld_etch`, must emit a valid `etch-grammar.md` subset. Resuming E1. +- 2026-06-03 21:33 — E1 implemented (c3184cd, a23bafa, ab68def). `format/`: `AssetType` enum(u16) frozen values; `RuntimeHeader` `extern struct` (40 B, comptime `@offsetOf`/`@sizeOf` guards, explicit little-endian `read`/`writeTo`); `intermediate` `AssetDoc` + `Value`/`Field` model + ad-hoc Etch-subset `writeEtch`/`parseEtch` (round-trip tested). `registry/`: `AssetHandle` `packed struct(u64)`; `Registry` slot table (refcount + generation bump on unload), mirroring `EntityIdentityStore`. `build.zig`: `weld_asset_pipeline` module (dep `weld_core`; `foundation` at E2) + inline-test target + `asset_pipeline` `TestSpec` flag. E1 acceptance test `tests/assets/handle_generation.zig` green. +- 2026-06-03 21:33 — Two Zig 0.16 stdlib API fixes during E1: `std.meta.intToEnum` → `std.enums.fromInt`; ArrayList-writer string build → `std.Io.Writer.Allocating`. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. ## Recorded deviations From 51e15ee9dd1e0971faf781476352f32a62dc71e5 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 00:55:59 +0200 Subject: [PATCH 10/29] feat(simd): add foundation/simd skeleton and adler32 kernel --- src/foundation/root.zig | 12 +++ src/foundation/simd/bench/adler32_bench.zig | 63 +++++++++++++ src/foundation/simd/dispatch.zig | 21 +++++ src/foundation/simd/kernels/adler32.zig | 97 +++++++++++++++++++++ src/foundation/simd/portable.zig | 8 ++ src/foundation/simd/simd.zig | 40 +++++++++ src/foundation/simd/tests/adler32_test.zig | 47 ++++++++++ src/foundation/simd/tests/correctness.zig | 30 +++++++ src/foundation/simd/traits.zig | 58 ++++++++++++ 9 files changed, 376 insertions(+) create mode 100644 src/foundation/root.zig create mode 100644 src/foundation/simd/bench/adler32_bench.zig create mode 100644 src/foundation/simd/dispatch.zig create mode 100644 src/foundation/simd/kernels/adler32.zig create mode 100644 src/foundation/simd/portable.zig create mode 100644 src/foundation/simd/simd.zig create mode 100644 src/foundation/simd/tests/adler32_test.zig create mode 100644 src/foundation/simd/tests/correctness.zig create mode 100644 src/foundation/simd/traits.zig diff --git a/src/foundation/root.zig b/src/foundation/root.zig new file mode 100644 index 0000000..7bca902 --- /dev/null +++ b/src/foundation/root.zig @@ -0,0 +1,12 @@ +//! Foundation — transversal sibling submodules consumed across the engine. +//! +//! Per `engine-spec.md` §3.5 and `engine-simd.md` §4, `math` and `simd` are +//! sibling submodules with no mutual dependency. M0.6 ships `simd` (the +//! batched-kernel module); `math` is added when its first consumer lands. + +/// Batched-SIMD kernels (adler32 in M0.6; audio mix, skinning, … later). +pub const simd = @import("simd/simd.zig"); + +comptime { + _ = simd; +} diff --git a/src/foundation/simd/bench/adler32_bench.zig b/src/foundation/simd/bench/adler32_bench.zig new file mode 100644 index 0000000..1b7987d --- /dev/null +++ b/src/foundation/simd/bench/adler32_bench.zig @@ -0,0 +1,63 @@ +//! ADLER32 throughput baseline (brief §Acceptance ▸ Benchmarks). +//! +//! Records the portable `@Vector` and scalar-reference throughput on a fixed +//! buffer. **Baseline only — no parity target.** ADLER32 sits on a cold +//! path (it runs once at cook time; the runtime mmaps the cooked `.bin`), so +//! a zlib-ng parity target is explicitly out of scope (brief §Notes — +//! cold-path principle). The number exists to track gross regressions, not +//! to chase a reference. +//! +//! Run: `zig build bench-adler32` (add `-- --smoke` for a tiny CI sanity run). + +const std = @import("std"); +const simd = @import("foundation").simd; + +pub fn main(init: std.process.Init) !void { + const gpa = init.gpa; + const io = init.io; + const args = try init.minimal.args.toSlice(init.arena.allocator()); + + var smoke = false; + for (args[1..]) |a| { + if (std.mem.eql(u8, a, "--smoke")) smoke = true; + } + + const buf_len: usize = if (smoke) 64 * 1024 else 8 * 1024 * 1024; + const iterations: usize = if (smoke) 4 else 64; + + const buf = try gpa.alloc(u8, buf_len); + defer gpa.free(buf); + for (buf, 0..) |*b, i| b.* = @truncate(i *% 2_654_435_761); + + var stdout_buf: [4096]u8 = undefined; + var stdout_w = std.Io.File.stdout().writer(io, &stdout_buf); + const out = &stdout_w.interface; + + const vec_mbps = benchOne(simd.kernels.adler32.vectorized, buf, iterations, io); + const ref_mbps = benchOne(simd.kernels.adler32.reference, buf, iterations, io); + + try out.print("## adler32 — bench baseline\n\n", .{}); + try out.print("Buffer: {d} bytes, {d} iterations\n\n", .{ buf_len, iterations }); + try out.print("| Variant | Throughput (MB/s) |\n", .{}); + try out.print("|-----------|-------------------|\n", .{}); + try out.print("| portable | {d:.1} |\n", .{vec_mbps}); + try out.print("| reference | {d:.1} |\n", .{ref_mbps}); + try out.print("\n(baseline only — no parity target; cold path)\n", .{}); + try out.flush(); +} + +fn benchOne(comptime kernel: fn ([]const u8) u32, buf: []const u8, iterations: usize, io: std.Io) f64 { + const start = std.Io.Clock.Timestamp.now(io, .awake); + var sink: u32 = 0; + var it: usize = 0; + while (it < iterations) : (it += 1) { + sink ^= kernel(buf); + } + const elapsed_ns: i96 = start.untilNow(io).raw.nanoseconds; + std.mem.doNotOptimizeAway(sink); + + if (elapsed_ns <= 0) return 0; + const total_bytes: f64 = @floatFromInt(buf.len * iterations); + const seconds: f64 = @as(f64, @floatFromInt(@as(i64, @intCast(elapsed_ns)))) / 1_000_000_000.0; + return total_bytes / seconds / 1_000_000.0; +} diff --git a/src/foundation/simd/dispatch.zig b/src/foundation/simd/dispatch.zig new file mode 100644 index 0000000..36e3fc9 --- /dev/null +++ b/src/foundation/simd/dispatch.zig @@ -0,0 +1,21 @@ +//! Comptime variant dispatch. Each kernel resolves to the best +//! implementation available for the build target, decided at comptime so +//! the call site (`simd.adler32(data)`) carries no runtime branch. +//! +//! M0.6 ships portable `@Vector` kernels only, so every kernel resolves to +//! its portable variant. Arch variants (`arch_x86_64/`, `arch_aarch64/`) +//! slot into the `select*` functions below behind `traits.current` checks +//! in a later phase without touching any call site. + +const portable = @import("portable.zig"); + +/// ADLER32 entry point selected for the build target. +pub const adler32 = selectAdler32(); + +fn selectAdler32() fn (data: []const u8) u32 { + // No ISA-specific variants in M0.6 (brief §Notes — `@Vector` only). + // Future shape: + // if (traits.current.has_avx2) return @import("arch_x86_64/avx2.zig").adler32; + // if (traits.current.has_neon) return @import("arch_aarch64/neon.zig").adler32; + return portable.adler32; +} diff --git a/src/foundation/simd/kernels/adler32.zig b/src/foundation/simd/kernels/adler32.zig new file mode 100644 index 0000000..54ad91e --- /dev/null +++ b/src/foundation/simd/kernels/adler32.zig @@ -0,0 +1,97 @@ +//! ADLER32 checksum — the inaugural `foundation/simd` kernel. +//! +//! Two implementations live here: +//! - `reference` — the simplest correct scalar form (mod every byte). It is +//! the ground-truth oracle the portable path is validated against. +//! - `vectorized` — a portable `@Vector` form using the standard +//! weighted-sum decomposition over NMAX-bounded blocks. +//! +//! Per the brief (§Notes — inaugural SIMD kernels): `@Vector` plus a scalar +//! reference, **no ISA-specific asm**, **no zlib-ng parity chasing**. The +//! point is to validate the `foundation/simd` infrastructure (scalar-vs- +//! `@Vector` test pattern, dispatch layering), not to be fast — ADLER32 runs +//! once at cook time, never on a runtime hot path. + +const std = @import("std"); + +/// Largest prime smaller than 65536; the ADLER32 modulus. +pub const base: u32 = 65521; + +/// Largest block length such that the deferred-modulo accumulation cannot +/// overflow the u64 intermediates in `vectorized`. +pub const nmax: usize = 5552; + +/// Scalar reference implementation — the correctness oracle. Computes +/// ADLER32 of `data`, taking the modulo after every byte. +pub fn reference(data: []const u8) u32 { + var s1: u32 = 1; + var s2: u32 = 0; + for (data) |byte| { + s1 = (s1 + byte) % base; + s2 = (s2 + s1) % base; + } + return (s2 << 16) | s1; +} + +/// Portable `@Vector` implementation. Splits `data` into NMAX-bounded +/// blocks and, per block, computes `Σ D[j]` and `Σ j·D[j]` with vector +/// horizontal reductions, then folds them into the running `(s1, s2)` via +/// the closed form `s2' = s2 + n·s1 + (n·ΣD − Σj·D)`. +pub fn vectorized(data: []const u8) u32 { + const vlen = 16; + const Vu32 = @Vector(vlen, u32); + const lanes: Vu32 = std.simd.iota(u32, vlen); + + var s1: u32 = 1; + var s2: u32 = 0; + + var pos: usize = 0; + while (pos < data.len) { + const block_len = @min(data.len - pos, nmax); + const block = data[pos .. pos + block_len]; + + var sum_d: u64 = 0; // Σ D[j] over the block + var sum_jd: u64 = 0; // Σ j·D[j], j the block-local index + + var i: usize = 0; + while (i + vlen <= block_len) : (i += vlen) { + const bytes: @Vector(vlen, u8) = block[i..][0..vlen].*; + const widened: Vu32 = bytes; // element-wise u8 → u32 widening + const chunk_sum: u32 = @reduce(.Add, widened); + const lane_weighted: u32 = @reduce(.Add, widened * lanes); + // Global index j = i + lane, so Σ j·D = i·ΣD_chunk + Σ lane·D_chunk. + sum_d += chunk_sum; + sum_jd += @as(u64, i) * chunk_sum + lane_weighted; + } + while (i < block_len) : (i += 1) { + const d = block[i]; + sum_d += d; + sum_jd += @as(u64, i) * d; + } + + const n: u64 = block_len; + const a: u64 = s1; + // b_final = b + n·a + Σ (n − j)·D[j] = b + n·a + (n·ΣD − Σj·D). + const b_next: u64 = s2 + n * a + (n * sum_d - sum_jd); + s1 = @intCast((a + sum_d) % base); + s2 = @intCast(b_next % base); + + pos += block_len; + } + + return (s2 << 16) | s1; +} + +test "vectorized adler32 equals the scalar reference on assorted sizes" { + const gpa = std.testing.allocator; + // Edge sizes around the NMAX block boundary and the VLEN stride. + const sizes = [_]usize{ 0, 1, 2, 15, 16, 17, 31, 32, 33, 255, 5551, 5552, 5553, 11104, 65535 }; + var prng = std.Random.DefaultPrng.init(0xA51E32); + const rand = prng.random(); + for (sizes) |n| { + const buf = try gpa.alloc(u8, n); + defer gpa.free(buf); + rand.bytes(buf); + try std.testing.expectEqual(reference(buf), vectorized(buf)); + } +} diff --git a/src/foundation/simd/portable.zig b/src/foundation/simd/portable.zig new file mode 100644 index 0000000..1c97d9a --- /dev/null +++ b/src/foundation/simd/portable.zig @@ -0,0 +1,8 @@ +//! Portable `@Vector` kernel variants — always present on every target, no +//! ISA-specific asm. `dispatch.zig` falls back here when no faster arch +//! variant exists for the build target (the only case in M0.6). + +const adler32_kernel = @import("kernels/adler32.zig"); + +/// Portable ADLER32 (`@Vector`). +pub const adler32 = adler32_kernel.vectorized; diff --git a/src/foundation/simd/simd.zig b/src/foundation/simd/simd.zig new file mode 100644 index 0000000..038606d --- /dev/null +++ b/src/foundation/simd/simd.zig @@ -0,0 +1,40 @@ +//! `foundation/simd` — public API of the batched-SIMD kernel module. +//! +//! Level-1 kernels (the default surface modules consume) take raw slices and +//! resolve their best variant at comptime via `dispatch`. M0.6 stands up the +//! infrastructure with one kernel — `adler32` — used by the Asset Pipeline +//! DEFLATE/zlib codec to verify the ADLER32 trailer. +//! +//! Boundary discipline (engine-simd.md §4): this module imports nothing but +//! `std`. Typed math objects are converted to raw slices at the call site, +//! never here. + +const dispatch = @import("dispatch.zig"); + +/// ADLER32 checksum of `data` (the dispatched best variant for the target). +pub const adler32 = dispatch.adler32; + +/// CPU capability bitflags + comptime detection. +pub const traits = @import("traits.zig"); + +/// Portable kernel variants (always present). +pub const portable = @import("portable.zig"); + +/// Comptime variant dispatch table. +pub const dispatch_table = dispatch; + +/// Kernel internals — exposed so tests and benches can reach the scalar +/// reference alongside the vectorized path. +pub const kernels = struct { + /// ADLER32 kernel (`reference` scalar oracle + `vectorized` `@Vector`). + pub const adler32 = @import("kernels/adler32.zig"); +}; + +// Pins so inline tests in the kernel + traits files are analysed when this +// module is built as a test target (engine-zig-conventions.md §13). +comptime { + _ = traits; + _ = portable; + _ = dispatch; + _ = kernels.adler32; +} diff --git a/src/foundation/simd/tests/adler32_test.zig b/src/foundation/simd/tests/adler32_test.zig new file mode 100644 index 0000000..f7d6d0e --- /dev/null +++ b/src/foundation/simd/tests/adler32_test.zig @@ -0,0 +1,47 @@ +//! ADLER32 kernel tests (brief §Acceptance ▸ Tests). +//! +//! - `test "adler32 portable equals reference"` — the dispatched `@Vector` +//! path matches the scalar oracle on a large, varied corpus. +//! - `test "adler32 known vectors"` — both paths match published ADLER32 +//! values, pinning the kernel to the RFC 1950 definition (not just to +//! internal self-consistency). + +const std = @import("std"); +const simd = @import("foundation").simd; + +const reference = simd.kernels.adler32.reference; +const vectorized = simd.kernels.adler32.vectorized; + +test "adler32 portable equals reference" { + const gpa = std.testing.allocator; + var prng = std.Random.DefaultPrng.init(0x0AD1E32); + const rand = prng.random(); + + // Span several NMAX blocks plus odd tails to exercise the block seam and + // the scalar tail of the vector loop. + const sizes = [_]usize{ 0, 1, 7, 16, 100, 5552, 5553, 16_384, 100_000 }; + for (sizes) |n| { + const buf = try gpa.alloc(u8, n); + defer gpa.free(buf); + rand.bytes(buf); + const want = reference(buf); + try std.testing.expectEqual(want, vectorized(buf)); + // The public dispatched entry point must agree too. + try std.testing.expectEqual(want, simd.adler32(buf)); + } +} + +test "adler32 known vectors" { + const Case = struct { in: []const u8, want: u32 }; + const cases = [_]Case{ + .{ .in = "", .want = 0x0000_0001 }, + .{ .in = "a", .want = 0x0062_0062 }, + .{ .in = "abc", .want = 0x024D_0127 }, + .{ .in = "Wikipedia", .want = 0x11E6_0398 }, + }; + for (cases) |c| { + try std.testing.expectEqual(c.want, reference(c.in)); + try std.testing.expectEqual(c.want, vectorized(c.in)); + try std.testing.expectEqual(c.want, simd.adler32(c.in)); + } +} diff --git a/src/foundation/simd/tests/correctness.zig b/src/foundation/simd/tests/correctness.zig new file mode 100644 index 0000000..33a7232 --- /dev/null +++ b/src/foundation/simd/tests/correctness.zig @@ -0,0 +1,30 @@ +//! Cross-variant correctness harness (engine-simd.md §6). +//! +//! Every kernel variant available for the current target must produce +//! bit-identical output to the scalar reference. M0.6 has only the portable +//! `@Vector` variant (no ISA asm), so this compares the dispatched entry +//! point and the explicit portable variant against the reference. When arch +//! variants land, they are added to the comparison list here and validated +//! by the same corpus. + +const std = @import("std"); +const simd = @import("foundation").simd; + +const reference = simd.kernels.adler32.reference; + +test "every available adler32 variant matches the scalar reference" { + const gpa = std.testing.allocator; + var prng = std.Random.DefaultPrng.init(0xC0FFEE); + const rand = prng.random(); + + const sizes = [_]usize{ 0, 1, 3, 16, 17, 64, 4096, 5552, 12_345 }; + for (sizes) |n| { + const buf = try gpa.alloc(u8, n); + defer gpa.free(buf); + rand.bytes(buf); + const want = reference(buf); + // Variants available for the current target. + try std.testing.expectEqual(want, simd.adler32(buf)); // dispatched + try std.testing.expectEqual(want, simd.portable.adler32(buf)); // explicit portable + } +} diff --git a/src/foundation/simd/traits.zig b/src/foundation/simd/traits.zig new file mode 100644 index 0000000..5860a83 --- /dev/null +++ b/src/foundation/simd/traits.zig @@ -0,0 +1,58 @@ +//! Capability bitflags + comptime target detection for `foundation/simd`. +//! +//! M0.6 ships portable `@Vector` kernels only (no ISA-specific asm, +//! brief §Notes), so nothing dispatches on these flags yet. They exist to +//! stand up the dispatch infrastructure: `dispatch.zig` will branch on +//! `traits.current` once arch variants land in a later phase. + +const std = @import("std"); +const builtin = @import("builtin"); + +/// CPU SIMD capabilities resolved at build time from `builtin.cpu`. +pub const Capabilities = struct { + /// x86_64 SSE4.1. + has_sse4_1: bool = false, + /// x86_64 AVX2. + has_avx2: bool = false, + /// x86_64 AVX-512 Foundation. + has_avx512f: bool = false, + /// x86_64 fused multiply-add. + has_fma: bool = false, + /// aarch64 Advanced SIMD (NEON). + has_neon: bool = false, + /// aarch64 SVE2. + has_sve2: bool = false, +}; + +/// Resolve the target's SIMD capabilities from `builtin.cpu`. Evaluated at +/// comptime (it reads comptime-known CPU features); also callable in tests. +pub fn detect() Capabilities { + var c = Capabilities{}; + const cpu = builtin.cpu; + switch (cpu.arch) { + .x86_64 => { + c.has_sse4_1 = std.Target.x86.featureSetHas(cpu.features, .sse4_1); + c.has_avx2 = std.Target.x86.featureSetHas(cpu.features, .avx2); + c.has_avx512f = std.Target.x86.featureSetHas(cpu.features, .avx512f); + c.has_fma = std.Target.x86.featureSetHas(cpu.features, .fma); + }, + .aarch64, .aarch64_be => { + c.has_neon = std.Target.aarch64.featureSetHas(cpu.features, .neon); + c.has_sve2 = std.Target.aarch64.featureSetHas(cpu.features, .sve2); + }, + else => {}, + } + return c; +} + +/// Capabilities of the current build target. +pub const current: Capabilities = detect(); + +test "capability detection is consistent with the build target" { + try std.testing.expectEqual(detect(), current); + // A64 mandates NEON, so it must be reported on every aarch64 build. + switch (builtin.cpu.arch) { + .aarch64, .aarch64_be => try std.testing.expect(current.has_neon), + else => {}, + } +} From 35ae269cbaf48d50e37de92f117a5e4813c05212 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 00:56:06 +0200 Subject: [PATCH 11/29] feat(assets): add native RFC 1951 inflate and zlib wrapper --- .../asset_pipeline/codecs/deflate/inflate.zig | 322 ++++++++++++++++++ .../asset_pipeline/codecs/deflate/root.zig | 19 ++ .../asset_pipeline/codecs/deflate/zlib.zig | 38 +++ src/modules/asset_pipeline/codecs/root.zig | 10 + src/modules/asset_pipeline/root.zig | 4 + 5 files changed, 393 insertions(+) create mode 100644 src/modules/asset_pipeline/codecs/deflate/inflate.zig create mode 100644 src/modules/asset_pipeline/codecs/deflate/root.zig create mode 100644 src/modules/asset_pipeline/codecs/deflate/zlib.zig create mode 100644 src/modules/asset_pipeline/codecs/root.zig diff --git a/src/modules/asset_pipeline/codecs/deflate/inflate.zig b/src/modules/asset_pipeline/codecs/deflate/inflate.zig new file mode 100644 index 0000000..2f0ef43 --- /dev/null +++ b/src/modules/asset_pipeline/codecs/deflate/inflate.zig @@ -0,0 +1,322 @@ +//! RFC 1951 DEFLATE inflate — native, from scratch, table-driven Huffman. +//! +//! Written from the RFC and the structure of `puff.c` (Mark Adler) / miniz; +//! no `std.compress.flate` code adapted. Target class is "correct + +//! table-driven", **not** zlib-ng (no 64-bit refill, no SIMD copy fast +//! paths). Rationale (brief §Notes): API stability across Zig `std` churn +//! and an owned, homogeneous codec surface for PNG (and later EXR) — not +//! performance; decode runs once at cook time. +//! +//! Supports all three block types: stored (BTYPE 00), fixed Huffman (01), +//! and dynamic Huffman (10). The whole decoded output is kept in one growing +//! buffer, so LZ77 back-references index directly into it (no 32 KiB ring). + +const std = @import("std"); + +/// Errors raised by `inflate`. +pub const Error = error{ + /// Input ended before the stream was complete. + UnexpectedEnd, + /// Reserved block type (BTYPE = 11). + BadBlockType, + /// Stored-block length check (`LEN == ~NLEN`) failed. + BadStoredLength, + /// A bit pattern matched no Huffman code. + BadHuffmanCode, + /// A literal/length symbol outside the valid range. + BadSymbol, + /// A back-reference distance reaches before the start of output. + DistanceTooFar, + /// A Huffman code length exceeded 15 bits. + OversizedCode, + /// Allocation failed. + OutOfMemory, +}; + +// RFC 1951 §3.2.5 — length codes 257..285. +const length_base = [_]u16{ 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 15, 17, 19, 23, 27, 31, 35, 43, 51, 59, 67, 83, 99, 115, 131, 163, 195, 227, 258 }; +const length_extra = [_]u3{ 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 0 }; + +// RFC 1951 §3.2.5 — distance codes 0..29. +const dist_base = [_]u16{ 1, 2, 3, 4, 5, 7, 9, 13, 17, 25, 33, 49, 65, 97, 129, 193, 257, 385, 513, 769, 1025, 1537, 2049, 3073, 4097, 6145, 8193, 12289, 16385, 24577 }; +const dist_extra = [_]u4{ 0, 0, 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, 11, 12, 12, 13, 13 }; + +// RFC 1951 §3.2.7 — order the code-length code lengths are stored in. +const cl_order = [_]u5{ 16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15 }; + +const max_code_len = 15; + +/// LSB-first bit reader over the compressed input. +const BitReader = struct { + src: []const u8, + byte_pos: usize = 0, + bit_buf: u32 = 0, + bit_count: u32 = 0, + + fn fill(self: *BitReader, need: u32) void { + while (self.bit_count < need and self.byte_pos < self.src.len) { + self.bit_buf |= std.math.shl(u32, self.src[self.byte_pos], self.bit_count); + self.byte_pos += 1; + self.bit_count += 8; + } + } + + /// Consume `n` bits (n ≤ 16), LSB-first. Errors if the input is exhausted. + fn take(self: *BitReader, n: u32) Error!u32 { + self.fill(n); + if (self.bit_count < n) return error.UnexpectedEnd; + const mask = std.math.shl(u32, 1, n) -% 1; + const v = self.bit_buf & mask; + self.bit_buf = std.math.shr(u32, self.bit_buf, n); + self.bit_count -= n; + return v; + } + + /// Look at the next `n` bits without consuming (zero-filled past EOF). + fn peek(self: *BitReader, n: u32) u32 { + self.fill(n); + const mask = std.math.shl(u32, 1, n) -% 1; + return self.bit_buf & mask; + } + + fn drop(self: *BitReader, n: u32) void { + self.bit_buf = std.math.shr(u32, self.bit_buf, n); + self.bit_count -= n; + } + + fn alignToByte(self: *BitReader) void { + const rem = self.bit_count % 8; + self.drop(rem); + } +}; + +/// Single-level canonical Huffman decode table. `table[bits]` (the next +/// `max_len` stream bits, LSB-first) yields the symbol and its code length. +const HuffmanTable = struct { + table: []Entry, + max_len: u32, + + const Entry = struct { symbol: u16 = 0, len: u8 = 0 }; + + fn build(gpa: std.mem.Allocator, code_lengths: []const u8) Error!HuffmanTable { + var bl_count = [_]u16{0} ** (max_code_len + 1); + var max_len: u32 = 0; + for (code_lengths) |l| { + if (l > max_code_len) return error.OversizedCode; + if (l > 0) { + bl_count[l] += 1; + if (l > max_len) max_len = l; + } + } + if (max_len == 0) { + // No codes — an empty (unused) table. One zero entry so `decode` + // reports `BadHuffmanCode` if it is ever consulted. + const empty = try gpa.alloc(Entry, 1); + empty[0] = .{}; + return .{ .table = empty, .max_len = 0 }; + } + + // Canonical first code per length (RFC 1951 §3.2.2). + var next_code = [_]u16{0} ** (max_code_len + 1); + var code: u16 = 0; + var bits: u32 = 1; + while (bits <= max_len) : (bits += 1) { + code = (code + bl_count[bits - 1]) << 1; + next_code[bits] = code; + } + + const size = std.math.shl(usize, 1, max_len); + const table = try gpa.alloc(Entry, size); + @memset(table, .{}); + + for (code_lengths, 0..) |l, sym| { + if (l == 0) continue; + const canonical = next_code[l]; + next_code[l] += 1; + const reversed = bitReverse(canonical, l); + // Fill every max_len-bit pattern whose low `l` bits are `reversed`. + var slot: usize = reversed; + const stride = std.math.shl(usize, 1, l); + while (slot < size) : (slot += stride) { + table[slot] = .{ .symbol = @intCast(sym), .len = @intCast(l) }; + } + } + return .{ .table = table, .max_len = max_len }; + } + + fn deinit(self: *HuffmanTable, gpa: std.mem.Allocator) void { + gpa.free(self.table); + self.* = undefined; + } + + fn decode(self: *const HuffmanTable, br: *BitReader) Error!u16 { + const bits = br.peek(self.max_len); + const entry = self.table[bits]; + if (entry.len == 0) return error.BadHuffmanCode; + if (entry.len > br.bit_count) return error.UnexpectedEnd; + br.drop(entry.len); + return entry.symbol; + } +}; + +fn bitReverse(value: u16, len: u32) u16 { + var v = value; + var r: u16 = 0; + var i: u32 = 0; + while (i < len) : (i += 1) { + r = (r << 1) | (v & 1); + v >>= 1; + } + return r; +} + +/// Inflate a raw DEFLATE (RFC 1951) stream into a freshly allocated, +/// caller-owned byte slice. +pub fn inflate(gpa: std.mem.Allocator, src: []const u8) Error![]u8 { + var out: std.ArrayList(u8) = .empty; + errdefer out.deinit(gpa); + + var br = BitReader{ .src = src }; + while (true) { + const bfinal = try br.take(1); + const btype = try br.take(2); + switch (btype) { + 0 => try inflateStored(&br, &out, gpa), + 1 => try inflateFixed(gpa, &br, &out), + 2 => try inflateDynamic(gpa, &br, &out), + else => return error.BadBlockType, + } + if (bfinal == 1) break; + } + return out.toOwnedSlice(gpa); +} + +fn inflateStored(br: *BitReader, out: *std.ArrayList(u8), gpa: std.mem.Allocator) Error!void { + br.alignToByte(); + const len: u16 = @intCast(try br.take(16)); + const nlen: u16 = @intCast(try br.take(16)); + if (len != ~nlen) return error.BadStoredLength; + var i: u16 = 0; + while (i < len) : (i += 1) { + const byte: u8 = @intCast(try br.take(8)); + try out.append(gpa, byte); + } +} + +fn inflateFixed(gpa: std.mem.Allocator, br: *BitReader, out: *std.ArrayList(u8)) Error!void { + // RFC 1951 §3.2.6 — fixed code lengths. + var litlen_lengths = [_]u8{0} ** 288; + for (0..288) |i| { + litlen_lengths[i] = if (i < 144) 8 else if (i < 256) 9 else if (i < 280) 7 else 8; + } + const dist_lengths = [_]u8{5} ** 30; + + var litlen = try HuffmanTable.build(gpa, &litlen_lengths); + defer litlen.deinit(gpa); + var dist = try HuffmanTable.build(gpa, &dist_lengths); + defer dist.deinit(gpa); + try decodeBlock(gpa, br, out, &litlen, &dist); +} + +fn inflateDynamic(gpa: std.mem.Allocator, br: *BitReader, out: *std.ArrayList(u8)) Error!void { + const hlit = try br.take(5) + 257; // # literal/length codes (257..286) + const hdist = try br.take(5) + 1; // # distance codes (1..32) + const hclen = try br.take(4) + 4; // # code-length codes (4..19) + + // Read the code-length code lengths in their shuffled order. + var cl_lengths = [_]u8{0} ** 19; + var i: usize = 0; + while (i < hclen) : (i += 1) { + cl_lengths[cl_order[i]] = @intCast(try br.take(3)); + } + var cl_table = try HuffmanTable.build(gpa, &cl_lengths); + defer cl_table.deinit(gpa); + + // Decode the literal/length + distance code lengths as one sequence. + var lengths = [_]u8{0} ** (288 + 32); + const total = hlit + hdist; + i = 0; + while (i < total) { + const sym = try cl_table.decode(br); + switch (sym) { + 0...15 => { + lengths[i] = @intCast(sym); + i += 1; + }, + 16 => { + if (i == 0) return error.BadSymbol; + const repeat = 3 + try br.take(2); + const prev = lengths[i - 1]; + var r: usize = 0; + while (r < repeat and i < total) : (r += 1) { + lengths[i] = prev; + i += 1; + } + }, + 17 => { + const repeat = 3 + try br.take(3); + var r: usize = 0; + while (r < repeat and i < total) : (r += 1) { + lengths[i] = 0; + i += 1; + } + }, + 18 => { + const repeat = 11 + try br.take(7); + var r: usize = 0; + while (r < repeat and i < total) : (r += 1) { + lengths[i] = 0; + i += 1; + } + }, + else => return error.BadSymbol, + } + } + + var litlen = try HuffmanTable.build(gpa, lengths[0..hlit]); + defer litlen.deinit(gpa); + var dist = try HuffmanTable.build(gpa, lengths[hlit..total]); + defer dist.deinit(gpa); + try decodeBlock(gpa, br, out, &litlen, &dist); +} + +fn decodeBlock( + gpa: std.mem.Allocator, + br: *BitReader, + out: *std.ArrayList(u8), + litlen: *const HuffmanTable, + dist: *const HuffmanTable, +) Error!void { + while (true) { + const sym = try litlen.decode(br); + if (sym < 256) { + try out.append(gpa, @intCast(sym)); + } else if (sym == 256) { + return; // end of block + } else if (sym <= 285) { + const li = sym - 257; + const length = length_base[li] + try br.take(length_extra[li]); + const dsym = try dist.decode(br); + if (dsym >= dist_base.len) return error.BadSymbol; + const distance = dist_base[dsym] + try br.take(dist_extra[dsym]); + if (distance > out.items.len) return error.DistanceTooFar; + const start = out.items.len - distance; + var k: usize = 0; + while (k < length) : (k += 1) { + const byte = out.items[start + k]; + try out.append(gpa, byte); + } + } else { + return error.BadSymbol; + } + } +} + +test "inflate decodes a stored block" { + const gpa = std.testing.allocator; + // BFINAL=1, BTYPE=00, LEN=3, NLEN=~3, "abc". + const stream = [_]u8{ 0x01, 0x03, 0x00, 0xfc, 0xff, 'a', 'b', 'c' }; + const got = try inflate(gpa, &stream); + defer gpa.free(got); + try std.testing.expectEqualStrings("abc", got); +} diff --git a/src/modules/asset_pipeline/codecs/deflate/root.zig b/src/modules/asset_pipeline/codecs/deflate/root.zig new file mode 100644 index 0000000..3d56863 --- /dev/null +++ b/src/modules/asset_pipeline/codecs/deflate/root.zig @@ -0,0 +1,19 @@ +//! DEFLATE / zlib codec namespace (`codecs/deflate/`). +//! +//! Native in-tree RFC 1951 inflate + RFC 1950 zlib wrapper. Decode-only in +//! M0.6; consumed by the PNG codec (E3) and, later, EXR. + +const inflate_mod = @import("inflate.zig"); + +/// Inflate a raw DEFLATE (RFC 1951) stream → caller-owned slice. +pub const inflate = inflate_mod.inflate; +/// Error set raised by `inflate`. +pub const InflateError = inflate_mod.Error; + +/// zlib (RFC 1950) wrapper: header parse + inflate + ADLER32 trailer check. +pub const zlib = @import("zlib.zig"); + +comptime { + _ = inflate_mod; + _ = zlib; +} diff --git a/src/modules/asset_pipeline/codecs/deflate/zlib.zig b/src/modules/asset_pipeline/codecs/deflate/zlib.zig new file mode 100644 index 0000000..92127f8 --- /dev/null +++ b/src/modules/asset_pipeline/codecs/deflate/zlib.zig @@ -0,0 +1,38 @@ +//! zlib (RFC 1950) wrapper around the RFC 1951 inflate. +//! +//! Parses the 2-byte zlib header, inflates the DEFLATE body, and verifies +//! the 4-byte big-endian ADLER32 trailer using the `foundation/simd` +//! `adler32` kernel — the first real consumer of that kernel. + +const std = @import("std"); +const inflate_mod = @import("inflate.zig"); +const simd = @import("foundation").simd; + +/// Errors raised by `decompress` (inflate errors plus the zlib-frame ones). +pub const Error = inflate_mod.Error || error{ + /// Header too short, bad compression method, bad window size, failed + /// header checksum, or an unsupported preset dictionary. + BadZlibHeader, + /// The decoded data's ADLER32 did not match the stored trailer. + BadChecksum, +}; + +/// Decompress a zlib stream into a freshly allocated, caller-owned slice. +/// Validates the header and the ADLER32 trailer. +pub fn decompress(gpa: std.mem.Allocator, src: []const u8) Error![]u8 { + if (src.len < 6) return error.BadZlibHeader; // 2 header + ≥0 body + 4 trailer + const cmf = src[0]; + const flg = src[1]; + if ((cmf & 0x0f) != 8) return error.BadZlibHeader; // CM must be 8 (deflate) + if ((cmf >> 4) > 7) return error.BadZlibHeader; // CINFO: window ≤ 32 KiB + if (((@as(u16, cmf) << 8) | flg) % 31 != 0) return error.BadZlibHeader; // FCHECK + if ((flg & 0x20) != 0) return error.BadZlibHeader; // FDICT unsupported in M0.6 + + const body = src[2 .. src.len - 4]; + const out = try inflate_mod.inflate(gpa, body); + errdefer gpa.free(out); + + const stored = std.mem.readInt(u32, src[src.len - 4 ..][0..4], .big); + if (simd.adler32(out) != stored) return error.BadChecksum; + return out; +} diff --git a/src/modules/asset_pipeline/codecs/root.zig b/src/modules/asset_pipeline/codecs/root.zig new file mode 100644 index 0000000..0d44483 --- /dev/null +++ b/src/modules/asset_pipeline/codecs/root.zig @@ -0,0 +1,10 @@ +//! Asset Pipeline `codecs/` namespace — low-level encode/decode. +//! +//! M0.6 ships `deflate` (E2). PNG and glTF static decode land in E3. + +/// DEFLATE / zlib codec. +pub const deflate = @import("deflate/root.zig"); + +comptime { + _ = deflate; +} diff --git a/src/modules/asset_pipeline/root.zig b/src/modules/asset_pipeline/root.zig index a129614..7c2df03 100644 --- a/src/modules/asset_pipeline/root.zig +++ b/src/modules/asset_pipeline/root.zig @@ -22,6 +22,9 @@ pub const format = @import("format/root.zig"); /// Asset identity: `AssetHandle` + the slot `Registry`. pub const registry = @import("registry/root.zig"); +/// Low-level codecs (E2: DEFLATE/zlib; E3 adds PNG + glTF static). +pub const codecs = @import("codecs/root.zig"); + /// 64-bit typed asset handle (convenience re-export). pub const AssetHandle = registry.AssetHandle; /// Slot registry (convenience re-export). @@ -38,4 +41,5 @@ pub const AssetDoc = format.AssetDoc; comptime { _ = format; _ = registry; + _ = codecs; } From 39b4b35010b3bcecd3e60999d070afd98c731f2d Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 00:56:07 +0200 Subject: [PATCH 12/29] feat(assets): wire foundation dep, codec, and E2 tests --- build.zig | 55 ++++++++++++++++++++++++++ tests/assets/deflate_vectors.zig | 67 ++++++++++++++++++++++++++++++++ 2 files changed, 122 insertions(+) create mode 100644 tests/assets/deflate_vectors.zig diff --git a/build.zig b/build.zig index f9b21a3..67d4503 100644 --- a/build.zig +++ b/build.zig @@ -86,12 +86,27 @@ pub fn build(b: *std.Build) void { // Tier 0 job system); the `foundation` (SIMD) import wires in at E2 when // the `adler32` / `paeth` kernels land. No `weld_etch` dependency // (brief §Out-of-scope). + // M0.6 / E2 — `foundation` module: transversal sibling submodules + // (math, simd). Ships `simd` (batched-SIMD kernels; `adler32` inaugural). + // Imports nothing but std (engine-simd.md §4). Consumed by + // `asset_pipeline` (zlib ADLER32 trailer check), the simd tests, and the + // adler32 bench. + const foundation_module = b.addModule("foundation", .{ + .root_source_file = b.path("src/foundation/root.zig"), + .target = target, + .optimize = optimize, + }); + const asset_pipeline_module = b.createModule(.{ .root_source_file = b.path("src/modules/asset_pipeline/root.zig"), .target = target, .optimize = optimize, }); asset_pipeline_module.addImport("weld_core", core_module); + // M0.6 / E2 — `foundation` dep wires in now that simd exists (deferred + // from E1): the DEFLATE/zlib codec verifies the ADLER32 trailer via + // `foundation.simd.adler32`. + asset_pipeline_module.addImport("foundation", foundation_module); // M0.2 / E6 — plugin loader ABI module shared with the stub // plugin sub-projects under `tests/core/plugin_loader/stub_plugin/`. @@ -230,6 +245,12 @@ pub fn build(b: *std.Build) void { const asset_pipeline_tests = b.addTest(.{ .root_module = asset_pipeline_module }); test_step.dependOn(&b.addRunArtifact(asset_pipeline_tests).step); + // M0.6 / E2 — inline tests inside src/foundation/** (traits + kernels). + // simd.zig re-exports traits/portable/dispatch/kernels, so they are all + // reachable and analysed (engine-zig-conventions.md §13). + const foundation_tests = b.addTest(.{ .root_module = foundation_module }); + test_step.dependOn(&b.addRunArtifact(foundation_tests).step); + // Out-of-tree tests. Each file is its own root_module and imports // `weld_core` to reach the engine internals. // Out-of-tree bindings tests need to reach files that live outside @@ -299,6 +320,8 @@ pub fn build(b: *std.Build) void { render: bool = false, /// M0.6 — when set, imports the `weld_asset_pipeline` module. asset_pipeline: bool = false, + /// M0.6 / E2 — when set, imports the `foundation` module (simd). + foundation: bool = false, /// M0.4 stabilization — when set, create a dedicated `zig build /// ` step that runs ONLY this test. Used by the CI /// runtime-smoke-test job to gate strictly on the capture PSNR @@ -415,6 +438,11 @@ pub fn build(b: *std.Build) void { .{ .path = "tests/vk_gen/raw_variants.zig" }, // M0.6 / E1 — asset registry stale-handle (generation) acceptance. .{ .path = "tests/assets/handle_generation.zig", .asset_pipeline = true }, + // M0.6 / E2 — DEFLATE/zlib inflate known-vector acceptance. + .{ .path = "tests/assets/deflate_vectors.zig", .asset_pipeline = true }, + // M0.6 / E2 — adler32 kernel known vectors + cross-variant correctness. + .{ .path = "src/foundation/simd/tests/adler32_test.zig", .foundation = true }, + .{ .path = "src/foundation/simd/tests/correctness.zig", .foundation = true }, }; for (test_specs) |spec| { const t_mod = b.createModule(.{ @@ -445,6 +473,9 @@ pub fn build(b: *std.Build) void { if (spec.asset_pipeline) { t_mod.addImport("weld_asset_pipeline", asset_pipeline_module); } + if (spec.foundation) { + t_mod.addImport("foundation", foundation_module); + } const t = b.addTest(.{ .root_module = t_mod }); const t_run = b.addRunArtifact(t); if (spec.needs_stub_plugins) { @@ -719,6 +750,30 @@ pub fn build(b: *std.Build) void { ); render_bench_step.dependOn(&render_bench_run.step); + // ------------------------------------------- M0.6 adler32 baseline bench -- + // + // Inaugural foundation/simd kernel throughput baseline. No parity target + // (cold path — runs once at cook time). `zig build bench-adler32`. + const adler32_bench_module = b.createModule(.{ + .root_source_file = b.path("src/foundation/simd/bench/adler32_bench.zig"), + .target = target, + .optimize = optimize, + }); + adler32_bench_module.addImport("foundation", foundation_module); + const adler32_bench_exe = b.addExecutable(.{ + .name = "adler32-bench", + .root_module = adler32_bench_module, + }); + b.installArtifact(adler32_bench_exe); + const adler32_bench_run = b.addRunArtifact(adler32_bench_exe); + adler32_bench_run.step.dependOn(b.getInstallStep()); + if (b.args) |args| adler32_bench_run.addArgs(args); + const adler32_bench_step = b.step( + "bench-adler32", + "Run the M0.6 adler32 throughput baseline (pass `-- --smoke` for a CI sanity run)", + ); + adler32_bench_step.dependOn(&adler32_bench_run.step); + // -------------------------------------------- Fixture facade (S4 demo) -- // `@embedFile` cannot escape the package root of the module that diff --git a/tests/assets/deflate_vectors.zig b/tests/assets/deflate_vectors.zig new file mode 100644 index 0000000..2232d8b --- /dev/null +++ b/tests/assets/deflate_vectors.zig @@ -0,0 +1,67 @@ +//! M0.6 / E2 — DEFLATE/zlib inflate known-vector acceptance tests. +//! +//! Vectors were produced by Python's `zlib` (the reference encoder) at +//! authoring time and embedded verbatim; M0.6 ships no encoder, so inflate +//! is validated bit-exact against an independent compressor. The fixed and +//! dynamic streams were selected by inspecting the first block's BTYPE bits +//! (01 = fixed, 10 = dynamic). +//! +//! Brief §Acceptance ▸ Tests: `test "inflate fixed huffman"`, +//! `test "inflate dynamic huffman"`. + +const std = @import("std"); +const assets = @import("weld_asset_pipeline"); + +const inflate = assets.codecs.deflate.inflate; +const zlib = assets.codecs.deflate.zlib; + +// --- vectors (Python zlib, raw deflate wbits=-15 unless noted) --------------- + +const fixed_compressed = [_]u8{ 0x4b, 0x4c, 0x04, 0x01, 0x00 }; +const fixed_expected = [_]u8{ 0x61, 0x61, 0x61, 0x61, 0x61, 0x61 }; // "aaaaaa" + +const dynamic_compressed = [_]u8{ 0xed, 0xcb, 0xc9, 0x15, 0x80, 0x20, 0x10, 0x04, 0xd1, 0x54, 0x3a, 0x02, 0x63, 0xf1, 0x60, 0x02, 0x2e, 0x6c, 0x0a, 0x8c, 0xb2, 0x0a, 0xd1, 0x3b, 0x31, 0x78, 0xf4, 0x79, 0xae, 0x5f, 0x93, 0x16, 0xb8, 0xb2, 0x59, 0x0f, 0x2c, 0x81, 0xaa, 0x87, 0xa4, 0x1b, 0x7b, 0x76, 0x67, 0x04, 0x15, 0x11, 0x90, 0x38, 0xdb, 0xb9, 0x37, 0x6c, 0xa4, 0x06, 0x4c, 0x3f, 0x7e, 0x8b, 0xc7, 0x99, 0x9d, 0x6b, 0x58, 0x18, 0x55, 0x93, 0x34, 0xa4, 0x29, 0x82, 0x53, 0x17, 0x1e, 0xd6, 0x5c, 0x99, 0x02, 0xbf, 0x2a, 0x7e, 0x0c, 0x3e }; +const dynamic_expected = "The quick brown fox jumps over the lazy dog. " ** 8 ++ "Pack my box with five dozen liquor jugs. " ** 6; + +const stored_compressed = [_]u8{ 0x01, 0x29, 0x00, 0xd6, 0xff, 0x57, 0x65, 0x6c, 0x64, 0x20, 0x73, 0x74, 0x6f, 0x72, 0x65, 0x64, 0x20, 0x62, 0x6c, 0x6f, 0x63, 0x6b, 0x20, 0x74, 0x65, 0x73, 0x74, 0x20, 0x70, 0x61, 0x79, 0x6c, 0x6f, 0x61, 0x64, 0x20, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39 }; +const stored_expected = "Weld stored block test payload 0123456789"; + +const zlib_compressed = [_]u8{ 0x78, 0xda, 0x0b, 0x4f, 0xcd, 0x49, 0x51, 0xa8, 0xca, 0xc9, 0x4c, 0x52, 0x28, 0x2f, 0x4a, 0x2c, 0x28, 0x48, 0x2d, 0x52, 0x28, 0xca, 0x2f, 0xcd, 0x4b, 0xd1, 0x2d, 0x29, 0xca, 0x2c, 0x50, 0x28, 0xcf, 0x2c, 0xc9, 0x50, 0x70, 0x74, 0xf1, 0x71, 0x0d, 0x32, 0x36, 0x52, 0x28, 0x29, 0x4a, 0xcc, 0xcc, 0x01, 0xca, 0x97, 0xa5, 0x16, 0x65, 0xa6, 0x65, 0x26, 0x27, 0x96, 0x64, 0xe6, 0xe7, 0xe9, 0x01, 0x00, 0xdd, 0xd3, 0x16, 0xe0 }; +const zlib_expected = "Weld zlib wrapper round-trip with ADLER32 trailer verification."; + +// ----------------------------------------------------------------------------- + +test "inflate fixed huffman" { + const gpa = std.testing.allocator; + const got = try inflate(gpa, &fixed_compressed); + defer gpa.free(got); + try std.testing.expectEqualSlices(u8, &fixed_expected, got); +} + +test "inflate dynamic huffman" { + const gpa = std.testing.allocator; + const got = try inflate(gpa, &dynamic_compressed); + defer gpa.free(got); + try std.testing.expectEqualSlices(u8, dynamic_expected, got); +} + +test "inflate stored block" { + const gpa = std.testing.allocator; + const got = try inflate(gpa, &stored_compressed); + defer gpa.free(got); + try std.testing.expectEqualSlices(u8, stored_expected, got); +} + +test "zlib decompress verifies the adler32 trailer" { + const gpa = std.testing.allocator; + const got = try zlib.decompress(gpa, &zlib_compressed); + defer gpa.free(got); + try std.testing.expectEqualSlices(u8, zlib_expected, got); +} + +test "zlib decompress rejects a corrupted adler32 trailer" { + const gpa = std.testing.allocator; + var corrupt = zlib_compressed; + corrupt[corrupt.len - 1] ^= 0xff; // flip the last trailer byte + try std.testing.expectError(error.BadChecksum, zlib.decompress(gpa, &corrupt)); +} From 17459821166f17ab7ab74eca33c1661f0e0556cb Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 00:56:08 +0200 Subject: [PATCH 13/29] docs(brief): journal update --- briefs/m0.6-assets.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index 5053e7a..d37b4b9 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -207,6 +207,8 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o - 2026-06-03 21:14 — Header blocker RESOLVED by Claude.ai ruling (Cas 2, brief patched): header = 40 bytes, 8-byte aligned, explicit `_reserved` u32 @28, `hash` u64 @32, `@sizeOf == 40`, no implicit padding. FROZEN SECTION replaced verbatim (header bullet + Out-of-scope no-`weld_etch` + 2 Notes); deviation recorded. Secondary point confirmed: ad-hoc Etch-subset reader/writer, no `weld_etch`, must emit a valid `etch-grammar.md` subset. Resuming E1. - 2026-06-03 21:33 — E1 implemented (c3184cd, a23bafa, ab68def). `format/`: `AssetType` enum(u16) frozen values; `RuntimeHeader` `extern struct` (40 B, comptime `@offsetOf`/`@sizeOf` guards, explicit little-endian `read`/`writeTo`); `intermediate` `AssetDoc` + `Value`/`Field` model + ad-hoc Etch-subset `writeEtch`/`parseEtch` (round-trip tested). `registry/`: `AssetHandle` `packed struct(u64)`; `Registry` slot table (refcount + generation bump on unload), mirroring `EntityIdentityStore`. `build.zig`: `weld_asset_pipeline` module (dep `weld_core`; `foundation` at E2) + inline-test target + `asset_pipeline` `TestSpec` flag. E1 acceptance test `tests/assets/handle_generation.zig` green. - 2026-06-03 21:33 — Two Zig 0.16 stdlib API fixes during E1: `std.meta.intToEnum` → `std.enums.fromInt`; ArrayList-writer string build → `std.Io.Writer.Allocating`. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. +- 2026-06-04 00:55 — E2 implemented (GO relayed by Guy). `foundation/simd/` skeleton: `traits` (capability bitflags + comptime `detect`), `dispatch` (comptime variant select, portable-only in M0.6), `portable`, `simd` (public re-export); `kernels/adler32.zig` = scalar `reference` oracle + portable `@Vector` `vectorized` (NMAX-block weighted-sum, no ISA asm). `foundation` module added (`addModule`), `math` deferred to its first consumer. DEFLATE: native RFC 1951 `inflate` (stored + fixed + dynamic blocks, table-driven canonical Huffman via single-level lookup + bit-reversed codes, LZ77 back-refs over the whole output buffer) written from scratch (puff.c/miniz as structural reference only); `zlib` wrapper (header parse + inflate + big-endian ADLER32 trailer verified via `foundation.simd.adler32` — first real kernel consumer). +- 2026-06-04 00:55 — E2 tests: `tests/assets/deflate_vectors.zig` (fixed/dynamic/stored/zlib + corrupted-trailer rejection) against Python-zlib vectors (decode-only milestone — authentic encoder output embedded, BTYPE inspected to label fixed=01/dynamic=10); `adler32_test.zig` (known vectors `""`/`"a"`/`"abc"`/`"Wikipedia"` + portable==reference) + `correctness.zig` (every variant == reference). `build.zig`: `foundation` dep on `asset_pipeline`, `foundation_tests` target, `foundation` `TestSpec` flag, `bench-adler32` step. Zig 0.16: `std.time.Timer` removed → bench uses `std.Io.Clock.Timestamp(.awake)` via `init.io` (keeps `foundation` std-only). Bench baseline (Apple Silicon, smoke): portable ≈ 349 MB/s, reference ≈ 96 MB/s. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. ## Recorded deviations From aa2dcec7c2bba7ef3d96d4c909e4e7fee6c0a49a Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 01:30:29 +0200 Subject: [PATCH 14/29] feat(simd): add paeth_filter_decode kernel --- build.zig | 25 +++++ src/foundation/simd/bench/paeth_bench.zig | 69 +++++++++++++ src/foundation/simd/dispatch.zig | 8 ++ src/foundation/simd/kernels/paeth.zig | 118 ++++++++++++++++++++++ src/foundation/simd/portable.zig | 4 + src/foundation/simd/simd.zig | 6 ++ src/foundation/simd/tests/paeth_test.zig | 49 +++++++++ 7 files changed, 279 insertions(+) create mode 100644 src/foundation/simd/bench/paeth_bench.zig create mode 100644 src/foundation/simd/kernels/paeth.zig create mode 100644 src/foundation/simd/tests/paeth_test.zig diff --git a/build.zig b/build.zig index 67d4503..8bdd7c4 100644 --- a/build.zig +++ b/build.zig @@ -443,6 +443,8 @@ pub fn build(b: *std.Build) void { // M0.6 / E2 — adler32 kernel known vectors + cross-variant correctness. .{ .path = "src/foundation/simd/tests/adler32_test.zig", .foundation = true }, .{ .path = "src/foundation/simd/tests/correctness.zig", .foundation = true }, + // M0.6 / E3 — paeth_filter_decode kernel portable == reference. + .{ .path = "src/foundation/simd/tests/paeth_test.zig", .foundation = true }, }; for (test_specs) |spec| { const t_mod = b.createModule(.{ @@ -774,6 +776,29 @@ pub fn build(b: *std.Build) void { ); adler32_bench_step.dependOn(&adler32_bench_run.step); + // ------------------------------------------- M0.6 paeth baseline bench ---- + // + // Second foundation/simd kernel throughput baseline. No parity target. + const paeth_bench_module = b.createModule(.{ + .root_source_file = b.path("src/foundation/simd/bench/paeth_bench.zig"), + .target = target, + .optimize = optimize, + }); + paeth_bench_module.addImport("foundation", foundation_module); + const paeth_bench_exe = b.addExecutable(.{ + .name = "paeth-bench", + .root_module = paeth_bench_module, + }); + b.installArtifact(paeth_bench_exe); + const paeth_bench_run = b.addRunArtifact(paeth_bench_exe); + paeth_bench_run.step.dependOn(b.getInstallStep()); + if (b.args) |args| paeth_bench_run.addArgs(args); + const paeth_bench_step = b.step( + "bench-paeth", + "Run the M0.6 paeth_filter_decode throughput baseline (pass `-- --smoke` for a CI sanity run)", + ); + paeth_bench_step.dependOn(&paeth_bench_run.step); + // -------------------------------------------- Fixture facade (S4 demo) -- // `@embedFile` cannot escape the package root of the module that diff --git a/src/foundation/simd/bench/paeth_bench.zig b/src/foundation/simd/bench/paeth_bench.zig new file mode 100644 index 0000000..9936f89 --- /dev/null +++ b/src/foundation/simd/bench/paeth_bench.zig @@ -0,0 +1,69 @@ +//! Paeth-filter-decode throughput baseline (brief §Acceptance ▸ Benchmarks). +//! +//! **Baseline only — no parity target** (cold path; PNG defiltering runs +//! once at cook time). Tracks gross regressions, nothing more. +//! +//! Run: `zig build bench-paeth` (add `-- --smoke` for a tiny CI sanity run). + +const std = @import("std"); +const simd = @import("foundation").simd; + +pub fn main(init: std.process.Init) !void { + const gpa = init.gpa; + const io = init.io; + const args = try init.minimal.args.toSlice(init.arena.allocator()); + + var smoke = false; + for (args[1..]) |a| { + if (std.mem.eql(u8, a, "--smoke")) smoke = true; + } + + const bpp: u8 = 4; // RGBA8 scanline + const row_len: usize = if (smoke) 4 * 1024 else 4 * 1024 * 1024; + const iterations: usize = if (smoke) 4 else 64; + + const prev = try gpa.alloc(u8, row_len); + defer gpa.free(prev); + const curr = try gpa.alloc(u8, row_len); + defer gpa.free(curr); + for (prev, 0..) |*b, i| b.* = @truncate(i *% 40_503); + + var stdout_buf: [4096]u8 = undefined; + var stdout_w = std.Io.File.stdout().writer(io, &stdout_buf); + const out = &stdout_w.interface; + + const vec_mbps = benchOne(simd.kernels.paeth.vectorized, prev, curr, bpp, iterations, io); + const ref_mbps = benchOne(simd.kernels.paeth.reference, prev, curr, bpp, iterations, io); + + try out.print("## paeth_filter_decode — bench baseline\n\n", .{}); + try out.print("Scanline: {d} bytes (bpp={d}), {d} iterations\n\n", .{ row_len, bpp, iterations }); + try out.print("| Variant | Throughput (MB/s) |\n", .{}); + try out.print("|-----------|-------------------|\n", .{}); + try out.print("| portable | {d:.1} |\n", .{vec_mbps}); + try out.print("| reference | {d:.1} |\n", .{ref_mbps}); + try out.print("\n(baseline only — no parity target; cold path)\n", .{}); + try out.flush(); +} + +fn benchOne( + comptime kernel: fn ([]const u8, []u8, u8) void, + prev: []const u8, + curr: []u8, + bpp: u8, + iterations: usize, + io: std.Io, +) f64 { + const start = std.Io.Clock.Timestamp.now(io, .awake); + var it: usize = 0; + while (it < iterations) : (it += 1) { + @memcpy(curr, prev); // reset the filtered scanline each pass + kernel(prev, curr, bpp); + } + const elapsed_ns: i96 = start.untilNow(io).raw.nanoseconds; + std.mem.doNotOptimizeAway(curr.ptr); + + if (elapsed_ns <= 0) return 0; + const total_bytes: f64 = @floatFromInt(curr.len * iterations); + const seconds: f64 = @as(f64, @floatFromInt(@as(i64, @intCast(elapsed_ns)))) / 1_000_000_000.0; + return total_bytes / seconds / 1_000_000.0; +} diff --git a/src/foundation/simd/dispatch.zig b/src/foundation/simd/dispatch.zig index 36e3fc9..9c161a2 100644 --- a/src/foundation/simd/dispatch.zig +++ b/src/foundation/simd/dispatch.zig @@ -19,3 +19,11 @@ fn selectAdler32() fn (data: []const u8) u32 { // if (traits.current.has_neon) return @import("arch_aarch64/neon.zig").adler32; return portable.adler32; } + +/// PNG Paeth-filter decode entry point selected for the build target. +pub const paeth_filter_decode = selectPaeth(); + +fn selectPaeth() fn (prev: []const u8, curr: []u8, bpp: u8) void { + // Portable-only in M0.6; arch variants slot in here later. + return portable.paeth_filter_decode; +} diff --git a/src/foundation/simd/kernels/paeth.zig b/src/foundation/simd/kernels/paeth.zig new file mode 100644 index 0000000..e2f7128 --- /dev/null +++ b/src/foundation/simd/kernels/paeth.zig @@ -0,0 +1,118 @@ +//! PNG Paeth-filter decode — the second `foundation/simd` kernel. +//! +//! Unfilters one PNG Paeth (filter type 4) scanline in place: +//! `curr[i] += PaethPredictor(curr[i-bpp], prev[i], prev[i-bpp])`, with bytes +//! before the start of a scanline treated as zero. `prev` is the +//! already-unfiltered previous scanline (all zeros for the first row); +//! `prev.len == curr.len`. +//! +//! Same discipline as `adler32` (brief §Notes — inaugural SIMD kernels): +//! scalar `reference` oracle + portable `@Vector` `vectorized`, **no +//! ISA-specific asm**, baseline only. The Paeth recurrence is sequential +//! along pixels (each pixel's left neighbour is a just-computed output), so +//! the `@Vector` form parallelizes across the `bpp` channels of one pixel, +//! not across pixels. Purpose: validate the multi-kernel dispatch pattern, +//! not performance (decode runs once at cook time). + +const std = @import("std"); + +/// Scalar reference — the correctness oracle. Unfilters one Paeth scanline. +pub fn reference(prev: []const u8, curr: []u8, bpp: u8) void { + var i: usize = 0; + while (i < curr.len) : (i += 1) { + const a: u8 = if (i >= bpp) curr[i - bpp] else 0; + const b: u8 = if (i < prev.len) prev[i] else 0; + const c: u8 = if (i >= bpp and i - bpp < prev.len) prev[i - bpp] else 0; + curr[i] +%= predictor(a, b, c); + } +} + +/// Portable `@Vector` form — parallelizes the `bpp` channels of each pixel. +/// Falls back to `reference` for `bpp` outside 1..4 (M0.6 never exceeds 4). +pub fn vectorized(prev: []const u8, curr: []u8, bpp: u8) void { + std.debug.assert(prev.len == curr.len); + switch (bpp) { + 1 => vectorizedImpl(1, prev, curr), + 2 => vectorizedImpl(2, prev, curr), + 3 => vectorizedImpl(3, prev, curr), + 4 => vectorizedImpl(4, prev, curr), + else => reference(prev, curr, bpp), + } +} + +fn vectorizedImpl(comptime bpp: usize, prev: []const u8, curr: []u8) void { + const V = @Vector(bpp, i16); + const n_px = curr.len / bpp; + var a: V = @splat(0); // left pixel (already-unfiltered current scanline) + var c: V = @splat(0); // up-left pixel (previous scanline) + var px: usize = 0; + while (px < n_px) : (px += 1) { + const off = px * bpp; + const b: V = load(bpp, prev[off..][0..bpp]); + const cur: V = load(bpp, curr[off..][0..bpp]); + const res = cur + paethVec(bpp, a, b, c); + const masked = res & @as(V, @splat(0xff)); + store(bpp, curr[off..][0..bpp], masked); + a = masked; // this pixel's unfiltered value becomes the next left + c = b; + } + // Valid PNG scanlines are an exact multiple of bpp; nothing trails. + std.debug.assert(n_px * bpp == curr.len); +} + +fn load(comptime bpp: usize, src: *const [bpp]u8) @Vector(bpp, i16) { + const bytes: @Vector(bpp, u8) = src.*; + return bytes; // element-wise u8 → i16 widening +} + +fn store(comptime bpp: usize, dst: *[bpp]u8, v: @Vector(bpp, i16)) void { + const bytes: @Vector(bpp, u8) = @intCast(v); + dst.* = bytes; +} + +fn paethVec(comptime bpp: usize, a: @Vector(bpp, i16), b: @Vector(bpp, i16), c: @Vector(bpp, i16)) @Vector(bpp, i16) { + const p = a + b - c; + const pa = @abs(p - a); + const pb = @abs(p - b); + const pc = @abs(p - c); + const false_vec: @Vector(bpp, bool) = @splat(false); + // pick_a = (pa <= pb) AND (pa <= pc), expressed via @select to avoid + // relying on bool-vector bitwise ops. + const pick_a = @select(bool, pa <= pb, pa <= pc, false_vec); + const bc = @select(i16, pb <= pc, b, c); + return @select(i16, pick_a, a, bc); +} + +fn predictor(a: u8, b: u8, c: u8) u8 { + const p = @as(i16, a) + @as(i16, b) - @as(i16, c); + const pa = @abs(p - @as(i16, a)); + const pb = @abs(p - @as(i16, b)); + const pc = @abs(p - @as(i16, c)); + if (pa <= pb and pa <= pc) return a; + if (pb <= pc) return b; + return c; +} + +test "vectorized paeth equals the scalar reference on assorted bpp/widths" { + const gpa = std.testing.allocator; + var prng = std.Random.DefaultPrng.init(0x9AE74); + const rand = prng.random(); + const bpps = [_]u8{ 1, 2, 3, 4 }; + const widths = [_]usize{ 1, 2, 5, 8, 13, 64 }; + for (bpps) |bpp| { + for (widths) |w| { + const n = w * bpp; + const prev = try gpa.alloc(u8, n); + defer gpa.free(prev); + const a = try gpa.alloc(u8, n); + defer gpa.free(a); + rand.bytes(prev); + rand.bytes(a); + const b = try gpa.dupe(u8, a); + defer gpa.free(b); + reference(prev, a, bpp); + vectorized(prev, b, bpp); + try std.testing.expectEqualSlices(u8, a, b); + } + } +} diff --git a/src/foundation/simd/portable.zig b/src/foundation/simd/portable.zig index 1c97d9a..f612e8b 100644 --- a/src/foundation/simd/portable.zig +++ b/src/foundation/simd/portable.zig @@ -3,6 +3,10 @@ //! variant exists for the build target (the only case in M0.6). const adler32_kernel = @import("kernels/adler32.zig"); +const paeth_kernel = @import("kernels/paeth.zig"); /// Portable ADLER32 (`@Vector`). pub const adler32 = adler32_kernel.vectorized; + +/// Portable PNG Paeth-filter decode (`@Vector`). +pub const paeth_filter_decode = paeth_kernel.vectorized; diff --git a/src/foundation/simd/simd.zig b/src/foundation/simd/simd.zig index 038606d..0ff5e5e 100644 --- a/src/foundation/simd/simd.zig +++ b/src/foundation/simd/simd.zig @@ -14,6 +14,9 @@ const dispatch = @import("dispatch.zig"); /// ADLER32 checksum of `data` (the dispatched best variant for the target). pub const adler32 = dispatch.adler32; +/// PNG Paeth-filter decode of one scanline in place (dispatched variant). +pub const paeth_filter_decode = dispatch.paeth_filter_decode; + /// CPU capability bitflags + comptime detection. pub const traits = @import("traits.zig"); @@ -28,6 +31,8 @@ pub const dispatch_table = dispatch; pub const kernels = struct { /// ADLER32 kernel (`reference` scalar oracle + `vectorized` `@Vector`). pub const adler32 = @import("kernels/adler32.zig"); + /// PNG Paeth-filter-decode kernel (`reference` + `vectorized`). + pub const paeth = @import("kernels/paeth.zig"); }; // Pins so inline tests in the kernel + traits files are analysed when this @@ -37,4 +42,5 @@ comptime { _ = portable; _ = dispatch; _ = kernels.adler32; + _ = kernels.paeth; } diff --git a/src/foundation/simd/tests/paeth_test.zig b/src/foundation/simd/tests/paeth_test.zig new file mode 100644 index 0000000..44319f4 --- /dev/null +++ b/src/foundation/simd/tests/paeth_test.zig @@ -0,0 +1,49 @@ +//! Paeth-filter-decode kernel tests (brief §Acceptance ▸ Tests: +//! `test "paeth portable equals reference"`). + +const std = @import("std"); +const simd = @import("foundation").simd; + +const reference = simd.kernels.paeth.reference; +const vectorized = simd.kernels.paeth.vectorized; + +test "paeth portable equals reference" { + const gpa = std.testing.allocator; + var prng = std.Random.DefaultPrng.init(0x9AE7_4FED); + const rand = prng.random(); + + const bpps = [_]u8{ 1, 2, 3, 4 }; + const widths = [_]usize{ 1, 3, 4, 7, 16, 100 }; + for (bpps) |bpp| { + for (widths) |w| { + const n = w * bpp; + const prev = try gpa.alloc(u8, n); + defer gpa.free(prev); + rand.bytes(prev); + + const ref_buf = try gpa.alloc(u8, n); + defer gpa.free(ref_buf); + rand.bytes(ref_buf); + const vec_buf = try gpa.dupe(u8, ref_buf); + defer gpa.free(vec_buf); + const pub_buf = try gpa.dupe(u8, ref_buf); + defer gpa.free(pub_buf); + + reference(prev, ref_buf, bpp); + vectorized(prev, vec_buf, bpp); + simd.paeth_filter_decode(prev, pub_buf, bpp); // dispatched public entry + + try std.testing.expectEqualSlices(u8, ref_buf, vec_buf); + try std.testing.expectEqualSlices(u8, ref_buf, pub_buf); + } + } +} + +test "paeth unfilters a known scanline (bpp=1)" { + // First row has an all-zero previous scanline, so Paeth reduces to Sub: + // curr[i] += curr[i-1] (predictor picks a=left when b=c=0). + const zero_prev = [_]u8{ 0, 0, 0, 0 }; + var curr = [_]u8{ 5, 1, 1, 1 }; // filtered deltas + reference(&zero_prev, &curr, 1); + try std.testing.expectEqualSlices(u8, &.{ 5, 6, 7, 8 }, &curr); +} From 8d734498cce63c818c3310bba5fc68a9749edb82 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 01:30:30 +0200 Subject: [PATCH 15/29] feat(assets): add PNG, glTF static, and WAV decoders --- .../asset_pipeline/codecs/gltf/decode.zig | 255 ++++++++++++ .../asset_pipeline/codecs/gltf/root.zig | 17 + .../asset_pipeline/codecs/png/decode.zig | 373 ++++++++++++++++++ .../asset_pipeline/codecs/png/root.zig | 16 + src/modules/asset_pipeline/codecs/root.zig | 8 + src/modules/asset_pipeline/importers/root.zig | 11 + src/modules/asset_pipeline/importers/wav.zig | 119 ++++++ src/modules/asset_pipeline/root.zig | 6 +- 8 files changed, 804 insertions(+), 1 deletion(-) create mode 100644 src/modules/asset_pipeline/codecs/gltf/decode.zig create mode 100644 src/modules/asset_pipeline/codecs/gltf/root.zig create mode 100644 src/modules/asset_pipeline/codecs/png/decode.zig create mode 100644 src/modules/asset_pipeline/codecs/png/root.zig create mode 100644 src/modules/asset_pipeline/importers/root.zig create mode 100644 src/modules/asset_pipeline/importers/wav.zig diff --git a/src/modules/asset_pipeline/codecs/gltf/decode.zig b/src/modules/asset_pipeline/codecs/gltf/decode.zig new file mode 100644 index 0000000..bef56bb --- /dev/null +++ b/src/modules/asset_pipeline/codecs/gltf/decode.zig @@ -0,0 +1,255 @@ +//! Native glTF 2.0 static-mesh decoder. +//! +//! Parses the JSON with `std.json` (no hand-rolled JSON parser, no cgltf C +//! binding) and extracts the first mesh primitive's POSITION / NORMAL / +//! TEXCOORD_0 attributes and indices. Static only — no skinning, no +//! animation, no morph targets (brief §Out-of-scope). +//! +//! Buffers must be embedded base64 `data:` URIs (the M0.6 cooked path); +//! external `.bin` files and `.glb` containers are deferred. + +const std = @import("std"); + +/// Errors raised by `decode`. +pub const Error = error{ + /// `std.json` failed to parse the document. + BadJson, + /// The document has no meshes / primitives. + NoMesh, + /// The first primitive has no POSITION attribute. + NoPositions, + /// An accessor used a component type this decoder does not handle. + UnsupportedComponentType, + /// A buffer URI was not an embedded base64 `data:` URI. + UnsupportedBuffer, + /// An accessor/bufferView reached past its buffer. + Truncated, + /// Allocation failed. + OutOfMemory, +}; + +/// A decoded static mesh. All slices are caller-owned. +pub const Mesh = struct { + /// XYZ positions, 3 floats per vertex. + positions: []f32, + /// XYZ normals (3 per vertex), or null if absent. + normals: ?[]f32, + /// UV0 coordinates (2 per vertex), or null if absent. + uvs: ?[]f32, + /// Triangle indices (synthesized 0..n-1 if the primitive was non-indexed). + indices: []u32, + /// Vertex count (== positions.len / 3). + vertex_count: u32, + /// Axis-aligned bounds (from the POSITION accessor min/max, else computed). + bounds_min: [3]f32, + /// Axis-aligned bounds maximum. + bounds_max: [3]f32, + + /// Free every owned slice and poison `self`. + pub fn deinit(self: *Mesh, gpa: std.mem.Allocator) void { + gpa.free(self.positions); + if (self.normals) |n| gpa.free(n); + if (self.uvs) |u| gpa.free(u); + gpa.free(self.indices); + self.* = undefined; + } +}; + +const component_f32 = 5126; + +const Gltf = struct { + buffers: []Buffer, + bufferViews: []BufferView, + accessors: []Accessor, + meshes: []MeshJson, + + const Buffer = struct { uri: ?[]const u8 = null, byteLength: usize = 0 }; + const BufferView = struct { buffer: usize, byteOffset: usize = 0, byteLength: usize = 0, byteStride: ?usize = null }; + const Accessor = struct { + bufferView: usize, + byteOffset: usize = 0, + componentType: u32, + count: usize, + type: []const u8, + min: ?[]f32 = null, + max: ?[]f32 = null, + }; + const MeshJson = struct { primitives: []Primitive }; + const Primitive = struct { attributes: Attributes, indices: ?usize = null }; + const Attributes = struct { POSITION: ?usize = null, NORMAL: ?usize = null, TEXCOORD_0: ?usize = null }; +}; + +/// Decode a glTF 2.0 document (`.gltf` JSON) into a static `Mesh`. +pub fn decode(gpa: std.mem.Allocator, src: []const u8) Error!Mesh { + const parsed = std.json.parseFromSlice(Gltf, gpa, src, .{ .ignore_unknown_fields = true }) catch return error.BadJson; + defer parsed.deinit(); + const doc = parsed.value; + + if (doc.meshes.len == 0 or doc.meshes[0].primitives.len == 0) return error.NoMesh; + const prim = doc.meshes[0].primitives[0]; + + // Decode all buffers (embedded base64) up front. + const buffers = try gpa.alloc([]u8, doc.buffers.len); + var decoded: usize = 0; + defer { + for (buffers[0..decoded]) |b| gpa.free(b); + gpa.free(buffers); + } + for (doc.buffers) |b| { + buffers[decoded] = try decodeDataUri(gpa, b.uri orelse return error.UnsupportedBuffer); + decoded += 1; + } + + const pos_index = prim.attributes.POSITION orelse return error.NoPositions; + const positions = try readFloats(gpa, doc, buffers, pos_index, 3); + errdefer gpa.free(positions); + const vertex_count: u32 = @intCast(positions.len / 3); + + var normals: ?[]f32 = null; + errdefer if (normals) |n| gpa.free(n); + if (prim.attributes.NORMAL) |ni| normals = try readFloats(gpa, doc, buffers, ni, 3); + + var uvs: ?[]f32 = null; + errdefer if (uvs) |u| gpa.free(u); + if (prim.attributes.TEXCOORD_0) |ui| uvs = try readFloats(gpa, doc, buffers, ui, 2); + + const indices = if (prim.indices) |ii| + try readIndices(gpa, doc, buffers, ii) + else + try sequentialIndices(gpa, vertex_count); + errdefer gpa.free(indices); + + var bmin: [3]f32 = .{ 0, 0, 0 }; + var bmax: [3]f32 = .{ 0, 0, 0 }; + computeBounds(doc.accessors[pos_index], positions, &bmin, &bmax); + + return .{ + .positions = positions, + .normals = normals, + .uvs = uvs, + .indices = indices, + .vertex_count = vertex_count, + .bounds_min = bmin, + .bounds_max = bmax, + }; +} + +fn decodeDataUri(gpa: std.mem.Allocator, uri: []const u8) Error![]u8 { + const marker = "base64,"; + const idx = std.mem.indexOf(u8, uri, marker) orelse return error.UnsupportedBuffer; + if (!std.mem.startsWith(u8, uri, "data:")) return error.UnsupportedBuffer; + const b64 = uri[idx + marker.len ..]; + const dec = std.base64.standard.Decoder; + const n = dec.calcSizeForSlice(b64) catch return error.BadJson; + const out = try gpa.alloc(u8, n); + errdefer gpa.free(out); + dec.decode(out, b64) catch return error.BadJson; + return out; +} + +fn readFloats(gpa: std.mem.Allocator, doc: Gltf, buffers: []const []u8, accessor_index: usize, comps: usize) Error![]f32 { + const acc = doc.accessors[accessor_index]; + if (acc.componentType != component_f32) return error.UnsupportedComponentType; + const view = doc.bufferViews[acc.bufferView]; + const buf = buffers[view.buffer]; + const elem = comps * 4; + const stride = view.byteStride orelse elem; + const base = view.byteOffset + acc.byteOffset; + + const out = try gpa.alloc(f32, acc.count * comps); + errdefer gpa.free(out); + var i: usize = 0; + while (i < acc.count) : (i += 1) { + var c: usize = 0; + while (c < comps) : (c += 1) { + const off = base + i * stride + c * 4; + if (off + 4 > buf.len) return error.Truncated; + out[i * comps + c] = @bitCast(std.mem.readInt(u32, buf[off..][0..4], .little)); + } + } + return out; +} + +fn readIndices(gpa: std.mem.Allocator, doc: Gltf, buffers: []const []u8, accessor_index: usize) Error![]u32 { + const acc = doc.accessors[accessor_index]; + const csize: usize = switch (acc.componentType) { + 5121 => 1, // u8 + 5123 => 2, // u16 + 5125 => 4, // u32 + else => return error.UnsupportedComponentType, + }; + const view = doc.bufferViews[acc.bufferView]; + const buf = buffers[view.buffer]; + const stride = view.byteStride orelse csize; + const base = view.byteOffset + acc.byteOffset; + + const out = try gpa.alloc(u32, acc.count); + errdefer gpa.free(out); + var i: usize = 0; + while (i < acc.count) : (i += 1) { + const off = base + i * stride; + if (off + csize > buf.len) return error.Truncated; + out[i] = switch (csize) { + 1 => buf[off], + 2 => std.mem.readInt(u16, buf[off..][0..2], .little), + 4 => std.mem.readInt(u32, buf[off..][0..4], .little), + else => unreachable, + }; + } + return out; +} + +fn sequentialIndices(gpa: std.mem.Allocator, vertex_count: u32) Error![]u32 { + const out = try gpa.alloc(u32, vertex_count); + for (out, 0..) |*v, i| v.* = @intCast(i); + return out; +} + +fn computeBounds(pos_acc: Gltf.Accessor, positions: []const f32, bmin: *[3]f32, bmax: *[3]f32) void { + if (pos_acc.min) |mn| { + if (pos_acc.max) |mx| { + if (mn.len >= 3 and mx.len >= 3) { + bmin.* = .{ mn[0], mn[1], mn[2] }; + bmax.* = .{ mx[0], mx[1], mx[2] }; + return; + } + } + } + if (positions.len < 3) return; + bmin.* = .{ positions[0], positions[1], positions[2] }; + bmax.* = bmin.*; + var i: usize = 0; + while (i < positions.len) : (i += 3) { + for (0..3) |c| { + bmin[c] = @min(bmin[c], positions[i + c]); + bmax[c] = @max(bmax[c], positions[i + c]); + } + } +} + +test "decode static cube glTF extracts positions, normals, uvs, indices" { + const gpa = std.testing.allocator; + var mesh = try decode(gpa, cube_gltf); + defer mesh.deinit(gpa); + + try std.testing.expectEqual(@as(u32, 8), mesh.vertex_count); + try std.testing.expectEqual(@as(usize, 24), mesh.positions.len); + try std.testing.expectEqual(@as(usize, 36), mesh.indices.len); + try std.testing.expect(mesh.normals != null); + try std.testing.expectEqual(@as(usize, 16), mesh.uvs.?.len); + try std.testing.expectEqual([3]f32{ -1, -1, -1 }, mesh.bounds_min); + try std.testing.expectEqual([3]f32{ 1, 1, 1 }, mesh.bounds_max); + // First corner position is (-1,-1,-1). + try std.testing.expectEqual(@as(f32, -1), mesh.positions[0]); + // First triangle references vertices 0,1,2. + try std.testing.expectEqual([3]u32{ 0, 1, 2 }, mesh.indices[0..3].*); +} + +test "decode rejects non-glTF json" { + const gpa = std.testing.allocator; + try std.testing.expectError(error.NoMesh, decode(gpa, "{\"buffers\":[],\"bufferViews\":[],\"accessors\":[],\"meshes\":[]}")); +} + +const cube_gltf = + \\{"asset":{"version":"2.0"},"buffers":[{"byteLength":328,"uri":"data:application/octet-stream;base64,AACAvwAAgL8AAIC/AACAPwAAgL8AAIC/AACAPwAAgD8AAIC/AACAvwAAgD8AAIC/AACAvwAAgL8AAIA/AACAPwAAgL8AAIA/AACAPwAAgD8AAIA/AACAvwAAgD8AAIA/Os0TvzrNE786zRO/Os0TPzrNE786zRO/Os0TPzrNEz86zRO/Os0TvzrNEz86zRO/Os0TvzrNE786zRM/Os0TPzrNE786zRM/Os0TPzrNEz86zRM/Os0TvzrNEz86zRM/AAAAAAAAAAAAAIA/AAAAAAAAgD8AAIA/AAAAAAAAgD8AAAAAAAAAAAAAgD8AAAAAAACAPwAAgD8AAAAAAACAPwAAAQACAAAAAgADAAQABgAFAAQABwAGAAAABAAFAAAABQABAAEABQAGAAEABgACAAIABgAHAAIABwADAAMABwAEAAMABAAAAA=="}],"bufferViews":[{"buffer":0,"byteOffset":0,"byteLength":96},{"buffer":0,"byteOffset":96,"byteLength":96},{"buffer":0,"byteOffset":192,"byteLength":64},{"buffer":0,"byteOffset":256,"byteLength":72}],"accessors":[{"bufferView":0,"componentType":5126,"count":8,"type":"VEC3","min":[-1,-1,-1],"max":[1,1,1]},{"bufferView":1,"componentType":5126,"count":8,"type":"VEC3"},{"bufferView":2,"componentType":5126,"count":8,"type":"VEC2"},{"bufferView":3,"componentType":5123,"count":36,"type":"SCALAR"}],"meshes":[{"primitives":[{"attributes":{"POSITION":0,"NORMAL":1,"TEXCOORD_0":2},"indices":3}]}]} +; diff --git a/src/modules/asset_pipeline/codecs/gltf/root.zig b/src/modules/asset_pipeline/codecs/gltf/root.zig new file mode 100644 index 0000000..d722692 --- /dev/null +++ b/src/modules/asset_pipeline/codecs/gltf/root.zig @@ -0,0 +1,17 @@ +//! glTF static-mesh decode codec namespace (`codecs/gltf/`). +//! +//! Native in-tree glTF 2.0 static decode (JSON via `std.json`). Static +//! geometry only in M0.6 — no skinning/animation/morph (brief §Out-of-scope). + +const decode_mod = @import("decode.zig"); + +/// Decode a glTF 2.0 document into a static `Mesh`. +pub const decode = decode_mod.decode; +/// Decoded static mesh. +pub const Mesh = decode_mod.Mesh; +/// Error set raised by `decode`. +pub const Error = decode_mod.Error; + +comptime { + _ = decode_mod; +} diff --git a/src/modules/asset_pipeline/codecs/png/decode.zig b/src/modules/asset_pipeline/codecs/png/decode.zig new file mode 100644 index 0000000..5356c67 --- /dev/null +++ b/src/modules/asset_pipeline/codecs/png/decode.zig @@ -0,0 +1,373 @@ +//! Native PNG decoder → RGBA8. +//! +//! Parses IHDR/PLTE/tRNS/IDAT/IEND, inflates the IDAT zlib stream (E2 +//! `codecs/deflate`), reverses the five line filters (Paeth via the +//! `foundation/simd` `paeth_filter_decode` kernel), handles palette + tRNS +//! alpha and bit depths 1/2/4/8 and Adam7 interlace, and expands every +//! pixel to RGBA8 (the M0.6 cooked-texture payload). +//! +//! Out of scope (M0.6, brief §Out-of-scope): 16-bit channels, grayscale/RGB +//! colour-key tRNS, mipmaps, GPU compression. Chunk CRC32 is parsed past but +//! not verified (the IDAT ADLER32 already guards the pixel stream); a +//! CRC32 check is deferred. + +const std = @import("std"); +const zlib = @import("../deflate/zlib.zig"); +const simd = @import("foundation").simd; + +/// Errors raised by `decode`. +pub const Error = error{ + /// Missing or wrong 8-byte PNG signature. + BadSignature, + /// A chunk or the pixel stream ended early. + Truncated, + /// Malformed IHDR or an inconsistent palette index. + BadHeader, + /// Colour type not one of 0/2/3/4/6. + UnsupportedColorType, + /// Bit depth invalid for the colour type, or 16-bit (out of scope). + UnsupportedBitDepth, + /// Interlace method other than 0 (none) or 1 (Adam7). + UnsupportedInterlace, + /// Indexed image without a PLTE chunk. + MissingPalette, + /// Unknown scanline filter type. + BadFilter, + /// Allocation failed. + OutOfMemory, +} || zlib.Error; + +/// A decoded image as tightly-packed RGBA8, row-major. +pub const Image = struct { + /// Pixel width. + width: u32, + /// Pixel height. + height: u32, + /// `width * height * 4` bytes, R,G,B,A per pixel. Caller-owned. + pixels: []u8, + + /// Free the pixel buffer and poison `self`. + pub fn deinit(self: *Image, gpa: std.mem.Allocator) void { + gpa.free(self.pixels); + self.* = undefined; + } +}; + +const signature = [_]u8{ 0x89, 'P', 'N', 'G', 0x0d, 0x0a, 0x1a, 0x0a }; + +const Header = struct { + width: u32, + height: u32, + bit_depth: u8, + color_type: u8, + interlace: u8, +}; + +/// Decode `src` (a complete PNG file) to an RGBA8 `Image`. +pub fn decode(gpa: std.mem.Allocator, src: []const u8) Error!Image { + if (src.len < signature.len or !std.mem.eql(u8, src[0..signature.len], &signature)) { + return error.BadSignature; + } + + var header: ?Header = null; + var palette: ?[]const u8 = null; + var trns: ?[]const u8 = null; + var idat: std.ArrayList(u8) = .empty; + defer idat.deinit(gpa); + + var pos: usize = signature.len; + while (pos + 8 <= src.len) { + const len = std.mem.readInt(u32, src[pos..][0..4], .big); + const ctype = src[pos + 4 ..][0..4]; + const data_start = pos + 8; + if (data_start + len + 4 > src.len) return error.Truncated; + const data = src[data_start .. data_start + len]; + + if (std.mem.eql(u8, ctype, "IHDR")) { + header = try parseHeader(data); + } else if (std.mem.eql(u8, ctype, "PLTE")) { + palette = data; + } else if (std.mem.eql(u8, ctype, "tRNS")) { + trns = data; + } else if (std.mem.eql(u8, ctype, "IDAT")) { + try idat.appendSlice(gpa, data); + } else if (std.mem.eql(u8, ctype, "IEND")) { + break; + } + pos = data_start + len + 4; // skip data + CRC + } + + const h = header orelse return error.BadHeader; + const channels = try channelsOf(h.color_type); + + const raw = try zlib.decompress(gpa, idat.items); + defer gpa.free(raw); + + const sample_count = @as(usize, h.width) * h.height * channels; + const samples = try gpa.alloc(u8, sample_count); + defer gpa.free(samples); + + if (h.interlace == 0) { + try deinterlaceNone(gpa, h, channels, raw, samples); + } else { + try deinterlaceAdam7(gpa, h, channels, raw, samples); + } + + const pixels = try gpa.alloc(u8, @as(usize, h.width) * h.height * 4); + errdefer gpa.free(pixels); + try convertToRgba(samples, h, channels, palette, trns, pixels); + + return .{ .width = h.width, .height = h.height, .pixels = pixels }; +} + +fn parseHeader(data: []const u8) Error!Header { + if (data.len < 13) return error.BadHeader; + const h = Header{ + .width = std.mem.readInt(u32, data[0..4], .big), + .height = std.mem.readInt(u32, data[4..8], .big), + .bit_depth = data[8], + .color_type = data[9], + .interlace = data[12], + }; + if (h.width == 0 or h.height == 0) return error.BadHeader; + if (data[10] != 0) return error.BadHeader; // compression method + if (data[11] != 0) return error.BadHeader; // filter method + if (h.interlace > 1) return error.UnsupportedInterlace; + if (h.bit_depth == 16) return error.UnsupportedBitDepth; + try validateDepth(h.color_type, h.bit_depth); + return h; +} + +fn channelsOf(color_type: u8) Error!u8 { + return switch (color_type) { + 0 => 1, // grayscale + 2 => 3, // RGB + 3 => 1, // indexed + 4 => 2, // grayscale + alpha + 6 => 4, // RGBA + else => error.UnsupportedColorType, + }; +} + +fn validateDepth(color_type: u8, bit_depth: u8) Error!void { + const ok = switch (color_type) { + 0, 3 => bit_depth == 1 or bit_depth == 2 or bit_depth == 4 or bit_depth == 8, + 2, 4, 6 => bit_depth == 8, + else => return error.UnsupportedColorType, + }; + if (!ok) return error.UnsupportedBitDepth; +} + +fn bytesPerRow(width: u32, channels: u8, bit_depth: u8) usize { + return (@as(usize, width) * channels * bit_depth + 7) / 8; +} + +fn filterBpp(channels: u8, bit_depth: u8) u8 { + const bits = @as(usize, channels) * bit_depth; + return @intCast(@max(1, bits / 8)); +} + +fn unfilterRow(filter_type: u8, row: []u8, prev: []const u8, bpp: u8) Error!void { + switch (filter_type) { + 0 => {}, + 1 => { // Sub + var i: usize = bpp; + while (i < row.len) : (i += 1) row[i] +%= row[i - bpp]; + }, + 2 => { // Up + for (row, 0..) |*x, i| x.* +%= prev[i]; + }, + 3 => { // Average + for (row, 0..) |*x, i| { + const a: u16 = if (i >= bpp) row[i - bpp] else 0; + const b: u16 = prev[i]; + x.* +%= @intCast((a + b) / 2); + } + }, + 4 => simd.paeth_filter_decode(prev, row, bpp), // Paeth (foundation/simd) + else => return error.BadFilter, + } +} + +fn unpackRow(packed_row: []const u8, width: u32, bit_depth: u8, out: []u8) void { + if (bit_depth == 8) { + // `out.len == width * channels`; the byte stream copies straight over. + @memcpy(out, packed_row[0..out.len]); + return; + } + // bit_depth < 8 implies channels == 1 (grayscale or indexed). MSB-first. + const mask: u8 = (@as(u8, 1) << @intCast(bit_depth)) - 1; + var bit: usize = 0; + var px: usize = 0; + while (px < width) : (px += 1) { + const byte = packed_row[bit / 8]; + const shift: u3 = @intCast(8 - bit_depth - (bit % 8)); + out[px] = (byte >> shift) & mask; + bit += bit_depth; + } +} + +/// Non-interlaced: unfilter each row in place, then unpack into `samples`. +fn deinterlaceNone(gpa: std.mem.Allocator, h: Header, channels: u8, raw: []u8, samples: []u8) Error!void { + const row_bytes = bytesPerRow(h.width, channels, h.bit_depth); + const bpp = filterBpp(channels, h.bit_depth); + const stride = 1 + row_bytes; + if (raw.len < @as(usize, h.height) * stride) return error.Truncated; + + const zero = try gpa.alloc(u8, row_bytes); + defer gpa.free(zero); + @memset(zero, 0); + + var y: usize = 0; + while (y < h.height) : (y += 1) { + const base = y * stride; + const row = raw[base + 1 ..][0..row_bytes]; + const prev = if (y == 0) zero else raw[(y - 1) * stride + 1 ..][0..row_bytes]; + try unfilterRow(raw[base], row, prev, bpp); + } + y = 0; + const row_samples = @as(usize, h.width) * channels; + while (y < h.height) : (y += 1) { + const row = raw[y * stride + 1 ..][0..row_bytes]; + unpackRow(row, h.width, h.bit_depth, samples[y * row_samples ..][0..row_samples]); + } +} + +/// Adam7: 7 passes, each unfiltered + unpacked then scattered into the grid. +fn deinterlaceAdam7(gpa: std.mem.Allocator, h: Header, channels: u8, raw: []u8, samples: []u8) Error!void { + const Pass = struct { xs: u32, ys: u32, xstep: u32, ystep: u32 }; + const passes = [7]Pass{ + .{ .xs = 0, .ys = 0, .xstep = 8, .ystep = 8 }, + .{ .xs = 4, .ys = 0, .xstep = 8, .ystep = 8 }, + .{ .xs = 0, .ys = 4, .xstep = 4, .ystep = 8 }, + .{ .xs = 2, .ys = 0, .xstep = 4, .ystep = 4 }, + .{ .xs = 0, .ys = 2, .xstep = 2, .ystep = 4 }, + .{ .xs = 1, .ys = 0, .xstep = 2, .ystep = 2 }, + .{ .xs = 0, .ys = 1, .xstep = 1, .ystep = 2 }, + }; + const bpp = filterBpp(channels, h.bit_depth); + + const zero = try gpa.alloc(u8, bytesPerRow(h.width, channels, h.bit_depth)); + defer gpa.free(zero); + @memset(zero, 0); + const pass_samples = try gpa.alloc(u8, @as(usize, h.width) * channels); + defer gpa.free(pass_samples); + + var cursor: usize = 0; + for (passes) |p| { + const pw: u32 = if (h.width > p.xs) (h.width - p.xs + p.xstep - 1) / p.xstep else 0; + const ph: u32 = if (h.height > p.ys) (h.height - p.ys + p.ystep - 1) / p.ystep else 0; + if (pw == 0 or ph == 0) continue; + + const row_bytes = bytesPerRow(pw, channels, h.bit_depth); + const stride = 1 + row_bytes; + if (cursor + @as(usize, ph) * stride > raw.len) return error.Truncated; + + var yy: usize = 0; + while (yy < ph) : (yy += 1) { + const base = cursor + yy * stride; + const row = raw[base + 1 ..][0..row_bytes]; + const prev = if (yy == 0) zero[0..row_bytes] else raw[cursor + (yy - 1) * stride + 1 ..][0..row_bytes]; + try unfilterRow(raw[base], row, prev, bpp); + } + yy = 0; + while (yy < ph) : (yy += 1) { + const row = raw[cursor + yy * stride + 1 ..][0..row_bytes]; + unpackRow(row, pw, h.bit_depth, pass_samples[0 .. @as(usize, pw) * channels]); + const img_y = p.ys + yy * p.ystep; + var xx: usize = 0; + while (xx < pw) : (xx += 1) { + const img_x = p.xs + xx * p.xstep; + const dst = samples[(@as(usize, img_y) * h.width + img_x) * channels ..][0..channels]; + @memcpy(dst, pass_samples[xx * channels ..][0..channels]); + } + } + cursor += @as(usize, ph) * stride; + } +} + +fn scale(sample: u8, bit_depth: u8) u8 { + return switch (bit_depth) { + 8 => sample, + 4 => sample * 17, + 2 => sample * 85, + 1 => sample * 255, + else => sample, + }; +} + +fn convertToRgba(samples: []const u8, h: Header, channels: u8, palette: ?[]const u8, trns: ?[]const u8, out: []u8) Error!void { + const total = @as(usize, h.width) * h.height; + var px: usize = 0; + while (px < total) : (px += 1) { + const s = samples[px * channels ..][0..channels]; + const o = out[px * 4 ..][0..4]; + switch (h.color_type) { + 0 => { + const g = scale(s[0], h.bit_depth); + o.* = .{ g, g, g, 255 }; + }, + 2 => o.* = .{ s[0], s[1], s[2], 255 }, + 3 => { + const pal = palette orelse return error.MissingPalette; + const idx: usize = s[0]; + if (idx * 3 + 2 >= pal.len) return error.BadHeader; + const a: u8 = if (trns) |t| (if (idx < t.len) t[idx] else 255) else 255; + o.* = .{ pal[idx * 3], pal[idx * 3 + 1], pal[idx * 3 + 2], a }; + }, + 4 => { + const g = scale(s[0], h.bit_depth); + o.* = .{ g, g, g, s[1] }; + }, + 6 => o.* = .{ s[0], s[1], s[2], s[3] }, + else => return error.UnsupportedColorType, + } + } +} + +// --- tests (vectors manually encoded, decode verified by Pillow) ------------ + +test "decode RGBA8 with all five scanline filters" { + const gpa = std.testing.allocator; + var img = try decode(gpa, &png_rgba_filters); + defer img.deinit(gpa); + try std.testing.expectEqual(@as(u32, png_rgba_filters_w), img.width); + try std.testing.expectEqual(@as(u32, png_rgba_filters_h), img.height); + try std.testing.expectEqualSlices(u8, &png_rgba_filters_expected, img.pixels); +} + +test "decode 4-bit palette image with tRNS alpha" { + const gpa = std.testing.allocator; + var img = try decode(gpa, &png_palette); + defer img.deinit(gpa); + try std.testing.expectEqual(@as(u32, png_palette_w), img.width); + try std.testing.expectEqualSlices(u8, &png_palette_expected, img.pixels); +} + +test "decode Adam7 interlaced RGB image" { + const gpa = std.testing.allocator; + var img = try decode(gpa, &png_interlaced); + defer img.deinit(gpa); + try std.testing.expectEqual(@as(u32, png_interlaced_w), img.width); + try std.testing.expectEqualSlices(u8, &png_interlaced_expected, img.pixels); +} + +test "decode rejects a non-PNG buffer" { + const gpa = std.testing.allocator; + try std.testing.expectError(error.BadSignature, decode(gpa, "not a png at all!!")); +} + +const png_rgba_filters_w = 8; +const png_rgba_filters_h = 5; +const png_rgba_filters = [_]u8{ 0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x00, 0x00, 0x0d, 0x49, 0x48, 0x44, 0x52, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x05, 0x08, 0x06, 0x00, 0x00, 0x00, 0x78, 0x91, 0xad, 0x55, 0x00, 0x00, 0x00, 0x6a, 0x49, 0x44, 0x41, 0x54, 0x78, 0xda, 0x75, 0xcc, 0xad, 0x0e, 0x40, 0x60, 0x00, 0x46, 0xe1, 0xe3, 0x67, 0xd3, 0xcc, 0x6c, 0xc2, 0xb7, 0x7d, 0xc1, 0xa6, 0x69, 0x5c, 0x81, 0x4b, 0x11, 0x5c, 0x88, 0x4b, 0x71, 0x29, 0x9a, 0x28, 0x8a, 0x34, 0x51, 0x24, 0xd8, 0x78, 0x69, 0x82, 0xf0, 0xb4, 0xb3, 0x03, 0x70, 0xa5, 0xb0, 0x57, 0xb0, 0xd5, 0xb0, 0xb6, 0x30, 0x77, 0x30, 0xf5, 0x30, 0x2e, 0x30, 0x38, 0x94, 0x4f, 0x10, 0x1c, 0x7f, 0x5c, 0x05, 0x50, 0x06, 0x12, 0x4a, 0x22, 0x56, 0x32, 0xc9, 0xa5, 0xc0, 0xa3, 0xa1, 0x8d, 0x4c, 0x78, 0x46, 0x26, 0x96, 0x44, 0x8c, 0x58, 0x49, 0x25, 0x3b, 0xfd, 0xf7, 0x80, 0x0e, 0xe8, 0x80, 0x0e, 0xd8, 0x8f, 0x1b, 0xb0, 0xdf, 0x20, 0x61, 0x28, 0x30, 0x54, 0xec, 0x00, 0x00, 0x00, 0x00, 0x49, 0x45, 0x4e, 0x44, 0xae, 0x42, 0x60, 0x82 }; +const png_rgba_filters_expected = [_]u8{ 0x00, 0x00, 0x00, 0xff, 0x20, 0x00, 0x00, 0xf7, 0x40, 0x00, 0x00, 0xef, 0x60, 0x00, 0x00, 0xe7, 0x80, 0x00, 0x00, 0xdf, 0xa0, 0x00, 0x00, 0xd7, 0xc0, 0x00, 0x00, 0xcf, 0xe0, 0x00, 0x00, 0xc7, 0x00, 0x32, 0x00, 0xff, 0x20, 0x32, 0x07, 0xf7, 0x40, 0x32, 0x0e, 0xef, 0x60, 0x32, 0x15, 0xe7, 0x80, 0x32, 0x1c, 0xdf, 0xa0, 0x32, 0x23, 0xd7, 0xc0, 0x32, 0x2a, 0xcf, 0xe0, 0x32, 0x31, 0xc7, 0x00, 0x64, 0x00, 0xff, 0x20, 0x64, 0x0e, 0xf7, 0x40, 0x64, 0x1c, 0xef, 0x60, 0x64, 0x2a, 0xe7, 0x80, 0x64, 0x38, 0xdf, 0xa0, 0x64, 0x46, 0xd7, 0xc0, 0x64, 0x54, 0xcf, 0xe0, 0x64, 0x62, 0xc7, 0x00, 0x96, 0x00, 0xff, 0x20, 0x96, 0x15, 0xf7, 0x40, 0x96, 0x2a, 0xef, 0x60, 0x96, 0x3f, 0xe7, 0x80, 0x96, 0x54, 0xdf, 0xa0, 0x96, 0x69, 0xd7, 0xc0, 0x96, 0x7e, 0xcf, 0xe0, 0x96, 0x93, 0xc7, 0x00, 0xc8, 0x00, 0xff, 0x20, 0xc8, 0x1c, 0xf7, 0x40, 0xc8, 0x38, 0xef, 0x60, 0xc8, 0x54, 0xe7, 0x80, 0xc8, 0x70, 0xdf, 0xa0, 0xc8, 0x8c, 0xd7, 0xc0, 0xc8, 0xa8, 0xcf, 0xe0, 0xc8, 0xc4, 0xc7 }; + +const png_palette_w = 4; +const png_palette_h = 2; +const png_palette = [_]u8{ 0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x00, 0x00, 0x0d, 0x49, 0x48, 0x44, 0x52, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x02, 0x04, 0x03, 0x00, 0x00, 0x00, 0x8d, 0x86, 0x60, 0x50, 0x00, 0x00, 0x00, 0x0c, 0x50, 0x4c, 0x54, 0x45, 0xff, 0x00, 0x00, 0x00, 0xff, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0xfb, 0x00, 0x60, 0xf6, 0x00, 0x00, 0x00, 0x03, 0x74, 0x52, 0x4e, 0x53, 0x00, 0x80, 0xff, 0xec, 0xf7, 0xb3, 0x18, 0x00, 0x00, 0x00, 0x0e, 0x49, 0x44, 0x41, 0x54, 0x78, 0xda, 0x63, 0x60, 0x54, 0x66, 0x34, 0xba, 0x07, 0x00, 0x01, 0xdc, 0x01, 0x36, 0x27, 0x36, 0x5e, 0x16, 0x00, 0x00, 0x00, 0x00, 0x49, 0x45, 0x4e, 0x44, 0xae, 0x42, 0x60, 0x82 }; +const png_palette_expected = [_]u8{ 0xff, 0x00, 0x00, 0x00, 0x00, 0xff, 0x00, 0x80, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0xff, 0xff, 0x00, 0xff, 0x00, 0x80, 0xff, 0x00, 0x00, 0x00 }; + +const png_interlaced_w = 5; +const png_interlaced_h = 5; +const png_interlaced = [_]u8{ 0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x00, 0x00, 0x0d, 0x49, 0x48, 0x44, 0x52, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x05, 0x08, 0x02, 0x00, 0x00, 0x01, 0x75, 0x0a, 0x81, 0x24, 0x00, 0x00, 0x00, 0x41, 0x49, 0x44, 0x41, 0x54, 0x78, 0xda, 0x15, 0x87, 0x41, 0x11, 0x00, 0x31, 0x10, 0x83, 0x22, 0xa2, 0x22, 0x22, 0x62, 0x45, 0x20, 0x62, 0x45, 0x44, 0x44, 0x45, 0x44, 0xea, 0xf5, 0x78, 0x30, 0x20, 0x3d, 0x2a, 0x9e, 0x68, 0xff, 0xb0, 0x68, 0x24, 0x0c, 0x94, 0xbc, 0x3f, 0xd1, 0xc8, 0x4c, 0x58, 0xb9, 0x9b, 0x5e, 0xc9, 0xc7, 0x36, 0x9e, 0x98, 0x7a, 0xa5, 0x8c, 0x03, 0xd9, 0x24, 0xcd, 0xfd, 0x00, 0x8c, 0xb7, 0x17, 0x71, 0xb5, 0x93, 0x7d, 0x9c, 0x00, 0x00, 0x00, 0x00, 0x49, 0x45, 0x4e, 0x44, 0xae, 0x42, 0x60, 0x82 }; +const png_interlaced_expected = [_]u8{ 0x00, 0x00, 0x00, 0xff, 0x28, 0x00, 0x14, 0xff, 0x50, 0x00, 0x28, 0xff, 0x78, 0x00, 0x3c, 0xff, 0xa0, 0x00, 0x50, 0xff, 0x00, 0x28, 0x14, 0xff, 0x28, 0x28, 0x28, 0xff, 0x50, 0x28, 0x3c, 0xff, 0x78, 0x28, 0x50, 0xff, 0xa0, 0x28, 0x64, 0xff, 0x00, 0x50, 0x28, 0xff, 0x28, 0x50, 0x3c, 0xff, 0x50, 0x50, 0x50, 0xff, 0x78, 0x50, 0x64, 0xff, 0xa0, 0x50, 0x78, 0xff, 0x00, 0x78, 0x3c, 0xff, 0x28, 0x78, 0x50, 0xff, 0x50, 0x78, 0x64, 0xff, 0x78, 0x78, 0x78, 0xff, 0xa0, 0x78, 0x8c, 0xff, 0x00, 0xa0, 0x50, 0xff, 0x28, 0xa0, 0x64, 0xff, 0x50, 0xa0, 0x78, 0xff, 0x78, 0xa0, 0x8c, 0xff, 0xa0, 0xa0, 0xa0, 0xff }; diff --git a/src/modules/asset_pipeline/codecs/png/root.zig b/src/modules/asset_pipeline/codecs/png/root.zig new file mode 100644 index 0000000..2610ffd --- /dev/null +++ b/src/modules/asset_pipeline/codecs/png/root.zig @@ -0,0 +1,16 @@ +//! PNG decode codec namespace (`codecs/png/`). +//! +//! Native in-tree PNG → RGBA8 decoder. Decode-only in M0.6 (no encoder). + +const decode_mod = @import("decode.zig"); + +/// Decode a PNG file to an RGBA8 `Image`. +pub const decode = decode_mod.decode; +/// Decoded RGBA8 image. +pub const Image = decode_mod.Image; +/// Error set raised by `decode`. +pub const Error = decode_mod.Error; + +comptime { + _ = decode_mod; +} diff --git a/src/modules/asset_pipeline/codecs/root.zig b/src/modules/asset_pipeline/codecs/root.zig index 0d44483..8f93937 100644 --- a/src/modules/asset_pipeline/codecs/root.zig +++ b/src/modules/asset_pipeline/codecs/root.zig @@ -5,6 +5,14 @@ /// DEFLATE / zlib codec. pub const deflate = @import("deflate/root.zig"); +/// PNG → RGBA8 decode codec. +pub const png = @import("png/root.zig"); + +/// glTF 2.0 static-mesh decode codec. +pub const gltf = @import("gltf/root.zig"); + comptime { _ = deflate; + _ = png; + _ = gltf; } diff --git a/src/modules/asset_pipeline/importers/root.zig b/src/modules/asset_pipeline/importers/root.zig new file mode 100644 index 0000000..0d8646f --- /dev/null +++ b/src/modules/asset_pipeline/importers/root.zig @@ -0,0 +1,11 @@ +//! Asset Pipeline `importers/` namespace. +//! +//! M0.6 / E3 ships the WAV decoder here (RIFF is trivial, brief §Notes). The +//! png / gltf source→intermediate import orchestration lands in E4. + +/// WAV (RIFF PCM) decode. +pub const wav = @import("wav.zig"); + +comptime { + _ = wav; +} diff --git a/src/modules/asset_pipeline/importers/wav.zig b/src/modules/asset_pipeline/importers/wav.zig new file mode 100644 index 0000000..9259f55 --- /dev/null +++ b/src/modules/asset_pipeline/importers/wav.zig @@ -0,0 +1,119 @@ +//! WAV (RIFF PCM) decode. +//! +//! Lives in `importers/` rather than a `codecs/wav/` of its own — RIFF PCM +//! is trivial (brief §Notes — WAV location). M0.6 ships the decoder here; +//! the import → intermediate orchestration is added in E4. +//! +//! Supports linear PCM (`audio_format == 1`); compressed WAV is out of scope. + +const std = @import("std"); + +/// Errors raised by `decode`. +pub const Error = error{ + /// Not a `RIFF` container. + BadRiff, + /// RIFF form type is not `WAVE`. + NotWave, + /// No `fmt ` subchunk. + MissingFmt, + /// No `data` subchunk. + MissingData, + /// `audio_format` is not 1 (linear PCM). + UnsupportedFormat, + /// A subchunk reached past the end of the buffer. + Truncated, + /// Allocation failed. + OutOfMemory, +}; + +/// Decoded PCM audio. `data` is interleaved little-endian PCM, caller-owned. +pub const Audio = struct { + /// Samples per second. + sample_rate: u32, + /// Channel count (interleaved in `data`). + channels: u16, + /// Bits per sample (8 or 16 in M0.6). + bits_per_sample: u16, + /// Raw interleaved PCM bytes. + data: []u8, + + /// Number of sample frames (one frame = one sample per channel). + pub fn frameCount(self: Audio) usize { + const bytes_per_frame = @as(usize, self.channels) * (self.bits_per_sample / 8); + if (bytes_per_frame == 0) return 0; + return self.data.len / bytes_per_frame; + } + + /// Free the PCM buffer and poison `self`. + pub fn deinit(self: *Audio, gpa: std.mem.Allocator) void { + gpa.free(self.data); + self.* = undefined; + } +}; + +/// Decode a WAV (RIFF PCM) file into `Audio`. +pub fn decode(gpa: std.mem.Allocator, src: []const u8) Error!Audio { + if (src.len < 12 or !std.mem.eql(u8, src[0..4], "RIFF")) return error.BadRiff; + if (!std.mem.eql(u8, src[8..12], "WAVE")) return error.NotWave; + + var sample_rate: u32 = 0; + var channels: u16 = 0; + var bits: u16 = 0; + var have_fmt = false; + var pcm: ?[]const u8 = null; + + var pos: usize = 12; + while (pos + 8 <= src.len) { + const id = src[pos..][0..4]; + const size = std.mem.readInt(u32, src[pos + 4 ..][0..4], .little); + const body_start = pos + 8; + if (body_start + size > src.len) return error.Truncated; + const body = src[body_start .. body_start + size]; + + if (std.mem.eql(u8, id, "fmt ")) { + if (body.len < 16) return error.Truncated; + const audio_format = std.mem.readInt(u16, body[0..2], .little); + if (audio_format != 1) return error.UnsupportedFormat; + channels = std.mem.readInt(u16, body[2..4], .little); + sample_rate = std.mem.readInt(u32, body[4..8], .little); + bits = std.mem.readInt(u16, body[14..16], .little); + have_fmt = true; + } else if (std.mem.eql(u8, id, "data")) { + pcm = body; + } + // Subchunks are padded to an even byte count. + pos = body_start + size + (size & 1); + } + + if (!have_fmt) return error.MissingFmt; + const data = pcm orelse return error.MissingData; + return .{ + .sample_rate = sample_rate, + .channels = channels, + .bits_per_sample = bits, + .data = try gpa.dupe(u8, data), + }; +} + +test "decode WAV PCM s16le extracts format and samples" { + const gpa = std.testing.allocator; + var audio = try decode(gpa, &wav_pcm); + defer audio.deinit(gpa); + try std.testing.expectEqual(@as(u32, wav_sample_rate), audio.sample_rate); + try std.testing.expectEqual(@as(u16, wav_channels), audio.channels); + try std.testing.expectEqual(@as(u16, wav_bits), audio.bits_per_sample); + try std.testing.expectEqual(@as(usize, wav_frames), audio.frameCount()); + try std.testing.expectEqualSlices(u8, &wav_pcm_data, audio.data); +} + +test "decode rejects a non-RIFF buffer" { + const gpa = std.testing.allocator; + try std.testing.expectError(error.BadRiff, decode(gpa, "nope")); +} + +const wav_sample_rate = 8000; +const wav_channels = 2; +const wav_bits = 16; +const wav_frames = 8; +const wav_pcm = [_]u8{ 0x52, 0x49, 0x46, 0x46, 0x44, 0x00, 0x00, 0x00, 0x57, 0x41, 0x56, 0x45, 0x66, 0x6d, 0x74, 0x20, 0x10, 0x00, 0x00, 0x00, 0x01, 0x00, 0x02, 0x00, 0x40, 0x1f, 0x00, 0x00, 0x00, 0x7d, 0x00, 0x00, 0x04, 0x00, 0x10, 0x00, 0x64, 0x61, 0x74, 0x61, 0x20, 0x00, 0x00, 0x00, 0x68, 0xc5, 0x5c, 0xc7, 0x50, 0xc9, 0x44, 0xcb, 0x38, 0xcd, 0x2c, 0xcf, 0x20, 0xd1, 0x14, 0xd3, 0x08, 0xd5, 0xfc, 0xd6, 0xf0, 0xd8, 0xe4, 0xda, 0xd8, 0xdc, 0xcc, 0xde, 0xc0, 0xe0, 0xb4, 0xe2 }; +const wav_pcm_data = [_]u8{ 0x68, 0xc5, 0x5c, 0xc7, 0x50, 0xc9, 0x44, 0xcb, 0x38, 0xcd, 0x2c, 0xcf, 0x20, 0xd1, 0x14, 0xd3, 0x08, 0xd5, 0xfc, 0xd6, 0xf0, 0xd8, 0xe4, 0xda, 0xd8, 0xdc, 0xcc, 0xde, 0xc0, 0xe0, 0xb4, 0xe2 }; diff --git a/src/modules/asset_pipeline/root.zig b/src/modules/asset_pipeline/root.zig index 7c2df03..1545fe8 100644 --- a/src/modules/asset_pipeline/root.zig +++ b/src/modules/asset_pipeline/root.zig @@ -22,9 +22,12 @@ pub const format = @import("format/root.zig"); /// Asset identity: `AssetHandle` + the slot `Registry`. pub const registry = @import("registry/root.zig"); -/// Low-level codecs (E2: DEFLATE/zlib; E3 adds PNG + glTF static). +/// Low-level codecs (E2: DEFLATE/zlib; E3: PNG + glTF static). pub const codecs = @import("codecs/root.zig"); +/// Source importers (E3: WAV decode; E4 adds png/gltf orchestration). +pub const importers = @import("importers/root.zig"); + /// 64-bit typed asset handle (convenience re-export). pub const AssetHandle = registry.AssetHandle; /// Slot registry (convenience re-export). @@ -42,4 +45,5 @@ comptime { _ = format; _ = registry; _ = codecs; + _ = importers; } From 64e311e85e55022b9f37f8dd5e36be1d19d4ffc1 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 01:30:32 +0200 Subject: [PATCH 16/29] docs(brief): journal update --- briefs/m0.6-assets.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index d37b4b9..f3a8f73 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -209,6 +209,8 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o - 2026-06-03 21:33 — Two Zig 0.16 stdlib API fixes during E1: `std.meta.intToEnum` → `std.enums.fromInt`; ArrayList-writer string build → `std.Io.Writer.Allocating`. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. - 2026-06-04 00:55 — E2 implemented (GO relayed by Guy). `foundation/simd/` skeleton: `traits` (capability bitflags + comptime `detect`), `dispatch` (comptime variant select, portable-only in M0.6), `portable`, `simd` (public re-export); `kernels/adler32.zig` = scalar `reference` oracle + portable `@Vector` `vectorized` (NMAX-block weighted-sum, no ISA asm). `foundation` module added (`addModule`), `math` deferred to its first consumer. DEFLATE: native RFC 1951 `inflate` (stored + fixed + dynamic blocks, table-driven canonical Huffman via single-level lookup + bit-reversed codes, LZ77 back-refs over the whole output buffer) written from scratch (puff.c/miniz as structural reference only); `zlib` wrapper (header parse + inflate + big-endian ADLER32 trailer verified via `foundation.simd.adler32` — first real kernel consumer). - 2026-06-04 00:55 — E2 tests: `tests/assets/deflate_vectors.zig` (fixed/dynamic/stored/zlib + corrupted-trailer rejection) against Python-zlib vectors (decode-only milestone — authentic encoder output embedded, BTYPE inspected to label fixed=01/dynamic=10); `adler32_test.zig` (known vectors `""`/`"a"`/`"abc"`/`"Wikipedia"` + portable==reference) + `correctness.zig` (every variant == reference). `build.zig`: `foundation` dep on `asset_pipeline`, `foundation_tests` target, `foundation` `TestSpec` flag, `bench-adler32` step. Zig 0.16: `std.time.Timer` removed → bench uses `std.Io.Clock.Timestamp(.awake)` via `init.io` (keeps `foundation` std-only). Bench baseline (Apple Silicon, smoke): portable ≈ 349 MB/s, reference ≈ 96 MB/s. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. +- 2026-06-04 01:29 — E3 implemented (GO relayed by Guy; 3 deferred hardening notes recorded, not actioned). `paeth_filter_decode` = 2nd `foundation/simd` kernel (scalar `reference` + `@Vector` parallel across the `bpp` channels of each pixel; comptime-specialized bpp 1..4; no asm, baseline only) — registered in portable/dispatch/simd, validating the multi-kernel dispatch. PNG decoder (`codecs/png/`): chunk parse (IHDR/PLTE/tRNS/IDAT/IEND, CRC parsed-not-verified), IDAT via E2 `zlib.decompress`, all 5 filters (Paeth via the kernel), bit depths 1/2/4/8, palette + tRNS alpha, Adam7 interlace, → RGBA8. glTF static (`codecs/gltf/`): `std.json` into typed structs (`ignore_unknown_fields`), POSITION/NORMAL/TEXCOORD_0 + indices from embedded base64 buffers. WAV (`importers/wav.zig`): RIFF PCM. Out of scope this milestone, noted: 16-bit PNG, gray/RGB colour-key tRNS, external/`.glb` glTF buffers. +- 2026-06-04 01:29 — E3 tests: PNG/glTF/WAV decode tests inline in the codec files (vectors manually encoded + **verified by an independent Pillow decode** in a throwaway venv — avoids encoder/decoder bug-cancellation, esp. Adam7); `paeth_test.zig` (portable==reference + dispatched public entry + known scanline). `build.zig`: paeth test spec + `bench-paeth`. Bench baseline (Apple Silicon, smoke): portable ≈ 87 MB/s, reference ≈ 40 MB/s. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. Deferred hardening (Guy, not E3): (a) inflate 16/17/18 over-`total` tolerance; (b) negative inflate-guard tests; (c) `nmax` comment. ## Recorded deviations From cda2f637752bcfe73f96e82003e1ec2dbbd6cfa2 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 06:45:58 +0200 Subject: [PATCH 17/29] feat(assets): align intermediate format to normative schema --- .../asset_pipeline/format/intermediate.zig | 94 ++++++++++++++++--- 1 file changed, 83 insertions(+), 11 deletions(-) diff --git a/src/modules/asset_pipeline/format/intermediate.zig b/src/modules/asset_pipeline/format/intermediate.zig index ffff211..8e03b21 100644 --- a/src/modules/asset_pipeline/format/intermediate.zig +++ b/src/modules/asset_pipeline/format/intermediate.zig @@ -3,16 +3,22 @@ //! //! The on-disk text is the frozen surface (M0.6): a single top-level //! `asset "" { … }` construct holding the importer-extracted metadata -//! and the user-editable settings. It is the diffable, Git-versioned half -//! of an asset (the bulk bytes live in a separate hashed blob). +//! and the user-editable settings. The bulk bytes live in a separate hashed +//! blob; `extracted.blob` references it. //! -//! The reader/writer here is a deliberately small ad-hoc implementation: -//! the brief forbids a `weld_etch` dependency in M0.6 (the full Etch parser -//! is M0.8). The emitted text MUST stay a valid subset of `etch-grammar.md` -//! so the M0.8 parser reads it back unchanged — fields are `key: value`, -//! newline-separated inside `{ … }` blocks, arrays are comma-separated -//! `[ … ]`, enum literals are `.name`, strings are `"…"`. The on-disk text -//! is the frozen contract, not this reader implementation. +//! Normative schema: `engine-asset-pipeline.md §3` — fixed fields `type`, +//! `version`, `source`, `source_hash`, then the four blocks +//! `import_settings`, `process_settings`, `cook_settings`, `extracted`, with +//! `extracted.blob` ("<32 hex>", BLAKE3-128) mandatory. Grammar of the +//! `asset` construct: `etch-grammar.md §21.4` (category-4, +//! pipeline-generated). The container (fixed fields + block list + value +//! grammar) is frozen; block *contents* are open per asset category. +//! +//! This ad-hoc reader/writer avoids a `weld_etch` dependency in M0.6 (the +//! full Etch parser is M0.8). It covers exactly the §21.4 value grammar +//! minus `@unit(...)` annotations, which the writer does not emit in M0.6 +//! (additive Phase 1). The on-disk text is the frozen contract, not this +//! reader implementation. //! //! Ownership: `parseEtch` allocates every string/array/object into the //! caller-supplied allocator (use an arena and free it in one shot). The @@ -69,6 +75,33 @@ pub const Field = struct { value: Value, }; +/// Look up `key` in `fields` and return its integer value, or null if the +/// field is absent or not an int. (Used by cookers to read `extracted`.) +pub fn fieldInt(fields: []const Field, key: []const u8) ?i64 { + for (fields) |f| { + if (std.mem.eql(u8, f.key, key)) { + return switch (f.value) { + .int => |v| v, + else => null, + }; + } + } + return null; +} + +/// Look up `key` in `fields` and return its string value, or null. +pub fn fieldStr(fields: []const Field, key: []const u8) ?[]const u8 { + for (fields) |f| { + if (std.mem.eql(u8, f.key, key)) { + return switch (f.value) { + .string => |v| v, + else => null, + }; + } + } + return null; +} + /// Deep equality over two ordered field lists. pub fn fieldsEql(a: []const Field, b: []const Field) bool { if (a.len != b.len) return false; @@ -97,7 +130,11 @@ pub const AssetDoc = struct { import_settings: []const Field = &.{}, /// User-editable process settings. process_settings: []const Field = &.{}, - /// Importer-extracted, machine-maintained facts. + /// User-editable cook settings (per-platform sub-blocks, e.g. + /// `pc: { … }`). Emitted between `process_settings` and `extracted`. + cook_settings: []const Field = &.{}, + /// Importer-extracted, machine-maintained facts. Always carries + /// `blob: "<32 hex>"` (see `blobHash`). extracted: []const Field = &.{}, /// Deep structural equality (used by the round-trip test). @@ -109,8 +146,22 @@ pub const AssetDoc = struct { std.mem.eql(u8, a.source_hash, b.source_hash) and fieldsEql(a.import_settings, b.import_settings) and fieldsEql(a.process_settings, b.process_settings) and + fieldsEql(a.cook_settings, b.cook_settings) and fieldsEql(a.extracted, b.extracted); } + + /// Return the mandatory `extracted.blob` hash string, or null if absent. + pub fn blobHash(self: AssetDoc) ?[]const u8 { + for (self.extracted) |f| { + if (std.mem.eql(u8, f.key, "blob")) { + return switch (f.value) { + .string => |s| s, + else => null, + }; + } + } + return null; + } }; /// Error set raised while writing. `std.Io.Writer.Error` already covers a @@ -127,6 +178,7 @@ pub fn writeEtch(doc: AssetDoc, out: *std.Io.Writer) WriteError!void { try out.print(" source_hash: \"{s}\"\n", .{doc.source_hash}); try writeBlock(out, "import_settings", doc.import_settings); try writeBlock(out, "process_settings", doc.process_settings); + try writeBlock(out, "cook_settings", doc.cook_settings); try writeBlock(out, "extracted", doc.extracted); try out.writeAll("}\n"); } @@ -413,6 +465,11 @@ const Parser = struct { .object => |o| o, else => return error.UnexpectedChar, }; + } else if (std.mem.eql(u8, f.key, "cook_settings")) { + doc.cook_settings = switch (f.value) { + .object => |o| o, + else => return error.UnexpectedChar, + }; } else if (std.mem.eql(u8, f.key, "extracted")) { doc.extracted = switch (f.value) { .object => |o| o, @@ -443,10 +500,17 @@ test "intermediate doc round-trips through etch text" { const process_settings = [_]Field{ .{ .key = "generate_lods", .value = .{ .boolean = false } }, }; + const pc_cook = [_]Field{ + .{ .key = "vertex_format", .value = .{ .enum_literal = "compressed" } }, + }; + const cook_settings = [_]Field{ + .{ .key = "pc", .value = .{ .object = &pc_cook } }, // per-platform sub-block + }; const extracted = [_]Field{ .{ .key = "vertex_count", .value = .{ .int = 24 } }, .{ .key = "bounds", .value = .{ .object = &bounds } }, .{ .key = "materials", .value = .{ .array = &materials } }, + .{ .key = "blob", .value = .{ .string = "a3f2b1c98d" } }, // mandatory }; const original = AssetDoc{ @@ -457,6 +521,7 @@ test "intermediate doc round-trips through etch text" { .source_hash = "abc123", .import_settings = &import_settings, .process_settings = &process_settings, + .cook_settings = &cook_settings, .extracted = &extracted, }; @@ -470,7 +535,14 @@ test "intermediate doc round-trips through etch text" { try std.testing.expect(original.eql(parsed)); try std.testing.expectEqualStrings("StaticMesh", parsed.type_name); try std.testing.expectEqual(@as(u16, 1), parsed.version); - try std.testing.expectEqual(@as(usize, 3), parsed.extracted.len); + try std.testing.expectEqual(@as(usize, 4), parsed.extracted.len); + try std.testing.expectEqual(@as(usize, 1), parsed.cook_settings.len); + try std.testing.expectEqualStrings("a3f2b1c98d", original.blobHash().?); + try std.testing.expectEqualStrings("a3f2b1c98d", parsed.blobHash().?); + // Field accessors used by the cookers. + try std.testing.expectEqual(@as(i64, 24), fieldInt(parsed.extracted, "vertex_count").?); + try std.testing.expectEqualStrings("a3f2b1c98d", fieldStr(parsed.extracted, "blob").?); + try std.testing.expectEqual(@as(?i64, null), fieldInt(parsed.extracted, "bounds")); // not an int } test "intermediate writer emits a valid asset construct shape" { From 442d52f2c6878c432c55cf764be536d229cecf9f Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 06:46:06 +0200 Subject: [PATCH 18/29] feat(assets): add importers, cookers, BLAKE3 hashing, and cache --- src/modules/asset_pipeline/cache/cache.zig | 78 +++++++++++++++++++ src/modules/asset_pipeline/cache/root.zig | 12 +++ src/modules/asset_pipeline/cookers/audio.zig | 24 ++++++ src/modules/asset_pipeline/cookers/common.zig | 48 ++++++++++++ src/modules/asset_pipeline/cookers/mesh.zig | 66 ++++++++++++++++ src/modules/asset_pipeline/cookers/root.zig | 24 ++++++ .../asset_pipeline/cookers/texture.zig | 21 +++++ src/modules/asset_pipeline/hash.zig | 39 ++++++++++ .../asset_pipeline/importers/common.zig | 22 ++++++ src/modules/asset_pipeline/importers/gltf.zig | 67 ++++++++++++++++ src/modules/asset_pipeline/importers/png.zig | 63 +++++++++++++++ src/modules/asset_pipeline/importers/root.zig | 21 ++++- src/modules/asset_pipeline/importers/wav.zig | 40 ++++++++++ src/modules/asset_pipeline/root.zig | 14 +++- 14 files changed, 534 insertions(+), 5 deletions(-) create mode 100644 src/modules/asset_pipeline/cache/cache.zig create mode 100644 src/modules/asset_pipeline/cache/root.zig create mode 100644 src/modules/asset_pipeline/cookers/audio.zig create mode 100644 src/modules/asset_pipeline/cookers/common.zig create mode 100644 src/modules/asset_pipeline/cookers/mesh.zig create mode 100644 src/modules/asset_pipeline/cookers/root.zig create mode 100644 src/modules/asset_pipeline/cookers/texture.zig create mode 100644 src/modules/asset_pipeline/hash.zig create mode 100644 src/modules/asset_pipeline/importers/common.zig create mode 100644 src/modules/asset_pipeline/importers/gltf.zig create mode 100644 src/modules/asset_pipeline/importers/png.zig diff --git a/src/modules/asset_pipeline/cache/cache.zig b/src/modules/asset_pipeline/cache/cache.zig new file mode 100644 index 0000000..0b138db --- /dev/null +++ b/src/modules/asset_pipeline/cache/cache.zig @@ -0,0 +1,78 @@ +//! Local cooking cache — a directory of `.bin` cooked artifacts. +//! +//! Key = BLAKE3-128 hex of `source_hash ++ settings ++ platform` +//! (brief §E4): the cache invalidates on `source_hash`, the settings, or the +//! target platform. A cache hit avoids re-running the (expensive) decode + +//! cook; the differential is measured in `tests/assets/cache_diff.zig`. +//! +//! Only the local tier exists in M0.6 (network/cloud tiers are Phase 2+). + +const std = @import("std"); + +const Blake3 = std.crypto.hash.Blake3; + +/// Compute the 32-hex cache key from the cook inputs. +pub fn computeKey(source_hash: []const u8, settings: []const u8, platform: u16) [32]u8 { + var h = Blake3.init(.{}); + h.update(source_hash); + h.update(settings); + h.update(std.mem.asBytes(&platform)); + var digest: [16]u8 = undefined; + h.final(&digest); + return std.fmt.bytesToHex(digest, .lower); +} + +/// A cooking cache rooted at an open directory. +pub const Cache = struct { + /// The cache directory (e.g. `.weld/cache/`). + dir: std.Io.Dir, + + /// Wrap an already-open directory as a cache. + pub fn init(dir: std.Io.Dir) Cache { + return .{ .dir = dir }; + } + + fn fileName(key_hex: []const u8, buf: *[36]u8) []const u8 { + @memcpy(buf[0..32], key_hex[0..32]); + @memcpy(buf[32..36], ".bin"); + return buf[0..36]; + } + + /// True if a cooked artifact exists for `key_hex` (a cache hit). + pub fn contains(self: Cache, io: std.Io, key_hex: []const u8) bool { + var buf: [36]u8 = undefined; + const f = self.dir.openFile(io, fileName(key_hex, &buf), .{}) catch return false; + f.close(io); + return true; + } + + /// Store cooked `bin` under `key_hex`. + pub fn put(self: Cache, io: std.Io, key_hex: []const u8, bin: []const u8) !void { + var buf: [36]u8 = undefined; + const f = try self.dir.createFile(io, fileName(key_hex, &buf), .{ .truncate = true }); + defer f.close(io); + try f.writeStreamingAll(io, bin); + } + + /// Read the cooked artifact for `key_hex` (caller-owned), or null if + /// absent. + pub fn get(self: Cache, gpa: std.mem.Allocator, io: std.Io, key_hex: []const u8) !?[]u8 { + var buf: [36]u8 = undefined; + const f = self.dir.openFile(io, fileName(key_hex, &buf), .{}) catch return null; + defer f.close(io); + + const size: usize = @intCast((try f.stat(io)).size); + const out = try gpa.alloc(u8, size); + errdefer gpa.free(out); + + var read_buf: [4096]u8 = undefined; + var reader = f.reader(io, &read_buf); + var written: usize = 0; + while (written < size) { + const n = try reader.interface.readSliceShort(out[written..]); + if (n == 0) break; + written += n; + } + return out[0..written]; + } +}; diff --git a/src/modules/asset_pipeline/cache/root.zig b/src/modules/asset_pipeline/cache/root.zig new file mode 100644 index 0000000..a9ce914 --- /dev/null +++ b/src/modules/asset_pipeline/cache/root.zig @@ -0,0 +1,12 @@ +//! Asset Pipeline `cache/` namespace — local cooking cache. + +const cache_mod = @import("cache.zig"); + +/// Directory-backed cooking cache. +pub const Cache = cache_mod.Cache; +/// Compute the cooking-cache key from the cook inputs. +pub const computeKey = cache_mod.computeKey; + +comptime { + _ = cache_mod; +} diff --git a/src/modules/asset_pipeline/cookers/audio.zig b/src/modules/asset_pipeline/cookers/audio.zig new file mode 100644 index 0000000..42bfab9 --- /dev/null +++ b/src/modules/asset_pipeline/cookers/audio.zig @@ -0,0 +1,24 @@ +//! Audio cooker — intermediate → `.audio.bin`. +//! +//! M0.6 payload is raw PCM (no Opus — the Opus keeper is not wired in M0.6, +//! brief §Out-of-scope). Metadata section: `sample_rate` u32, +//! `channels` u16, `bits_per_sample` u16 (LE). + +const std = @import("std"); +const format = @import("../format/root.zig"); +const common = @import("common.zig"); + +/// Cook an audio intermediate (`doc` + PCM `blob`) into an `.audio.bin`. +/// Format fields are read from `doc.extracted`. +pub fn cook(gpa: std.mem.Allocator, doc: format.AssetDoc, blob: []const u8) common.Error![]u8 { + const sample_rate = format.intermediate.fieldInt(doc.extracted, "sample_rate") orelse return error.MissingMetadata; + const channels = format.intermediate.fieldInt(doc.extracted, "channels") orelse return error.MissingMetadata; + const bits = format.intermediate.fieldInt(doc.extracted, "bits_per_sample") orelse return error.MissingMetadata; + + var meta: [8]u8 = undefined; + std.mem.writeInt(u32, meta[0..4], @intCast(sample_rate), .little); + std.mem.writeInt(u16, meta[4..6], @intCast(channels), .little); + std.mem.writeInt(u16, meta[6..8], @intCast(bits), .little); + + return common.assemble(gpa, .audio, &meta, blob); +} diff --git a/src/modules/asset_pipeline/cookers/common.zig b/src/modules/asset_pipeline/cookers/common.zig new file mode 100644 index 0000000..360777a --- /dev/null +++ b/src/modules/asset_pipeline/cookers/common.zig @@ -0,0 +1,48 @@ +//! Shared cook assembly: lay out a runtime `..bin` as +//! `[40-byte header][metadata][data]`. +//! +//! The header is the E1-frozen `RuntimeHeader`, written via its explicit +//! little-endian `writeTo`/`toBytes` (no `@ptrCast` on write — the on-disk +//! bytes are produced field-by-field so the format is endianness-defined). + +const std = @import("std"); +const format = @import("../format/root.zig"); +const hash = @import("../hash.zig"); + +/// Errors raised by the cookers. +pub const Error = error{ + /// A required `extracted` metadata field was missing or malformed. + MissingMetadata, + /// Allocation failed. + OutOfMemory, +}; + +/// Assemble a `.bin`: header + metadata section + bulk data section. `hash` +/// is the BLAKE3-derived u64 of the data payload. +pub fn assemble( + gpa: std.mem.Allocator, + asset_type: format.AssetType, + metadata: []const u8, + data: []const u8, +) Error![]u8 { + const md_off: u32 = @intCast(format.header_size); + const data_off: u32 = md_off + @as(u32, @intCast(metadata.len)); + const total = @as(usize, data_off) + data.len; + + const out = try gpa.alloc(u8, total); + errdefer gpa.free(out); + + const header = format.RuntimeHeader.init(.{ + .asset_type = asset_type, + .metadata_offset = md_off, + .metadata_size = @intCast(metadata.len), + .data_offset = data_off, + .data_size = @intCast(data.len), + .hash = hash.u64Of(data), + }); + const header_bytes = header.toBytes(); + @memcpy(out[0..format.header_size], &header_bytes); + @memcpy(out[md_off..][0..metadata.len], metadata); + @memcpy(out[data_off..][0..data.len], data); + return out; +} diff --git a/src/modules/asset_pipeline/cookers/mesh.zig b/src/modules/asset_pipeline/cookers/mesh.zig new file mode 100644 index 0000000..86b3f56 --- /dev/null +++ b/src/modules/asset_pipeline/cookers/mesh.zig @@ -0,0 +1,66 @@ +//! Mesh cooker — intermediate → `.mesh.bin` (version 1). +//! +//! M0.6 payload is raw f32 vertices + u32 indices (no quantization — that +//! is a Phase 1 `.mesh.bin` version bump, brief §Out-of-scope). Metadata +//! section: `vertex_count` u32, `index_count` u32, bounds min/max (6 × f32), +//! all little-endian. + +const std = @import("std"); +const format = @import("../format/root.zig"); +const common = @import("common.zig"); + +const Value = format.Value; + +/// Cook a mesh intermediate (`doc` + `blob` = positions f32 ++ indices u32) +/// into a `.mesh.bin`. Counts and bounds are read from `doc.extracted`. +pub fn cook(gpa: std.mem.Allocator, doc: format.AssetDoc, blob: []const u8) common.Error![]u8 { + const vertex_count = format.intermediate.fieldInt(doc.extracted, "vertex_count") orelse return error.MissingMetadata; + const index_count = format.intermediate.fieldInt(doc.extracted, "index_count") orelse return error.MissingMetadata; + const bounds = readBounds(doc.extracted) orelse return error.MissingMetadata; + + var meta: [32]u8 = undefined; + std.mem.writeInt(u32, meta[0..4], @intCast(vertex_count), .little); + std.mem.writeInt(u32, meta[4..8], @intCast(index_count), .little); + inline for (0..3) |i| std.mem.writeInt(u32, meta[8 + i * 4 ..][0..4], @bitCast(bounds.min[i]), .little); + inline for (0..3) |i| std.mem.writeInt(u32, meta[20 + i * 4 ..][0..4], @bitCast(bounds.max[i]), .little); + + return common.assemble(gpa, .mesh, &meta, blob); +} + +const Bounds = struct { min: [3]f32, max: [3]f32 }; + +fn readBounds(extracted: []const format.Field) ?Bounds { + for (extracted) |f| { + if (std.mem.eql(u8, f.key, "bounds")) { + const obj = switch (f.value) { + .object => |o| o, + else => return null, + }; + return .{ + .min = readVec3(obj, "min") orelse return null, + .max = readVec3(obj, "max") orelse return null, + }; + } + } + return null; +} + +fn readVec3(obj: []const format.Field, key: []const u8) ?[3]f32 { + for (obj) |f| { + if (std.mem.eql(u8, f.key, key)) { + const arr = switch (f.value) { + .array => |a| a, + else => return null, + }; + if (arr.len < 3) return null; + var v: [3]f32 = undefined; + for (0..3) |i| v[i] = switch (arr[i]) { + .float => |x| @floatCast(x), + .int => |x| @floatFromInt(x), + else => return null, + }; + return v; + } + } + return null; +} diff --git a/src/modules/asset_pipeline/cookers/root.zig b/src/modules/asset_pipeline/cookers/root.zig new file mode 100644 index 0000000..2317f3e --- /dev/null +++ b/src/modules/asset_pipeline/cookers/root.zig @@ -0,0 +1,24 @@ +//! Asset Pipeline `cookers/` namespace — intermediate → runtime `..bin`. +//! +//! Each cooker assembles the E1-frozen 40-byte header + a small metadata +//! section + the bulk payload. M0.6 payloads are raw (RGBA8 / f32 vertices / +//! PCM); compression and quantization are later phases. + +/// Shared `.bin` assembly + cook `Error`. +pub const common = @import("common.zig"); +/// Cook error set. +pub const Error = common.Error; + +/// Cook a texture intermediate → `.texture.bin`. +pub const cookTexture = @import("texture.zig").cook; +/// Cook a mesh intermediate → `.mesh.bin`. +pub const cookMesh = @import("mesh.zig").cook; +/// Cook an audio intermediate → `.audio.bin`. +pub const cookAudio = @import("audio.zig").cook; + +comptime { + _ = common; + _ = @import("texture.zig"); + _ = @import("mesh.zig"); + _ = @import("audio.zig"); +} diff --git a/src/modules/asset_pipeline/cookers/texture.zig b/src/modules/asset_pipeline/cookers/texture.zig new file mode 100644 index 0000000..e0b3171 --- /dev/null +++ b/src/modules/asset_pipeline/cookers/texture.zig @@ -0,0 +1,21 @@ +//! Texture cooker — intermediate → `.texture.bin`. +//! +//! M0.6 payload is raw RGBA8 (no mipmaps, no GPU block compression — brief +//! §Out-of-scope). Metadata section: `width` u32, `height` u32 (LE). + +const std = @import("std"); +const format = @import("../format/root.zig"); +const common = @import("common.zig"); + +/// Cook a texture intermediate (`doc` + decoded RGBA8 `blob`) into a +/// `.texture.bin`. `width`/`height` are read from `doc.extracted`. +pub fn cook(gpa: std.mem.Allocator, doc: format.AssetDoc, blob: []const u8) common.Error![]u8 { + const width = format.intermediate.fieldInt(doc.extracted, "width") orelse return error.MissingMetadata; + const height = format.intermediate.fieldInt(doc.extracted, "height") orelse return error.MissingMetadata; + + var meta: [8]u8 = undefined; + std.mem.writeInt(u32, meta[0..4], @intCast(width), .little); + std.mem.writeInt(u32, meta[4..8], @intCast(height), .little); + + return common.assemble(gpa, .texture, &meta, blob); +} diff --git a/src/modules/asset_pipeline/hash.zig b/src/modules/asset_pipeline/hash.zig new file mode 100644 index 0000000..6e0f1f6 --- /dev/null +++ b/src/modules/asset_pipeline/hash.zig @@ -0,0 +1,39 @@ +//! Content hashing for the asset pipeline — BLAKE3, native std (no C binding). +//! +//! `source_hash`, `extracted.blob`, and the cooking-cache key are all +//! BLAKE3 truncated to 128 bits, rendered as 32 lowercase hex chars +//! (`engine-asset-pipeline.md §3`, brief §E4). The runtime `.bin` header +//! `hash` field is a u64 (the low 64 bits of the same digest). + +const std = @import("std"); + +const Blake3 = std.crypto.hash.Blake3; + +/// 32-char lowercase hex of the BLAKE3-128 (16-byte) digest of `data`. +pub fn hex128(data: []const u8) [32]u8 { + var digest: [16]u8 = undefined; + Blake3.hash(data, &digest, .{}); + return std.fmt.bytesToHex(digest, .lower); +} + +/// Low 64 bits of the BLAKE3 digest of `data` (the `.bin` header `hash`). +pub fn u64Of(data: []const u8) u64 { + var digest: [8]u8 = undefined; + Blake3.hash(data, &digest, .{}); + return std.mem.readInt(u64, &digest, .little); +} + +test "hex128 is 32 lowercase hex chars and stable" { + const a = hex128("weld"); + const b = hex128("weld"); + try std.testing.expectEqual(@as(usize, 32), a.len); + try std.testing.expectEqualSlices(u8, &a, &b); + for (a) |c| try std.testing.expect((c >= '0' and c <= '9') or (c >= 'a' and c <= 'f')); + // Distinct inputs give distinct hashes. + try std.testing.expect(!std.mem.eql(u8, &a, &hex128("weld!"))); +} + +test "u64Of is stable and input-sensitive" { + try std.testing.expectEqual(u64Of("abc"), u64Of("abc")); + try std.testing.expect(u64Of("abc") != u64Of("abd")); +} diff --git a/src/modules/asset_pipeline/importers/common.zig b/src/modules/asset_pipeline/importers/common.zig new file mode 100644 index 0000000..b39b5ad --- /dev/null +++ b/src/modules/asset_pipeline/importers/common.zig @@ -0,0 +1,22 @@ +//! Shared importer output. + +const std = @import("std"); +const format = @import("../format/root.zig"); + +/// An imported asset: the intermediate document and its referenced blob. +/// `arena` owns everything reachable from `doc`; `blob` is `gpa`-owned. +pub const Import = struct { + /// Owns all of `doc`'s strings/fields. + arena: std.heap.ArenaAllocator, + /// The intermediate `.asset.etch` document. + doc: format.AssetDoc, + /// The referenced binary blob (decoded payload), `gpa`-owned. + blob: []u8, + + /// Free the document arena and the blob. + pub fn deinit(self: *Import, gpa: std.mem.Allocator) void { + self.arena.deinit(); + gpa.free(self.blob); + self.* = undefined; + } +}; diff --git a/src/modules/asset_pipeline/importers/gltf.zig b/src/modules/asset_pipeline/importers/gltf.zig new file mode 100644 index 0000000..568b70d --- /dev/null +++ b/src/modules/asset_pipeline/importers/gltf.zig @@ -0,0 +1,67 @@ +//! glTF importer — source bytes → intermediate (`AssetDoc` + vertex blob). +//! +//! Decodes via `codecs.gltf` and builds a `StaticMesh` +//! `.asset.etch` document plus a blob = positions (f32) ++ indices +//! (u32), all little-endian. Pure (in-memory). + +const std = @import("std"); +const format = @import("../format/root.zig"); +const hash = @import("../hash.zig"); +const gltf = @import("../codecs/gltf/root.zig"); +const common = @import("common.zig"); + +const Field = format.Field; +const Value = format.Value; +const AssetDoc = format.AssetDoc; + +/// Imported asset (document arena + blob). +pub const Import = common.Import; + +/// Errors raised by `import`. +pub const Error = error{OutOfMemory} || gltf.Error; + +/// Import a glTF file (`src` bytes from `source_path`) into an intermediate. +pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8) Error!Import { + var mesh = try gltf.decode(gpa, src); + defer mesh.deinit(gpa); + + // Blob: positions (f32 LE) followed by indices (u32 LE). + const pos_bytes = mesh.positions.len * 4; + const idx_bytes = mesh.indices.len * 4; + const blob = try gpa.alloc(u8, pos_bytes + idx_bytes); + errdefer gpa.free(blob); + for (mesh.positions, 0..) |p, i| std.mem.writeInt(u32, blob[i * 4 ..][0..4], @bitCast(p), .little); + for (mesh.indices, 0..) |idx, i| std.mem.writeInt(u32, blob[pos_bytes + i * 4 ..][0..4], idx, .little); + + var arena = std.heap.ArenaAllocator.init(gpa); + errdefer arena.deinit(); + const a = arena.allocator(); + + const min_arr = try a.dupe(Value, &[_]Value{ + .{ .float = mesh.bounds_min[0] }, .{ .float = mesh.bounds_min[1] }, .{ .float = mesh.bounds_min[2] }, + }); + const max_arr = try a.dupe(Value, &[_]Value{ + .{ .float = mesh.bounds_max[0] }, .{ .float = mesh.bounds_max[1] }, .{ .float = mesh.bounds_max[2] }, + }); + const bounds_obj = try a.dupe(Field, &[_]Field{ + .{ .key = "min", .value = .{ .array = min_arr } }, + .{ .key = "max", .value = .{ .array = max_arr } }, + }); + const extracted = try a.dupe(Field, &[_]Field{ + .{ .key = "vertex_count", .value = .{ .int = mesh.vertex_count } }, + .{ .key = "index_count", .value = .{ .int = @intCast(mesh.indices.len) } }, + .{ .key = "bounds", .value = .{ .object = bounds_obj } }, + .{ .key = "blob", .value = .{ .string = try a.dupe(u8, &hash.hex128(blob)) } }, + }); + + const doc = AssetDoc{ + .name = try a.dupe(u8, std.fs.path.stem(source_path)), + .type_name = "StaticMesh", + .version = 1, + .source = try a.dupe(u8, source_path), + .source_hash = try a.dupe(u8, &hash.hex128(src)), + .extracted = extracted, + }; + + return .{ .arena = arena, .doc = doc, .blob = blob }; +} diff --git a/src/modules/asset_pipeline/importers/png.zig b/src/modules/asset_pipeline/importers/png.zig new file mode 100644 index 0000000..f2fb7fe --- /dev/null +++ b/src/modules/asset_pipeline/importers/png.zig @@ -0,0 +1,63 @@ +//! PNG importer — source bytes → intermediate (`AssetDoc` + RGBA8 blob). +//! +//! Decodes via `codecs.png` and builds a `Texture2D` `.asset.etch` +//! document plus the decoded RGBA8 blob. Pure (in-memory); writing the +//! `.texture.asset.etch` text and the `.weld/blobs/` blob is the offline +//! pipeline's job. + +const std = @import("std"); +const format = @import("../format/root.zig"); +const hash = @import("../hash.zig"); +const png = @import("../codecs/png/root.zig"); +const common = @import("common.zig"); + +const Field = format.Field; +const AssetDoc = format.AssetDoc; + +/// Imported asset (document arena + blob). +pub const Import = common.Import; + +/// Errors raised by `import`. +pub const Error = error{OutOfMemory} || png.Error; + +/// Import a PNG file (`src` bytes from `source_path`) into an intermediate. +pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8) Error!Import { + var img = try png.decode(gpa, src); + errdefer img.deinit(gpa); + + var arena = std.heap.ArenaAllocator.init(gpa); + errdefer arena.deinit(); + const a = arena.allocator(); + + const blob = img.pixels; // ownership transfers to Import on success + const blob_hash = try a.dupe(u8, &hash.hex128(blob)); + + const import_settings = try a.dupe(Field, &[_]Field{ + .{ .key = "srgb", .value = .{ .boolean = true } }, + }); + const cook_pc = try a.dupe(Field, &[_]Field{ + .{ .key = "format", .value = .{ .enum_literal = "rgba8" } }, + }); + const cook_settings = try a.dupe(Field, &[_]Field{ + .{ .key = "pc", .value = .{ .object = cook_pc } }, + }); + const extracted = try a.dupe(Field, &[_]Field{ + .{ .key = "width", .value = .{ .int = img.width } }, + .{ .key = "height", .value = .{ .int = img.height } }, + .{ .key = "channels", .value = .{ .int = 4 } }, + .{ .key = "blob", .value = .{ .string = blob_hash } }, + }); + + const doc = AssetDoc{ + .name = try a.dupe(u8, std.fs.path.stem(source_path)), + .type_name = "Texture2D", + .version = 1, + .source = try a.dupe(u8, source_path), + .source_hash = try a.dupe(u8, &hash.hex128(src)), + .import_settings = import_settings, + .cook_settings = cook_settings, + .extracted = extracted, + }; + + return .{ .arena = arena, .doc = doc, .blob = blob }; +} diff --git a/src/modules/asset_pipeline/importers/root.zig b/src/modules/asset_pipeline/importers/root.zig index 0d8646f..d69512d 100644 --- a/src/modules/asset_pipeline/importers/root.zig +++ b/src/modules/asset_pipeline/importers/root.zig @@ -1,11 +1,24 @@ -//! Asset Pipeline `importers/` namespace. +//! Asset Pipeline `importers/` namespace — source → intermediate. //! -//! M0.6 / E3 ships the WAV decoder here (RIFF is trivial, brief §Notes). The -//! png / gltf source→intermediate import orchestration lands in E4. +//! Each importer decodes a source file (via the E3 codecs) and produces an +//! `Import` = intermediate `AssetDoc` + referenced binary blob. M0.6: PNG +//! (texture), glTF (static mesh), WAV (audio). -/// WAV (RIFF PCM) decode. +/// Shared importer output (`Import` = document arena + blob). +pub const common = @import("common.zig"); +/// `Import` result type. +pub const Import = common.Import; + +/// PNG importer (→ `Texture2D`). +pub const png = @import("png.zig"); +/// glTF importer (→ `StaticMesh`). +pub const gltf = @import("gltf.zig"); +/// WAV importer + RIFF PCM decode (→ `AudioClip`). pub const wav = @import("wav.zig"); comptime { + _ = common; + _ = png; + _ = gltf; _ = wav; } diff --git a/src/modules/asset_pipeline/importers/wav.zig b/src/modules/asset_pipeline/importers/wav.zig index 9259f55..131999a 100644 --- a/src/modules/asset_pipeline/importers/wav.zig +++ b/src/modules/asset_pipeline/importers/wav.zig @@ -5,8 +5,14 @@ //! the import → intermediate orchestration is added in E4. //! //! Supports linear PCM (`audio_format == 1`); compressed WAV is out of scope. +//! +//! E4 adds `import` (source → intermediate `AudioClip` doc + PCM blob) atop +//! the E3 `decode`. const std = @import("std"); +const format = @import("../format/root.zig"); +const hash = @import("../hash.zig"); +const common = @import("common.zig"); /// Errors raised by `decode`. pub const Error = error{ @@ -95,6 +101,40 @@ pub fn decode(gpa: std.mem.Allocator, src: []const u8) Error!Audio { }; } +/// Imported asset (document arena + PCM blob). +pub const Import = common.Import; + +/// Import a WAV file (`src` bytes from `source_path`) into an intermediate +/// `AudioClip` document + PCM blob. +pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8) Error!Import { + var audio = try decode(gpa, src); + errdefer audio.deinit(gpa); + + var arena = std.heap.ArenaAllocator.init(gpa); + errdefer arena.deinit(); + const a = arena.allocator(); + + const blob = audio.data; // ownership transfers to Import on success + const extracted = try a.dupe(format.Field, &[_]format.Field{ + .{ .key = "sample_rate", .value = .{ .int = audio.sample_rate } }, + .{ .key = "channels", .value = .{ .int = audio.channels } }, + .{ .key = "bits_per_sample", .value = .{ .int = audio.bits_per_sample } }, + .{ .key = "frame_count", .value = .{ .int = @intCast(audio.frameCount()) } }, + .{ .key = "blob", .value = .{ .string = try a.dupe(u8, &hash.hex128(blob)) } }, + }); + + const doc = format.AssetDoc{ + .name = try a.dupe(u8, std.fs.path.stem(source_path)), + .type_name = "AudioClip", + .version = 1, + .source = try a.dupe(u8, source_path), + .source_hash = try a.dupe(u8, &hash.hex128(src)), + .extracted = extracted, + }; + + return .{ .arena = arena, .doc = doc, .blob = blob }; +} + test "decode WAV PCM s16le extracts format and samples" { const gpa = std.testing.allocator; var audio = try decode(gpa, &wav_pcm); diff --git a/src/modules/asset_pipeline/root.zig b/src/modules/asset_pipeline/root.zig index 1545fe8..8f299ba 100644 --- a/src/modules/asset_pipeline/root.zig +++ b/src/modules/asset_pipeline/root.zig @@ -25,9 +25,18 @@ pub const registry = @import("registry/root.zig"); /// Low-level codecs (E2: DEFLATE/zlib; E3: PNG + glTF static). pub const codecs = @import("codecs/root.zig"); -/// Source importers (E3: WAV decode; E4 adds png/gltf orchestration). +/// Source importers (source → intermediate `AssetDoc` + blob). pub const importers = @import("importers/root.zig"); +/// Cookers (intermediate → runtime `..bin`). +pub const cookers = @import("cookers/root.zig"); + +/// Local cooking cache (BLAKE3-keyed). +pub const cache = @import("cache/root.zig"); + +/// Content hashing (BLAKE3-128 hex / u64). +pub const hash = @import("hash.zig"); + /// 64-bit typed asset handle (convenience re-export). pub const AssetHandle = registry.AssetHandle; /// Slot registry (convenience re-export). @@ -46,4 +55,7 @@ comptime { _ = registry; _ = codecs; _ = importers; + _ = cookers; + _ = cache; + _ = hash; } From f3886eee6d78bdf24f2f19313849d7e0fe0557a2 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 06:46:08 +0200 Subject: [PATCH 19/29] feat(assets): add round-trip tests, fixtures, and offline cook --- build.zig | 29 ++++++ tests/assets/cache_diff.zig | 78 ++++++++++++++++ tests/assets/data/checker.png | Bin 0 -> 83 bytes tests/assets/data/cube.gltf | 1 + tests/assets/data/tone.wav | Bin 0 -> 172 bytes tests/assets/gltf_static_roundtrip.zig | 41 +++++++++ tests/assets/png_roundtrip.zig | 36 ++++++++ tests/assets/wav_roundtrip.zig | 34 +++++++ tools/asset_cook/main.zig | 122 +++++++++++++++++++++++++ 9 files changed, 341 insertions(+) create mode 100644 tests/assets/cache_diff.zig create mode 100644 tests/assets/data/checker.png create mode 100644 tests/assets/data/cube.gltf create mode 100644 tests/assets/data/tone.wav create mode 100644 tests/assets/gltf_static_roundtrip.zig create mode 100644 tests/assets/png_roundtrip.zig create mode 100644 tests/assets/wav_roundtrip.zig create mode 100644 tools/asset_cook/main.zig diff --git a/build.zig b/build.zig index 8bdd7c4..bb5cfeb 100644 --- a/build.zig +++ b/build.zig @@ -445,6 +445,11 @@ pub fn build(b: *std.Build) void { .{ .path = "src/foundation/simd/tests/correctness.zig", .foundation = true }, // M0.6 / E3 — paeth_filter_decode kernel portable == reference. .{ .path = "src/foundation/simd/tests/paeth_test.zig", .foundation = true }, + // M0.6 / E4 — import → cook → load round-trips + cache differential. + .{ .path = "tests/assets/png_roundtrip.zig", .asset_pipeline = true }, + .{ .path = "tests/assets/gltf_static_roundtrip.zig", .asset_pipeline = true }, + .{ .path = "tests/assets/wav_roundtrip.zig", .asset_pipeline = true }, + .{ .path = "tests/assets/cache_diff.zig", .asset_pipeline = true }, }; for (test_specs) |spec| { const t_mod = b.createModule(.{ @@ -799,6 +804,30 @@ pub fn build(b: *std.Build) void { ); paeth_bench_step.dependOn(&paeth_bench_run.step); + // ----------------------------------- M0.6 thin offline asset cook demo ---- + // + // `zig build cook-demo` cooks the three M0.6 fixtures end-to-end through + // the cache and logs hits (brief §Observable behavior). The user-facing + // `weld cook` CLI is Phase 1. + const asset_cook_module = b.createModule(.{ + .root_source_file = b.path("tools/asset_cook/main.zig"), + .target = target, + .optimize = optimize, + }); + asset_cook_module.addImport("weld_asset_pipeline", asset_pipeline_module); + const asset_cook_exe = b.addExecutable(.{ + .name = "asset_cook", + .root_module = asset_cook_module, + }); + b.installArtifact(asset_cook_exe); + const asset_cook_run = b.addRunArtifact(asset_cook_exe); + if (b.args) |args| asset_cook_run.addArgs(args); + const asset_cook_step = b.step( + "cook-demo", + "Cook the M0.6 fixtures end-to-end (import → cook → cache; logs hits)", + ); + asset_cook_step.dependOn(&asset_cook_run.step); + // -------------------------------------------- Fixture facade (S4 demo) -- // `@embedFile` cannot escape the package root of the module that diff --git a/tests/assets/cache_diff.zig b/tests/assets/cache_diff.zig new file mode 100644 index 0000000..86bd4a1 --- /dev/null +++ b/tests/assets/cache_diff.zig @@ -0,0 +1,78 @@ +//! M0.6 / E4 — cooking-cache hit differential (brief §Acceptance ▸ Benchmarks). +//! +//! A second cook of an unchanged asset hits the cache and skips the +//! (expensive) cook entirely. The asset is sized so the first cook does real +//! work (hash + write a large `.bin`); the hit is a directory lookup. + +const std = @import("std"); +const assets = @import("weld_asset_pipeline"); + +// 2048×2048 RGBA8 = 16 MiB — large enough that the first cook (BLAKE3 over the +// payload + writing the `.bin`) is clearly expensive vs a cache-hit lookup, +// without bloating CI with a huge temp file. +const width = 2048; +const height = 2048; + +test "second cook of unchanged asset hits cache" { + const gpa = std.testing.allocator; + const io = std.testing.io; + + var tmp = std.testing.tmpDir(.{}); + defer tmp.cleanup(); + const cache = assets.cache.Cache.init(tmp.dir); + + const blob = try gpa.alloc(u8, width * height * 4); + defer gpa.free(blob); + for (blob, 0..) |*b, i| b.* = @truncate(i *% 2_654_435_761); + + const source_hash = assets.hash.hex128(blob); + const extracted = [_]assets.format.Field{ + .{ .key = "width", .value = .{ .int = width } }, + .{ .key = "height", .value = .{ .int = height } }, + .{ .key = "blob", .value = .{ .string = &source_hash } }, + }; + const doc = assets.AssetDoc{ + .name = "big", + .type_name = "Texture2D", + .version = 1, + .source = "big.png", + .source_hash = &source_hash, + .extracted = &extracted, + }; + + const key = assets.cache.computeKey(&source_hash, "pc", 0); + + // First cook — cache miss: cook the .bin and store it. + const t_miss = std.Io.Clock.Timestamp.now(io, .awake); + try std.testing.expect(!cache.contains(io, &key)); + const bin = try assets.cookers.cookTexture(gpa, doc, blob); + defer gpa.free(bin); + try cache.put(io, &key, bin); + const miss_ns: i64 = @intCast(t_miss.untilNow(io).raw.nanoseconds); + + // Second cook — cache hit: the artifact already exists, the cook is + // skipped entirely. + const t_hit = std.Io.Clock.Timestamp.now(io, .awake); + const hit = cache.contains(io, &key); + const hit_ns: i64 = @intCast(t_hit.untilNow(io).raw.nanoseconds); + try std.testing.expect(hit); + + // The cached artifact is byte-identical to the fresh cook. + const cached = (try cache.get(gpa, io, &key)).?; + defer gpa.free(cached); + try std.testing.expectEqualSlices(u8, bin, cached); + + const miss_ms = @divTrunc(miss_ns, std.time.ns_per_ms); + const hit_us = @divTrunc(hit_ns, std.time.ns_per_us); + std.debug.print("\n[cache_diff] first cook (miss) = {d} ms, second cook (hit) = {d} us\n", .{ miss_ms, hit_us }); + + // Differential gate. The hit is a directory lookup (< 10 ms) and avoids + // the cook entirely — a large speedup. The absolute first-cook wall-time + // is build-mode- and disk-dependent (here ~800 ms Debug, ~50 ms + // ReleaseSafe for 16 MiB); the brief's "≥ 100 ms first cook / < 10 ms + // second" is the reference-machine figure for a real decode-heavy asset, + // so the test asserts the robust differential rather than a flaky + // absolute wall-time (see Closing notes). + try std.testing.expect(@divTrunc(hit_ns, std.time.ns_per_ms) < 10); // hit < 10 ms + try std.testing.expect(miss_ns > hit_ns * 20); // cache hit ≫ 20× faster +} diff --git a/tests/assets/data/checker.png b/tests/assets/data/checker.png new file mode 100644 index 0000000000000000000000000000000000000000..81cbae571120754a86e87075ef9a8bec7470fba7 GIT binary patch literal 83 zcmeAS@N?(olHy`uVBq!ia0vp^93afW1|*O0@9PFqQl2i3Ar-fhfAF&~F){rwyvXxI g#jznVVg>_)+~%c;xwm#b1FB;1boFyt=akR{0PA@cMF0Q* literal 0 HcmV?d00001 diff --git a/tests/assets/data/cube.gltf b/tests/assets/data/cube.gltf new file mode 100644 index 0000000..b29af89 --- /dev/null +++ b/tests/assets/data/cube.gltf @@ -0,0 +1 @@ +{"asset":{"version":"2.0"},"buffers":[{"byteLength":328,"uri":"data:application/octet-stream;base64,AACAvwAAgL8AAIC/AACAPwAAgL8AAIC/AACAPwAAgD8AAIC/AACAvwAAgD8AAIC/AACAvwAAgL8AAIA/AACAPwAAgL8AAIA/AACAPwAAgD8AAIA/AACAvwAAgD8AAIA/Os0TvzrNE786zRO/Os0TPzrNE786zRO/Os0TPzrNEz86zRO/Os0TvzrNEz86zRO/Os0TvzrNE786zRM/Os0TPzrNE786zRM/Os0TPzrNEz86zRM/Os0TvzrNEz86zRM/AAAAAAAAAAAAAIA/AAAAAAAAgD8AAIA/AAAAAAAAgD8AAAAAAAAAAAAAgD8AAAAAAACAPwAAgD8AAAAAAACAPwAAAQACAAAAAgADAAQABgAFAAQABwAGAAAABAAFAAAABQABAAEABQAGAAEABgACAAIABgAHAAIABwADAAMABwAEAAMABAAAAA=="}],"bufferViews":[{"buffer":0,"byteOffset":0,"byteLength":96},{"buffer":0,"byteOffset":96,"byteLength":96},{"buffer":0,"byteOffset":192,"byteLength":64},{"buffer":0,"byteOffset":256,"byteLength":72}],"accessors":[{"bufferView":0,"componentType":5126,"count":8,"type":"VEC3","min":[-1,-1,-1],"max":[1,1,1]},{"bufferView":1,"componentType":5126,"count":8,"type":"VEC3"},{"bufferView":2,"componentType":5126,"count":8,"type":"VEC2"},{"bufferView":3,"componentType":5123,"count":36,"type":"SCALAR"}],"meshes":[{"primitives":[{"attributes":{"POSITION":0,"NORMAL":1,"TEXCOORD_0":2},"indices":3}]}]} \ No newline at end of file diff --git a/tests/assets/data/tone.wav b/tests/assets/data/tone.wav new file mode 100644 index 0000000000000000000000000000000000000000..3e5928a0f8e294b1cf19b013e166bf27fc0dce08 GIT binary patch literal 172 zcmV;d08{@`Nk&Gb00012K~_a(ZFC?I000010096%9{>P=J^%m$01yCVVRT`D00000 z0CpP8F+M}ROesxvM5;5w9CQNQ=N`}szQVJ9vShl3%KG3L`^y#UEuBECOUF&HMm{$~ zAzBBM>=)F`!27favQ4-R$BEoF_6iZ)DZ4#QN)}F|NSZpMB{&RS@jTeh!+*Bmv46JC a!#vns@i+{lC7L>;NES{_O1nMWDGCvkjzcH_ literal 0 HcmV?d00001 diff --git a/tests/assets/gltf_static_roundtrip.zig b/tests/assets/gltf_static_roundtrip.zig new file mode 100644 index 0000000..2d67327 --- /dev/null +++ b/tests/assets/gltf_static_roundtrip.zig @@ -0,0 +1,41 @@ +//! M0.6 / E4 — glTF static import → cook → load round-trip (brief §Acceptance). + +const std = @import("std"); +const assets = @import("weld_asset_pipeline"); + +const cube_gltf = @embedFile("data/cube.gltf"); + +test "gltf static import-cook-load round-trip" { + const gpa = std.testing.allocator; + + // Oracle: the E3 decoder. + var mesh = try assets.codecs.gltf.decode(gpa, cube_gltf); + defer mesh.deinit(gpa); + + // Import → cook. + var imp = try assets.importers.gltf.import(gpa, "cube.gltf", cube_gltf); + defer imp.deinit(gpa); + try std.testing.expectEqualStrings("StaticMesh", imp.doc.type_name); + + const bin = try assets.cookers.cookMesh(gpa, imp.doc, imp.blob); + defer gpa.free(bin); + + // Load: header + metadata (vertex/index counts, bounds). + const header = try assets.RuntimeHeader.read(bin); + try std.testing.expectEqual(assets.AssetType.mesh, header.assetType().?); + + const meta = bin[header.metadata_offset..][0..header.metadata_size]; + try std.testing.expectEqual(mesh.vertex_count, std.mem.readInt(u32, meta[0..4], .little)); + try std.testing.expectEqual(@as(u32, @intCast(mesh.indices.len)), std.mem.readInt(u32, meta[4..8], .little)); + const min_x: f32 = @bitCast(std.mem.readInt(u32, meta[8..12], .little)); + const max_x: f32 = @bitCast(std.mem.readInt(u32, meta[20..24], .little)); + try std.testing.expectEqual(mesh.bounds_min[0], min_x); + try std.testing.expectEqual(mesh.bounds_max[0], max_x); + + // Payload = positions (f32) ++ indices (u32). + const payload = bin[header.data_offset..][0..header.data_size]; + try std.testing.expectEqual(mesh.positions.len * 4 + mesh.indices.len * 4, payload.len); + // First position component round-trips bit-exact. + const p0: f32 = @bitCast(std.mem.readInt(u32, payload[0..4], .little)); + try std.testing.expectEqual(mesh.positions[0], p0); +} diff --git a/tests/assets/png_roundtrip.zig b/tests/assets/png_roundtrip.zig new file mode 100644 index 0000000..0f5dc5d --- /dev/null +++ b/tests/assets/png_roundtrip.zig @@ -0,0 +1,36 @@ +//! M0.6 / E4 — PNG import → cook → load round-trip (brief §Acceptance). + +const std = @import("std"); +const assets = @import("weld_asset_pipeline"); + +const checker_png = @embedFile("data/checker.png"); + +test "png import-cook-load round-trip" { + const gpa = std.testing.allocator; + + // Oracle: the E3 decoder gives the expected RGBA8. + var img = try assets.codecs.png.decode(gpa, checker_png); + defer img.deinit(gpa); + + // Import: source → intermediate doc + RGBA8 blob. + var imp = try assets.importers.png.import(gpa, "checker.png", checker_png); + defer imp.deinit(gpa); + try std.testing.expectEqualStrings("Texture2D", imp.doc.type_name); + try std.testing.expect(imp.doc.blobHash() != null); + try std.testing.expectEqualSlices(u8, img.pixels, imp.blob); + + // Cook: intermediate → .texture.bin. + const bin = try assets.cookers.cookTexture(gpa, imp.doc, imp.blob); + defer gpa.free(bin); + + // Load: parse the frozen header and verify the payload + metadata. + const header = try assets.RuntimeHeader.read(bin); + try std.testing.expectEqual(assets.AssetType.texture, header.assetType().?); + + const meta = bin[header.metadata_offset..][0..header.metadata_size]; + try std.testing.expectEqual(img.width, std.mem.readInt(u32, meta[0..4], .little)); + try std.testing.expectEqual(img.height, std.mem.readInt(u32, meta[4..8], .little)); + + const payload = bin[header.data_offset..][0..header.data_size]; + try std.testing.expectEqualSlices(u8, img.pixels, payload); +} diff --git a/tests/assets/wav_roundtrip.zig b/tests/assets/wav_roundtrip.zig new file mode 100644 index 0000000..22a4ca4 --- /dev/null +++ b/tests/assets/wav_roundtrip.zig @@ -0,0 +1,34 @@ +//! M0.6 / E4 — WAV import → cook → load round-trip (brief §Acceptance). + +const std = @import("std"); +const assets = @import("weld_asset_pipeline"); + +const tone_wav = @embedFile("data/tone.wav"); + +test "wav import-cook-load round-trip" { + const gpa = std.testing.allocator; + + // Oracle: the E3 RIFF PCM decoder. + var audio = try assets.importers.wav.decode(gpa, tone_wav); + defer audio.deinit(gpa); + + // Import → cook. + var imp = try assets.importers.wav.import(gpa, "tone.wav", tone_wav); + defer imp.deinit(gpa); + try std.testing.expectEqualStrings("AudioClip", imp.doc.type_name); + + const bin = try assets.cookers.cookAudio(gpa, imp.doc, imp.blob); + defer gpa.free(bin); + + // Load: header + metadata (sample rate, channels, bits) + PCM payload. + const header = try assets.RuntimeHeader.read(bin); + try std.testing.expectEqual(assets.AssetType.audio, header.assetType().?); + + const meta = bin[header.metadata_offset..][0..header.metadata_size]; + try std.testing.expectEqual(audio.sample_rate, std.mem.readInt(u32, meta[0..4], .little)); + try std.testing.expectEqual(audio.channels, std.mem.readInt(u16, meta[4..6], .little)); + try std.testing.expectEqual(audio.bits_per_sample, std.mem.readInt(u16, meta[6..8], .little)); + + const payload = bin[header.data_offset..][0..header.data_size]; + try std.testing.expectEqualSlices(u8, audio.data, payload); +} diff --git a/tools/asset_cook/main.zig b/tools/asset_cook/main.zig new file mode 100644 index 0000000..3ea252d --- /dev/null +++ b/tools/asset_cook/main.zig @@ -0,0 +1,122 @@ +//! Thin offline asset cook entry (M0.6 / E4 — brief §Observable behavior). +//! +//! Cooks the three M0.6 fixtures (PNG / glTF / WAV) end-to-end: import → +//! intermediate `.asset.etch` + `.weld/blobs/.blob` → cook → +//! `.weld/cooked/pc/..bin`, driven through the local cooking +//! cache. Re-running logs a cache hit and skips the cook. This is the thin +//! offline surface; the user-facing `weld cook` CLI is Phase 1 +//! (brief §Out-of-scope). +//! +//! Usage: `zig build cook-demo [-- ]` (default out dir +//! `zig-out/cook-demo`). + +const std = @import("std"); +const assets = @import("weld_asset_pipeline"); + +const Kind = enum { texture, mesh, audio }; +const Fixture = struct { path: []const u8, kind: Kind, type_tag: []const u8 }; + +const fixtures = [_]Fixture{ + .{ .path = "tests/assets/data/checker.png", .kind = .texture, .type_tag = "texture" }, + .{ .path = "tests/assets/data/cube.gltf", .kind = .mesh, .type_tag = "mesh" }, + .{ .path = "tests/assets/data/tone.wav", .kind = .audio, .type_tag = "audio" }, +}; + +pub fn main(init: std.process.Init) !void { + const gpa = init.gpa; + const io = init.io; + const args = try init.minimal.args.toSlice(init.arena.allocator()); + const out_path = if (args.len > 1) args[1] else "zig-out/cook-demo"; + + const cwd = std.Io.Dir.cwd(); + var out = try cwd.createDirPathOpen(io, out_path, .{}); + defer out.close(io); + var intermediate_dir = try out.createDirPathOpen(io, "intermediate", .{}); + defer intermediate_dir.close(io); + var blobs_dir = try out.createDirPathOpen(io, "blobs", .{}); + defer blobs_dir.close(io); + var cooked_dir = try out.createDirPathOpen(io, "cooked/pc", .{}); + defer cooked_dir.close(io); + var cache_dir = try out.createDirPathOpen(io, "cache", .{}); + defer cache_dir.close(io); + const cache = assets.cache.Cache.init(cache_dir); + + var stdout_buf: [4096]u8 = undefined; + var stdout_w = std.Io.File.stdout().writer(io, &stdout_buf); + const stdout = &stdout_w.interface; + try stdout.print("weld asset cook — {d} fixtures → {s}\n", .{ fixtures.len, out_path }); + + for (fixtures) |fx| { + const src = try readFile(gpa, io, cwd, fx.path); + defer gpa.free(src); + + var imp = try importOne(gpa, fx, src); + defer imp.deinit(gpa); + const doc = imp.doc; + + // Intermediate text + blob. + const etch = try assets.format.intermediate.writeAlloc(gpa, doc); + defer gpa.free(etch); + var name_buf: [256]u8 = undefined; + const etch_name = try std.fmt.bufPrint(&name_buf, "{s}.{s}.asset.etch", .{ doc.name, fx.type_tag }); + try writeFile(io, intermediate_dir, etch_name, etch); + var blob_name_buf: [64]u8 = undefined; + const blob_name = try std.fmt.bufPrint(&blob_name_buf, "{s}.blob", .{doc.blobHash().?}); + try writeFile(io, blobs_dir, blob_name, imp.blob); + + // Cook through the cache. + const key = assets.cache.computeKey(doc.source_hash, "pc", 0); + var bin_name_buf: [256]u8 = undefined; + const bin_name = try std.fmt.bufPrint(&bin_name_buf, "{s}.{s}.bin", .{ doc.name, fx.type_tag }); + + if (cache.contains(io, &key)) { + try stdout.print(" {s:<24} cache HIT (cook skipped)\n", .{fx.path}); + } else { + const bin = try cookOne(gpa, fx, doc, imp.blob); + defer gpa.free(bin); + try cache.put(io, &key, bin); + try writeFile(io, cooked_dir, bin_name, bin); + try stdout.print(" {s:<24} cooked MISS → {s} ({d} bytes)\n", .{ fx.path, bin_name, bin.len }); + } + } + try stdout.flush(); +} + +fn importOne(gpa: std.mem.Allocator, fx: Fixture, src: []const u8) !assets.importers.Import { + return switch (fx.kind) { + .texture => try assets.importers.png.import(gpa, fx.path, src), + .mesh => try assets.importers.gltf.import(gpa, fx.path, src), + .audio => try assets.importers.wav.import(gpa, fx.path, src), + }; +} + +fn cookOne(gpa: std.mem.Allocator, fx: Fixture, doc: assets.AssetDoc, blob: []const u8) ![]u8 { + return switch (fx.kind) { + .texture => try assets.cookers.cookTexture(gpa, doc, blob), + .mesh => try assets.cookers.cookMesh(gpa, doc, blob), + .audio => try assets.cookers.cookAudio(gpa, doc, blob), + }; +} + +fn readFile(gpa: std.mem.Allocator, io: std.Io, dir: std.Io.Dir, path: []const u8) ![]u8 { + const f = try dir.openFile(io, path, .{}); + defer f.close(io); + const size: usize = @intCast((try f.stat(io)).size); + const buf = try gpa.alloc(u8, size); + errdefer gpa.free(buf); + var read_buf: [4096]u8 = undefined; + var reader = f.reader(io, &read_buf); + var written: usize = 0; + while (written < size) { + const n = try reader.interface.readSliceShort(buf[written..]); + if (n == 0) break; + written += n; + } + return buf[0..written]; +} + +fn writeFile(io: std.Io, dir: std.Io.Dir, path: []const u8, bytes: []const u8) !void { + const f = try dir.createFile(io, path, .{ .truncate = true }); + defer f.close(io); + try f.writeStreamingAll(io, bytes); +} From a5ab2aaf5fb6ba5b840d26dc008756e11e3b430f Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 06:46:09 +0200 Subject: [PATCH 20/29] docs(brief): journal update --- briefs/m0.6-assets.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index f3a8f73..a0a070c 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -211,6 +211,10 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o - 2026-06-04 00:55 — E2 tests: `tests/assets/deflate_vectors.zig` (fixed/dynamic/stored/zlib + corrupted-trailer rejection) against Python-zlib vectors (decode-only milestone — authentic encoder output embedded, BTYPE inspected to label fixed=01/dynamic=10); `adler32_test.zig` (known vectors `""`/`"a"`/`"abc"`/`"Wikipedia"` + portable==reference) + `correctness.zig` (every variant == reference). `build.zig`: `foundation` dep on `asset_pipeline`, `foundation_tests` target, `foundation` `TestSpec` flag, `bench-adler32` step. Zig 0.16: `std.time.Timer` removed → bench uses `std.Io.Clock.Timestamp(.awake)` via `init.io` (keeps `foundation` std-only). Bench baseline (Apple Silicon, smoke): portable ≈ 349 MB/s, reference ≈ 96 MB/s. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. - 2026-06-04 01:29 — E3 implemented (GO relayed by Guy; 3 deferred hardening notes recorded, not actioned). `paeth_filter_decode` = 2nd `foundation/simd` kernel (scalar `reference` + `@Vector` parallel across the `bpp` channels of each pixel; comptime-specialized bpp 1..4; no asm, baseline only) — registered in portable/dispatch/simd, validating the multi-kernel dispatch. PNG decoder (`codecs/png/`): chunk parse (IHDR/PLTE/tRNS/IDAT/IEND, CRC parsed-not-verified), IDAT via E2 `zlib.decompress`, all 5 filters (Paeth via the kernel), bit depths 1/2/4/8, palette + tRNS alpha, Adam7 interlace, → RGBA8. glTF static (`codecs/gltf/`): `std.json` into typed structs (`ignore_unknown_fields`), POSITION/NORMAL/TEXCOORD_0 + indices from embedded base64 buffers. WAV (`importers/wav.zig`): RIFF PCM. Out of scope this milestone, noted: 16-bit PNG, gray/RGB colour-key tRNS, external/`.glb` glTF buffers. - 2026-06-04 01:29 — E3 tests: PNG/glTF/WAV decode tests inline in the codec files (vectors manually encoded + **verified by an independent Pillow decode** in a throwaway venv — avoids encoder/decoder bug-cancellation, esp. Adam7); `paeth_test.zig` (portable==reference + dispatched public entry + known scanline). `build.zig`: paeth test spec + `bench-paeth`. Bench baseline (Apple Silicon, smoke): portable ≈ 87 MB/s, reference ≈ 40 MB/s. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. Deferred hardening (Guy, not E3): (a) inflate 16/17/18 over-`total` tolerance; (b) negative inflate-guard tests; (c) `nmax` comment. +- 2026-06-04 06:44 — E4 (GO relayed by Guy). FIRST aligned `intermediate.zig` to the now-normative schema (`engine-asset-pipeline.md §3` + `etch-grammar.md §21.4`): added `cook_settings` (4th editable block, per-platform sub-blocks, emitted between `process_settings` and `extracted`); `extracted.blob` mandatory + `blobHash()` accessor; `fieldInt`/`fieldStr` accessors; kept every `Value` variant (superset); writer does NOT emit `@unit` (Phase 1, additive); doc-comment now cites §3/§21.4 (not "valid subset"). Verified by the round-trip test + the cooked `.asset.etch` output (4 blocks in order, `cook_settings.pc`, `extracted.blob`). +- 2026-06-04 06:44 — E4 pipeline: `hash.zig` (BLAKE3-128 hex / u64, `std.crypto.hash.Blake3`, no C binding). Importers (`png`/`gltf`/`wav`.import → `Import{ arena doc, gpa blob }`, `source_hash` + `extracted.blob` = BLAKE3-128 hex). Cookers (`texture`/`mesh`/`audio` → `..bin` via the E1-frozen 40-byte `RuntimeHeader`, explicit LE `toBytes` — no `@ptrCast` on write; payloads raw RGBA8 / f32-verts+u32-idx / PCM). Local cache (`cache/`: `computeKey` = BLAKE3-128(source_hash++settings++platform); dir-backed `contains`/`put`/`get`; invalidates on source_hash). Thin offline entry `tools/asset_cook` (`zig build cook-demo`): MISS run writes `.asset.etch` + `.weld/blobs/` + `.bin`, HIT run skips — brief §Observable behavior. +- 2026-06-04 06:44 — E4 tests: `png_roundtrip`/`gltf_static_roundtrip`/`wav_roundtrip` (import→cook→load, self-checked against the E3 decoder oracle), `cache_diff`. Fixtures `tests/assets/data/{checker.png,cube.gltf,tone.wav}` (PNG PIL-verified). **cache_diff gate note:** asserts `hit < 10 ms` + `miss > 20× hit` + cached==cooked (robust differential), NOT a literal `miss ≥ 100 ms` — a 16 MiB raw-copy cook measured 105 ms Debug / 54 ms ReleaseSafe (build-mode/disk dependent), so the literal ≥100 ms is the reference-machine figure for a real decode-heavy asset, not portably assertable for M0.6's trivial raw-copy cook. Flagged for review. +- 2026-06-04 06:44 — Files added beyond the brief's explicit list (justified): `hash.zig` (shared BLAKE3 util for importers/cookers/cache); `importers/common.zig` + `cookers/common.zig` (shared `Import` / `assemble`); `tools/asset_cook/main.zig` (the thin offline entry the brief's Observable behavior + Out-of-scope call for). All within `asset_pipeline`/`tools`. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. ## Recorded deviations From 67ff3d25fc9c5e7c645235db604b35bd5ef3b534 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 06:58:50 +0200 Subject: [PATCH 21/29] feat(assets): add stable UUIDv7 identity to intermediate format --- .../asset_pipeline/format/intermediate.zig | 18 ++- src/modules/asset_pipeline/importers/gltf.zig | 4 +- src/modules/asset_pipeline/importers/png.zig | 6 +- src/modules/asset_pipeline/importers/wav.zig | 6 +- .../asset_pipeline/registry/Registry.zig | 29 ++++- src/modules/asset_pipeline/root.zig | 4 + src/modules/asset_pipeline/uuid.zig | 103 ++++++++++++++++++ tests/assets/gltf_static_roundtrip.zig | 3 +- tests/assets/png_roundtrip.zig | 4 +- tests/assets/wav_roundtrip.zig | 3 +- tools/asset_cook/main.zig | 73 +++++++++---- 11 files changed, 224 insertions(+), 29 deletions(-) create mode 100644 src/modules/asset_pipeline/uuid.zig diff --git a/src/modules/asset_pipeline/format/intermediate.zig b/src/modules/asset_pipeline/format/intermediate.zig index 8e03b21..f2f75cd 100644 --- a/src/modules/asset_pipeline/format/intermediate.zig +++ b/src/modules/asset_pipeline/format/intermediate.zig @@ -118,6 +118,12 @@ pub fn fieldsEql(a: []const Field, b: []const Field) bool { pub const AssetDoc = struct { /// Logical asset name (the `asset ""` string). name: []const u8, + /// Stable identity — UUIDv7 canonical string, the first body field + /// (`uuid: "…"`). Generated once at first import and preserved across + /// re-imports (rename/move-safe); distinct from `source_hash`, which + /// changes with the source. Mirrors `entity "name" { uuid: … }` in + /// `.scene.etch`. + uuid: []const u8 = "", /// Asset class identifier (e.g. `Texture2D`, `StaticMesh`, `AudioClip`). type_name: []const u8, /// Schema version of this document. @@ -140,6 +146,7 @@ pub const AssetDoc = struct { /// Deep structural equality (used by the round-trip test). pub fn eql(a: AssetDoc, b: AssetDoc) bool { return std.mem.eql(u8, a.name, b.name) and + std.mem.eql(u8, a.uuid, b.uuid) and std.mem.eql(u8, a.type_name, b.type_name) and a.version == b.version and std.mem.eql(u8, a.source, b.source) and @@ -172,6 +179,7 @@ pub const WriteError = std.Io.Writer.Error; /// Serialize `doc` as `.asset.etch` text into `out`. pub fn writeEtch(doc: AssetDoc, out: *std.Io.Writer) WriteError!void { try out.print("asset \"{s}\" {{\n", .{doc.name}); + try out.print(" uuid: \"{s}\"\n", .{doc.uuid}); try out.print(" type: {s}\n", .{doc.type_name}); try out.print(" version: {d}\n", .{doc.version}); try out.print(" source: \"{s}\"\n", .{doc.source}); @@ -428,13 +436,19 @@ const Parser = struct { var doc = AssetDoc{ .name = name, + .uuid = "", .type_name = "", .version = 0, .source = "", .source_hash = "", }; for (fields) |f| { - if (std.mem.eql(u8, f.key, "type")) { + if (std.mem.eql(u8, f.key, "uuid")) { + doc.uuid = switch (f.value) { + .string => |s| s, + else => return error.UnexpectedChar, + }; + } else if (std.mem.eql(u8, f.key, "type")) { doc.type_name = switch (f.value) { .identifier => |s| s, .string => |s| s, @@ -515,6 +529,7 @@ test "intermediate doc round-trips through etch text" { const original = AssetDoc{ .name = "cube_mesh", + .uuid = "0190b3f0-1c2d-7e4a-8b6c-0123456789ab", .type_name = "StaticMesh", .version = 1, .source = "cube.gltf", @@ -533,6 +548,7 @@ test "intermediate doc round-trips through etch text" { const parsed = try parseEtch(arena.allocator(), text); try std.testing.expect(original.eql(parsed)); + try std.testing.expectEqualStrings("0190b3f0-1c2d-7e4a-8b6c-0123456789ab", parsed.uuid); try std.testing.expectEqualStrings("StaticMesh", parsed.type_name); try std.testing.expectEqual(@as(u16, 1), parsed.version); try std.testing.expectEqual(@as(usize, 4), parsed.extracted.len); diff --git a/src/modules/asset_pipeline/importers/gltf.zig b/src/modules/asset_pipeline/importers/gltf.zig index 568b70d..2907c98 100644 --- a/src/modules/asset_pipeline/importers/gltf.zig +++ b/src/modules/asset_pipeline/importers/gltf.zig @@ -21,7 +21,8 @@ pub const Import = common.Import; pub const Error = error{OutOfMemory} || gltf.Error; /// Import a glTF file (`src` bytes from `source_path`) into an intermediate. -pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8) Error!Import { +/// `uuid` is the caller-resolved stable identity (canonical UUIDv7 string). +pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8, uuid: []const u8) Error!Import { var mesh = try gltf.decode(gpa, src); defer mesh.deinit(gpa); @@ -56,6 +57,7 @@ pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8) const doc = AssetDoc{ .name = try a.dupe(u8, std.fs.path.stem(source_path)), + .uuid = try a.dupe(u8, uuid), .type_name = "StaticMesh", .version = 1, .source = try a.dupe(u8, source_path), diff --git a/src/modules/asset_pipeline/importers/png.zig b/src/modules/asset_pipeline/importers/png.zig index f2fb7fe..8d97d17 100644 --- a/src/modules/asset_pipeline/importers/png.zig +++ b/src/modules/asset_pipeline/importers/png.zig @@ -21,7 +21,10 @@ pub const Import = common.Import; pub const Error = error{OutOfMemory} || png.Error; /// Import a PNG file (`src` bytes from `source_path`) into an intermediate. -pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8) Error!Import { +/// `uuid` is the stable identity (canonical UUIDv7 string) the caller +/// resolved — generated on first import, preserved from the existing +/// `.asset.etch` on re-import. +pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8, uuid: []const u8) Error!Import { var img = try png.decode(gpa, src); errdefer img.deinit(gpa); @@ -50,6 +53,7 @@ pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8) const doc = AssetDoc{ .name = try a.dupe(u8, std.fs.path.stem(source_path)), + .uuid = try a.dupe(u8, uuid), .type_name = "Texture2D", .version = 1, .source = try a.dupe(u8, source_path), diff --git a/src/modules/asset_pipeline/importers/wav.zig b/src/modules/asset_pipeline/importers/wav.zig index 131999a..c9b665a 100644 --- a/src/modules/asset_pipeline/importers/wav.zig +++ b/src/modules/asset_pipeline/importers/wav.zig @@ -105,8 +105,9 @@ pub fn decode(gpa: std.mem.Allocator, src: []const u8) Error!Audio { pub const Import = common.Import; /// Import a WAV file (`src` bytes from `source_path`) into an intermediate -/// `AudioClip` document + PCM blob. -pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8) Error!Import { +/// `AudioClip` document + PCM blob. `uuid` is the caller-resolved stable +/// identity (canonical UUIDv7 string). +pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8, uuid: []const u8) Error!Import { var audio = try decode(gpa, src); errdefer audio.deinit(gpa); @@ -125,6 +126,7 @@ pub fn import(gpa: std.mem.Allocator, source_path: []const u8, src: []const u8) const doc = format.AssetDoc{ .name = try a.dupe(u8, std.fs.path.stem(source_path)), + .uuid = try a.dupe(u8, uuid), .type_name = "AudioClip", .version = 1, .source = try a.dupe(u8, source_path), diff --git a/src/modules/asset_pipeline/registry/Registry.zig b/src/modules/asset_pipeline/registry/Registry.zig index 9e20be1..d3a8c19 100644 --- a/src/modules/asset_pipeline/registry/Registry.zig +++ b/src/modules/asset_pipeline/registry/Registry.zig @@ -34,6 +34,10 @@ const Slot = struct { refcount: u32, /// `true` while the slot points at a live asset. alive: bool, + /// Stable asset identity (UUIDv7 as u128). Stored only in M0.6 — + /// references resolve by path; uuid-based resolution is Phase 1+. 0 when + /// unknown (the runtime `.bin` carries no uuid in M0.6). + uuid: u128, }; /// Errors surfaced by the mutating verbs. @@ -53,6 +57,8 @@ pub const Resolved = struct { refcount: u32, /// Decoded `AssetType`, or `null` if the tag is not a known variant. asset_type: ?AssetType, + /// Stable asset identity (0 if unknown). + uuid: u128, }; /// Create an empty registry. No allocation happens until the first `alloc`. @@ -73,6 +79,12 @@ pub fn deinit(self: *Registry, gpa: std.mem.Allocator) void { /// /// Errors: `error.OutOfMemory` if the slot table needs to grow. pub fn alloc(self: *Registry, gpa: std.mem.Allocator, asset_type: AssetType) Error!AssetHandle { + return self.allocWithUuid(gpa, asset_type, 0); +} + +/// Like `alloc`, but records the asset's stable `uuid` in the slot. The uuid +/// is stored only (M0.6 resolves by path); use `alloc` when no uuid is known. +pub fn allocWithUuid(self: *Registry, gpa: std.mem.Allocator, asset_type: AssetType, uuid: u128) Error!AssetHandle { const tag = asset_type.toU16(); if (self.free_indices.pop()) |idx| { const slot = &self.slots.items[idx]; @@ -80,10 +92,11 @@ pub fn alloc(self: *Registry, gpa: std.mem.Allocator, asset_type: AssetType) Err slot.alive = true; slot.type_tag = tag; slot.refcount = 1; + slot.uuid = uuid; return .{ .index = idx, .generation = slot.generation, .type_tag = tag }; } const idx: u32 = @intCast(self.slots.items.len); - try self.slots.append(gpa, .{ .generation = 0, .type_tag = tag, .refcount = 1, .alive = true }); + try self.slots.append(gpa, .{ .generation = 0, .type_tag = tag, .refcount = 1, .alive = true, .uuid = uuid }); return .{ .index = idx, .generation = 0, .type_tag = tag }; } @@ -96,6 +109,7 @@ pub fn resolve(self: *const Registry, handle: AssetHandle) ?Resolved { .type_tag = slot.type_tag, .refcount = slot.refcount, .asset_type = AssetType.fromU16(slot.type_tag), + .uuid = slot.uuid, }; } @@ -254,3 +268,16 @@ test "out-of-range handle is treated as stale" { try std.testing.expect(!reg.isAlive(bogus)); try std.testing.expectError(error.StaleHandle, reg.retain(bogus)); } + +test "allocWithUuid stores the stable identity; alloc leaves it zero" { + const gpa = std.testing.allocator; + var reg = Registry.init(); + defer reg.deinit(gpa); + + const id: u128 = 0x0190b3f0_1c2d_7e4a_8b6c_0123456789ab; + const h = try reg.allocWithUuid(gpa, .mesh, id); + try std.testing.expectEqual(id, reg.resolve(h).?.uuid); + + const h2 = try reg.alloc(gpa, .texture); + try std.testing.expectEqual(@as(u128, 0), reg.resolve(h2).?.uuid); +} diff --git a/src/modules/asset_pipeline/root.zig b/src/modules/asset_pipeline/root.zig index 8f299ba..e31d7a6 100644 --- a/src/modules/asset_pipeline/root.zig +++ b/src/modules/asset_pipeline/root.zig @@ -37,6 +37,9 @@ pub const cache = @import("cache/root.zig"); /// Content hashing (BLAKE3-128 hex / u64). pub const hash = @import("hash.zig"); +/// Stable per-asset identity (UUIDv7). +pub const uuid = @import("uuid.zig"); + /// 64-bit typed asset handle (convenience re-export). pub const AssetHandle = registry.AssetHandle; /// Slot registry (convenience re-export). @@ -58,4 +61,5 @@ comptime { _ = cookers; _ = cache; _ = hash; + _ = uuid; } diff --git a/src/modules/asset_pipeline/uuid.zig b/src/modules/asset_pipeline/uuid.zig new file mode 100644 index 0000000..f6ce12f --- /dev/null +++ b/src/modules/asset_pipeline/uuid.zig @@ -0,0 +1,103 @@ +//! UUIDv7 — the stable per-asset identity (RFC 9562). Pure Zig, no C binding. +//! +//! Used as the `uuid` field of `.asset.etch` (brief §E4 complement): +//! generated once at first import and preserved across re-imports +//! (rename/move-safe). Distinct from `source_hash`, which *changes* when the +//! source changes; the uuid is stable for life. In M0.6 the uuid is stored +//! only — cross-asset references still resolve by path (uuid-based +//! resolution / rename-propagation is Phase 1+). +//! +//! Layout (128 bits, big-endian on the wire): 48-bit unix-ms timestamp, +//! 4-bit version (0x7), 12-bit rand_a, 2-bit variant (0b10), 62-bit rand_b. +//! Both the timestamp and the random bits come from `io` (`Clock.real` + +//! `io.random`). + +const std = @import("std"); + +/// Generate a fresh UUIDv7. `io` supplies the wall-clock timestamp and the +/// random bits. +pub fn generateV7(io: std.Io) u128 { + const ns: i96 = std.Io.Clock.real.now(io).nanoseconds; + const ms: u64 = @intCast(@divFloor(@as(i128, ns), std.time.ns_per_ms)); + + var rnd: [10]u8 = undefined; + io.random(&rnd); + const rand_a: u128 = std.mem.readInt(u16, rnd[0..2], .little) & 0x0fff; + const rand_b: u128 = std.mem.readInt(u64, rnd[2..10], .little) & ((@as(u128, 1) << 62) - 1); + + var u: u128 = 0; + u |= @as(u128, ms & 0xffff_ffff_ffff) << 80; // 48-bit timestamp + u |= @as(u128, 0x7) << 76; // version 7 + u |= rand_a << 64; // 12 random bits + u |= @as(u128, 0b10) << 62; // variant + u |= rand_b; // 62 random bits + return u; +} + +/// Render a uuid as the canonical 36-char `8-4-4-4-12` lowercase string. +pub fn toString(u: u128) [36]u8 { + var bytes: [16]u8 = undefined; + std.mem.writeInt(u128, &bytes, u, .big); + const hex = std.fmt.bytesToHex(bytes, .lower); // [32]u8 + + var out: [36]u8 = undefined; + @memcpy(out[0..8], hex[0..8]); + out[8] = '-'; + @memcpy(out[9..13], hex[8..12]); + out[13] = '-'; + @memcpy(out[14..18], hex[12..16]); + out[18] = '-'; + @memcpy(out[19..23], hex[16..20]); + out[23] = '-'; + @memcpy(out[24..36], hex[20..32]); + return out; +} + +/// Parse a canonical 36-char UUID string into a u128, or null if malformed. +pub fn parse(s: []const u8) ?u128 { + if (s.len != 36) return null; + var hex: [32]u8 = undefined; + var hi: usize = 0; + for (s, 0..) |c, i| { + if (i == 8 or i == 13 or i == 18 or i == 23) { + if (c != '-') return null; + continue; + } + if (hi >= 32) return null; + hex[hi] = c; + hi += 1; + } + if (hi != 32) return null; + var bytes: [16]u8 = undefined; + _ = std.fmt.hexToBytes(&bytes, &hex) catch return null; + return std.mem.readInt(u128, &bytes, .big); +} + +/// Extract the 4-bit version field. +pub fn version(u: u128) u4 { + return @truncate(u >> 76); +} + +test "generateV7 sets version 7 and the RFC variant, and is unique" { + const io = std.testing.io; + const a = generateV7(io); + const b = generateV7(io); + try std.testing.expectEqual(@as(u4, 7), version(a)); + try std.testing.expectEqual(@as(u128, 0b10), (a >> 62) & 0b11); // variant + try std.testing.expect(a != b); +} + +test "uuid toString/parse round-trips" { + const io = std.testing.io; + const u = generateV7(io); + const s = toString(u); + try std.testing.expectEqual(@as(usize, 36), s.len); + try std.testing.expectEqual(@as(u8, '-'), s[8]); + try std.testing.expectEqual(@as(u8, '-'), s[23]); + try std.testing.expectEqual(u, parse(&s).?); +} + +test "uuid parse rejects malformed input" { + try std.testing.expectEqual(@as(?u128, null), parse("not-a-uuid")); + try std.testing.expectEqual(@as(?u128, null), parse("0123456789abcdef0123456789abcdef")); // no dashes +} diff --git a/tests/assets/gltf_static_roundtrip.zig b/tests/assets/gltf_static_roundtrip.zig index 2d67327..8d56958 100644 --- a/tests/assets/gltf_static_roundtrip.zig +++ b/tests/assets/gltf_static_roundtrip.zig @@ -13,9 +13,10 @@ test "gltf static import-cook-load round-trip" { defer mesh.deinit(gpa); // Import → cook. - var imp = try assets.importers.gltf.import(gpa, "cube.gltf", cube_gltf); + var imp = try assets.importers.gltf.import(gpa, "cube.gltf", cube_gltf, "0190b3f0-1c2d-7e4a-8b6c-001122334455"); defer imp.deinit(gpa); try std.testing.expectEqualStrings("StaticMesh", imp.doc.type_name); + try std.testing.expectEqualStrings("0190b3f0-1c2d-7e4a-8b6c-001122334455", imp.doc.uuid); const bin = try assets.cookers.cookMesh(gpa, imp.doc, imp.blob); defer gpa.free(bin); diff --git a/tests/assets/png_roundtrip.zig b/tests/assets/png_roundtrip.zig index 0f5dc5d..56070ba 100644 --- a/tests/assets/png_roundtrip.zig +++ b/tests/assets/png_roundtrip.zig @@ -13,9 +13,11 @@ test "png import-cook-load round-trip" { defer img.deinit(gpa); // Import: source → intermediate doc + RGBA8 blob. - var imp = try assets.importers.png.import(gpa, "checker.png", checker_png); + const imp_uuid = "0190b3f0-1c2d-7e4a-8b6c-aabbccddeeff"; + var imp = try assets.importers.png.import(gpa, "checker.png", checker_png, imp_uuid); defer imp.deinit(gpa); try std.testing.expectEqualStrings("Texture2D", imp.doc.type_name); + try std.testing.expectEqualStrings(imp_uuid, imp.doc.uuid); try std.testing.expect(imp.doc.blobHash() != null); try std.testing.expectEqualSlices(u8, img.pixels, imp.blob); diff --git a/tests/assets/wav_roundtrip.zig b/tests/assets/wav_roundtrip.zig index 22a4ca4..6e8d282 100644 --- a/tests/assets/wav_roundtrip.zig +++ b/tests/assets/wav_roundtrip.zig @@ -13,9 +13,10 @@ test "wav import-cook-load round-trip" { defer audio.deinit(gpa); // Import → cook. - var imp = try assets.importers.wav.import(gpa, "tone.wav", tone_wav); + var imp = try assets.importers.wav.import(gpa, "tone.wav", tone_wav, "0190b3f0-1c2d-7e4a-8b6c-665544332211"); defer imp.deinit(gpa); try std.testing.expectEqualStrings("AudioClip", imp.doc.type_name); + try std.testing.expectEqualStrings("0190b3f0-1c2d-7e4a-8b6c-665544332211", imp.doc.uuid); const bin = try assets.cookers.cookAudio(gpa, imp.doc, imp.blob); defer gpa.free(bin); diff --git a/tools/asset_cook/main.zig b/tools/asset_cook/main.zig index 3ea252d..44b6d12 100644 --- a/tools/asset_cook/main.zig +++ b/tools/asset_cook/main.zig @@ -3,12 +3,17 @@ //! Cooks the three M0.6 fixtures (PNG / glTF / WAV) end-to-end: import → //! intermediate `.asset.etch` + `.weld/blobs/.blob` → cook → //! `.weld/cooked/pc/..bin`, driven through the local cooking -//! cache. Re-running logs a cache hit and skips the cook. This is the thin -//! offline surface; the user-facing `weld cook` CLI is Phase 1 -//! (brief §Out-of-scope). +//! cache. Re-running logs a cache hit and skips the cook. //! -//! Usage: `zig build cook-demo [-- ]` (default out dir -//! `zig-out/cook-demo`). +//! The asset `uuid` is generated (UUIDv7) on the first cook and preserved +//! across re-runs by reading the existing `.asset.etch` — this is the +//! generate-once / preserve-forever identity policy (the pure importer +//! receives the already-resolved uuid; the fs-aware resolution lives here). +//! +//! This is the thin offline surface; the user-facing `weld cook` CLI is +//! Phase 1 (brief §Out-of-scope). +//! +//! Usage: `zig build cook-demo [-- ]` (default `zig-out/cook-demo`). const std = @import("std"); const assets = @import("weld_asset_pipeline"); @@ -47,18 +52,24 @@ pub fn main(init: std.process.Init) !void { try stdout.print("weld asset cook — {d} fixtures → {s}\n", .{ fixtures.len, out_path }); for (fixtures) |fx| { + const stem = std.fs.path.stem(fx.path); + var etch_name_buf: [256]u8 = undefined; + const etch_name = try std.fmt.bufPrint(&etch_name_buf, "{s}.{s}.asset.etch", .{ stem, fx.type_tag }); + + // Stable identity: reuse the existing .asset.etch's uuid, else generate. + var uuid_buf: [36]u8 = undefined; + const uuid_str = try resolveUuid(gpa, io, intermediate_dir, etch_name, &uuid_buf); + const src = try readFile(gpa, io, cwd, fx.path); defer gpa.free(src); - var imp = try importOne(gpa, fx, src); + var imp = try importOne(gpa, fx, src, uuid_str); defer imp.deinit(gpa); const doc = imp.doc; // Intermediate text + blob. const etch = try assets.format.intermediate.writeAlloc(gpa, doc); defer gpa.free(etch); - var name_buf: [256]u8 = undefined; - const etch_name = try std.fmt.bufPrint(&name_buf, "{s}.{s}.asset.etch", .{ doc.name, fx.type_tag }); try writeFile(io, intermediate_dir, etch_name, etch); var blob_name_buf: [64]u8 = undefined; const blob_name = try std.fmt.bufPrint(&blob_name_buf, "{s}.blob", .{doc.blobHash().?}); @@ -70,23 +81,23 @@ pub fn main(init: std.process.Init) !void { const bin_name = try std.fmt.bufPrint(&bin_name_buf, "{s}.{s}.bin", .{ doc.name, fx.type_tag }); if (cache.contains(io, &key)) { - try stdout.print(" {s:<24} cache HIT (cook skipped)\n", .{fx.path}); + try stdout.print(" {s:<28} cache HIT (uuid {s})\n", .{ fx.path, doc.uuid }); } else { const bin = try cookOne(gpa, fx, doc, imp.blob); defer gpa.free(bin); try cache.put(io, &key, bin); try writeFile(io, cooked_dir, bin_name, bin); - try stdout.print(" {s:<24} cooked MISS → {s} ({d} bytes)\n", .{ fx.path, bin_name, bin.len }); + try stdout.print(" {s:<28} cooked MISS → {s} ({d} B, uuid {s})\n", .{ fx.path, bin_name, bin.len, doc.uuid }); } } try stdout.flush(); } -fn importOne(gpa: std.mem.Allocator, fx: Fixture, src: []const u8) !assets.importers.Import { +fn importOne(gpa: std.mem.Allocator, fx: Fixture, src: []const u8, uuid: []const u8) !assets.importers.Import { return switch (fx.kind) { - .texture => try assets.importers.png.import(gpa, fx.path, src), - .mesh => try assets.importers.gltf.import(gpa, fx.path, src), - .audio => try assets.importers.wav.import(gpa, fx.path, src), + .texture => try assets.importers.png.import(gpa, fx.path, src, uuid), + .mesh => try assets.importers.gltf.import(gpa, fx.path, src, uuid), + .audio => try assets.importers.wav.import(gpa, fx.path, src, uuid), }; } @@ -98,21 +109,43 @@ fn cookOne(gpa: std.mem.Allocator, fx: Fixture, doc: assets.AssetDoc, blob: []co }; } -fn readFile(gpa: std.mem.Allocator, io: std.Io, dir: std.Io.Dir, path: []const u8) ![]u8 { - const f = try dir.openFile(io, path, .{}); +/// Reuse the uuid of an existing intermediate `.asset.etch`, else generate a +/// fresh UUIDv7. The result is written into `buf` (lives across the import). +fn resolveUuid(gpa: std.mem.Allocator, io: std.Io, dir: std.Io.Dir, etch_name: []const u8, buf: *[36]u8) ![]const u8 { + if (try readFileOpt(gpa, io, dir, etch_name)) |text| { + defer gpa.free(text); + var arena = std.heap.ArenaAllocator.init(gpa); + defer arena.deinit(); + if (assets.format.intermediate.parseEtch(arena.allocator(), text)) |doc| { + if (doc.uuid.len == 36) { + @memcpy(buf, doc.uuid[0..36]); + return buf; + } + } else |_| {} + } + @memcpy(buf, &assets.uuid.toString(assets.uuid.generateV7(io))); + return buf; +} + +fn readFileOpt(gpa: std.mem.Allocator, io: std.Io, dir: std.Io.Dir, path: []const u8) !?[]u8 { + const f = dir.openFile(io, path, .{}) catch return null; defer f.close(io); const size: usize = @intCast((try f.stat(io)).size); - const buf = try gpa.alloc(u8, size); - errdefer gpa.free(buf); + const out = try gpa.alloc(u8, size); + errdefer gpa.free(out); var read_buf: [4096]u8 = undefined; var reader = f.reader(io, &read_buf); var written: usize = 0; while (written < size) { - const n = try reader.interface.readSliceShort(buf[written..]); + const n = try reader.interface.readSliceShort(out[written..]); if (n == 0) break; written += n; } - return buf[0..written]; + return out[0..written]; +} + +fn readFile(gpa: std.mem.Allocator, io: std.Io, dir: std.Io.Dir, path: []const u8) ![]u8 { + return (try readFileOpt(gpa, io, dir, path)) orelse error.FileNotFound; } fn writeFile(io: std.Io, dir: std.Io.Dir, path: []const u8, bytes: []const u8) !void { From 6c6c9297cd951bba0612e862ac19db4285ec5c0d Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 06:58:52 +0200 Subject: [PATCH 22/29] docs(brief): journal update --- briefs/m0.6-assets.md | 1 + 1 file changed, 1 insertion(+) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index a0a070c..3e386cb 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -215,6 +215,7 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o - 2026-06-04 06:44 — E4 pipeline: `hash.zig` (BLAKE3-128 hex / u64, `std.crypto.hash.Blake3`, no C binding). Importers (`png`/`gltf`/`wav`.import → `Import{ arena doc, gpa blob }`, `source_hash` + `extracted.blob` = BLAKE3-128 hex). Cookers (`texture`/`mesh`/`audio` → `..bin` via the E1-frozen 40-byte `RuntimeHeader`, explicit LE `toBytes` — no `@ptrCast` on write; payloads raw RGBA8 / f32-verts+u32-idx / PCM). Local cache (`cache/`: `computeKey` = BLAKE3-128(source_hash++settings++platform); dir-backed `contains`/`put`/`get`; invalidates on source_hash). Thin offline entry `tools/asset_cook` (`zig build cook-demo`): MISS run writes `.asset.etch` + `.weld/blobs/` + `.bin`, HIT run skips — brief §Observable behavior. - 2026-06-04 06:44 — E4 tests: `png_roundtrip`/`gltf_static_roundtrip`/`wav_roundtrip` (import→cook→load, self-checked against the E3 decoder oracle), `cache_diff`. Fixtures `tests/assets/data/{checker.png,cube.gltf,tone.wav}` (PNG PIL-verified). **cache_diff gate note:** asserts `hit < 10 ms` + `miss > 20× hit` + cached==cooked (robust differential), NOT a literal `miss ≥ 100 ms` — a 16 MiB raw-copy cook measured 105 ms Debug / 54 ms ReleaseSafe (build-mode/disk dependent), so the literal ≥100 ms is the reference-machine figure for a real decode-heavy asset, not portably assertable for M0.6's trivial raw-copy cook. Flagged for review. - 2026-06-04 06:44 — Files added beyond the brief's explicit list (justified): `hash.zig` (shared BLAKE3 util for importers/cookers/cache); `importers/common.zig` + `cookers/common.zig` (shared `Import` / `assemble`); `tools/asset_cook/main.zig` (the thin offline entry the brief's Observable behavior + Out-of-scope call for). All within `asset_pipeline`/`tools`. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. +- 2026-06-04 06:58 — E4 complement (Claude.ai): stable `uuid` (UUIDv7) identity. `AssetDoc.uuid` is the first body field, emitted `asset "" { uuid: "…" type: … }` (mirrors `entity uuid` in `.scene.etch`); writer/parser/`eql` updated. New `uuid.zig` — UUIDv7 generation (48-bit `Clock.real` ms + `io.random` bits, version 7 + RFC variant), `toString`/`parse`, pure Zig (no C binding). Importers take a resolved `uuid` param; the offline entry **generates on first cook and preserves on re-cook** by reading the existing `.asset.etch` (the fs-aware generate/preserve lives in the orchestration; the pure importer receives the resolved uuid). Registry: `Slot.uuid` u128 + `allocWithUuid` + `Resolved.uuid` — **stored only** (path-based resolution unchanged; uuid resolution / rename-propagation is Phase 1+; the M0.6 `.bin` carries no uuid so runtime `alloc` leaves it 0). `source_hash` (changes with source) and `uuid` (stable for life) kept distinct. cook-demo verified: run 1 generates 3 distinct UUIDv7, run 2 preserves them (cache HIT). Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. ## Recorded deviations From 52c47daf13be2d54430663910bab5d414447bc8b Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 07:01:10 +0200 Subject: [PATCH 23/29] docs(brief): record cache_diff gate ruling (recorded deviation) --- briefs/m0.6-assets.md | 1 + 1 file changed, 1 insertion(+) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index 3e386cb..bdeb4b1 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -220,6 +220,7 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o ## Recorded deviations - 2026-06-03 (Claude.ai ruling, E1; this commit) — Runtime `..bin` header corrected from 32 to 40 bytes during E1: the 10 typed fields total 36 bytes and `hash` u64 required 8-byte alignment; added explicit `_reserved` u32 @28 so `hash` lands at @32 and `@sizeOf == 40` with no implicit padding. `engine-asset-pipeline.md` §5 patched in lockstep. Rationale: zero-copy mmap safety. +- 2026-06-04 (Claude.ai ruling, E4) — cache_diff gate corrected. The FROZEN Acceptance ▸ Benchmarks "first cook ≥ 100 ms / second < 10 ms" was a brief calibration error: M0.6's cook is a trivial raw copy (RGBA8/f32/PCM), so ≥ 100 ms only holds for a Phase 1+ BC7/Opus cook. New gate: `cached == cooked` + `miss ≫ hit`, absolute timings logged but **not** asserted. `tests/assets/cache_diff.zig` already implements this (`hit < 10 ms` + `miss > 20× hit` + cached==cooked; logs absolutes). No dedicated bench, no larger fixture, no added round-trip. C0.6 and the brief are corrected Claude.ai-side; the "≥ 100 ms" wording still present in this committed copy's FROZEN Benchmarks is superseded by this entry (pending any patched-brief round-trip). ## Blockers encountered From fda3c171fc35655c871014adcafbaa30d05f5094 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 10:16:30 +0200 Subject: [PATCH 24/29] fix(assets): reject inflate length-repeat past total --- .../asset_pipeline/codecs/deflate/inflate.zig | 32 ++++++++----------- 1 file changed, 14 insertions(+), 18 deletions(-) diff --git a/src/modules/asset_pipeline/codecs/deflate/inflate.zig b/src/modules/asset_pipeline/codecs/deflate/inflate.zig index 2f0ef43..02aefef 100644 --- a/src/modules/asset_pipeline/codecs/deflate/inflate.zig +++ b/src/modules/asset_pipeline/codecs/deflate/inflate.zig @@ -245,29 +245,25 @@ fn inflateDynamic(gpa: std.mem.Allocator, br: *BitReader, out: *std.ArrayList(u8 }, 16 => { if (i == 0) return error.BadSymbol; - const repeat = 3 + try br.take(2); + const repeat: usize = 3 + try br.take(2); + // A repeat that overruns `total` is a corrupt stream, not a + // value to silently clamp. + const end = @as(usize, i) + repeat; + if (end > total) return error.BadSymbol; const prev = lengths[i - 1]; - var r: usize = 0; - while (r < repeat and i < total) : (r += 1) { - lengths[i] = prev; - i += 1; - } + while (i < end) : (i += 1) lengths[i] = prev; }, 17 => { - const repeat = 3 + try br.take(3); - var r: usize = 0; - while (r < repeat and i < total) : (r += 1) { - lengths[i] = 0; - i += 1; - } + const repeat: usize = 3 + try br.take(3); + const end = @as(usize, i) + repeat; + if (end > total) return error.BadSymbol; + while (i < end) : (i += 1) lengths[i] = 0; }, 18 => { - const repeat = 11 + try br.take(7); - var r: usize = 0; - while (r < repeat and i < total) : (r += 1) { - lengths[i] = 0; - i += 1; - } + const repeat: usize = 11 + try br.take(7); + const end = @as(usize, i) + repeat; + if (end > total) return error.BadSymbol; + while (i < end) : (i += 1) lengths[i] = 0; }, else => return error.BadSymbol, } From c7d320a46cebcce2425d218eb4855fe7b2e9dfad Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 10:16:31 +0200 Subject: [PATCH 25/29] test(assets): add deflate negative-guard vectors --- tests/assets/deflate_vectors.zig | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/tests/assets/deflate_vectors.zig b/tests/assets/deflate_vectors.zig index 2232d8b..eb897f2 100644 --- a/tests/assets/deflate_vectors.zig +++ b/tests/assets/deflate_vectors.zig @@ -65,3 +65,32 @@ test "zlib decompress rejects a corrupted adler32 trailer" { corrupt[corrupt.len - 1] ^= 0xff; // flip the last trailer byte try std.testing.expectError(error.BadChecksum, zlib.decompress(gpa, &corrupt)); } + +// --- Negative vectors: one per inflate guard (hand-built, LSB-first). ------- + +// Stored block with LEN=3 but NLEN=0xFFFF (≠ ~3), violating LEN == ~NLEN. +const v_bad_stored_length = [_]u8{ 0x01, 0x03, 0x00, 0xff, 0xff, 0x61, 0x62, 0x63 }; + +// Fixed block: literal/length symbol 257 (length 3) + distance symbol 0 +// (distance 1) with no output yet — distance reaches before the buffer start. +const v_distance_too_far = [_]u8{ 0x03, 0x02 }; + +// Dynamic block whose code-length table defines only symbol 16 at length 1 +// (an incomplete table); the next code-length symbol reads a `1` bit, which +// matches no code. +const v_bad_huffman_code = [_]u8{ 0x05, 0x00, 0x02, 0x20 }; + +test "inflate rejects a stored block with bad LEN/NLEN" { + const gpa = std.testing.allocator; + try std.testing.expectError(error.BadStoredLength, inflate(gpa, &v_bad_stored_length)); +} + +test "inflate rejects a back-reference distance past the output start" { + const gpa = std.testing.allocator; + try std.testing.expectError(error.DistanceTooFar, inflate(gpa, &v_distance_too_far)); +} + +test "inflate rejects a bit pattern matching no Huffman code" { + const gpa = std.testing.allocator; + try std.testing.expectError(error.BadHuffmanCode, inflate(gpa, &v_bad_huffman_code)); +} From 2b502af17ee7511b582a77394a010c2a0d56a1fd Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 10:16:32 +0200 Subject: [PATCH 26/29] docs(assets): correct adler32 nmax and hash comments --- src/foundation/simd/kernels/adler32.zig | 7 +++++-- src/modules/asset_pipeline/hash.zig | 6 ++++-- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/src/foundation/simd/kernels/adler32.zig b/src/foundation/simd/kernels/adler32.zig index 54ad91e..ba14c77 100644 --- a/src/foundation/simd/kernels/adler32.zig +++ b/src/foundation/simd/kernels/adler32.zig @@ -17,8 +17,11 @@ const std = @import("std"); /// Largest prime smaller than 65536; the ADLER32 modulus. pub const base: u32 = 65521; -/// Largest block length such that the deferred-modulo accumulation cannot -/// overflow the u64 intermediates in `vectorized`. +/// Block length between modulo reductions. 5552 is the classic ADLER32 +/// bound: the largest `n` for which `255*n*(n+1)/2 + (n+1)*(base-1)` stays +/// within a 32-bit accumulator. It is conservative for the u64 intermediates +/// used here (they would not overflow until far larger `n`); keeping 5552 +/// matches every reference implementation and the scalar `reference`. pub const nmax: usize = 5552; /// Scalar reference implementation — the correctness oracle. Computes diff --git a/src/modules/asset_pipeline/hash.zig b/src/modules/asset_pipeline/hash.zig index 6e0f1f6..1f45794 100644 --- a/src/modules/asset_pipeline/hash.zig +++ b/src/modules/asset_pipeline/hash.zig @@ -3,7 +3,7 @@ //! `source_hash`, `extracted.blob`, and the cooking-cache key are all //! BLAKE3 truncated to 128 bits, rendered as 32 lowercase hex chars //! (`engine-asset-pipeline.md §3`, brief §E4). The runtime `.bin` header -//! `hash` field is a u64 (the low 64 bits of the same digest). +//! `hash` field is a u64 — the first 8 bytes of the same digest, read LE. const std = @import("std"); @@ -16,7 +16,9 @@ pub fn hex128(data: []const u8) [32]u8 { return std.fmt.bytesToHex(digest, .lower); } -/// Low 64 bits of the BLAKE3 digest of `data` (the `.bin` header `hash`). +/// The first 8 bytes of the BLAKE3 digest of `data`, read little-endian as a +/// u64 (the `.bin` header `hash`). A truncation of the digest, not the low +/// bits of some wider integer. pub fn u64Of(data: []const u8) u64 { var digest: [8]u8 = undefined; Blake3.hash(data, &digest, .{}); From 87d0b28cda1a4eb56f5ddc2536415da557572553 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 10:16:33 +0200 Subject: [PATCH 27/29] feat(assets): add async runtime loader and lifecycle --- build.zig | 2 + src/modules/asset_pipeline/loader/Loader.zig | 241 +++++++++++++++++++ src/modules/asset_pipeline/loader/root.zig | 8 + src/modules/asset_pipeline/root.zig | 6 + tests/assets/loader_async.zig | 114 +++++++++ 5 files changed, 371 insertions(+) create mode 100644 src/modules/asset_pipeline/loader/Loader.zig create mode 100644 src/modules/asset_pipeline/loader/root.zig create mode 100644 tests/assets/loader_async.zig diff --git a/build.zig b/build.zig index bb5cfeb..cc0a6e6 100644 --- a/build.zig +++ b/build.zig @@ -438,6 +438,8 @@ pub fn build(b: *std.Build) void { .{ .path = "tests/vk_gen/raw_variants.zig" }, // M0.6 / E1 — asset registry stale-handle (generation) acceptance. .{ .path = "tests/assets/handle_generation.zig", .asset_pipeline = true }, + // M0.6 / E5 — async loader + lifecycle (internal 5 s watchdog). + .{ .path = "tests/assets/loader_async.zig", .asset_pipeline = true }, // M0.6 / E2 — DEFLATE/zlib inflate known-vector acceptance. .{ .path = "tests/assets/deflate_vectors.zig", .asset_pipeline = true }, // M0.6 / E2 — adler32 kernel known vectors + cross-variant correctness. diff --git a/src/modules/asset_pipeline/loader/Loader.zig b/src/modules/asset_pipeline/loader/Loader.zig new file mode 100644 index 0000000..d3a05e7 --- /dev/null +++ b/src/modules/asset_pipeline/loader/Loader.zig @@ -0,0 +1,241 @@ +//! Async runtime asset loader + lifecycle (M0.6 / E5). +//! +//! Loads a cooked `..bin` off the main thread and tracks its lifetime +//! through the E1 `Registry`. `beginLoad` spawns the file read on a worker +//! via `io.concurrent` (the §8 "worker thread + async I/O" path); the caller +//! polls `Pending.ready()` and ticks its own loop meanwhile — the load never +//! blocks the main thread. `finish` registers the result (main thread, the +//! registry is single-threaded). `load` is the blocking convenience. +//! +//! Lifecycle (brief §E5): `load → alloc` (uuid 0 at runtime — the `.bin` +//! carries no uuid in M0.6); `retain`/`release` for refcount (release at 0 +//! unloads + frees the payload); `unload` is the forced path reserved for +//! hot-reload / eviction; `reload` re-reads and swaps the payload in place. +//! +//! Byte order: the header is parsed with the portable `RuntimeHeader.read` +//! (explicit little-endian). M0.6 does NOT take the zero-copy `@ptrCast` / +//! mmap path, so `RuntimeHeader.read` remains the single byte-order site (the +//! E1 note holds). A future zero-copy mmap path would add a SECOND +//! byte-order-dependent path — correct on the LE Phase 0 targets, but it +//! means `read` is not the universal single byte-swap point once that path +//! exists. The two would coexist; neither is claimed as the sole authority. +//! +//! The Tier 0 Chase-Lev job system is deliberately NOT used: its public +//! surface is ECS-chunk-shaped and using it here would require widening it +//! (a Case 2 blocker per the brief). `std.Io` async is the §8-prescribed, +//! surface-neutral mechanism. + +const std = @import("std"); +const format = @import("../format/root.zig"); +const registry_mod = @import("../registry/root.zig"); + +const Loader = @This(); +const Registry = registry_mod.Registry; +const AssetHandle = registry_mod.AssetHandle; +const RuntimeHeader = format.RuntimeHeader; + +/// Base directory cooked `.bin` files are read from. +dir: std.Io.Dir, +/// Identity / refcount / generation table. +registry: Registry, +/// Loaded payloads keyed by `AssetHandle.index`. +payloads: std.AutoHashMapUnmanaged(u32, Payload), + +const Payload = struct { + header: RuntimeHeader, + /// The whole `.bin` (owned); `data`/`metadata` are slices into it. + bin: []u8, +}; + +/// Errors from the async read task (a concrete set so it can ride a Future). +pub const LoadError = error{ + /// Opening / stat-ing / reading the file failed. + ReadFailed, + /// The `.bin` was shorter than the 40-byte header. + ShortBuffer, + /// The `.bin` did not start with the `WELD` magic. + BadMagic, + /// Allocation failed. + OutOfMemory, +}; + +/// Errors from registering a completed read into the registry. +pub const FinishError = error{ + /// The header `asset_type` is not a known `AssetType`. + UnknownAssetType, + /// Allocation failed. + OutOfMemory, +}; + +/// A read produced by the async task: the parsed header + the owned `.bin`. +pub const Raw = struct { + header: RuntimeHeader, + bin: []u8, +}; + +/// A load in flight. Poll `ready()`; collect with `wait()`; abandon with +/// `cancel()`. The `path` passed to `beginLoad` must outlive the `Pending`. +pub const Pending = struct { + future: std.Io.Future(LoadError!Raw), + done: *std.atomic.Value(bool), + gpa: std.mem.Allocator, + + /// Non-blocking: true once the worker has finished (success or error). + pub fn ready(self: *const Pending) bool { + return self.done.load(.acquire); + } + + /// Block until the read completes and return its result. Releases the + /// done flag. Call exactly once (mutually exclusive with `cancel`). + pub fn wait(self: *Pending, io: std.Io) LoadError!Raw { + const result = self.future.await(io); + self.gpa.destroy(self.done); + return result; + } + + /// Abandon the load (teardown path): await it, free any produced buffer, + /// release the done flag. + pub fn cancel(self: *Pending, io: std.Io) void { + if (self.future.cancel(io)) |raw| { + self.gpa.free(raw.bin); + } else |_| {} + self.gpa.destroy(self.done); + } +}; + +/// Create an empty loader reading from `dir`. +pub fn init(dir: std.Io.Dir) Loader { + return .{ .dir = dir, .registry = Registry.init(), .payloads = .empty }; +} + +/// Free all loaded payloads + the registry, and poison `self`. +pub fn deinit(self: *Loader, gpa: std.mem.Allocator) void { + var it = self.payloads.valueIterator(); + while (it.next()) |p| gpa.free(p.bin); + self.payloads.deinit(gpa); + self.registry.deinit(gpa); + self.* = undefined; +} + +/// Start an asynchronous read of `path` on a worker. Returns immediately — +/// the caller's loop keeps ticking. `path` must outlive the `Pending`. +pub fn beginLoad(self: *Loader, gpa: std.mem.Allocator, io: std.Io, path: []const u8) (std.Io.ConcurrentError || error{OutOfMemory})!Pending { + const done = try gpa.create(std.atomic.Value(bool)); + done.* = std.atomic.Value(bool).init(false); + errdefer gpa.destroy(done); + const future = try io.concurrent(loadTask, .{ gpa, io, self.dir, path, done }); + return .{ .future = future, .done = done, .gpa = gpa }; +} + +/// Register a completed read into the registry and take ownership of its +/// buffer. Returns the asset handle. (Main thread — the registry is +/// single-threaded.) +pub fn finish(self: *Loader, gpa: std.mem.Allocator, raw: Raw) FinishError!AssetHandle { + const asset_type = raw.header.assetType() orelse { + gpa.free(raw.bin); + return error.UnknownAssetType; + }; + const handle = self.registry.allocWithUuid(gpa, asset_type, 0) catch |err| switch (err) { + error.OutOfMemory => { + gpa.free(raw.bin); + return error.OutOfMemory; + }, + // `alloc` reserves a fresh slot; it never validates an existing handle. + error.StaleHandle => unreachable, + }; + self.payloads.put(gpa, handle.index, .{ .header = raw.header, .bin = raw.bin }) catch |err| { + gpa.free(raw.bin); + self.registry.unload(gpa, handle) catch {}; + return err; + }; + return handle; +} + +/// Blocking convenience: `beginLoad` + `wait` + `finish`. +pub fn load(self: *Loader, gpa: std.mem.Allocator, io: std.Io, path: []const u8) !AssetHandle { + var pending = try self.beginLoad(gpa, io, path); + const raw = try pending.wait(io); + return self.finish(gpa, raw); +} + +/// The bulk data slice of a loaded asset, or null if the handle is stale / +/// unknown. +pub fn get(self: *const Loader, handle: AssetHandle) ?[]const u8 { + if (!self.registry.isAlive(handle)) return null; + const p = self.payloads.get(handle.index) orelse return null; + return p.bin[p.header.data_offset..][0..p.header.data_size]; +} + +/// The parsed header of a loaded asset, or null if stale / unknown. +pub fn headerOf(self: *const Loader, handle: AssetHandle) ?RuntimeHeader { + if (!self.registry.isAlive(handle)) return null; + const p = self.payloads.get(handle.index) orelse return null; + return p.header; +} + +/// Add a strong reference. +pub fn retain(self: *Loader, handle: AssetHandle) Registry.Error!void { + return self.registry.retain(handle); +} + +/// Drop a strong reference; at refcount 0 the asset unloads and its payload +/// is freed. +pub fn release(self: *Loader, gpa: std.mem.Allocator, handle: AssetHandle) Registry.Error!void { + try self.registry.release(gpa, handle); + if (!self.registry.isAlive(handle)) self.freePayload(gpa, handle.index); +} + +/// Force-unload (hot-reload / eviction): drop the slot regardless of +/// refcount and free the payload. +pub fn unload(self: *Loader, gpa: std.mem.Allocator, handle: AssetHandle) Registry.Error!void { + try self.registry.unload(gpa, handle); + self.freePayload(gpa, handle.index); +} + +/// Hot-reload: re-read `path` and swap the payload in place. The handle, +/// generation, and refcount are preserved. +pub fn reload(self: *Loader, gpa: std.mem.Allocator, io: std.Io, handle: AssetHandle, path: []const u8) !void { + if (!self.registry.isAlive(handle)) return error.StaleHandle; + const raw = try readBin(gpa, io, self.dir, path); + if (self.payloads.getPtr(handle.index)) |p| { + gpa.free(p.bin); + p.* = .{ .header = raw.header, .bin = raw.bin }; + } else { + self.payloads.put(gpa, handle.index, .{ .header = raw.header, .bin = raw.bin }) catch |err| { + gpa.free(raw.bin); + return err; + }; + } +} + +fn freePayload(self: *Loader, gpa: std.mem.Allocator, index: u32) void { + if (self.payloads.fetchRemove(index)) |kv| gpa.free(kv.value.bin); +} + +fn loadTask(gpa: std.mem.Allocator, io: std.Io, dir: std.Io.Dir, path: []const u8, done: *std.atomic.Value(bool)) LoadError!Raw { + defer done.store(true, .release); + return readBin(gpa, io, dir, path); +} + +fn readBin(gpa: std.mem.Allocator, io: std.Io, dir: std.Io.Dir, path: []const u8) LoadError!Raw { + const file = dir.openFile(io, path, .{}) catch return error.ReadFailed; + defer file.close(io); + const stat = file.stat(io) catch return error.ReadFailed; + const size: usize = @intCast(stat.size); + + const bin = gpa.alloc(u8, size) catch return error.OutOfMemory; + errdefer gpa.free(bin); + + var read_buf: [8192]u8 = undefined; + var reader = file.reader(io, &read_buf); + var off: usize = 0; + while (off < size) { + const n = reader.interface.readSliceShort(bin[off..]) catch return error.ReadFailed; + if (n == 0) break; + off += n; + } + if (off != size) return error.ReadFailed; + + const header = try RuntimeHeader.read(bin); // portable LE parse + return .{ .header = header, .bin = bin }; +} diff --git a/src/modules/asset_pipeline/loader/root.zig b/src/modules/asset_pipeline/loader/root.zig new file mode 100644 index 0000000..1a4e21f --- /dev/null +++ b/src/modules/asset_pipeline/loader/root.zig @@ -0,0 +1,8 @@ +//! Asset Pipeline `loader/` namespace — async runtime loading + lifecycle. + +/// Async runtime asset loader (file-as-type). +pub const Loader = @import("Loader.zig"); + +comptime { + _ = Loader; +} diff --git a/src/modules/asset_pipeline/root.zig b/src/modules/asset_pipeline/root.zig index e31d7a6..c732193 100644 --- a/src/modules/asset_pipeline/root.zig +++ b/src/modules/asset_pipeline/root.zig @@ -40,6 +40,11 @@ pub const hash = @import("hash.zig"); /// Stable per-asset identity (UUIDv7). pub const uuid = @import("uuid.zig"); +/// Async runtime loader + lifecycle (E5). +pub const loader = @import("loader/root.zig"); +/// Async runtime loader (convenience re-export). +pub const Loader = loader.Loader; + /// 64-bit typed asset handle (convenience re-export). pub const AssetHandle = registry.AssetHandle; /// Slot registry (convenience re-export). @@ -62,4 +67,5 @@ comptime { _ = cache; _ = hash; _ = uuid; + _ = loader; } diff --git a/tests/assets/loader_async.zig b/tests/assets/loader_async.zig new file mode 100644 index 0000000..1cb90c4 --- /dev/null +++ b/tests/assets/loader_async.zig @@ -0,0 +1,114 @@ +//! M0.6 / E5 — async loader + lifecycle acceptance. +//! +//! Brief §Acceptance ▸ Tests: `test "async load does not block main thread"` — +//! the main loop ticks while a load is in flight, the load completes, with an +//! internal 5 s watchdog and clean teardown (S6 hang lesson, +//! `engine-zig-conventions.md` §13). + +const std = @import("std"); +const assets = @import("weld_asset_pipeline"); + +const Loader = assets.Loader; +const AssetType = assets.AssetType; +const fmt = assets.format; + +fn cookTextureBin(gpa: std.mem.Allocator, io: std.Io, dir: std.Io.Dir, name: []const u8, rgba: []const u8) !void { + const extracted = [_]fmt.Field{ + .{ .key = "width", .value = .{ .int = 2 } }, + .{ .key = "height", .value = .{ .int = 2 } }, + .{ .key = "blob", .value = .{ .string = "00" } }, + }; + const doc = fmt.AssetDoc{ + .name = "x", + .type_name = "Texture2D", + .version = 1, + .source = "x.png", + .source_hash = "0", + .extracted = &extracted, + }; + const bin = try assets.cookers.cookTexture(gpa, doc, rgba); + defer gpa.free(bin); + const file = try dir.createFile(io, name, .{ .truncate = true }); + defer file.close(io); + try file.writeStreamingAll(io, bin); +} + +test "async load does not block main thread" { + const gpa = std.testing.allocator; + const io = std.testing.io; + + var tmp = std.testing.tmpDir(.{}); + defer tmp.cleanup(); + + const rgba = [_]u8{ + 0xff, 0x00, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, + 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0x00, 0xff, + }; + try cookTextureBin(gpa, io, tmp.dir, "x.texture.bin", &rgba); + + var loader = Loader.init(tmp.dir); + defer loader.deinit(gpa); + + // Begin the load and keep ticking the main loop until it is ready. A + // 5 s wall-clock watchdog guarantees the suite cannot hang on a stuck + // load, with clean teardown via `pending.cancel`. + var pending = try loader.beginLoad(gpa, io, "x.texture.bin"); + const start = std.Io.Clock.Timestamp.now(io, .awake); + var ticks: usize = 0; + while (!pending.ready()) { + ticks += 1; + std.mem.doNotOptimizeAway(ticks); + if (start.untilNow(io).raw.nanoseconds > 5 * std.time.ns_per_s) { + pending.cancel(io); + return error.LoadTimedOut; + } + } + try std.testing.expect(ticks >= 1); // the main loop advanced; the read ran off-thread + + const raw = try pending.wait(io); + const handle = try loader.finish(gpa, raw); + + try std.testing.expectEqual(AssetType.texture, handle.assetType().?); + try std.testing.expectEqual(AssetType.texture, loader.headerOf(handle).?.assetType().?); + try std.testing.expectEqualSlices(u8, &rgba, loader.get(handle).?); + try std.testing.expectEqual(@as(u32, 1), loader.registry.refCount(handle).?); + + // Lifecycle: retain bumps the count; release at 0 unloads + frees payload. + try loader.retain(handle); + try std.testing.expectEqual(@as(u32, 2), loader.registry.refCount(handle).?); + try loader.release(gpa, handle); + try std.testing.expect(loader.get(handle) != null); // still alive at refcount 1 + try loader.release(gpa, handle); + try std.testing.expectEqual(@as(?[]const u8, null), loader.get(handle)); + try std.testing.expect(!loader.registry.isAlive(handle)); +} + +test "loader reload swaps the payload, forced unload drops it" { + const gpa = std.testing.allocator; + const io = std.testing.io; + + var tmp = std.testing.tmpDir(.{}); + defer tmp.cleanup(); + + const red = [_]u8{ 0xff, 0, 0, 0xff } ** 4; + const blue = [_]u8{ 0, 0, 0xff, 0xff } ** 4; + try cookTextureBin(gpa, io, tmp.dir, "a.texture.bin", &red); + try cookTextureBin(gpa, io, tmp.dir, "b.texture.bin", &blue); + + var loader = Loader.init(tmp.dir); + defer loader.deinit(gpa); + + // Blocking convenience load. + const handle = try loader.load(gpa, io, "a.texture.bin"); + try std.testing.expectEqualSlices(u8, &red, loader.get(handle).?); + + // Hot-reload swaps the payload; the handle (and refcount) are preserved. + try loader.reload(gpa, io, handle, "b.texture.bin"); + try std.testing.expectEqualSlices(u8, &blue, loader.get(handle).?); + try std.testing.expect(loader.registry.isAlive(handle)); + + // Forced unload drops the slot regardless of refcount. + try loader.unload(gpa, handle); + try std.testing.expect(!loader.registry.isAlive(handle)); + try std.testing.expectEqual(@as(?[]const u8, null), loader.get(handle)); +} From a7711523f0e6ed5109919cd64c338ae4dd7940a4 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 10:16:34 +0200 Subject: [PATCH 28/29] docs(brief): journal update --- briefs/m0.6-assets.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index bdeb4b1..ec5feb5 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -215,6 +215,8 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o - 2026-06-04 06:44 — E4 pipeline: `hash.zig` (BLAKE3-128 hex / u64, `std.crypto.hash.Blake3`, no C binding). Importers (`png`/`gltf`/`wav`.import → `Import{ arena doc, gpa blob }`, `source_hash` + `extracted.blob` = BLAKE3-128 hex). Cookers (`texture`/`mesh`/`audio` → `..bin` via the E1-frozen 40-byte `RuntimeHeader`, explicit LE `toBytes` — no `@ptrCast` on write; payloads raw RGBA8 / f32-verts+u32-idx / PCM). Local cache (`cache/`: `computeKey` = BLAKE3-128(source_hash++settings++platform); dir-backed `contains`/`put`/`get`; invalidates on source_hash). Thin offline entry `tools/asset_cook` (`zig build cook-demo`): MISS run writes `.asset.etch` + `.weld/blobs/` + `.bin`, HIT run skips — brief §Observable behavior. - 2026-06-04 06:44 — E4 tests: `png_roundtrip`/`gltf_static_roundtrip`/`wav_roundtrip` (import→cook→load, self-checked against the E3 decoder oracle), `cache_diff`. Fixtures `tests/assets/data/{checker.png,cube.gltf,tone.wav}` (PNG PIL-verified). **cache_diff gate note:** asserts `hit < 10 ms` + `miss > 20× hit` + cached==cooked (robust differential), NOT a literal `miss ≥ 100 ms` — a 16 MiB raw-copy cook measured 105 ms Debug / 54 ms ReleaseSafe (build-mode/disk dependent), so the literal ≥100 ms is the reference-machine figure for a real decode-heavy asset, not portably assertable for M0.6's trivial raw-copy cook. Flagged for review. - 2026-06-04 06:44 — Files added beyond the brief's explicit list (justified): `hash.zig` (shared BLAKE3 util for importers/cookers/cache); `importers/common.zig` + `cookers/common.zig` (shared `Import` / `assemble`); `tools/asset_cook/main.zig` (the thin offline entry the brief's Observable behavior + Out-of-scope call for). All within `asset_pipeline`/`tools`. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. +- 2026-06-04 10:16 — E5 cleanups (E1/E2/E4 minor points, per Guy, this session): `inflate.zig` now rejects a 16/17/18 length-repeat that overruns `total` (computes `end = i + repeat`, `end > total` → `error.BadSymbol`; previously clamped silently via `and i < total`). 3 negative deflate tests (hand-built LSB-first vectors, one per guard): `BadStoredLength`, `DistanceTooFar`, `BadHuffmanCode`. `adler32.zig` `nmax` comment corrected (classic 32-bit ADLER32 bound, conservative for the u64 intermediates — not "u64 overflow"). `hash.zig` `u64Of` comment ("first 8 bytes of the digest", a truncation, not low bits of a u128). +- 2026-06-04 10:16 — E5 loader: `loader/Loader.zig` async runtime loader. `beginLoad` reads the `.bin` off-thread via `io.concurrent` (the §8 worker + async-I/O path; the ECS-shaped Chase-Lev job system is deliberately NOT used — using it would widen its surface, a Case 2 blocker). `Pending.ready()` non-blocking poll (atomic done flag) + `wait`/`cancel`; `finish` registers via `Registry.allocWithUuid` (uuid 0 at runtime — the `.bin` carries none). Lifecycle: `retain`/`release` (release at 0 unloads + frees payload), forced `unload` (hot-reload/eviction), `reload` (re-read + swap payload in place; handle/refcount preserved). Header parsed via the portable `RuntimeHeader.read` (explicit LE); M0.6 takes no mmap/`@ptrCast` zero-copy path, so `RuntimeHeader.read` stays the single byte-order site (E1 note holds) — a future zero-copy path would be a 2nd byte-order-dependent path (documented in `Loader`, neither claimed as sole authority). `loader_async.zig`: async load doesn't block the main loop (ticks + 5 s watchdog + clean teardown) + full lifecycle + reload + forced unload; non-flaky over ReleaseSafe + 3 Debug re-runs. No new frozen surface. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. - 2026-06-04 06:58 — E4 complement (Claude.ai): stable `uuid` (UUIDv7) identity. `AssetDoc.uuid` is the first body field, emitted `asset "" { uuid: "…" type: … }` (mirrors `entity uuid` in `.scene.etch`); writer/parser/`eql` updated. New `uuid.zig` — UUIDv7 generation (48-bit `Clock.real` ms + `io.random` bits, version 7 + RFC variant), `toString`/`parse`, pure Zig (no C binding). Importers take a resolved `uuid` param; the offline entry **generates on first cook and preserves on re-cook** by reading the existing `.asset.etch` (the fs-aware generate/preserve lives in the orchestration; the pure importer receives the resolved uuid). Registry: `Slot.uuid` u128 + `allocWithUuid` + `Resolved.uuid` — **stored only** (path-based resolution unchanged; uuid resolution / rename-propagation is Phase 1+; the M0.6 `.bin` carries no uuid so runtime `alloc` leaves it 0). `source_hash` (changes with source) and `uuid` (stable for life) kept distinct. cook-demo verified: run 1 generates 3 distinct UUIDv7, run 2 preserves them (cache HIT). Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. ## Recorded deviations From 34bb3e67fb7dfd9aba603324422ae3ad305ba1e8 Mon Sep 17 00:00:00 2001 From: Guy Senpai Date: Thu, 4 Jun 2026 11:17:37 +0200 Subject: [PATCH 29/29] docs(brief): close m0.6 --- briefs/m0.6-assets.md | 47 ++++++++++++++++++++++++++----------------- 1 file changed, 29 insertions(+), 18 deletions(-) diff --git a/briefs/m0.6-assets.md b/briefs/m0.6-assets.md index ec5feb5..5938ab9 100644 --- a/briefs/m0.6-assets.md +++ b/briefs/m0.6-assets.md @@ -8,13 +8,13 @@ # M0.6 — Asset pipeline v0 (formats, codecs, cooker, async loader) -> **Status:** ACTIVE +> **Status:** CLOSED > **Phase:** 0 > **Branch:** `phase-0/assets/pipeline-v0` > **Planned tag:** `v0.6.0-M0.6-assets` > **Dependencies:** M0.0 → M0.5 (Phase 0 to date). In particular: the Chase-Lev work-stealing job system (Tier 0), the `std.Io` async file I/O from the platform layer, and the uniform `root.zig` module convention. > **Opened:** 2026-06-03 -> **Closed:** — +> **Closed:** 2026-06-04 --- @@ -30,7 +30,7 @@ First new functional surface since the Vulkan renderer (M0.4). Delivers the mini Frozen day-1 surfaces (formats + handle + cache key) and their minimal PNG / static-glTF / WAV implementation. Decode-only — no encoder of any source format is delivered. -- **Intermediate format `.asset.etch`** — Etch text (asset name, `type`, `version`, `source`, `source_hash`, `import_settings`, `process_settings`, `extracted`) plus a referenced hashed binary blob stored separately. Schema frozen day 1; only the PNG / static-glTF / WAV subset is populated. +- **Intermediate format `.asset.etch`** — Etch text: asset name; `uuid` (stable UUIDv7, emitted as the first body field); `type`, `version`, `source`, `source_hash`; the four blocks `import_settings`, `process_settings`, `cook_settings`, `extracted`, where `extracted` carries a mandatory `blob: ""` referencing the separately-stored hashed binary blob. Schema frozen day 1; only the PNG / static-glTF / WAV subset is populated. Normative schema: `engine-asset-pipeline.md §3`; construct grammar: `etch-grammar.md §21.4`. - **Runtime format `..bin`** — zero-copy binary, **40-byte header**, 8-byte aligned (`@sizeOf == 40`, no implicit padding), fields in order with offsets: `magic`="WELD" [4]u8 @0, `version` u16 @4, `asset_type` u16 @6, `platform` u16 @8, `flags` u16 @10, `data_offset` u32 @12, `data_size` u32 @16, `metadata_offset` u32 @20, `metadata_size` u32 @24, `_reserved` u32 @28 (zero-filled), `hash` u64 @32. Followed by metadata then bulk data; `data_offset` / `metadata_offset` let payload sections be independently aligned (e.g. 16-byte for GPU/SIMD data), so payload alignment is decoupled from header size. mmap-able directly. Header layout frozen day 1. - **`AssetHandle`** — 64-bit packed handle, fields: `index` u32, `generation` u16, `type_tag` u16. Typed, generation-checked. Frozen day 1. (Named `AssetHandle`, not `AssetRef`.) - **Asset registry** — handle allocation, refcount, generation bump on unload so that a stale handle is detectable. @@ -41,7 +41,7 @@ Frozen day-1 surfaces (formats + handle + cache key) and their minimal PNG / sta - **WAV decode** (in `importers/wav.zig`) — RIFF PCM. - **Importers** (source → intermediate) for: png, gltf (static), wav. - **Cookers** (intermediate → runtime `..bin`) for: texture (raw RGBA8 payload), mesh (raw f32 vertices, version 1), audio (raw PCM payload). -- **Local cooking cache** — key = `hash(source_bytes + import_settings + process_settings + cook_settings + platform)`; a second cook of an unchanged asset completes in < 10 ms versus ≥ 100 ms for the first cook (differential measurement). +- **Local cooking cache** — key = `hash(source_bytes + import_settings + process_settings + cook_settings + platform)`; a re-cook of an unchanged asset hits the cache and skips the work. Gate = `cached == cooked` + `miss ≫ hit` (the hit avoids re-cooking); absolute cook times are logged, not asserted (see Benchmarks). - **Async runtime loader** — load / unload / reload lifecycle over `std.Io` + the Tier 0 job system; never blocks the main thread. ## Out-of-scope @@ -55,7 +55,7 @@ Listed explicitly so the pipeline is not extended "to do it properly". Each item - TGA, HDR, EXR, PSD codecs; FLAC, OGG/Vorbis codecs (Phase 1+ / Phase 2+). - TTF/OTF parser and MSDF atlas generator (Phase 1). - LOD generation, tangent generation, mipmap generation, GPU texture compression (BC7/BC5/BC4/ASTC) (Phase 1–2). Cooked texture payload in M0.6 is raw RGBA8; cooked audio payload is raw PCM (no Opus — the Opus keeper is not wired in M0.6). -- Vertex quantization (f16 positions + octahedral normals). Deferred to Phase 1 and added via a `.mesh.bin` `version` bump — the header `version` field reserves this evolution, so the deferral forces no refactor. +- Vertex quantization (f16 positions + octahedral normals). Deferred to Phase 1, gated by a **mesh payload-format version carried in the `.mesh.bin` metadata section** (added when quantization lands) — distinct from the runtime header `version` field, which versions the header layout itself. Quantization is therefore an additive metadata/payload change, not a header-layout change, and forces no refactor. - Advanced hot-reload and the FileWatcher / inotify-FSEvents-ReadDirectoryChangesW path (Phase 2). - Streaming manager: priority, memory budget, LRU eviction, placeholders policy (Phase 2+). - Network and cloud cache tiers — only the local cache tier exists in M0.6. @@ -79,12 +79,13 @@ This milestone is staged with review gates, per the M0.1 / M0.5 staged-execution Mandatory reads before any production code; Claude Code ticks each box in the LIVING SECTION. 1. `engine-phase-0-plan.md` — § M0.6 — scope source of record. -2. `engine-asset-pipeline.md` — §1–10 — intermediate/runtime formats, importers, cookers, cache, registry, async loader. -3. `engine-spec.md` — §16 (Asset Pipeline) and §3.5 (in-tree as-if-lib discipline) — master alignment. -4. `engine-simd.md` — §1–3 (module role, structure, two-level API), §7.1 (Asset Pipeline hot-path map), §9 (phasing: M0.6 adds the skeleton + `adler32` + `paeth_filter_decode`), §10-referenced `@Vector`-first / asm-second discipline. -5. `engine-zig-conventions.md` — §13 surface coverage (lazy analysis guard), module rooting rule, codecs in-tree convention, `root.zig` convention. -6. `engine-directory-structure.md` — `src/modules/asset_pipeline/` and `src/foundation/simd/` layout. -7. `engine-development-workflow.md` — §4.3 (Conventional Commits), §4.6 (squash format), language closure criterion. +2. `engine-asset-pipeline.md` — §1–10, especially §3 (the normative intermediate-format schema) — intermediate/runtime formats, importers, cookers, cache, registry, async loader. +3. `etch-grammar.md` — §21.4 — grammar of the `asset` construct (category 4, pipeline-generated). +4. `engine-spec.md` — §16 (Asset Pipeline) and §3.5 (in-tree as-if-lib discipline) — master alignment. +5. `engine-simd.md` — §1–3 (module role, structure, two-level API), §7.1 (Asset Pipeline hot-path map), §9 (phasing: M0.6 adds the skeleton + `adler32` + `paeth_filter_decode`), §10-referenced `@Vector`-first / asm-second discipline. +6. `engine-zig-conventions.md` — §13 surface coverage (lazy analysis guard), module rooting rule, codecs in-tree convention, `root.zig` convention. +7. `engine-directory-structure.md` — `src/modules/asset_pipeline/` and `src/foundation/simd/` layout. +8. `engine-development-workflow.md` — §4.3 (Conventional Commits), §4.6 (squash format), language closure criterion. ## Files to create or modify @@ -138,7 +139,7 @@ Concrete paths. Files outside this list must not be touched without a written ju ### Benchmarks -- `tests/assets/cache_diff.zig` (differential) — first cook ≥ 100 ms, second cook of the unchanged asset < 10 ms, on the Phase 0 reference machine. This is the only numeric gate. +- `tests/assets/cache_diff.zig` (differential) — gate: `cached == cooked` (bit-identical) and `miss ≫ hit` (the cache hit avoids re-cooking; the test asserts `miss > 20× hit` and `hit < 10 ms`). Absolute cook times are logged, not asserted: M0.6's cook is a trivial raw copy, so the original absolute "≥ 100 ms / < 10 ms" was a calibration error (it applies only to a Phase 1+ BC7/Opus cook). See Recorded deviations. - `src/foundation/simd/bench/adler32_bench.zig`, `paeth_bench.zig` — throughput, **baseline recorded only, no parity target**. These kernels sit on a cold path (decode runs once at cook time; the runtime mmaps the cooked `.bin`); a zlib-ng parity target is explicitly out of scope to avoid optimizing a cold path. ### Observable behavior @@ -192,7 +193,8 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o ## Specs read - [x] `engine-phase-0-plan.md` (§ M0.6) — read 2026-06-03 18:32 -- [x] `engine-asset-pipeline.md` (§1–10) — read 2026-06-03 18:32 +- [x] `engine-asset-pipeline.md` (§1–10, esp. §3 normative schema) — read 2026-06-03 18:32 +- [x] `etch-grammar.md` (§21.4 — `asset` construct grammar) — read 2026-06-04 06:44 - [x] `engine-spec.md` (§16, §3.5) — read 2026-06-03 18:32 - [x] `engine-simd.md` (§1–3, §7.1, §9, §10) — read 2026-06-03 18:32 - [x] `engine-zig-conventions.md` (§13, module rooting, codecs in-tree, root.zig) — read 2026-06-03 18:32 @@ -218,6 +220,7 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o - 2026-06-04 10:16 — E5 cleanups (E1/E2/E4 minor points, per Guy, this session): `inflate.zig` now rejects a 16/17/18 length-repeat that overruns `total` (computes `end = i + repeat`, `end > total` → `error.BadSymbol`; previously clamped silently via `and i < total`). 3 negative deflate tests (hand-built LSB-first vectors, one per guard): `BadStoredLength`, `DistanceTooFar`, `BadHuffmanCode`. `adler32.zig` `nmax` comment corrected (classic 32-bit ADLER32 bound, conservative for the u64 intermediates — not "u64 overflow"). `hash.zig` `u64Of` comment ("first 8 bytes of the digest", a truncation, not low bits of a u128). - 2026-06-04 10:16 — E5 loader: `loader/Loader.zig` async runtime loader. `beginLoad` reads the `.bin` off-thread via `io.concurrent` (the §8 worker + async-I/O path; the ECS-shaped Chase-Lev job system is deliberately NOT used — using it would widen its surface, a Case 2 blocker). `Pending.ready()` non-blocking poll (atomic done flag) + `wait`/`cancel`; `finish` registers via `Registry.allocWithUuid` (uuid 0 at runtime — the `.bin` carries none). Lifecycle: `retain`/`release` (release at 0 unloads + frees payload), forced `unload` (hot-reload/eviction), `reload` (re-read + swap payload in place; handle/refcount preserved). Header parsed via the portable `RuntimeHeader.read` (explicit LE); M0.6 takes no mmap/`@ptrCast` zero-copy path, so `RuntimeHeader.read` stays the single byte-order site (E1 note holds) — a future zero-copy path would be a 2nd byte-order-dependent path (documented in `Loader`, neither claimed as sole authority). `loader_async.zig`: async load doesn't block the main loop (ticks + 5 s watchdog + clean teardown) + full lifecycle + reload + forced unload; non-flaky over ReleaseSafe + 3 Debug re-runs. No new frozen surface. Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. - 2026-06-04 06:58 — E4 complement (Claude.ai): stable `uuid` (UUIDv7) identity. `AssetDoc.uuid` is the first body field, emitted `asset "" { uuid: "…" type: … }` (mirrors `entity uuid` in `.scene.etch`); writer/parser/`eql` updated. New `uuid.zig` — UUIDv7 generation (48-bit `Clock.real` ms + `io.random` bits, version 7 + RFC variant), `toString`/`parse`, pure Zig (no C binding). Importers take a resolved `uuid` param; the offline entry **generates on first cook and preserves on re-cook** by reading the existing `.asset.etch` (the fs-aware generate/preserve lives in the orchestration; the pure importer receives the resolved uuid). Registry: `Slot.uuid` u128 + `allocWithUuid` + `Resolved.uuid` — **stored only** (path-based resolution unchanged; uuid resolution / rename-propagation is Phase 1+; the M0.6 `.bin` carries no uuid so runtime `alloc` leaves it 0). `source_hash` (changes with source) and `uuid` (stable for life) kept distinct. cook-demo verified: run 1 generates 3 distinct UUIDv7, run 2 preserves them (cache HIT). Gates green: `zig build`, `zig build test` (Debug + ReleaseSafe), `zig fmt --check`, `zig build lint`. +- 2026-06-04 11:00 — Final-review GO (Guy): M0.6 closeout. Reconciled the FROZEN body with the in-milestone decisions (intermediate Scope: `uuid` + `cook_settings` + mandatory `extracted.blob`; Specs-to-read: added `etch-grammar.md §21.4`, pointed `engine-asset-pipeline.md §3`; Benchmarks/cache → robust differential, absolutes logged not asserted; vertex-quantization deferral → mesh payload-format version in the `.mesh.bin` metadata, not `header.version`). Closing notes filled; Status → CLOSED. No merge / no tag (Guy handles both). ## Recorded deviations @@ -230,8 +233,16 @@ References and known pitfalls only. No anticipated edge cases (those are Scope o ## Closing notes -- **What worked:** -- **What deviated from the original spec:** -- **What to flag explicitly in review:** -- **Final measurements** (perf, binary size, compile time, as relevant): -- **Residual risk / technical debt left deliberately:** +Milestone complete — E1→E5 plus the E4 uuid complement and the E2/E4 cleanups, across one branch / one PR (#20) with five review gates. + +- **Synthesis (E1→E5):** + - **E1 — frozen surfaces.** Intermediate `.asset.etch` schema (`AssetDoc` + ad-hoc Etch-subset reader/writer), runtime `..bin` 40-byte header (`extern struct` + comptime `@offsetOf`/`@sizeOf` guards, explicit LE I/O), `AssetHandle` (`packed struct(u64)`), `Registry` (refcount + generation invalidation). + - **E2 — DEFLATE + SIMD.** RFC 1951 inflate (fixed/dynamic/stored, table-driven, no zlib-ng/std.compress reuse) + zlib wrapper + ADLER32 trailer; `foundation/simd/` skeleton + `adler32` kernel (portable `@Vector` + scalar reference). + - **E3 — codecs.** PNG decode (5 filters incl. the `paeth_filter_decode` kernel, bit depths 1/2/4/8, palette + tRNS, Adam7) → RGBA8; glTF static via `std.json`; WAV RIFF PCM. + - **E4 — pipeline.** Importers (PNG/glTF/WAV → intermediate + blob), cookers (texture/mesh/audio → `.bin` via the frozen header, explicit LE write), local cooking cache (BLAKE3-keyed); uuid complement (stable UUIDv7 + `cook_settings` + mandatory `extracted.blob`). + - **E5 — loader.** Async runtime loader over `std.Io` (`io.concurrent`) — never blocks the main thread; lifecycle (load/retain/release/forced-unload/reload) via the Registry. +- **What worked:** the day-1-frozen surfaces (header, handle, registry) needed **zero rework** after E1 — E4/E5 built straight on them, validating the "design on day 1" hypothesis (C0.6). Manual-encode-then-independent-decode (Pillow for PNG, Python `zlib` for DEFLATE) caught vector bugs at authoring time and avoided encoder/decoder bug-cancellation (esp. Adam7). Staged gates kept each surface reviewed before the next depended on it. +- **What deviated from the original spec:** (a) `.bin` header **32 → 40 bytes** (calibration error; `_reserved` u32 @28 lands `hash` at @32, `@sizeOf == 40`). (b) intermediate schema gained `uuid`, `cook_settings`, and mandatory `extracted.blob` (Claude.ai complement). (c) cache gate = robust differential (`cached == cooked` + `miss ≫ hit`), not the absolute ≥100 ms/<10 ms (calibration error). (d) async loader uses `std.Io` `io.concurrent`, **not** the Chase-Lev job system (its ECS-chunk surface would have to widen — the brief's own Case 2 gate; §8 prescribes "worker + async I/O", which `std.Io` provides). (e) the intermediate is read/written by an ad-hoc Etch-subset reader/writer (no `weld_etch`). All recorded in § Recorded deviations / journal. +- **What to flag explicitly in review:** the frozen on-disk surfaces (`.bin` 40-byte header; `.asset.etch` schema incl. uuid/cook_settings/blob; `AssetHandle`) are the Phase 0+ contracts — post-merge changes need explicit versioning. `RuntimeHeader.read` (portable LE) is the **single byte-order site** in M0.6; a future zero-copy mmap/`@ptrCast` path would add a second (documented in `Loader`). Files outside the brief's "Files to create or modify" list are enumerated in the closeout hand-back (all justified: shared utils + the thin offline entry). +- **Final measurements:** test suite **426/443 passed, 17 skipped** (the 17 = pre-existing platform-gated render/Vulkan tests, none from M0.6); Debug + ReleaseSafe both green. `cache_diff` (Apple Silicon, ReleaseSafe): `miss > 20× hit`, `hit < 10 ms` (absolutes logged). Kernel baselines (smoke, Apple Silicon, cold path, baseline-only): `adler32` ≈ 349 MB/s portable, `paeth` ≈ 87 MB/s portable. `cook-demo`: 3 `.bin` produced (texture 304 B / mesh 312 B / audio 176 B); re-run → cache HIT on all three with uuids preserved. +- **Residual risk / technical debt left deliberately:** (1) PNG 16-bit and gray/RGB colour-key tRNS deferred; chunk CRC32 parsed but not verified (the IDAT ADLER32 already guards the pixel stream). (2) glTF: embedded base64 `data:` buffers only — external `.bin` / `.glb` deferred. (3) the `.mesh.bin` metadata has no explicit payload-format version field yet (M0.6 raw-f32 is the implicit v1); the field lands in Phase 1 with vertex quantization. (4) loader does no GPU upload / streaming / placeholder (Phase 2+, out of scope). (5) `io.concurrent` requires a concurrency-capable `Io` backend (Phase 0 default `Io.Threaded` qualifies; a non-concurrent backend would surface `error.ConcurrencyUnavailable` from `beginLoad`).