diff --git a/.claude/board/AGENT_LOG.md b/.claude/board/AGENT_LOG.md index 238cd8a1..53fd8451 100644 --- a/.claude/board/AGENT_LOG.md +++ b/.claude/board/AGENT_LOG.md @@ -1,3 +1,7 @@ +## 2026-06-13 — SoaEnvelope binding for canonical NodeRow (the canon-as-substrate keystone) + +**bardioc cross-session.** Closes punchlist item §7.2 of the 2026-06-13 SoA migration diff resolution doc — the canonical row layout is now bound to the envelope ABI. New `NodeRowPacket<'a>` wrapper in `canonical_node.rs` zero-copy-views a `&[NodeRow]` (each row `#[repr(C, align(64))]` at 512 bytes) as a row-strided LE byte packet through `SoaEnvelope`. Three-column descriptor table (`NODE_ROW_COLUMNS`): key (16 × u8 at offset 0), edges (16 × u8 at offset 16), value (480 × u8 at offset 32) — sums to `NODE_ROW_STRIDE = 512`. Internal structure within each slot stays canon-described (`NodeGuid` for the key, `EdgeBlock` for the edges, registry `ClassView` for the value carve-out) — the envelope contract is at the row-stride level, not the field-decomposition level. `NodeRowColumn` enum exports the column ordinals as `pub enum { Key=0, Edges=1, Value=2 }` for type-safe `column_le` access. `as_le_bytes()` is unsafe-free at the API but uses `core::slice::from_raw_parts` internally with a documented SAFETY note (NodeRow `#[repr(C)]` + locked size + canon-LE field accessors). +9 tests covering column-table layout, empty-packet verification, single-row zero-copy (pointer equality), multi-row byte length, `row_le`/`column_le` LE byte ranges, canon-LE key end-to-end, and `LAYOUT_VERSION` parity. `cargo test -p lance-graph-contract --lib`: **603/603 green** (+9); `cargo clippy -p lance-graph-contract --all-targets -- -D warnings`: clean. **No public-API drift in existing code** — `NodeRowPacket`, `NodeRowColumn`, `NODE_ROW_COLUMNS`, `NODE_ROW_STRIDE` are pure additions. This is the keystone the BindSpace dissolution sequence S1-S4 has been blocked behind: Lance's columnar I/O can now read the canonical row packet directly. Next step: MailboxSoA migrating from its column-major `[T; N]` layout to a row-strided `[NodeRow; N]` backing store that impls `SoaEnvelope` through this wrapper. + ## 2026-06-13 — SoA migration diff resolution doc (catch-up audit + post-#490 supersession map) **bardioc cross-session.** Operator directive: *"the biggest goal would be to catch up on all SoA bindspace migration plans and resolve the diff."* Surveyed the SoA / BindSpace / identity plan family (9 plans + 4 board files + 5 code files + canon doc-locks) and produced a single resolution doc at `.claude/plans/soa-migration-diff-resolution-2026-06-13.md` that names every plan-vs-shipped diff post-#487/#489/#490. **Headline drifts named:** (1) `identity-architecture-exists-vs-needs-v1.md §N1`'s UUIDv8 layout fully superseded by OGAR/CLAUDE.md P0 canon — the namespace/entity_type/kind/niblepath_prefix/shape_hash/RFC ceremony framing did NOT ship; canon's classid·HEEL·HIP·TWIG·family·identity (no ceremony) won (PR #489/#490). (2) `bindspace-singleton-to-mailbox-soa-v1.md`'s `CollapseGateEmission` / `MailboxSoA::emit()` / Baton-as-type was retired in PR #487 tombstone; `last_emission_cycle → last_active_cycle` rename per #477 supersession. (3) `unified-soa-convergence-v1.md §4.2` stack pins drifted (`lance =7.0.0` / `lancedb =0.30.0`); 2026-05-29 addendum partially addressed. (4) `polyglot-container-query-membrane-v1.md` ratified research-only — self-describing-key convergence dissolved the membrane question. **D-MBX-A2 status:** still queued, still the gating gap; MailboxSoA has no Hamming columns. **S2-S4 status:** unshipped; `driver.rs:56` still has `pub(crate) bindspace: Arc`, both `bin/serve.rs:29` + `bin/grpc.rs:29` still call `BindSpace::zeros(4096)`. **SoaEnvelope status:** trait shipped (#477), zero real implementors — only `TestEnvelope` in tests; MailboxSoA does NOT impl it. **Staunen/Wisdom-as-entropy×energy-substrate-state correction with §6.1 alternative framings (Bugwelle / aerodynamics / event horizon / Friston FEP)** — added per operator relay 2026-06-13: (a) Staunen as the entropy *Bugwelle* (bow wave) of thinking-in-progress — Staunen is not static, it's the leading-edge entropy disturbance GENERATED by cognitive motion; (b) Aerodynamic / shock-wave analogy — as cognitive velocity through the substrate increases, the Bugwelle steepens, "sonic boom" = the *aha* breakthrough where entropy collapses abruptly; (c) Event-horizon / inertia — Wisdom's (low entropy × high energy) corner has gravitational properties, novelty needs escape velocity to break out of the well, the canon's "reserve don't reclaim" at classid==0/family==0 keeps the bootstrap basin always-escapable; (d) Friston Free Energy Principle as the scientific anchor — high FE = Staunen, FE-minimisation in progress = Confusion/Chaos quadrant, minimised FE = Wisdom; `consume_firing(row)` IS active inference (energy ≥ threshold ⇒ fire, in-place mark, integrate prediction error). The four framings stack: Bugwelle = shape; aerodynamics = velocity scaling; event-horizon = inertia/why Wisdom resists; FEP = drive function. Same underlying substrate dynamics, four vocabularies for reach. (handover §8 + operator image relays + operator framing relay 2026-06-13). Canonical DIKW = Data → Information → Knowledge → Wisdom, bridged by Processing → Cognition → Judgment; Wisdom IS the canonical DIKW apex rung. The operator's precise framing: **Staunen = high entropy × low energy** = "needs entropy work" marker = cognitive pressure + emerging insight (not yet crystallised). **Wisdom = low entropy × high energy** = crystalline knowledge with supporting plasticity + integrated insights, the substrate has invested heavily and locked it in. Diagonal opposites on the entropy×energy plane, NOT two ends of one axis. The other two quadrants: **Confusion / Chaos** (high entropy × high energy = in-progress climb state, substrate has invested energy but entropy hasn't yet collapsed) and **Boredom / Inert** (low entropy × low energy = ordered but not energised). Substrate column map: Energy = `MailboxSoA.energy: [f32; N]` (signed spatio-temporal accumulator); Plasticity = `MailboxSoA.plasticity_counter: [u8; N]` (saturating Hebbian counter = long-term investment); Entropy proxy = classid-prefix-resolved codebook hit-rate × local edge-neighbourhood density. Two-algebra rule maps onto the plane: entropy axis = signed side (`vsa_bind`), energy axis = magnitude side (`vsa_bundle`). The canon's 3×4 uniform cascade (HEEL · HIP · TWIG = three u16 tiers) shape-matches DIKW's three transitions + four layers — not coincidentally. NOT YET corrected in lance-graph CLAUDE.md (line ~120 still says "Magnitude = Contradiction depth from Staunen × Wisdom qualia") — flagged as `TD-CLAUDE-MD-STAUNEN-MISNAME` for a separate maintenance pass with three specific edits identified (line ~120 rewrite citing entropy×energy markers, §11.5 rephrasing, new DIKW-anchor sub-section under "The Click" mapping cascade tiers onto DIKW transitions + the entropy×energy quadrant diagram). **LE-contract violations still on the books:** `engine_bridge.rs` f32→i4 qualia re-encode, `Vsa16kF32` persisted as cross-boundary in singleton, DTO-as-owned-Vec sites — all dissolve at S2/S4. Errata stubs prepended to 4 affected plans (bindspace-singleton-to-mailbox-soa, identity-architecture-exists-vs-needs, unified-soa-convergence, polyglot-container-query-membrane) pointing at the resolution doc. Resolved punchlist §7 lists 9 follow-up PRs in priority order. Docs-only PR; no code touched. diff --git a/crates/lance-graph-contract/src/canonical_node.rs b/crates/lance-graph-contract/src/canonical_node.rs index 853868d5..359ca433 100644 --- a/crates/lance-graph-contract/src/canonical_node.rs +++ b/crates/lance-graph-contract/src/canonical_node.rs @@ -189,6 +189,129 @@ const _: () = assert!(core::mem::size_of::() == 16); const _: () = assert!(core::mem::size_of::() == 16); const _: () = assert!(core::mem::size_of::() == 512); +// ── SoaEnvelope binding for [NodeRow] ──────────────────────────────────────── + +use crate::soa_envelope::{ColumnDescriptor, ColumnKind, SoaEnvelope}; + +/// Stable column-id ordinals for [`NodeRow`]'s three top-level slots. +/// `name_id` in the [`ColumnDescriptor`] table; the registry-resolved value +/// carve-out (per `classid → ClassView`) lives *inside* `Value` and is not +/// surfaced as its own envelope column — the canon contract is at this level. +#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] +#[repr(u16)] +pub enum NodeRowColumn { + Key = 0, + Edges = 1, + Value = 2, +} + +/// Canonical [`ColumnDescriptor`] table for [`NodeRow`]. +/// +/// Three columns, all `ColumnKind::U8` byte-arrays (their internal structure +/// is canon-described elsewhere — `NodeGuid` decomposes the key, `EdgeBlock` +/// the edges, registry `ClassView` carves the value side). The envelope +/// contract is at the row-stride level: bytes 0..16 are the key, 16..32 are +/// the edges, 32..512 are the class-resolved value slab. Sum = 512 = stride. +pub const NODE_ROW_COLUMNS: &[ColumnDescriptor] = &[ + ColumnDescriptor { + name_id: NodeRowColumn::Key as u16, + kind: ColumnKind::U8, + elems_per_row: 16, + row_offset: 0, + }, + ColumnDescriptor { + name_id: NodeRowColumn::Edges as u16, + kind: ColumnKind::U8, + elems_per_row: 16, + row_offset: 16, + }, + ColumnDescriptor { + name_id: NodeRowColumn::Value as u16, + kind: ColumnKind::U8, + elems_per_row: 480, + row_offset: 32, + }, +]; + +/// Row stride for [`NodeRow`] in bytes — equal to `size_of::()`. +pub const NODE_ROW_STRIDE: usize = 512; + +/// Zero-copy [`SoaEnvelope`] wrapper over a contiguous slice of [`NodeRow`]. +/// +/// `NodeRow` is `#[repr(C, align(64))]` with the locked 16/16/480 byte +/// layout, so a `&[NodeRow]` IS already a row-strided LE packet at stride +/// 512 — no allocation, no copy. This wrapper just attaches the cycle stamp +/// and exposes the slice through the [`SoaEnvelope`] trait so Lance's +/// columnar I/O reads it directly. +/// +/// The envelope's column table ([`NODE_ROW_COLUMNS`]) names the three +/// top-level slots (key / edges / value). Internal structure within each +/// slot is the canon's concern (`NodeGuid` for the key, `EdgeBlock` for the +/// edges, registry `ClassView` for the value carve-out). +#[derive(Clone, Copy)] +pub struct NodeRowPacket<'a> { + rows: &'a [NodeRow], + cycle: u32, +} + +impl<'a> core::fmt::Debug for NodeRowPacket<'a> { + fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result { + f.debug_struct("NodeRowPacket") + .field("n_rows", &self.rows.len()) + .field("cycle", &self.cycle) + .field("row_stride", &NODE_ROW_STRIDE) + .finish() + } +} + +impl<'a> NodeRowPacket<'a> { + /// Wrap a contiguous slice of [`NodeRow`] with a cycle stamp. + #[inline] + pub const fn new(rows: &'a [NodeRow], cycle: u32) -> Self { + Self { rows, cycle } + } + + /// The underlying rows. + #[inline] + pub const fn rows(&self) -> &'a [NodeRow] { + self.rows + } +} + +impl<'a> SoaEnvelope for NodeRowPacket<'a> { + fn columns(&self) -> &[ColumnDescriptor] { + NODE_ROW_COLUMNS + } + fn row_stride(&self) -> usize { + NODE_ROW_STRIDE + } + fn n_rows(&self) -> usize { + self.rows.len() + } + fn cycle(&self) -> u32 { + self.cycle + } + fn as_le_bytes(&self) -> &[u8] { + // SAFETY: NodeRow is #[repr(C, align(64))] with size_of::() == + // 512 (checked by the const _: () asserts above). A &[NodeRow] is a + // contiguous array of #[repr(C)] structs; viewing it as &[u8] of + // length len * 512 is a standard column-store packing operation, and + // every byte position is valid for reads (no padding past size_of, + // alignment of NodeRow (64) ⊇ alignment of u8 (1)). + // + // The NodeGuid and EdgeBlock fields hold their bytes in canon-LE + // order (NodeGuid::new uses to_le_bytes; EdgeBlock is plain [u8;_]), + // so the resulting byte slice IS the envelope's LE packet — no + // translation needed at the boundary. + unsafe { + core::slice::from_raw_parts( + self.rows.as_ptr().cast::(), + self.rows.len() * NODE_ROW_STRIDE, + ) + } + } +} + #[cfg(test)] mod tests { use super::*; @@ -278,4 +401,167 @@ mod tests { let g = NodeGuid::local(0x00_00CD); assert_eq!(g.to_string(), "00000000-0000-0000-0000-0000000000cd"); } + + // ── SoaEnvelope binding for NodeRowPacket ──────────────────────────────── + + fn sample_row(classid: u32, identity: u32) -> NodeRow { + NodeRow { + key: NodeGuid::new(classid, 0x1111, 0x2222, 0x3333, 0x00_00AB, identity), + edges: EdgeBlock::default(), + value: [0u8; 480], + } + } + + #[test] + fn node_row_column_table_sums_to_row_stride() { + let total: usize = NODE_ROW_COLUMNS + .iter() + .map(|c| c.col_bytes_per_row()) + .sum(); + assert_eq!(total, NODE_ROW_STRIDE); + assert_eq!(NODE_ROW_STRIDE, core::mem::size_of::()); + } + + #[test] + fn node_row_column_table_is_in_offset_order_without_gaps() { + // The contract: columns are contiguous (key 0..16, edges 16..32, + // value 32..512) — no gaps, no overlap, in offset order. + let mut prev_end = 0usize; + for c in NODE_ROW_COLUMNS { + assert_eq!(c.row_offset as usize, prev_end, "no gap before {c:?}"); + prev_end = c.row_offset as usize + c.col_bytes_per_row(); + } + assert_eq!(prev_end, NODE_ROW_STRIDE); + } + + #[test] + fn empty_packet_verifies() { + let rows: &[NodeRow] = &[]; + let pkt = NodeRowPacket::new(rows, 0); + assert_eq!(pkt.n_rows(), 0); + assert_eq!(pkt.as_le_bytes().len(), 0); + assert!(pkt.verify_layout().is_ok(), "empty packet must verify"); + } + + #[test] + fn single_row_packet_verifies_and_byte_view_is_zero_copy() { + let rows = [sample_row(0xDEAD_BEEF, 0x00_00CD)]; + let pkt = NodeRowPacket::new(&rows, 7); + assert_eq!(pkt.n_rows(), 1); + assert_eq!(pkt.cycle(), 7); + assert_eq!(pkt.row_stride(), 512); + assert_eq!(pkt.as_le_bytes().len(), 512); + // Zero-copy: the byte view's pointer is the slice's pointer. + assert_eq!( + pkt.as_le_bytes().as_ptr() as usize, + rows.as_ptr() as usize, + "as_le_bytes must be zero-copy" + ); + assert!(pkt.verify_layout().is_ok()); + } + + #[test] + fn multi_row_packet_byte_length_is_stride_times_rows() { + let rows = [ + sample_row(0xDEAD_BEEF, 0x00_00CD), + sample_row(0xCAFE_BABE, 0x00_0001), + sample_row(0x0000_0000, 0x00_0042), + ]; + let pkt = NodeRowPacket::new(&rows, 42); + assert_eq!(pkt.n_rows(), 3); + assert_eq!(pkt.as_le_bytes().len(), 3 * 512); + assert!(pkt.verify_layout().is_ok()); + } + + #[test] + fn row_le_view_returns_one_full_row() { + let rows = [sample_row(1, 2), sample_row(3, 4), sample_row(5, 6)]; + let pkt = NodeRowPacket::new(&rows, 0); + for (i, row) in rows.iter().enumerate() { + let row_bytes = pkt.row_le(i).expect("row in range"); + assert_eq!(row_bytes.len(), 512); + // First 4 bytes are the classid in canon-LE order. + assert_eq!( + u32::from_le_bytes(row_bytes[..4].try_into().unwrap()), + row.key.classid() + ); + } + assert!(pkt.row_le(3).is_none(), "out of range"); + } + + #[test] + fn column_le_view_returns_the_named_slot() { + // Place a recognisable byte pattern in the value side; verify the + // value column-view picks it up at the right offset. + let mut row = sample_row(0xDEAD_BEEF, 0x00_00CD); + row.value[0] = 0xAB; + row.value[479] = 0xCD; + let rows = [row]; + let pkt = NodeRowPacket::new(&rows, 0); + let value_col = pkt + .column_le(0, &NODE_ROW_COLUMNS[NodeRowColumn::Value as usize]) + .expect("value column in range"); + assert_eq!(value_col.len(), 480); + assert_eq!(value_col[0], 0xAB); + assert_eq!(value_col[479], 0xCD); + // Key column is at offset 0, length 16 — first byte = LE byte 0 of + // classid = 0xEF (low byte of 0xDEAD_BEEF). + let key_col = pkt + .column_le(0, &NODE_ROW_COLUMNS[NodeRowColumn::Key as usize]) + .expect("key column in range"); + assert_eq!(key_col.len(), 16); + assert_eq!(key_col[0], 0xEF); + assert_eq!(key_col[3], 0xDE); + } + + #[test] + fn key_bytes_in_canon_le_order() { + // Round-trip: pack a NodeRow with known fields, read the bytes back + // through the envelope, parse each canon group by its LE byte range, + // confirm values match. Proves the SoA envelope view stays canon-LE + // end-to-end without any field-accessor intermediation. + let row = sample_row(0xDEAD_BEEF, 0x00_00CD); + let rows = [row]; + let pkt = NodeRowPacket::new(&rows, 0); + let bytes = pkt.as_le_bytes(); + // Per OGAR/CLAUDE.md P0: classid · HEEL · HIP · TWIG · family · identity. + assert_eq!( + u32::from_le_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]), + 0xDEAD_BEEF, + "classid at [0..4]" + ); + assert_eq!( + u16::from_le_bytes([bytes[4], bytes[5]]), + 0x1111, + "HEEL at [4..6]" + ); + assert_eq!( + u16::from_le_bytes([bytes[6], bytes[7]]), + 0x2222, + "HIP at [6..8]" + ); + assert_eq!( + u16::from_le_bytes([bytes[8], bytes[9]]), + 0x3333, + "TWIG at [8..10]" + ); + // family is u24 LE in bytes [10..13]: 0xAB, 0x00, 0x00. + assert_eq!(&bytes[10..13], &[0xAB, 0x00, 0x00], "family at [10..13]"); + // identity is u24 LE in bytes [13..16]: 0xCD, 0x00, 0x00. + assert_eq!(&bytes[13..16], &[0xCD, 0x00, 0x00], "identity at [13..16]"); + } + + #[test] + fn envelope_layout_version_matches_envelope_default() { + // The wrapper does not override LAYOUT_VERSION, so verify_layout + // checks against the envelope-crate default (ENVELOPE_LAYOUT_VERSION). + let rows = [sample_row(0, 1)]; + let pkt = NodeRowPacket::new(&rows, 0); + assert_eq!( + as SoaEnvelope>::LAYOUT_VERSION, + crate::soa_envelope::ENVELOPE_LAYOUT_VERSION + ); + // verify_layout exercises that gate. + assert!(pkt.verify_layout().is_ok()); + } }