Skip to content

perf: page preprocessed commitment#645

Open
nicole-graus wants to merge 3 commits into
mainfrom
page-preprocessed-commit
Open

perf: page preprocessed commitment#645
nicole-graus wants to merge 3 commits into
mainfrom
page-preprocessed-commit

Conversation

@nicole-graus
Copy link
Copy Markdown
Collaborator

Description

This PR optimizes the PAGE preprocessed commitment in two complementary ways:

  • Static zero-init page commitments: Zero-init pages (stack, heap, runtime —init_values.is_none()) have an INIT column of uniformly zero bytes and an OFFSET column of 0..page_size-1, neither depending on the program being proven.

  • Optional ELF-data-page commitments: verify_with_options and VmAirs::new now accept an optional page_commitments: Option<&[(u64, Commitment)]>. When supplied, the verifier looks up each ELF data page by page_base and uses the value directly, skipping the in-verifier FFT + Merkle build for that page.

Verifier cycles reduction

Measured on the recursion guest (the verifier compiled as a RISC-V program verifying an empty inner proof, blowup=2, 1 query):

Baseline (Bitwise + keccak_rc + decode optional, page recompute): 2,895 M
This PR (+ Page static + Page optional): 67 M (~43× reduction)

Each ELF data page passed via page_commitments saves the cost of one 256 KB page's FFT + Merkle build in the recursion guest (~2.8 B cycles per page, measured here).

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Codex Code Review

Findings

  • Medium (DoS/performance): zero-init PAGE fallback recomputes the same commitment once per zero page when the options miss the static table, e.g. non-3 coset or blowup outside [2,4,8]. prover/src/lib.rs calls page::preprocessed_commitment for every zero-init page, and prover/src/tables/page.rs falls back to a full PAGE LDE/Merkle recompute. Since zero-init commitments are identical for a given (page_size, blowup, coset), this can multiply a very expensive computation by the number of runtime pages. Cache the fallback per geometry/options, ideally locally while building VmAirs, or with an options-keyed cache.

No other issues found in the changed diff.

I could not run the targeted tests because rustup tried to write to /home/runner/.rustup/tmp, which is read-only in this sandbox.

@claude
Copy link
Copy Markdown

claude Bot commented Jun 3, 2026

Review: perf: page preprocessed commitment

Overview

This PR lands two complementary optimizations:

  1. Static zero-init page commitments — replaces the OnceLock-based runtime cache with compile-time constants for standard (DEFAULT_PAGE_SIZE, blowup) pairs. Strictly better: no heap allocation, no atomic sync, and drift is caught by tests.
  2. Optional ELF-data-page commitments — new page_commitments: Option<&[(u64, Commitment)]> parameter on verify_with_options / VmAirs::new lets callers supply precomputed commitments to skip the in-verifier FFT + Merkle build.

The ~43× recursion-guest cycle reduction is substantial and the design is clean.


Security

The trust model is sound and mirrors the existing decode_commitment pattern. page_commitments must come from the caller's compiled binary, never from the prover. A wrong value is rejected (not silently accepted) because it either mismatches the prover's committed preprocessed root or diverges the Fiat-Shamir transcript. The four new soundness tests (wrong_value_rejects, zero_bytes_rejects, some_matches_default_path, empty_list_matches_none) cover the critical cases.


Minor: linear scan in the ELF-data-page lookup

// prover/src/lib.rs ~line 440
let commitment = page_commitments
    .and_then(|list| {
        list.iter()
            .find(|(pb, _)| *pb == config.page_base)
            .map(|(_, c)| *c)
    })
    .unwrap_or_else(|| page::preprocessed_commitment(config, proof_options));

This is O(pages × list_len). With DEFAULT_PAGE_SIZE = 256 KB the number of ELF data pages per proof is tiny (single digits in practice), so the O(n²) cost is negligible today. No change needed, but worth a note if page counts ever grow.


Nit: naming inconsistency

The low-level function retained its old name compute_precomputed_commitment (marked #[doc(hidden)]) while the new public wrapper is preprocessed_commitment. The mismatch (precomputed vs preprocessed) is a leftover from before the terminology was standardised. Not blocking, but could confuse future readers.


Overall

LGTM. The approach is correct, the trust model is properly documented, and the test coverage is thorough. The static constant table for zero-init pages is a clean replacement for the old OnceLock mechanism.

Comment thread prover/src/lib.rs
list.iter()
.find(|(pb, _)| *pb == config.page_base)
.map(|(_, c)| *c)
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The find scan is O(pages × list_len). Fine today — with DEFAULT_PAGE_SIZE = 256 KB, typical ELF programs have only a handful of data pages so the total work is tiny. If the ELF-page count ever grows materially, a pre-built HashMap<u64, Commitment> outside the iterator would be the fix.

Comment thread prover/src/tables/page.rs
/// drift-detection tests in `static_commitments_tests`. Production callers
/// should go through [`preprocessed_commitment`] so the static const-table
/// shortcut is used when applicable.
#[doc(hidden)]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor naming nit: this function is compute_precomputed_commitment while the new public wrapper below is preprocessed_commitment — "precomputed" vs "preprocessed". The inconsistency is understandable (preserving the old name as #[doc(hidden)]), but a reader encountering both names for the first time may wonder if they compute different things. A brief one-liner comment on this function noting it's the raw computation behind preprocessed_commitment would make the relationship explicit.

@nicole-graus nicole-graus marked this pull request as ready for review June 3, 2026 13:10
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Codex Code Review

Found one issue:

Performance: significant fallback regression
prover/src/tables/page.rs:354

preprocessed_commitment now recomputes zero-init PAGE commitments on every zero page when (page_size, blowup, coset) is not in the static table, e.g. CLI-supported blowup 16 or a non-3 coset. The old OnceLock avoided repeated work, albeit with an unsafe keying bug. For a program with many stack/heap pages, this repeats a very large FFT/LDE/Merkle build per page. At DEFAULT_PAGE_SIZE and blowup 16, that is 2^22 LDE rows per zero page.

Actionable fix: keep the static fast path, but cache fallback zero-page commitments keyed by (page_size, blowup_factor, coset_offset) so non-static options compute once per geometry/options tuple.

No security vulnerabilities found in the changed verifier commitment path. Incorrect supplied page commitments appear to reject rather than silently pass.

I could not run cargo check -p lambda-vm-prover --tests because rustup tried to sync the pinned toolchain under read-only /home/runner/.rustup.

Comment thread prover/src/lib.rs
let commitment = page_commitments
.and_then(|list| {
list.iter()
.find(|(pb, _)| *pb == config.page_base)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low – undocumented first-match-wins behaviour

If page_commitments contains two entries with the same page_base (e.g. a caller accidentally duplicates an entry), .find() silently uses the first one. A wrong commitment will still cause Fiat-Shamir rejection, so this can't silently accept a bad proof, but it can cause a head-scratcher where a caller updates the second copy of a duplicate and wonders why nothing changes.

Consider documenting the contract ("each page_base must appear at most once") in the verify_with_options / VmAirs::new doc-comments, or adding a debug_assert! that the slice has no duplicate page_base values.

@claude
Copy link
Copy Markdown

claude Bot commented Jun 3, 2026

Review: perf: page preprocessed commitment

Summary

Good, focused optimization. The two changes (static const commitments for zero-init pages, caller-supplied commitments for ELF data pages) are well-separated, and the trust model is correctly documented and tested. The 43× cycle reduction for the recursion guest is significant.

Security properties hold: a wrong page_commitments entry feeds into the Fiat-Shamir transcript and causes the proof to be rejected (Ok(false)), never silently accepted. The explicit coset_offset == 3 guard in preprocessed_commitment ensures the baked-in constants are never used for a different LDE domain.

Issues

Low – duplicate page_base entries (inline comment): list.iter().find() silently uses the first match if the caller provides duplicate page_base keys. Wrong values are still rejected by Fiat-Shamir, so this is not a security bug, but the contract should be documented or enforced with a debug_assert!. See inline.

Low – missing partial-coverage test: page_commitments_some_matches_default_path only exercises the case where every ELF data page is in the list. The interesting correctness edge is when only a subset of pages is supplied — those pages use the caller's commitment, the rest fall through to compute_precomputed_commitment. That path is correct by construction, but a test would pin the behaviour:

// supply only the first page; the rest recompute from ELF
let partial = &list[..1];
let ok = verify_with_options(&vm_proof, &elf_bytes, &options, None, Some(partial))
    .expect("partial page_commitments should not error");
assert!(ok, "partial page_commitments must still verify");

Notes

  • Removing the OnceLock cache is the right call: the old cache was keyed only on page content, not on ProofOptions, so it was silently wrong for processes with multiple blowup/coset configurations. The new static lookup is both faster and correct.
  • The static_zero_page_commitment function is a clean pattern; the regeneration instructions and drift tests are thorough.
  • #[doc(hidden)] on compute_precomputed_commitment is a good nudge that production callers should go through preprocessed_commitment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants