fix(soundness): bound untrusted num_instances in native + recursion verifiers by spherel · Pull Request #1360 · scroll-tech/ceno

spherel · 2026-06-16T13:57:05Z

Summary

num_instances (per-shard, per-circuit instance counts) is read straight from
the untrusted proof and then flows into size/shift/field arithmetic in both
verifiers. Neither side bounded it, so a crafted proof could push these
quantities out of range — a verifier-side DoS and a field-order decoupling, i.e.
a soundness concern. This PR adds the matching upper bound on both the native
Rust verifier and the recursion verifier so they stay in lockstep.

Problem

In the native verifier, proof.num_instances feeds:

proof.num_instances.iter().sum() — can wrap (or panic in debug);
next_pow2_instance_padding and the 1 << log2_num_instances usize shifts —
can overflow / panic on attacker input (DoS);
the from_canonical_usize transcript casts — can exceed the base-field modulus
and wrap, decoupling the transcript-bound value from the real count.

The recursion verifier (verify_chip_proof_pre_main) is worse: it does not even
derive log2_num_instances/sum_num_instances — it reads them as untrusted
hints. An out-of-range log2_num_instances drives pow_2(log2_num_instances)
past the BabyBear field size (~2³¹), wrapping the next_pow2/next_pow2 - sum
offset used by the eq selectors, which would let the recursive proof verify a
different statement than the native verifier.

Fix

A valid proof's instance count is always bounded by the PCS setup size
(2^MAX_NUM_VARIABLES, with the witness committed at ≤ 2^MAX_NUM_VARIABLES), so
rejecting anything larger only drops malformed proofs earlier and does not change
the statement verified for any valid proof.

Native (ceno_zkvm/src/scheme/verifier.rs): reject any per-shard
num_instances entry (and their sum) > 2^MAX_NUM_VARIABLES, both where the
count is first summed and (authoritatively) in verify_chip_proof_pre_main,
right before the shift/cast it protects.
Recursion (ceno_recursion/src/zkvm_verifier/verifier.rs): in
verify_chip_proof_pre_main, before the hints feed pow_2(log2_num_instances)
and the offset, assert log2_num_instances ≤ MAX_NUM_VARIABLES and
sum_num_instances / each num_instances[i] ≤ 2^MAX_NUM_VARIABLES. The bound
lives in the verifier function (not the hint read(), which is not part of the
verified circuit).

2^MAX_NUM_VARIABLES = 2^24 is far below the BabyBear modulus, so the field-order
wrap is closed on both sides.

Test plan

cargo check / cargo clippy clean for both ceno_zkvm and ceno_recursion
Native: integration e2e subset (fibonacci + ceno_rt_alloc + keccak_syscall,
debug MOCK + release prove/verify, single + multi-shard) — 9/9 pass; every
release run exercises the new bound in verify_chip_proof_pre_main
Recursion: e2e_aggregate over multi-shard keccak_syscall verifies —
both leaf proofs run verify_chip_proof_pre_main (and thus the new asserts)
for every chip across 2 shards, then internal aggregation completes (final
height 1), confirming valid proofs are not rejected

Follow-up (out of scope)

This closes the bound (DoS / field-wrap). Fully binding the recursion's
log2_num_instances / bit-decomposition hints to be exactly consistent with
num_instances (rather than only bounding them) — which the native side gets for
free by deriving log2 — remains a separate hardening item.

Closes #1178, #1240.

🤖 Generated with Claude Code

…, #1240) `proof.num_instances` is read from the untrusted proof and flows into `next_pow2_instance_padding`, the `1 << log2_num_instances` usize shifts, and the `from_canonical_usize` transcript casts. With no upper bound, a crafted proof could: * make `proof.num_instances.iter().sum()` wrap (or panic in debug); * drive `log2_num_instances` past the usize bit-width so `1 << log2_num_instances` panics / wraps (a verifier DoS on attacker input, per the repo's panic-on- untrusted-input concern); and * exceed the base-field modulus so `from_canonical_usize` wraps, decoupling the transcript-bound value from the real count (the #1240 field-order concern). Fix: reject any per-shard `num_instances` entry (and their sum) larger than the PCS setup bound `2^MAX_NUM_VARIABLES` up front. A valid proof's instance count is always within this bound (the witness is committed at `<= 2^MAX_NUM_VARIABLES`), so this only rejects malformed proofs earlier and does not change the statement verified for any valid proof — keeping native/recursion lockstep intact. The guard is placed both where the count is first summed and (authoritatively) in `verify_chip_proof_pre_main`, right before the shift/cast it protects. `2^MAX_NUM_VARIABLES = 2^24` is far below the BabyBear modulus (~2^31), so the field-order wrap (#1240) is also closed. Recursion mirror: the recursion verifier reads `sum_num_instances`, `sum_num_instances_minus_one_bit_decomposition` and `log2_num_instances` as unconstrained hints (ceno_recursion/src/zkvm_verifier/binding.rs:585-587 — they are never bound to each other or to a field-order limit). Mirroring this guard and verifying the bit-decomposition/log2 derivation there is the matching follow-up but requires recursion-DSL changes that cannot be validated without an end-to-end recursion run, so it is intentionally left out of this commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Companion to the native-verifier bound. The native verifier rejects any num_instances entry (and their sum) exceeding 2^MAX_NUM_VARIABLES and derives log2_num_instances from those bounded counts. The recursion verifier reads sum_num_instances and log2_num_instances as untrusted hints without bounding them, so a crafted proof could drive pow_2(log2_num_instances) past the field size and wrap the next_pow2 - sum offset, verifying a different statement than the native verifier. verify_chip_proof_pre_main now bounds log2_num_instances <= MAX_NUM_VARIABLES and sum_num_instances / each num_instances[i] <= 2^MAX_NUM_VARIABLES before they are used. Valid proofs are always within this bound, so only malformed proofs are rejected; native and recursion stay in lockstep. Validated: cargo check / clippy clean; aggregation e2e over multi-shard keccak_syscall verifies (leaf proofs run the asserts for every chip, valid proofs are not rejected). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

spherel and others added 2 commits June 13, 2026 00:23

spherel force-pushed the fix/issue-1178-1240-num-instance-bounds branch from a3bf785 to 70d3710 Compare June 16, 2026 14:03

Merge branch 'master' into fix/issue-1178-1240-num-instance-bounds

6120684

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(soundness): bound untrusted num_instances in native + recursion verifiers#1360

fix(soundness): bound untrusted num_instances in native + recursion verifiers#1360
spherel wants to merge 3 commits into
masterfrom
fix/issue-1178-1240-num-instance-bounds

spherel commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

spherel commented Jun 16, 2026

Summary

Problem

Fix

Test plan

Follow-up (out of scope)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants