Status
Declined entirely — YAGNI.
Context
ce-code-review's Stage 5b is a validator pass that re-checks each surviving finding's file:line citation against the actual source. The pattern serves two distinct purposes:
- Stale-finding check (ce's primary motivation). After autofix mode applies safe fixes, line numbers shift; findings cited at pre-fix lines become stale. Stage 5b catches this drift. Only runs in ce's non-interactive (externalization) modes.
- Anti-hallucination check (secondary). LLM reviewers sometimes cite line numbers that don't exist, describe code that doesn't match what's there, or reference functions/variables not in the diff. Independent of autofix.
For /ba:review, only (2) is relevant — there's no autofix mode applying changes mid-flow. The candidate's value is purely anti-hallucination.
Decision
Declined. User has not experienced hallucinated-citation pain in practice; building mitigation infrastructure before observing the problem is the textbook YAGNI failure.
Design captured (if revisited)
Two tiers were considered:
Tier 1 — Existence check
- Verify cited file exists in the diff or repo
- Verify cited line ≤ file length
- Cheap (file list and line counts are already in orchestrator context, no extra reads)
- Catches the most flagrant hallucinations (line numbers beyond file length, files not in the diff)
Tier 2 — Content match
- Re-read file around
line ± N (likely N = 5)
- Verify reviewer's prose description plausibly matches the code at that location
- Higher token cost per finding (~few hundred tokens of re-read per check)
- Catches subtler hallucinations (described code differs from actual code at cited location)
Other decisions if revisited
| Decision |
Choice |
| Run order |
After B6 dedup. Operate on consolidated findings, not per-reviewer outputs. |
| On failure |
Demote confidence by one anchor + add *(evidence-match warning)* marker. Do NOT drop. |
| Always-on or mode-gated |
Tier 1 always-on (cost negligible). Tier 2 opt-in / mode-gated. |
| Window for Tier 2 |
line ± 5 initial guess; tunable on observed false-positive rate. |
Partial absorption elsewhere
The file-existence check from Tier 1 is included in C2's validator pass (see the consolidation rework bundle issue) as part of schema validity. So a minimal form of "evidence match" effectively ships with the consolidation rework — but framed as structural schema validation, not as a separate anti-hallucination concept. If you want strict honoring of this decline, that file-existence check could be removed from C2's validator scope; the design decision currently includes it.
Trigger conditions for revisit
- User notices hallucinated citations in
/ba:review output regularly (e.g., multiple findings per review pointing at lines that don't exist or describing code that's not there)
- An autofix mode is introduced (would re-motivate ce's original use case — line drift after auto-applied fixes)
- C2 (structured output) lands and we want to push more validation into the validator pass
References
- ce-code-review Stage 5b (evidence-match in externalization modes)
- C2 issue — already includes file-existence as part of schema validation
Status
Declined entirely — YAGNI.
Context
ce-code-review's Stage 5b is a validator pass that re-checks each surviving finding'sfile:linecitation against the actual source. The pattern serves two distinct purposes:For
/ba:review, only (2) is relevant — there's no autofix mode applying changes mid-flow. The candidate's value is purely anti-hallucination.Decision
Declined. User has not experienced hallucinated-citation pain in practice; building mitigation infrastructure before observing the problem is the textbook YAGNI failure.
Design captured (if revisited)
Two tiers were considered:
Tier 1 — Existence check
Tier 2 — Content match
line ± N(likely N = 5)Other decisions if revisited
*(evidence-match warning)*marker. Do NOT drop.line ± 5initial guess; tunable on observed false-positive rate.Partial absorption elsewhere
The file-existence check from Tier 1 is included in C2's validator pass (see the consolidation rework bundle issue) as part of schema validity. So a minimal form of "evidence match" effectively ships with the consolidation rework — but framed as structural schema validation, not as a separate anti-hallucination concept. If you want strict honoring of this decline, that file-existence check could be removed from C2's validator scope; the design decision currently includes it.
Trigger conditions for revisit
/ba:reviewoutput regularly (e.g., multiple findings per review pointing at lines that don't exist or describing code that's not there)References