Consult request_state only for the question a resolver is asking#3019
Conversation
The resolver framework restored recorded outcomes from the client-echoed request_state before running the resolver body, so an entry under a resolver's wire key could stand in for the body's own computation on a branch that never asks, and a decline/cancel entry could abort a resolver with no Elicit arm at all. Restore now happens only inside _elicit, after the body returned an Elicit this round: the body's concrete return always wins, and a recorded answer satisfies exactly the question being asked, validated against the live Elicit.schema. That makes the statically-extracted elicit_schema unused, so it is removed - which also fixes resolvers annotated `-> Elicit[BaseModel]` (dynamic create_model schemas): restoring through the static schema called BaseModel.model_validate, whose PydanticUserError is not a ValidationError and hard-failed the call after the user had answered. A tool combining Resolve(...) parameters with a hand-rolled InputRequiredResult body flow shares the call's single input_responses/request_state channel; the flows overwrite each other's state and the call never converges. A declared InputRequiredResult return arm is now rejected at registration (InvalidSignature, seeing through Annotated wrappers, type aliases, and unions) and an undeclared one fails the call at runtime. The cost of body-wins is that an eliciting resolver's body re-runs on each round instead of being restored without running; the docs scope the once-per-call guarantee to the question accordingly.
There was a problem hiding this comment.
1 issue found across 6 files
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
There was a problem hiding this comment.
I didn't find any bugs in this change, but it reworks the resolver request_state trust model and changes resolver re-run semantics across rounds, so it deserves a human reviewer's judgment on the design rather than a bot approval.
Extended reasoning...
Overview
The PR moves the request_state restore gate from _resolve into _elicit in src/mcp/server/mcpserver/resolve.py, deletes the statically-extracted elicit_schema plumbing, adds a registration-time rejection (and a runtime guard) for tools that combine Resolve(...) parameters with an InputRequiredResult return in tools/base.py, and updates docs and the refund_desk story README to scope the run-once guarantee to the question rather than the resolver body. Five new regression tests plus updates to existing ones accompany the change.
Security risks
The change is security-positive on its face: a client-echoed request_state entry can no longer stand in for a server-side computed value or abort a non-eliciting resolver, since restore now only happens after the body returns an Elicit and is re-validated against the live schema. That said, this is exactly the kind of trust-boundary code (client-supplied state influencing server-side resolution) where subtle interactions matter — e.g. the new semantics that resolver bodies re-run on every round means resolver side effects now execute more times than before, which is a behavioral contract change downstream servers may rely on.
Level of scrutiny
This touches the core multi-round-trip elicitation flow of the server SDK and intentionally changes documented semantics (eliciting resolver bodies re-run each round; Resolve + InputRequiredResult is now rejected, which is technically a breaking restriction for any tool that previously combined them). Those are design decisions a maintainer should weigh in on, not something to shadow-approve regardless of implementation correctness.
Other factors
The implementation is coherent and well-tested: the new tests directly exercise the crafted-state attack paths, the dynamic Elicit[BaseModel] restore, and the channel-ownership rejection across several annotation spellings, and the bug-hunting pass found no issues. The diff is consistent with the PR description. My deferral is about the scope and semantics of the change, not about correctness concerns.
…heck Accessing `__value__` on a PEP 695 `type X = ...` alias evaluates it, so an alias naming TYPE_CHECKING-only imports raised NameError at registration for resolver-using tools that registered fine before the check existed. Such an alias declares no arm the check can see; the in-call guard still covers a body that returns an InputRequiredResult. Also pin the neighboring shapes that already work: a subscripted generic alias forwards `__value__` to its origin and is rejected, and a parameterized builtin generic return registers on every supported Python.
There was a problem hiding this comment.
I didn't find any bugs in this change, but it materially alters the resolver elicitation trust model and re-run semantics (client-echoed request_state handling, registration-time channel-ownership rejection), so it deserves a human reviewer's sign-off rather than an automated approval.
Extended reasoning...
Overview
The PR moves the request_state restore from _resolve into _elicit so a recorded outcome is only consulted after a resolver's body actually asks its question, deletes the statically-extracted elicit_schema plumbing (fixing the -> Elicit[BaseModel] restore crash), and adds a registration-time (InvalidSignature) plus runtime (ToolError) rejection of combining Resolve(...) parameters with a hand-rolled InputRequiredResult return. Docs and the refund_desk story README are updated to scope the run-once guarantee to the question rather than the resolver body, and five new regression tests cover the changed behaviors.
Security risks
The change is security-positive in intent: it closes a path where a client could craft a predictable-wire-key request_state entry to substitute or suppress a server-side computed value before the resolver body ran. The new _outcome_from_state always validates accepted entries against the live Elicit.schema, which is stricter than before. However, because this is exactly the trust boundary between client-echoed state and server-side computation, the correctness of the new restore placement and its interaction with res.persist/round convergence is the kind of thing a human maintainer should confirm. Integrity-sealing of request_state remains an acknowledged follow-up.
Level of scrutiny
This is core SDK request-handling logic with a deliberate semantic change (eliciting resolver bodies now re-run on every round the call passes through), not a mechanical or config-level change. Behavior changes of this kind can affect user resolvers with side effects, so a maintainer should weigh the documented trade-off even though the implementation looks consistent and well-tested.
Other factors
The bug-hunting system found no issues, the new tests are thorough (crafted state entries, dynamic schemas, five annotation spellings of the rejected combination, the unevaluable PEP 695 alias case), and the one external review comment (cubic) was investigated, answered with an empirical check, and the adjacent real issue fixed in a follow-up commit. Nothing outstanding remains in the timeline, but the scope and trust-model implications keep this above the bar for automated approval.
Summary
Post-merge review of #2986 surfaced three issues in the resolver
input_requiredflow; this PR fixes all three, each by removing the structure that made the issue possible rather than guarding around it.1.
request_stateentries could stand in for a resolver's own computation. The restore gate in_resolveconsulted the client-echoedrequest_statebefore running the resolver body. Wire keys are predictable (module:qualname), so a client could send an accept entry for a resolver branch that never asks, replacing a server-side computed value, and a decline/cancel entry could abort a resolver with noElicitarm at all. That contradicted #2986's stated trust model ("a fabricated entry grants the same capability as answering the live question dishonestly"): an entry honored before the body runs grants strictly more.Restore now happens only inside
_elicit, after the body has returned anElicitthis round. The body's concrete return always wins; a recorded answer satisfies exactly the question being asked, re-validated against the liveElicit.schema; an entry for a question the server is not asking is never consulted. The trust-model statement is now true by construction. (Integrity-sealingrequest_stateremains the separate follow-up, staged the same way as in the typescript-sdk.)2.
-> Elicit[BaseModel]crashed on restore.issubclass(BaseModel, BaseModel)is true, so a dynamic-schema resolver (the natural typing forcreate_model(...)questions) registeredBaseModelitself as its static restore schema; on round 3+,BaseModel.model_validate(...)raisesPydanticUserError(aTypeError, not aValidationError), which hard-failed the call after the user had already answered. With restore moved into_elicit, the statically-extracted schema has no remaining consumer, so that plumbing is deleted and the crash path with it: dynamic schemas restore against the live schema like everything else. (The registration-time rejection of multi-Elicit-arm annotations stays.)3. Resolver elicitation and a hand-rolled
InputRequiredResultbody never converge together. A call has oneinput_responses/request_statechannel. A tool that usesResolve(...)parameters and returns its ownInputRequiredResultoverwrites the resolver codec state with its opaque body state; the next round decodes that as "no progress" and re-asks the already-answered resolver question. Round 3 is wire-identical to round 1, looping until the client's round budget runs out. A declaredInputRequiredResultreturn arm on a resolver-using tool is now rejected at registration (InvalidSignature, seeing throughAnnotatedwrappers, type aliases, and unions), and an undeclared one fails the call with a clear error instead of silently fighting for the channel. The low-levelServerpath and tools without resolver parameters are unaffected.Semantics
The cost of body-wins is that an eliciting resolver's body re-runs on each round the call passes through (previously: once to ask, once to consume, then restored without running). This unifies the model: every resolver body may run on each round, each question is still asked exactly once, and only elicited outcomes persist. Wire shapes, round counts, and question batching are unchanged. The docs (
dependencies.md,multi-round-trip.md, the refund_desk story README) now scope the run-once guarantee to the question.Tests
Five new regression tests, each verified to fail on the pre-fix source: a crafted accept entry cannot replace a non-eliciting branch's value (and the body provably runs); a decline entry cannot abort a pure resolver; an
-> Elicit[BaseModel]chain completes across rounds; theResolve+InputRequiredResultcombination is rejected at registration across five annotation spellings; a dynamically returnedInputRequiredResultfrom a resolver-using tool errors cleanly. The existing suite is updated for the run-count change. 100% branch coverage.AI Disclaimer