Consult request_state only for the question a resolver is asking by maxisbey · Pull Request #3019 · modelcontextprotocol/python-sdk

maxisbey · 2026-06-29T14:27:03Z

Summary

Post-merge review of #2986 surfaced three issues in the resolver input_required flow; this PR fixes all three, each by removing the structure that made the issue possible rather than guarding around it.

1. request_state entries could stand in for a resolver's own computation. The restore gate in _resolve consulted the client-echoed request_state before running the resolver body. Wire keys are predictable (module:qualname), so a client could send an accept entry for a resolver branch that never asks, replacing a server-side computed value, and a decline/cancel entry could abort a resolver with no Elicit arm at all. That contradicted #2986's stated trust model ("a fabricated entry grants the same capability as answering the live question dishonestly"): an entry honored before the body runs grants strictly more.

Restore now happens only inside _elicit, after the body has returned an Elicit this round. The body's concrete return always wins; a recorded answer satisfies exactly the question being asked, re-validated against the live Elicit.schema; an entry for a question the server is not asking is never consulted. The trust-model statement is now true by construction. (Integrity-sealing request_state remains the separate follow-up, staged the same way as in the typescript-sdk.)

2. -> Elicit[BaseModel] crashed on restore. issubclass(BaseModel, BaseModel) is true, so a dynamic-schema resolver (the natural typing for create_model(...) questions) registered BaseModel itself as its static restore schema; on round 3+, BaseModel.model_validate(...) raises PydanticUserError (a TypeError, not a ValidationError), which hard-failed the call after the user had already answered. With restore moved into _elicit, the statically-extracted schema has no remaining consumer, so that plumbing is deleted and the crash path with it: dynamic schemas restore against the live schema like everything else. (The registration-time rejection of multi-Elicit-arm annotations stays.)

3. Resolver elicitation and a hand-rolled InputRequiredResult body never converge together. A call has one input_responses/request_state channel. A tool that uses Resolve(...) parameters and returns its own InputRequiredResult overwrites the resolver codec state with its opaque body state; the next round decodes that as "no progress" and re-asks the already-answered resolver question. Round 3 is wire-identical to round 1, looping until the client's round budget runs out. A declared InputRequiredResult return arm on a resolver-using tool is now rejected at registration (InvalidSignature, seeing through Annotated wrappers, type aliases, and unions), and an undeclared one fails the call with a clear error instead of silently fighting for the channel. The low-level Server path and tools without resolver parameters are unaffected.

Semantics

The cost of body-wins is that an eliciting resolver's body re-runs on each round the call passes through (previously: once to ask, once to consume, then restored without running). This unifies the model: every resolver body may run on each round, each question is still asked exactly once, and only elicited outcomes persist. Wire shapes, round counts, and question batching are unchanged. The docs (dependencies.md, multi-round-trip.md, the refund_desk story README) now scope the run-once guarantee to the question.

Tests

Five new regression tests, each verified to fail on the pre-fix source: a crafted accept entry cannot replace a non-eliciting branch's value (and the body provably runs); a decline entry cannot abort a pure resolver; an -> Elicit[BaseModel] chain completes across rounds; the Resolve + InputRequiredResult combination is rejected at registration across five annotation spellings; a dynamically returned InputRequiredResult from a resolver-using tool errors cleanly. The existing suite is updated for the run-count change. 100% branch coverage.

_{AI Disclaimer}

The resolver framework restored recorded outcomes from the client-echoed request_state before running the resolver body, so an entry under a resolver's wire key could stand in for the body's own computation on a branch that never asks, and a decline/cancel entry could abort a resolver with no Elicit arm at all. Restore now happens only inside _elicit, after the body returned an Elicit this round: the body's concrete return always wins, and a recorded answer satisfies exactly the question being asked, validated against the live Elicit.schema. That makes the statically-extracted elicit_schema unused, so it is removed - which also fixes resolvers annotated `-> Elicit[BaseModel]` (dynamic create_model schemas): restoring through the static schema called BaseModel.model_validate, whose PydanticUserError is not a ValidationError and hard-failed the call after the user had answered. A tool combining Resolve(...) parameters with a hand-rolled InputRequiredResult body flow shares the call's single input_responses/request_state channel; the flows overwrite each other's state and the call never converges. A declared InputRequiredResult return arm is now rejected at registration (InvalidSignature, seeing through Annotated wrappers, type aliases, and unions) and an undeclared one fails the call at runtime. The cost of body-wins is that an eliciting resolver's body re-runs on each round instead of being restored without running; the docs scope the once-per-call guarantee to the question accordingly.

cubic-dev-ai

1 issue found across 6 files

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

claude

I didn't find any bugs in this change, but it reworks the resolver request_state trust model and changes resolver re-run semantics across rounds, so it deserves a human reviewer's judgment on the design rather than a bot approval.

Extended reasoning...

Overview

The PR moves the request_state restore gate from _resolve into _elicit in src/mcp/server/mcpserver/resolve.py, deletes the statically-extracted elicit_schema plumbing, adds a registration-time rejection (and a runtime guard) for tools that combine Resolve(...) parameters with an InputRequiredResult return in tools/base.py, and updates docs and the refund_desk story README to scope the run-once guarantee to the question rather than the resolver body. Five new regression tests plus updates to existing ones accompany the change.

Security risks

The change is security-positive on its face: a client-echoed request_state entry can no longer stand in for a server-side computed value or abort a non-eliciting resolver, since restore now only happens after the body returns an Elicit and is re-validated against the live schema. That said, this is exactly the kind of trust-boundary code (client-supplied state influencing server-side resolution) where subtle interactions matter — e.g. the new semantics that resolver bodies re-run on every round means resolver side effects now execute more times than before, which is a behavioral contract change downstream servers may rely on.

Level of scrutiny

This touches the core multi-round-trip elicitation flow of the server SDK and intentionally changes documented semantics (eliciting resolver bodies re-run each round; Resolve + InputRequiredResult is now rejected, which is technically a breaking restriction for any tool that previously combined them). Those are design decisions a maintainer should weigh in on, not something to shadow-approve regardless of implementation correctness.

Other factors

The implementation is coherent and well-tested: the new tests directly exercise the crafted-state attack paths, the dynamic Elicit[BaseModel] restore, and the channel-ownership rejection across several annotation spellings, and the bug-hunting pass found no issues. The diff is consistent with the PR description. My deferral is about the scope and semantics of the change, not about correctness concerns.

…heck Accessing `__value__` on a PEP 695 `type X = ...` alias evaluates it, so an alias naming TYPE_CHECKING-only imports raised NameError at registration for resolver-using tools that registered fine before the check existed. Such an alias declares no arm the check can see; the in-call guard still covers a body that returns an InputRequiredResult. Also pin the neighboring shapes that already work: a subscripted generic alias forwards `__value__` to its origin and is rejected, and a parameterized builtin generic return registers on every supported Python.

claude

I didn't find any bugs in this change, but it materially alters the resolver elicitation trust model and re-run semantics (client-echoed request_state handling, registration-time channel-ownership rejection), so it deserves a human reviewer's sign-off rather than an automated approval.

Extended reasoning...

Overview

The PR moves the request_state restore from _resolve into _elicit so a recorded outcome is only consulted after a resolver's body actually asks its question, deletes the statically-extracted elicit_schema plumbing (fixing the -> Elicit[BaseModel] restore crash), and adds a registration-time (InvalidSignature) plus runtime (ToolError) rejection of combining Resolve(...) parameters with a hand-rolled InputRequiredResult return. Docs and the refund_desk story README are updated to scope the run-once guarantee to the question rather than the resolver body, and five new regression tests cover the changed behaviors.

Security risks

The change is security-positive in intent: it closes a path where a client could craft a predictable-wire-key request_state entry to substitute or suppress a server-side computed value before the resolver body ran. The new _outcome_from_state always validates accepted entries against the live Elicit.schema, which is stricter than before. However, because this is exactly the trust boundary between client-echoed state and server-side computation, the correctness of the new restore placement and its interaction with res.persist/round convergence is the kind of thing a human maintainer should confirm. Integrity-sealing of request_state remains an acknowledged follow-up.

Level of scrutiny

This is core SDK request-handling logic with a deliberate semantic change (eliciting resolver bodies now re-run on every round the call passes through), not a mechanical or config-level change. Behavior changes of this kind can affect user resolvers with side effects, so a maintainer should weigh the documented trade-off even though the implementation looks consistent and well-tested.

Other factors

The bug-hunting system found no issues, the new tests are thorough (crafted state entries, dynamic schemas, five annotation spellings of the rejected combination, the unevaluable PEP 695 alias case), and the one external review comment (cubic) was investigated, answered with an empirical check, and the adjacent real issue fixed in a follow-up commit. Nothing outstanding remains in the timeline, but the scope and trust-model implications keep this above the bar for automated approval.

maxisbey marked this pull request as ready for review June 29, 2026 14:27

cubic-dev-ai Bot reviewed Jun 29, 2026

View reviewed changes

Comment thread src/mcp/server/mcpserver/resolve.py Outdated

claude Bot reviewed Jun 29, 2026

View reviewed changes

maxisbey added 2 commits June 29, 2026 15:16

Reword new doc passages to avoid dash asides

ee82f82

felixweinberger approved these changes Jun 29, 2026

View reviewed changes

maxisbey merged commit 8f2c97b into main Jun 29, 2026
35 of 36 checks passed

maxisbey deleted the resolver-restore-semantics branch June 29, 2026 15:44

claude Bot reviewed Jun 29, 2026

View reviewed changes

maxisbey mentioned this pull request Jun 29, 2026

Implement SEP-2322: Multi Round-Trip Requests #2797

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consult request_state only for the question a resolver is asking#3019

Consult request_state only for the question a resolver is asking#3019
maxisbey merged 3 commits into
mainfrom
resolver-restore-semantics

maxisbey commented Jun 29, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

maxisbey commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Semantics

Tests

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

maxisbey commented Jun 29, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading