Drive resolver elicitation over the 2026-07-28 input_required flow (#2986)

Kludex · maxisbey · web-flow · commit c85836a0817f · 2026-06-29T14:39:43.000+01:00
Co-authored-by: Max Isbey &lt;224885523+maxisbey@users.noreply.github.com&gt;
diff --git a/docs/advanced/multi-round-trip.md b/docs/advanced/multi-round-trip.md
@@ -19,7 +19,7 @@ That's the whole protocol. Every leg is an ordinary request from the client to t
 
 ## The server side
 
-The high-level `@mcp.tool()` decorator has no sugar for this yet. Today you write it on the **low-level** `Server`, whose `on_call_tool` handler is allowed to return either result type:
+On `@mcp.tool()` you rarely build this by hand: declare a dependency that asks the user and the SDK returns the `InputRequiredResult` for you - that form is the **[Dependencies](../tutorial/dependencies.md)** tutorial. The manual form is the **low-level** `Server`, whose `on_call_tool` handler is allowed to return either result type:
 
 ```python title="server.py" hl_lines="44-47"
 --8<-- "docs_src/mrtr/tutorial001.py"
@@ -93,6 +93,6 @@ Drop to the underlying session, where `allow_input_required=True` hands you the
 * `input_requests` is what it needs. `request_state` is an opaque resume token only the server reads.
 * `Client` runs the retry loop for you: register `elicitation_callback` / `sampling_callback` / `list_roots_callback` and `call_tool` returns a plain `CallToolResult`. `input_required_max_rounds` (default 10) bounds it.
 * To inspect or persist rounds, use `client.session.call_tool(..., allow_input_required=True)` and own the `while isinstance(result, InputRequiredResult)` loop yourself.
-* The server side is the **low-level** `Server` only; `@mcp.tool()` has no sugar for this yet.
+* On `@mcp.tool()`, a dependency that asks the user produces this result for you (**[Dependencies](../tutorial/dependencies.md)**); the **low-level** `Server` is the manual form.
 
 This is the mechanism that replaces server-initiated sampling and the rest of the push-style back-channel; see **[Deprecated features](deprecated.md)**.
diff --git a/docs/migration.md b/docs/migration.md
@@ -786,6 +786,8 @@ Positional calls (`await ctx.info("hello")`) are unaffected.
 
 `Context.elicit()` (and `elicit_with_validation()`) now render the schema first and validate each property against the spec's `PrimitiveSchemaDefinition`, raising `TypeError` at the call site for anything outside it. `Optional[T]` fields render as `{"type": ...}` with the field omitted from `required` (previously the non-spec `anyOf` shape). A bare `list[str]` field is rejected because it renders without the required enum items; use `list[Literal[...]]` or `list[str]` with `json_schema_extra` supplying the items. Unions of multiple primitives (e.g. `int | str`) and nested models are rejected.
 
+A schema-mismatched *accepted* answer also fails differently: the call now raises `ValueError` with a stable message ("Received an accepted elicitation whose content does not match the requested schema") instead of letting pydantic's `ValidationError` escape with its internals. Code that caught `ValidationError` around `ctx.elicit()` should catch `ValueError` (or rely on the tool's error result).
+
 ### Replace `RootModel` by union types with `TypeAdapter` validation
 
 The following union types are no longer `RootModel` subclasses:
diff --git a/docs/tutorial/dependencies.md b/docs/tutorial/dependencies.md
@@ -116,11 +116,26 @@ And if the user won't answer at all - declines the question, or cancels it?
 
 That's the right default for a precondition: no answer, no order. When declining is an outcome your tool wants to handle - skip the backorder but still suggest another title - annotate `ElicitationResult[Backorder]` instead and the tool receives the full accept/decline/cancel outcome to branch on. **[Elicitation](elicitation.md)** shows that form, and everything else about asking: the schema rules, the three answers, the client's side of the conversation.
 
+!!! info
+    The framework picks the question's transport from the negotiated protocol version; the code
+    above is identical on both. On **2026-07-28** and later the question rides inside a
+    multi-round-trip `tools/call` - the server returns it, the client's `elicitation_callback`
+    answers it, and the `Client` retries the call for you (**[Multi-round-trip requests](../advanced/multi-round-trip.md)**). On
+    **2025-11-25** and earlier it is a synchronous elicitation request mid-call. Each question is
+    asked exactly once per call - a guarantee about the question, not the resolver. In the
+    multi-round-trip form an eliciting resolver runs again to consume its answer, so code before
+    its `return Elicit(...)` runs on the asking round and again on the answering one; a resolver
+    that answered *without* asking, like `check_stock`, may run again whenever the call resumes
+    after a question. When it resumes, each answer is matched back to its question, so an
+    eliciting resolver must derive its question deterministically from the tool's arguments and
+    earlier answers - a per-call generated value (a `default_factory` id, a timestamp) is
+    re-derived on each round and must not appear in a question the answer is meant to bind to.
+
 ## Recap
 
 * `Annotated[T, Resolve(fn)]` on a tool parameter: the SDK runs `fn` and injects its return value.
 * A resolved parameter is invisible to the model and cannot be supplied by a client. Values the model must not invent - prices, identities, permissions - belong here.
-* A resolver's parameters are resolved the same way: the `Context`, another `Resolve(...)`, or a tool argument by name. The graph runs each resolver at most once per call.
+* A resolver's parameters are resolved the same way: the `Context`, another `Resolve(...)`, or a tool argument by name. The graph runs each resolver at most once per round, however many consumers it has; each question is asked exactly once, an eliciting resolver runs again to consume its answer, and a resolver that never asked may run again when a call resumes.
 * Bad graphs fail at registration with `InvalidSignature`, not mid-call.
 * Return `Elicit(message, Model)` to ask the user, only when you have to. Unwrapped annotations abort on decline; `ElicitationResult[T]` lets the tool branch.
 
diff --git a/docs/tutorial/elicitation.md b/docs/tutorial/elicitation.md
@@ -76,8 +76,8 @@ A refusal is not an error. The tool decides what declining means (here, no booki
 
 !!! tip
     The answer is validated against your model before your code sees it. A client that sends
-    `"maybe"` for a `bool` doesn't corrupt your booking: the call fails with the
-    `ValidationError`, your `if` never runs.
+    `"maybe"` for a `bool` doesn't corrupt your booking: the call fails with a
+    schema-mismatch error, your `if` never runs.
 
 ## Ask before the tool runs
 
diff --git a/examples/stories/legacy_elicitation/README.md b/examples/stories/legacy_elicitation/README.md
@@ -68,6 +68,6 @@ uv run python -m stories.legacy_elicitation.client --http --legacy --server serv
 ## See also
 
 `sampling/` (same push-request shape, deprecated per SEP-2577), `mrtr/`
-(planned — the 2026-era carrier), `error_handling/`
+(the 2026-era carrier), `error_handling/`
 (`UrlElicitationRequiredError`), `refund_desk/` (resolver DI rides this push
-mechanism today).
+mechanism on handshake-era connections).
diff --git a/examples/stories/manifest.toml b/examples/stories/manifest.toml
@@ -40,9 +40,8 @@ era    = "legacy"
 status = "legacy"
 
 [story.refund_desk]
-# Resolver DI rides push elicitation (ctx.elicit) today; era flips to "dual" once
-# the SDK carries resolver elicitation over the 2026 input_required round-trip.
-era      = "legacy"
+# Resolver elicitation picks its transport per era: input_required round-trips on
+# the modern leg, push elicitation (ctx.elicit) on the legacy one.
 lowlevel = false
 
 [story.sampling]
diff --git a/examples/stories/mrtr/README.md b/examples/stories/mrtr/README.md
@@ -46,7 +46,7 @@ uv run python -m stories.mrtr.client --http --server server_lowlevel
 
 ## Spec
 
-[Multi-round results — server features](https://modelcontextprotocol.io/specification/draft/server/tools#multi-round-results)
+[Input required tool results — server features](https://modelcontextprotocol.io/specification/draft/server/tools#input-required-tool-results)
 
 ## See also
 
diff --git a/examples/stories/refund_desk/README.md b/examples/stories/refund_desk/README.md
@@ -7,9 +7,10 @@ reason)` refunds what the order record says — `cents` is resolver-computed and
 does not appear in the input schema at all, so the model cannot supply or
 inflate the amount. Resolvers form a DAG (`load_order` → `refund_scope` →
 `refund_amount` / `ask_restock`), may return `Elicit[...]` to ask the human,
-and run at most once per call. A resolver's own plain parameters are filled
-from the tool's arguments by name — `load_order(order_id)` receives the
-`order_id` the model passed to `refund_order`.
+and ask each question at most once per call. A resolver's own plain
+parameters are filled from the tool's arguments by name —
+`load_order(order_id)` receives the `order_id` the model passed to
+`refund_order`.
 
 ## Run it
 
@@ -18,9 +19,9 @@ from the tool's arguments by name — `load_order(order_id)` receives the
 uv run python -m stories.refund_desk.client
 
 # HTTP — the client self-hosts the server on a free port, runs, then tears it
-# down (--legacy: resolver elicitation rides the push request today; the
-# manifest pins this era, so bare --http runs the same leg)
-uv run python -m stories.refund_desk.client --http --legacy
+# down (2026 protocol: the questions ride embedded input_required round-trips;
+# add --legacy to ride synchronous push elicitation instead)
+uv run python -m stories.refund_desk.client --http
 ```
 
 ## What to look at
@@ -47,21 +48,38 @@ uv run python -m stories.refund_desk.client --http --legacy
 
 ## Caveats
 
+- **Transport per era.** The framework picks the elicitation transport from
+  the negotiated protocol: at >= 2026-07-28 the questions ride embedded
+  `input_required` round-trips (a resolver that depends on another's answer is
+  asked in a later round); at <= 2025-11-25 each is a synchronous
+  `elicitation/create` push request mid-call. Author code is identical on
+  both — this client runs unchanged on either era.
 - **Decline order.** A declined unwrapped dependency aborts resolution in
   tool-signature order — `cents` resolves before `restock`, so `ask_restock`
   never runs. Don't rely on a later resolver's side effects after an earlier
   consumer can abort.
-- **Memoization scope.** Each resolver runs at most once per `tools/call`,
-  keyed by function identity; nothing is cached across calls or connections.
+- **Memoization scope.** Each question is asked at most once per call, and
+  within a round each resolver runs at most once, keyed by function identity.
+  Across 2026 rounds only *elicited* outcomes persist (in `requestState`); a
+  resolver that resolves without eliciting is pure and may re-run each round.
+  An eliciting resolver's body runs again too — once to ask, once more to
+  consume its answer.
+  An answer is matched back to its question when the call resumes, so an
+  eliciting resolver must derive its question deterministically from the
+  tool's arguments and earlier answers; a per-call generated value (a
+  `default_factory` id, a timestamp) is re-derived each round and must not
+  appear in a question the answer is meant to bind to. Nothing is cached
+  across calls or connections.
 - **Validate elicited values.** Elicited answers are human-typed; check them
   against your records (as `_scoped` does) before acting on them.
 
 ## Spec
 
-[Elicitation — client features](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation)
+[Elicitation — client features](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation),
+[Input required tool results — server features](https://modelcontextprotocol.io/specification/draft/server/tools#input-required-tool-results)
 
 ## See also
 
-`legacy_elicitation/` (the push mechanism resolver elicitation rides on today),
-`mrtr/` (the 2026 `input_required` carrier; resolver DI will ride it once the
-SDK wires them together).
+`mrtr/` (the 2026 `input_required` carrier these questions ride at
+>= 2026-07-28), `legacy_elicitation/` (the push mechanism they ride on
+handshake-era connections).
diff --git a/examples/stories/refund_desk/client.py b/examples/stories/refund_desk/client.py
@@ -41,7 +41,9 @@ async def on_elicit(context: ClientRequestContext, params: types.ElicitRequestPa
         assert counts == {"scope": 0, "restock": 0}, counts
 
         # Full refund of a three-line order. The scope question fires exactly ONCE even though
-        # both refund_amount and ask_restock consume it — memoized within the call.
+        # both refund_amount and ask_restock consume it — asked at most once per call on either
+        # era. ask_restock needs the scope ANSWER, so at 2026 the two questions land in
+        # successive rounds, never one concurrent batch: counts and order are era-independent.
         receipt = await client.call_tool("refund_order", {"order_id": "ORD-7002", "reason": "arrived broken"})
         assert receipt.structured_content == {
             "order_id": "ORD-7002",
@@ -53,7 +55,7 @@ async def on_elicit(context: ClientRequestContext, params: types.ElicitRequestPa
 
         # Declining restock still refunds: the tool keeps the ElicitationResult union for
         # `restock`, sees the decline, and just skips the restock. The scope counter moves
-        # again — the memo cache is per tools/call, not per connection.
+        # again — questions are deduped per call, not per connection.
         declines.add("restock")
         answers["scope"] = {"full": False, "sku": "canvas-tote"}
         receipt = await client.call_tool("refund_order", {"order_id": "ORD-7002", "reason": "wrong colour"})
diff --git a/src/mcp/server/elicitation.py b/src/mcp/server/elicitation.py
@@ -87,6 +87,18 @@ def _validate_rendered_properties(json_schema: dict[str, Any]) -> None:
             ) from None
 
 
+def render_elicitation_schema(schema: type[BaseModel]) -> dict[str, Any]:
+    """Render a model as the spec-valid `requested_schema` for an elicitation.
+
+    Raises:
+        TypeError: If a field renders as something the spec's
+            `PrimitiveSchemaDefinition` does not accept.
+    """
+    json_schema = schema.model_json_schema(schema_generator=_ElicitationJsonSchema)
+    _validate_rendered_properties(json_schema)
+    return json_schema
+
+
 async def elicit_with_validation(
     session: ServerSession,
     message: str,
@@ -102,27 +114,32 @@ async def elicit_with_validation(
     the user or automatically generating a response.
 
     For sensitive data like credentials or OAuth flows, use elicit_url() instead.
+
+    Raises:
+        ValueError: If the client accepted the elicitation without supplying
+            content, or with content that does not match the requested schema.
     """
-    json_schema = schema.model_json_schema(schema_generator=_ElicitationJsonSchema)
-    _validate_rendered_properties(json_schema)
+    json_schema = render_elicitation_schema(schema)
 
     result = await session.elicit_form(
         message=message,
         requested_schema=json_schema,
         related_request_id=related_request_id,
     )
 
-    if result.action == "accept" and result.content is not None:
-        # Validate and parse the content using the schema
-        validated_data = schema.model_validate(result.content)
+    if result.action == "accept":
+        if result.content is None:
+            raise ValueError("Received an accepted elicitation with no content")
+        try:
+            validated_data = schema.model_validate(result.content)
+        except ValidationError as e:
+            raise ValueError(
+                "Received an accepted elicitation whose content does not match the requested schema"
+            ) from e
         return AcceptedElicitation(data=validated_data)
-    elif result.action == "decline":
+    if result.action == "decline":
         return DeclinedElicitation()
-    elif result.action == "cancel":
-        return CancelledElicitation()
-    else:  # pragma: no cover
-        # This should never happen, but handle it just in case
-        raise ValueError(f"Unexpected elicitation action: {result.action}")
+    return CancelledElicitation()
 
 
 async def elicit_url(
diff --git a/src/mcp/server/mcpserver/context.py b/src/mcp/server/mcpserver/context.py
@@ -232,6 +232,11 @@ def request_id(self) -> str:
         """Get the unique ID for this request."""
         return str(self.request_context.request_id)
 
+    @property
+    def protocol_version(self) -> str | None:
+        """The negotiated protocol version, or `None` outside of an active request."""
+        return self._request_context.protocol_version if self._request_context is not None else None
+
     @property
     def input_responses(self) -> InputResponses | None:
         """Client responses to a prior `InputRequiredResult.input_requests`.
diff --git a/src/mcp/server/mcpserver/resolve.py b/src/mcp/server/mcpserver/resolve.py
diff --git a/src/mcp/server/mcpserver/tools/base.py b/src/mcp/server/mcpserver/tools/base.py
diff --git a/tests/docs_src/test_dependencies.py b/tests/docs_src/test_dependencies.py
diff --git a/tests/docs_src/test_elicitation.py b/tests/docs_src/test_elicitation.py
diff --git a/tests/server/mcpserver/test_resolve.py b/tests/server/mcpserver/test_resolve.py