Strip reasoning_content from chat history before sending to the LLM by jwzj720 · Pull Request #371 · plmbr/notebook-intelligence

jwzj720 · 2026-06-09T21:19:37Z

Problem

NBI stores assistant messages in chat history with a reasoning_content key and replays the full history back to the model on every subsequent turn. reasoning_content (and reasoning) are output-only fields. Strict-validating OpenAI-compatible endpoints reject them as input — for example Databricks model serving (pydantic extra="forbid") returns:

Error code: 400 - {"message":"messages.0.reasoning_content: Extra inputs are not permitted"}

Because the key is written unconditionally (even as an empty string when there is no reasoning), the first turn of a chat works but every follow-up turn against such an endpoint 400s.

Where it comes from

The assistant message is stored with the key in extension.py (add_message(..., {"role": "assistant", "content": ..., "reasoning_content": ...}), both the buffered and streamed paths).
It is replayed via base_chat_participant.py (messages = [system] + request.chat_history → chat_model.completions(messages, ...)).
The LiteLLM and OpenAI-compatible providers forward the messages to the API unmodified (messages=messages.copy()), so the rejected field goes out on the wire.

Fix

Add a strip_reasoning_fields() helper to the LiteLLM-compatible and OpenAI-compatible providers and apply it in completions() immediately before the API call (replacing the prior messages.copy()). It returns a sanitized copy with reasoning_content and reasoning removed from each message dict, without mutating the caller's list or NBI's stored history — so reasoning stays available to the UI and only the outbound request is cleaned. Both streaming and non-streaming paths use it.

These two providers are the ones that hit strict OpenAI-compatible / LiteLLM validators (including Databricks).

Tests

Added to tests/test_openai_compatible_llm_provider.py:

strip_reasoning_fields() removes both keys without mutating its input, and leaves non-dict entries untouched;
the outbound API call (mocked client) receives messages with no reasoning fields, while the input/stored history is preserved.

$ python -m pytest tests/test_openai_compatible_llm_provider.py -q
4 passed

Reported environment

The 400 was observed with:

NBI: 5.0.1
Provider: OpenAI-compatible / LiteLLM → Databricks model serving
Model: claude-sonnet-4-6 (Databricks-hosted, reasoning on by default)
Repro: ask a prompt, let it answer, then send a second prompt → server log shows 400 ... messages.0.reasoning_content: Extra inputs are not permitted.

The added test asserts the offending field is no longer present in the outbound request, which is exactly what the endpoint rejected.

NBI stores assistant messages in chat history with a `reasoning_content` key (unconditionally, even as an empty string), then replays the full history back to the LLM API on the next turn. `reasoning_content` (and `reasoning`) are OUTPUT-only fields; strict-validating OpenAI-compatible endpoints reject them on input. For example, Databricks model serving (pydantic `extra="forbid"`) returns: Error code: 400 - {"message":"messages.0.reasoning_content: Extra inputs are not permitted"} Because the key is always present in stored history, the request is always rejected. Fix: add a `strip_reasoning_fields()` helper to both the OpenAI-compatible and LiteLLM-compatible providers that returns a sanitized copy of the messages list with `reasoning_content` and `reasoning` removed from each message dict. It is applied in `completions()` right before the messages are passed to the API client, replacing the prior `messages.copy()`. The caller's list and NBI's stored history are left untouched (per-dict copy), so reasoning is still available for the UI; only the outbound request is sanitized. Both streaming and non-streaming paths use the sanitized list. Adds focused unit tests asserting the helper removes the keys without mutating its input, and that the outbound API call receives messages without the reasoning fields. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

mbektas · 2026-06-11T15:46:05Z

 DEFAULT_CONTEXT_WINDOW = 4096

+
+def strip_reasoning_fields(messages: list[dict]) -> list[dict]:


@jwzj720 can you move this method to util.py to prevent duplication. otherwise it looks good to me.

mbektas reviewed Jun 11, 2026

View reviewed changes

mbektas added the blocked Blocked due to conflicts or no response from author label Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strip reasoning_content from chat history before sending to the LLM#371

Strip reasoning_content from chat history before sending to the LLM#371
jwzj720 wants to merge 1 commit into
plmbr:mainfrom
jwzj720:fix/strip-reasoning-content-from-request

jwzj720 commented Jun 9, 2026

Uh oh!

mbektas Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		DEFAULT_CONTEXT_WINDOW = 4096


		def strip_reasoning_fields(messages: list[dict]) -> list[dict]:

Conversation

jwzj720 commented Jun 9, 2026

Problem

Where it comes from

Fix

Tests

Reported environment

Uh oh!

mbektas Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants