Skip to content

ref(chat): Align agent execution boundary with run/slice terminology and outcome contract#748

Draft
dcramer wants to merge 5 commits into
mainfrom
agent-run-outcome-model
Draft

ref(chat): Align agent execution boundary with run/slice terminology and outcome contract#748
dcramer wants to merge 5 commits into
mainfrom
agent-run-outcome-model

Conversation

@dcramer

@dcramer dcramer commented Jul 3, 2026

Copy link
Copy Markdown
Member

The chat executor previously signaled expected run endings — cooperative yield, continuable timeout, auth pause — by throwing CooperativeTurnYieldError and RetryableTurnError, so every caller caught and type-sniffed control flow, and the boundary mixed three vocabularies (reply/respond, legacy turn, and the spec-canonical run/slice). This lands the full contract cleanup from #746 in five commits, one per phase:

  1. executeAgentRun (né generateAssistantReply) returns a discriminated AgentRunOutcome (completed | failed | yielded | timed_out | awaiting_auth); genuine errors still throw. Outer boundaries that need the historical errors (queue worker yield routing, OAuth resume callbacks) construct them from the outcome instead of catching them.
  2. A small AgentRunner interface replaces five typeof-function injection slots; composition roots build it, and the silent production-impl fallbacks in dispatch/local/continuation runners are gone.
  3. The flat ~35-field request bag becomes six role groups (input/routing/policy/state/observers/durability), so AgentRunner.run takes one grouped request.
  4. Hard-cutover rename to run vocabulary (respond.tsagent-run.ts, no compat aliases), with spec references updated and specs/terminology.md now governing reply (reserved for destination-visible messages owned by delivery/policy layers).
  5. The ~1,450-line executor body decomposes into a chat/agent-run/ feature directory — the slice checkpointer owns resume/persistence state and translates expected endings into outcomes, so the catch block is a thin translation; tool wiring and prompt assembly are explicit-contract phase modules.

Behavior is unchanged throughout: persistence ordering, telemetry keys, delivery semantics, and the historical turn-named identifiers (AgentTurnSessionRecord, RetryableTurnError, …) are preserved per the terminology spec's grandfathering rules. A new component test pins the highest-risk invariant: a yielded slice must stay resumable — no failure persistence, no canvas-recovery hijack, mailbox requeued.

Review order: commits are independently green and reviewable in sequence; the rename commit (bffd9b70) is mechanical, and the decomposition commit (e0f64f8e) moves code without changing consumer import paths.

Fixes #746

🤖 Generated with Claude Code

dcramer and others added 5 commits July 2, 2026 17:39
generateAssistantReply previously signaled expected run endings — cooperative
yield, continuable timeout, and auth pause — by throwing
CooperativeTurnYieldError and RetryableTurnError, forcing every caller to
catch and type-sniff control flow. Introduce a discriminated AgentRunOutcome
union (completed | failed | yielded | timed_out | awaiting_auth) returned as
a value, and migrate the Slack reply executor, Slack resume, agent dispatch,
and local runners to switch on it.

Genuine errors (lost input commits, disabled auth flows, provider retry
errors, pre-commit failures) remain thrown. Outer boundaries that still
require the historical errors (queue worker yield routing, OAuth resume
callbacks) now construct them from the outcome value instead of catching
them from the executor. Session-record persistence for yield, timeout, and
auth pause is unchanged.

Adds a component test pinning that a yielded slice stays resumable: no
failure persistence, no canvas-recovery post, and requeued mailbox work.

Refs #746
Co-Authored-By: GPT-5.5 Codex <noreply@openai.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Five call sites each declared their own `generateReply?: typeof
generateAssistantReply` injection slot, so every consumer and test double
carried the executor's full context-bag signature. Introduce a small
consumer-owned AgentRunner interface (run(messageText, context) ->
AgentRunOutcome) and construct it in composition roots, per the chat
architecture spec's service interface rules.

Fold withSandboxTracePropagation into createAgentRunner, preserving the
per-run override precedence. Remove the silent production-impl fallbacks
from the dispatch, local, and continuation runners: the runner is now a
required dependency wired by app/, handlers, and the CLI, so queue and
worker paths no longer reach the executor implicitly.

Refs #746
Co-Authored-By: GPT-5.5 Codex <noreply@openai.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
ReplyRequestContext was a flat ~35-field bag where nearly every field was
optional, so the type could not express which combinations were valid and
call sites gave no hint of a field's role. Group the request into input,
routing, policy, state, observers, and durability sub-objects, and make
AgentRunner.run take the single grouped request (messageText now lives in
input).

The executor body keeps operating on the historical flat shape via a
private flatten step; the flat type is an unexported intersection of the
group interfaces, so no field is declared twice. The internal rewrite is
deferred to the phase decomposition (#746 Phase 5), which consumes the
groups directly.

No behavior change, no field optionality changes; runtime destination and
requester invariant checks are unchanged.

Refs #746
Co-Authored-By: GPT-5.5 Codex <noreply@openai.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Hard cutover of the executor boundary from reply/respond vocabulary to the
run/slice vocabulary that specs/terminology.md canonicalizes:

- generateAssistantReply -> executeAgentRun
- AssistantReply -> AgentRunResult
- ReplyRequestContext -> AgentRunRequest (group interfaces follow suit)
- ReplySteeringMessage -> AgentRunSteeringMessage
- respond.ts -> agent-run.ts, respond-helpers.ts -> agent-run-helpers.ts
- AgentRunOutcome completed/failed variants carry result, not reply

The module stays at the chat root because it composes tools, plugins, MCP,
sandbox, and capabilities, which runtime/ modules may not depend on under
the dependency-cruiser rules. The AssistantReplyRequestContext compat alias
and re-export layer are deleted.

Amend specs/terminology.md to govern reply: reserved for destination-visible
messages owned by delivery and reply-policy layers; executor-boundary names
use run vocabulary. Update spec references (chat-architecture rule 5 and
data flow, agent-prompt, context-compaction, task-execution, local-agent,
plugin-dispatch, harness-agent) to the new names.

Refs #746
Co-Authored-By: GPT-5.5 Codex <noreply@openai.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
executeAgentRunInPrivacyContext was ~1,450 lines with a dozen mutable
closure variables that existed so the catch block could persist yield,
timeout, and auth-pause continuations. Split it into a chat/agent-run/
feature directory:

- checkpointer.ts owns safe-boundary persistence, durable input
  checkpointing, resume snapshots, and translates expected run endings
  into AgentRunOutcome values; the executor catch block is now a thin
  translation plus genuine-error guards
- tools.ts wires sandbox, MCP/plugin auth orchestration, and agent tools
  through an explicit params/return contract instead of closure capture
- prompt.ts owns prompt input, history trimming, and telemetry message
  assembly; request.ts holds the public request group types
- session-restore.ts, skills.ts, sandbox-workspace.ts, events.ts carry
  the remaining phases

agent-run.ts remains the composition root and execution loop (the
provider retry loop stays inline to preserve abort-settlement, usage,
and span ordering). The private flatten shim from the request-group
split is dissolved; phases consume the groups directly. No behavior
change; no consumer import paths changed.

Refs #746
Co-Authored-By: GPT-5.5 Codex <noreply@openai.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@vercel

vercel Bot commented Jul 3, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
junior-docs Ready Ready Preview, Comment Jul 3, 2026 4:12am

Request Review

@dcramer

dcramer commented Jul 3, 2026

Copy link
Copy Markdown
Member Author

this is too complex in some areas. it started as good intent but im not even sure i can review this as a human. will try to skim and feed in some pointed criticisms and see if gpt can clean it up. this was fable with gpt as subagents

dcramer added a commit that referenced this pull request Jul 4, 2026
The chat runtime now passes agent execution through a small
`AgentRunner` interface instead of handing around the full
reply-generator function signature. Slack replies, Slack resumes, local
turns, durable dispatch, continuation, handlers, and CLI composition
roots all receive an explicit runner dependency.

This pulls the second commit from #748 onto `main` after #750. The
conflict resolution keeps #750's minimal three-status `AgentRunOutcome`
contract, preserves sandbox trace propagation precedence in
`createAgentRunner`, and removes silent production fallback paths so
queue and worker code use the runner wired by their composition
boundary.

Refs #746

---------

Co-authored-by: GPT-5.5 Codex <noreply@openai.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Align chat runtime execution boundary with run/slice terminology and outcome-based contract

1 participant