ref(chat): Align agent execution boundary with run/slice terminology and outcome contract by dcramer · Pull Request #748 · getsentry/junior

dcramer · 2026-07-03T04:11:24Z

The chat executor previously signaled expected run endings — cooperative yield, continuable timeout, auth pause — by throwing CooperativeTurnYieldError and RetryableTurnError, so every caller caught and type-sniffed control flow, and the boundary mixed three vocabularies (reply/respond, legacy turn, and the spec-canonical run/slice). This lands the full contract cleanup from #746 in five commits, one per phase:

executeAgentRun (né generateAssistantReply) returns a discriminated AgentRunOutcome (completed | failed | yielded | timed_out | awaiting_auth); genuine errors still throw. Outer boundaries that need the historical errors (queue worker yield routing, OAuth resume callbacks) construct them from the outcome instead of catching them.
A small AgentRunner interface replaces five typeof-function injection slots; composition roots build it, and the silent production-impl fallbacks in dispatch/local/continuation runners are gone.
The flat ~35-field request bag becomes six role groups (input/routing/policy/state/observers/durability), so AgentRunner.run takes one grouped request.
Hard-cutover rename to run vocabulary (respond.ts → agent-run.ts, no compat aliases), with spec references updated and specs/terminology.md now governing reply (reserved for destination-visible messages owned by delivery/policy layers).
The ~1,450-line executor body decomposes into a chat/agent-run/ feature directory — the slice checkpointer owns resume/persistence state and translates expected endings into outcomes, so the catch block is a thin translation; tool wiring and prompt assembly are explicit-contract phase modules.

Behavior is unchanged throughout: persistence ordering, telemetry keys, delivery semantics, and the historical turn-named identifiers (AgentTurnSessionRecord, RetryableTurnError, …) are preserved per the terminology spec's grandfathering rules. A new component test pins the highest-risk invariant: a yielded slice must stay resumable — no failure persistence, no canvas-recovery hijack, mailbox requeued.

Review order: commits are independently green and reviewable in sequence; the rename commit (bffd9b70) is mechanical, and the decomposition commit (e0f64f8e) moves code without changing consumer import paths.

Fixes #746

🤖 Generated with Claude Code

generateAssistantReply previously signaled expected run endings — cooperative yield, continuable timeout, and auth pause — by throwing CooperativeTurnYieldError and RetryableTurnError, forcing every caller to catch and type-sniff control flow. Introduce a discriminated AgentRunOutcome union (completed | failed | yielded | timed_out | awaiting_auth) returned as a value, and migrate the Slack reply executor, Slack resume, agent dispatch, and local runners to switch on it. Genuine errors (lost input commits, disabled auth flows, provider retry errors, pre-commit failures) remain thrown. Outer boundaries that still require the historical errors (queue worker yield routing, OAuth resume callbacks) now construct them from the outcome value instead of catching them from the executor. Session-record persistence for yield, timeout, and auth pause is unchanged. Adds a component test pinning that a yielded slice stays resumable: no failure persistence, no canvas-recovery post, and requeued mailbox work. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Five call sites each declared their own `generateReply?: typeof generateAssistantReply` injection slot, so every consumer and test double carried the executor's full context-bag signature. Introduce a small consumer-owned AgentRunner interface (run(messageText, context) -> AgentRunOutcome) and construct it in composition roots, per the chat architecture spec's service interface rules. Fold withSandboxTracePropagation into createAgentRunner, preserving the per-run override precedence. Remove the silent production-impl fallbacks from the dispatch, local, and continuation runners: the runner is now a required dependency wired by app/, handlers, and the CLI, so queue and worker paths no longer reach the executor implicitly. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

ReplyRequestContext was a flat ~35-field bag where nearly every field was optional, so the type could not express which combinations were valid and call sites gave no hint of a field's role. Group the request into input, routing, policy, state, observers, and durability sub-objects, and make AgentRunner.run take the single grouped request (messageText now lives in input). The executor body keeps operating on the historical flat shape via a private flatten step; the flat type is an unexported intersection of the group interfaces, so no field is declared twice. The internal rewrite is deferred to the phase decomposition (#746 Phase 5), which consumes the groups directly. No behavior change, no field optionality changes; runtime destination and requester invariant checks are unchanged. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Hard cutover of the executor boundary from reply/respond vocabulary to the run/slice vocabulary that specs/terminology.md canonicalizes: - generateAssistantReply -> executeAgentRun - AssistantReply -> AgentRunResult - ReplyRequestContext -> AgentRunRequest (group interfaces follow suit) - ReplySteeringMessage -> AgentRunSteeringMessage - respond.ts -> agent-run.ts, respond-helpers.ts -> agent-run-helpers.ts - AgentRunOutcome completed/failed variants carry result, not reply The module stays at the chat root because it composes tools, plugins, MCP, sandbox, and capabilities, which runtime/ modules may not depend on under the dependency-cruiser rules. The AssistantReplyRequestContext compat alias and re-export layer are deleted. Amend specs/terminology.md to govern reply: reserved for destination-visible messages owned by delivery and reply-policy layers; executor-boundary names use run vocabulary. Update spec references (chat-architecture rule 5 and data flow, agent-prompt, context-compaction, task-execution, local-agent, plugin-dispatch, harness-agent) to the new names. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

executeAgentRunInPrivacyContext was ~1,450 lines with a dozen mutable closure variables that existed so the catch block could persist yield, timeout, and auth-pause continuations. Split it into a chat/agent-run/ feature directory: - checkpointer.ts owns safe-boundary persistence, durable input checkpointing, resume snapshots, and translates expected run endings into AgentRunOutcome values; the executor catch block is now a thin translation plus genuine-error guards - tools.ts wires sandbox, MCP/plugin auth orchestration, and agent tools through an explicit params/return contract instead of closure capture - prompt.ts owns prompt input, history trimming, and telemetry message assembly; request.ts holds the public request group types - session-restore.ts, skills.ts, sandbox-workspace.ts, events.ts carry the remaining phases agent-run.ts remains the composition root and execution loop (the provider retry loop stays inline to preserve abort-settlement, usage, and span ordering). The private flatten shim from the request-group split is dissolved; phases consume the groups directly. No behavior change; no consumer import paths changed. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

vercel · 2026-07-03T04:11:29Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
junior-docs	Ready	Preview, Comment	Jul 3, 2026 4:12am

dcramer · 2026-07-03T06:45:08Z

this is too complex in some areas. it started as good intent but im not even sure i can review this as a human. will try to skim and feed in some pointed criticisms and see if gpt can clean it up. this was fable with gpt as subagents

The chat runtime now passes agent execution through a small `AgentRunner` interface instead of handing around the full reply-generator function signature. Slack replies, Slack resumes, local turns, durable dispatch, continuation, handlers, and CLI composition roots all receive an explicit runner dependency. This pulls the second commit from #748 onto `main` after #750. The conflict resolution keeps #750's minimal three-status `AgentRunOutcome` contract, preserves sandbox trace propagation precedence in `createAgentRunner`, and removes silent production fallback paths so queue and worker code use the runner wired by their composition boundary. Refs #746 --------- Co-authored-by: GPT-5.5 Codex <noreply@openai.com> Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

dcramer and others added 5 commits July 2, 2026 17:39

vercel Bot deployed to Preview – junior-docs July 3, 2026 04:12 View deployment

dcramer mentioned this pull request Jul 3, 2026

ref(chat): Replace reply-generator slots with AgentRunner #751

Merged

dcramer mentioned this pull request Jul 4, 2026

ref(chat): Split agent-run request context into role-scoped groups #752

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ref(chat): Align agent execution boundary with run/slice terminology and outcome contract#748

ref(chat): Align agent execution boundary with run/slice terminology and outcome contract#748
dcramer wants to merge 5 commits into
mainfrom
agent-run-outcome-model

dcramer commented Jul 3, 2026

Uh oh!

vercel Bot commented Jul 3, 2026 •

edited

Loading

Uh oh!

dcramer commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dcramer commented Jul 3, 2026

Uh oh!

vercel Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dcramer commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jul 3, 2026 •

edited

Loading