ref(chat): Align agent execution boundary with run/slice terminology and outcome contract#748
Draft
dcramer wants to merge 5 commits into
Draft
ref(chat): Align agent execution boundary with run/slice terminology and outcome contract#748dcramer wants to merge 5 commits into
dcramer wants to merge 5 commits into
Conversation
generateAssistantReply previously signaled expected run endings — cooperative yield, continuable timeout, and auth pause — by throwing CooperativeTurnYieldError and RetryableTurnError, forcing every caller to catch and type-sniff control flow. Introduce a discriminated AgentRunOutcome union (completed | failed | yielded | timed_out | awaiting_auth) returned as a value, and migrate the Slack reply executor, Slack resume, agent dispatch, and local runners to switch on it. Genuine errors (lost input commits, disabled auth flows, provider retry errors, pre-commit failures) remain thrown. Outer boundaries that still require the historical errors (queue worker yield routing, OAuth resume callbacks) now construct them from the outcome value instead of catching them from the executor. Session-record persistence for yield, timeout, and auth pause is unchanged. Adds a component test pinning that a yielded slice stays resumable: no failure persistence, no canvas-recovery post, and requeued mailbox work. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Five call sites each declared their own `generateReply?: typeof generateAssistantReply` injection slot, so every consumer and test double carried the executor's full context-bag signature. Introduce a small consumer-owned AgentRunner interface (run(messageText, context) -> AgentRunOutcome) and construct it in composition roots, per the chat architecture spec's service interface rules. Fold withSandboxTracePropagation into createAgentRunner, preserving the per-run override precedence. Remove the silent production-impl fallbacks from the dispatch, local, and continuation runners: the runner is now a required dependency wired by app/, handlers, and the CLI, so queue and worker paths no longer reach the executor implicitly. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
ReplyRequestContext was a flat ~35-field bag where nearly every field was optional, so the type could not express which combinations were valid and call sites gave no hint of a field's role. Group the request into input, routing, policy, state, observers, and durability sub-objects, and make AgentRunner.run take the single grouped request (messageText now lives in input). The executor body keeps operating on the historical flat shape via a private flatten step; the flat type is an unexported intersection of the group interfaces, so no field is declared twice. The internal rewrite is deferred to the phase decomposition (#746 Phase 5), which consumes the groups directly. No behavior change, no field optionality changes; runtime destination and requester invariant checks are unchanged. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Hard cutover of the executor boundary from reply/respond vocabulary to the run/slice vocabulary that specs/terminology.md canonicalizes: - generateAssistantReply -> executeAgentRun - AssistantReply -> AgentRunResult - ReplyRequestContext -> AgentRunRequest (group interfaces follow suit) - ReplySteeringMessage -> AgentRunSteeringMessage - respond.ts -> agent-run.ts, respond-helpers.ts -> agent-run-helpers.ts - AgentRunOutcome completed/failed variants carry result, not reply The module stays at the chat root because it composes tools, plugins, MCP, sandbox, and capabilities, which runtime/ modules may not depend on under the dependency-cruiser rules. The AssistantReplyRequestContext compat alias and re-export layer are deleted. Amend specs/terminology.md to govern reply: reserved for destination-visible messages owned by delivery and reply-policy layers; executor-boundary names use run vocabulary. Update spec references (chat-architecture rule 5 and data flow, agent-prompt, context-compaction, task-execution, local-agent, plugin-dispatch, harness-agent) to the new names. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
executeAgentRunInPrivacyContext was ~1,450 lines with a dozen mutable closure variables that existed so the catch block could persist yield, timeout, and auth-pause continuations. Split it into a chat/agent-run/ feature directory: - checkpointer.ts owns safe-boundary persistence, durable input checkpointing, resume snapshots, and translates expected run endings into AgentRunOutcome values; the executor catch block is now a thin translation plus genuine-error guards - tools.ts wires sandbox, MCP/plugin auth orchestration, and agent tools through an explicit params/return contract instead of closure capture - prompt.ts owns prompt input, history trimming, and telemetry message assembly; request.ts holds the public request group types - session-restore.ts, skills.ts, sandbox-workspace.ts, events.ts carry the remaining phases agent-run.ts remains the composition root and execution loop (the provider retry loop stays inline to preserve abort-settlement, usage, and span ordering). The private flatten shim from the request-group split is dissolved; phases consume the groups directly. No behavior change; no consumer import paths changed. Refs #746 Co-Authored-By: GPT-5.5 Codex <noreply@openai.com> Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Member
Author
|
this is too complex in some areas. it started as good intent but im not even sure i can review this as a human. will try to skim and feed in some pointed criticisms and see if gpt can clean it up. this was fable with gpt as subagents |
dcramer
added a commit
that referenced
this pull request
Jul 4, 2026
The chat runtime now passes agent execution through a small `AgentRunner` interface instead of handing around the full reply-generator function signature. Slack replies, Slack resumes, local turns, durable dispatch, continuation, handlers, and CLI composition roots all receive an explicit runner dependency. This pulls the second commit from #748 onto `main` after #750. The conflict resolution keeps #750's minimal three-status `AgentRunOutcome` contract, preserves sandbox trace propagation precedence in `createAgentRunner`, and removes silent production fallback paths so queue and worker code use the runner wired by their composition boundary. Refs #746 --------- Co-authored-by: GPT-5.5 Codex <noreply@openai.com> Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The chat executor previously signaled expected run endings — cooperative yield, continuable timeout, auth pause — by throwing
CooperativeTurnYieldErrorandRetryableTurnError, so every caller caught and type-sniffed control flow, and the boundary mixed three vocabularies (reply/respond, legacyturn, and the spec-canonical run/slice). This lands the full contract cleanup from #746 in five commits, one per phase:executeAgentRun(négenerateAssistantReply) returns a discriminatedAgentRunOutcome(completed | failed | yielded | timed_out | awaiting_auth); genuine errors still throw. Outer boundaries that need the historical errors (queue worker yield routing, OAuth resume callbacks) construct them from the outcome instead of catching them.AgentRunnerinterface replaces fivetypeof-function injection slots; composition roots build it, and the silent production-impl fallbacks in dispatch/local/continuation runners are gone.input/routing/policy/state/observers/durability), soAgentRunner.runtakes one grouped request.respond.ts→agent-run.ts, no compat aliases), with spec references updated andspecs/terminology.mdnow governingreply(reserved for destination-visible messages owned by delivery/policy layers).chat/agent-run/feature directory — the slice checkpointer owns resume/persistence state and translates expected endings into outcomes, so the catch block is a thin translation; tool wiring and prompt assembly are explicit-contract phase modules.Behavior is unchanged throughout: persistence ordering, telemetry keys, delivery semantics, and the historical
turn-named identifiers (AgentTurnSessionRecord,RetryableTurnError, …) are preserved per the terminology spec's grandfathering rules. A new component test pins the highest-risk invariant: a yielded slice must stay resumable — no failure persistence, no canvas-recovery hijack, mailbox requeued.Review order: commits are independently green and reviewable in sequence; the rename commit (
bffd9b70) is mechanical, and the decomposition commit (e0f64f8e) moves code without changing consumer import paths.Fixes #746
🤖 Generated with Claude Code