Add sandbox-coding-agent example: Think orchestrating Claude Code in containers#1830
Merged
Conversation
… Code in containers A Think agent that orchestrates Claude Code coding agents, each running in its own Cloudflare Sandbox container, with live progress and diffs streamed into the chat. Highlights: - Agents-as-tools: the orchestrator delegates each task to a ClaudeCodeAgent facet (one isolated container per run via getSandbox + a derived sandbox id); delegate_parallel fans out across containers. - Zero-token AI Gateway: the Sandbox subclass intercepts the container's api.anthropic.com egress (outboundByHost + interceptHttps) and routes it through env.AI.gateway(), so no Anthropic key or AI Gateway token lives in the container — only a plaintext GATEWAY_ID. Requires exporting the SDK's ContainerProxy from the Worker entry. - Claude Code runs headless as root with IS_SANDBOX=1; its stream-json is mapped to AI SDK UIMessage chunks and stderr/exit/result errors are surfaced. - beforeTurn restricts the orchestrator to its delegation tools so it can't wander into Think's built-in workspace tools. Pins @cloudflare/sandbox to 0.12.1 (the 0.12.2 image failed to publish to Docker Hub, cloudflare/sandbox-sdk#792). README documents the durability/recovery model (three lifecycles, the ephemeral container disk) and the deferred backup/restore + harness-migration upgrade paths (#1829). Co-authored-by: Cursor <cursoragent@cursor.com>
|
2 tasks
threepointone
added a commit
that referenced
this pull request
Jun 28, 2026
* docs(design): add rfc-coding-agent (first-class CodingAgent for Think) Proposes promoting examples/sandbox-coding-agent (#1830) into @cloudflare/think as a supported `CodingAgent` class — a Think subclass that drives a CLI coding agent (Claude Code first) inside a Cloudflare Sandbox, exported per-CLI as `@cloudflare/think/claudecode`. Locks the surface before any core code moves: an internal TurnRuntime seam in Think (private, CodingAgent is its only consumer), a per-CLI adapter contract, tokenless AI Gateway egress + snapshot-based durability built in, DO-tuned recovery, and a conformance-test strategy for stream-json drift. Strategic stance: own the public interface, keep the engine swappable (a matured @ai-sdk/harness could become an impl detail behind the same class — #1829). Status: proposed. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(design): expand rfc-coding-agent with config, threads, and seams Folds in the firm follow-ups from review: - §8 dynamic config (resolve-by-precedence, freeze-on-first-turn) + topology (standalone / Chats-child threads / orchestrated), with the "no top-level binding assumption" requirement. - §9 two seams designed in from day one: a filesystem backend interface (so the durable cloudflare/workspace VFS can supersede snapshots later) and run/preview by target (container dev server vs worker-bundler + env.LOADER). - §10 future-work pointer to a Workers-native runtime (Runtime B) behind the same TurnRuntime seam. - New alternative (adopt cloudflare/workspace now — rejected, preview-only) and expanded decision questions. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(design): drop first-class Chats; threads are a userland directory pattern The fixed chats_index/ChatSummary schema is the part consumers outgrow immediately — a coding directory needs repo/branch/status/lastDiff, etc. So: - rfc-coding-agent §8: reframe "threads" as a userland directory pattern (plain Agent + subAgent with domain-specific metadata), not a `Chats` base class. The shipped, load-bearing primitives it leans on are unchanged (subAgent + Props, parentAgent, RemoteContextProvider). - rfc-think-multi-session: record the third (now-leaning) answer to open question #1 — don't ship a Chats base class; ship primitives + a thin client hook + an example. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(design): rewrite rfc-coding-agent — own package, AIChatAgent base, pluggable engine Adversarial review reshaped the design: - New @cloudflare/coding-agent package, NOT a Think subclass/subpath. Extends AIChatAgent, so onChatMessage is the seam and Think core is untouched (drops the riskiest piece — the turn-runtime refactor). Honors the AGENTS.md layering preference (containers don't belong in the chat base). - Pluggable engine: CliEngine ships first (lift the example's mapper), HarnessEngine is the goal (reuses HarnessAgent's tested stream-mapping + session lifecycle; gated on #1829). No speculative multi-CLI adapter interface — extract after codex. - Durability redesigned around two decoupled lifecycles (DO vs container have different shutdown behaviors); reconcile-on-wake; honest that claude -p can't resume a killed turn and that re-run can double-apply edits; bound snapshot cost. - Egress scoped honestly (per-provider, TLS-dependent, OAuth CLIs can't be tokenless). - Branch is mutable working state (git checkout), only repo identity is frozen. - Filesystem VFS, preview, git ops, HITL, and Runtime B moved to "Directions" (not v1 seams). Added a Testing & CI section. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(design): mark package name resolved in rfc-coding-agent decision Package name @cloudflare/coding-agent + /claude-code subpath confirmed. Engine default, snapshot policy, and first-PR scope remain deliberately open. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Sunil Pai <18808+threepointone@users.noreply.github.com> Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A new example (
examples/sandbox-coding-agent) demonstrating Think as an orchestrator over containerized coding agents — the Cloudflare-native take on the "agent harness" pattern, one level up. You chat with aCodingOrchestrator(Think); it delegates each concrete coding task to a Claude Code sub-agent running in its own Cloudflare Sandbox container, and streams each sub-agent's narration, tool calls, and final git diff back into the chat.agentTool(ClaudeCodeAgent, …); each delegated task runs as a facet with its own isolated container (getSandbox+ a hashed, DNS-safe sandbox id).delegate_parallelfans out across containers viarunAgentTool.Sandboxsubclass intercepts the container'sapi.anthropic.comegress (outboundByHost+interceptHttps) and forwards it throughenv.AI.gateway(), which authenticates via your Cloudflare account. No Anthropic key or AI Gateway token lives in the container — only a plaintextGATEWAY_ID. (Requires exporting the SDK'sContainerProxyfrom the Worker entry.)claude -pheadless inside its container and mapsstream-jsonto AI SDKUIMessagechunks.beforeTurnrestricts the orchestrator to its delegation tools so it can't wander into Think's built-in workspace tools. CLI stderr/exit/result errors are surfaced into the delegate panel instead of silently showing "no changes".sleepAfter) with deferred upgrade paths (Sandbox backup/restore; harness-based mid-turn continuity, Add an @ai-sdk/sandbox-cloudflare HarnessAgent provider (during ai v7 migration) #1829).Pins
@cloudflare/sandboxto0.12.1— the0.12.2image failed to publish to Docker Hub.Examples don't require a changeset.
Test plan
pnpm run check(sherif + export checks + oxfmt + oxlint + typecheck — all 114 projects typecheck)Made with Cursor