Add sandbox-coding-agent example: Think orchestrating Claude Code in containers by threepointone · Pull Request #1830 · cloudflare/agents

threepointone · 2026-06-28T20:08:01Z

Summary

A new example (examples/sandbox-coding-agent) demonstrating Think as an orchestrator over containerized coding agents — the Cloudflare-native take on the "agent harness" pattern, one level up. You chat with a CodingOrchestrator (Think); it delegates each concrete coding task to a Claude Code sub-agent running in its own Cloudflare Sandbox container, and streams each sub-agent's narration, tool calls, and final git diff back into the chat.

Agents-as-tools / sub-agents. The orchestrator exposes agentTool(ClaudeCodeAgent, …); each delegated task runs as a facet with its own isolated container (getSandbox + a hashed, DNS-safe sandbox id). delegate_parallel fans out across containers via runAgentTool.
Zero-token AI Gateway. The Sandbox subclass intercepts the container's api.anthropic.com egress (outboundByHost + interceptHttps) and forwards it through env.AI.gateway(), which authenticates via your Cloudflare account. No Anthropic key or AI Gateway token lives in the container — only a plaintext GATEWAY_ID. (Requires exporting the SDK's ContainerProxy from the Worker entry.)
Think owns planning; Claude Code owns coding. Orchestrator loop runs on Workers AI; each sub-agent drives claude -p headless inside its container and maps stream-json to AI SDK UIMessage chunks. beforeTurn restricts the orchestrator to its delegation tools so it can't wander into Think's built-in workspace tools. CLI stderr/exit/result errors are surfaced into the delegate panel instead of silently showing "no changes".
Docs. README explains the no-token trick, the architecture, and a Durability & recovery section (three lifecycles; the container disk is ephemeral and resets across sleepAfter) with deferred upgrade paths (Sandbox backup/restore; harness-based mid-turn continuity, Add an @ai-sdk/sandbox-cloudflare HarnessAgent provider (during ai v7 migration) #1829).

Pins @cloudflare/sandbox to 0.12.1 — the 0.12.2 image failed to publish to Docker Hub.

Examples don't require a changeset.

Test plan

pnpm run check (sherif + export checks + oxfmt + oxlint + typecheck — all 114 projects typecheck)
Ran locally end-to-end: orchestrator delegates a single task and a parallel fan-out; Claude Code runs in-container, egress routes through AI Gateway with no token, diffs render in the delegate panels.
Reviewer: requires Docker locally (paid Workers plan to deploy) and an AI Gateway that can reach Anthropic without a per-request key (Unified Billing or BYOK).

Made with Cursor

… Code in containers A Think agent that orchestrates Claude Code coding agents, each running in its own Cloudflare Sandbox container, with live progress and diffs streamed into the chat. Highlights: - Agents-as-tools: the orchestrator delegates each task to a ClaudeCodeAgent facet (one isolated container per run via getSandbox + a derived sandbox id); delegate_parallel fans out across containers. - Zero-token AI Gateway: the Sandbox subclass intercepts the container's api.anthropic.com egress (outboundByHost + interceptHttps) and routes it through env.AI.gateway(), so no Anthropic key or AI Gateway token lives in the container — only a plaintext GATEWAY_ID. Requires exporting the SDK's ContainerProxy from the Worker entry. - Claude Code runs headless as root with IS_SANDBOX=1; its stream-json is mapped to AI SDK UIMessage chunks and stderr/exit/result errors are surfaced. - beforeTurn restricts the orchestrator to its delegation tools so it can't wander into Think's built-in workspace tools. Pins @cloudflare/sandbox to 0.12.1 (the 0.12.2 image failed to publish to Docker Hub, cloudflare/sandbox-sdk#792). README documents the durability/recovery model (three lifecycles, the ephemeral container disk) and the deferred backup/restore + harness-migration upgrade paths (#1829). Co-authored-by: Cursor <cursoragent@cursor.com>

changeset-bot · 2026-06-28T20:08:05Z

⚠️ No Changeset found

Latest commit: e7de6d0

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

* docs(design): add rfc-coding-agent (first-class CodingAgent for Think) Proposes promoting examples/sandbox-coding-agent (#1830) into @cloudflare/think as a supported `CodingAgent` class — a Think subclass that drives a CLI coding agent (Claude Code first) inside a Cloudflare Sandbox, exported per-CLI as `@cloudflare/think/claudecode`. Locks the surface before any core code moves: an internal TurnRuntime seam in Think (private, CodingAgent is its only consumer), a per-CLI adapter contract, tokenless AI Gateway egress + snapshot-based durability built in, DO-tuned recovery, and a conformance-test strategy for stream-json drift. Strategic stance: own the public interface, keep the engine swappable (a matured @ai-sdk/harness could become an impl detail behind the same class — #1829). Status: proposed. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(design): expand rfc-coding-agent with config, threads, and seams Folds in the firm follow-ups from review: - §8 dynamic config (resolve-by-precedence, freeze-on-first-turn) + topology (standalone / Chats-child threads / orchestrated), with the "no top-level binding assumption" requirement. - §9 two seams designed in from day one: a filesystem backend interface (so the durable cloudflare/workspace VFS can supersede snapshots later) and run/preview by target (container dev server vs worker-bundler + env.LOADER). - §10 future-work pointer to a Workers-native runtime (Runtime B) behind the same TurnRuntime seam. - New alternative (adopt cloudflare/workspace now — rejected, preview-only) and expanded decision questions. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(design): drop first-class Chats; threads are a userland directory pattern The fixed chats_index/ChatSummary schema is the part consumers outgrow immediately — a coding directory needs repo/branch/status/lastDiff, etc. So: - rfc-coding-agent §8: reframe "threads" as a userland directory pattern (plain Agent + subAgent with domain-specific metadata), not a `Chats` base class. The shipped, load-bearing primitives it leans on are unchanged (subAgent + Props, parentAgent, RemoteContextProvider). - rfc-think-multi-session: record the third (now-leaning) answer to open question #1 — don't ship a Chats base class; ship primitives + a thin client hook + an example. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(design): rewrite rfc-coding-agent — own package, AIChatAgent base, pluggable engine Adversarial review reshaped the design: - New @cloudflare/coding-agent package, NOT a Think subclass/subpath. Extends AIChatAgent, so onChatMessage is the seam and Think core is untouched (drops the riskiest piece — the turn-runtime refactor). Honors the AGENTS.md layering preference (containers don't belong in the chat base). - Pluggable engine: CliEngine ships first (lift the example's mapper), HarnessEngine is the goal (reuses HarnessAgent's tested stream-mapping + session lifecycle; gated on #1829). No speculative multi-CLI adapter interface — extract after codex. - Durability redesigned around two decoupled lifecycles (DO vs container have different shutdown behaviors); reconcile-on-wake; honest that claude -p can't resume a killed turn and that re-run can double-apply edits; bound snapshot cost. - Egress scoped honestly (per-provider, TLS-dependent, OAuth CLIs can't be tokenless). - Branch is mutable working state (git checkout), only repo identity is frozen. - Filesystem VFS, preview, git ops, HITL, and Runtime B moved to "Directions" (not v1 seams). Added a Testing & CI section. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(design): mark package name resolved in rfc-coding-agent decision Package name @cloudflare/coding-agent + /claude-code subpath confirmed. Engine default, snapshot policy, and first-PR scope remain deliberately open. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Sunil Pai <18808+threepointone@users.noreply.github.com> Co-authored-by: Cursor <cursoragent@cursor.com>

devin-ai-integration Bot reviewed Jun 28, 2026

View reviewed changes

threepointone merged commit 2351e5c into main Jun 28, 2026
4 checks passed

threepointone deleted the example/sandbox-coding-agent branch June 28, 2026 20:17

threepointone mentioned this pull request Jun 28, 2026

RFC: CodingAgent — a new @cloudflare/coding-agent package #1831

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add sandbox-coding-agent example: Think orchestrating Claude Code in containers#1830

Add sandbox-coding-agent example: Think orchestrating Claude Code in containers#1830
threepointone merged 1 commit into
mainfrom
example/sandbox-coding-agent

threepointone commented Jun 28, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

changeset-bot Bot commented Jun 28, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

threepointone commented Jun 28, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

changeset-bot Bot commented Jun 28, 2026

⚠️ No Changeset found

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

threepointone commented Jun 28, 2026 •

edited by devin-ai-integration Bot

Loading