Skip to content

Version Packages#1822

Merged
threepointone merged 3 commits into
mainfrom
changeset-release/main
Jun 28, 2026
Merged

Version Packages#1822
threepointone merged 3 commits into
mainfrom
changeset-release/main

Conversation

@github-actions

@github-actions github-actions Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

This PR was opened by the Changesets release GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated.

Releases

agents@0.17.1

Patch Changes

  • #1826 1bbd9bc Thanks @threepointone! - Add a tight, OOM-specific retry budget to chat recovery so a memory-limit crash loop seals fast and attributably (#1825).

    When a recovery turn hits a Durable Object memory-limit reset (the isolate exceeded its 128 MB limit), recovery now classifies it as a distinct, deterministic failure rather than a deploy-style transient. A memory reset re-OOMs on re-run (the turn's working set, not the platform, is the cause), so it must NOT be deferred and retried forever like a code-update/connection-lost transient. Each such crash bumps a durable per-incident oomAttempts counter; recovery retries a small number of times (new chatRecovery.maxOomRetries, default 3) — in case the OOM was a transient spike — then seals with reason="out_of_memory". This is far tighter than the generic maxRecoveryWork backstop because an OOM is attributable and each re-run re-runs the model.

    This complements the finite maxRecoveryWork default: the OOM budget is the fast path for memory resets that surface as catchable errors thrown from recovery bookkeeping (e.g. storage/SQL rejections after the reset), while maxRecoveryWork remains a backstop for the hard-kill case where no in-isolate code runs to record the OOM.

    Adds an alarm-boundary circuit breaker (agents) as the universal backstop for the case the in-DO budgets can't catch (#1825): a memory-limit reset that bypasses them entirely — thrown before the budget code runs (e.g. boot-time state hydration OOMs), or whose own small writes also OOM under memory pressure. Left unhandled, such an error propagates out of alarm() and the platform auto-retries the alarm forever, re-running the doomed, billable turn each cycle. Agent.alarm() now intercepts ONLY Durable Object memory-limit resets at the outermost frame — where the heavy turn has unwound and GC has reclaimed its footprint, so the seal/purge writes can land where mid-turn ones OOMed. A durable strike counter tolerates a few resets (new static options.maxAlarmMemoryLimitStrikes, default 3) — backing off the looping rows so the retry is not a hot loop — then seals the recovery (out_of_memory) and surgically purges only the looping schedule rows, leaving unrelated scheduled tasks intact. A new alarm:memory_limit_reset observability event is emitted. Everything except memory-limit resets re-throws exactly as before.

    Also broadens and exports the isDurableObjectMemoryLimitReset(error) predicate from agents (a sibling to isDurableObjectCodeUpdateReset / isPlatformTransientError): it now matches the shared "exceeded its memory limit" fragment so truncated/reworded surfacings (observed in real #1825 logs) still classify.

  • #1826 1bbd9bc Thanks @threepointone! - Fix neverending chat-recovery retries when a Durable Object isolate runs out of memory mid-turn (#1825).

    chatRecovery.maxRecoveryWork now defaults to a generous finite backstop (1000) instead of Infinity. An isolate that exceeds its memory limit and is reset mid-stream has usually already streamed a little content, which bumps the durable progress counter. On the next wake recovery reads that as forward progress and resets both progress-keyed bounds — the attempt cap (maxAttempts) and the no-progress window (noProgressTimeoutMs) — and because each crash lands inside the alarm-debounce window the attempt counter is pinned too. With the work budget disabled (Infinity), no instrument could ever seal the turn, so recovery re-ran the turn (and its LLM calls) forever. The work meter is the one signal that keeps climbing across such a loop, so a finite default seals a runaway with reason="work_budget_exceeded" instead of looping.

    Work only accrues from the first interruption until the turn completes, so a normal interrupted turn never approaches the cap. A very long agentic turn that legitimately produces a large amount of content under heavy interruption can raise maxRecoveryWork (or set it to Infinity to restore the previous fully-unbounded behavior, ideally paired with a shouldKeepRecovering predicate that bounds the runaway via real token/cost accounting).

@cloudflare/ai-chat@0.9.1

Patch Changes

  • #1826 1bbd9bc Thanks @threepointone! - Add a tight, OOM-specific retry budget to chat recovery so a memory-limit crash loop seals fast and attributably (#1825).

    When a recovery turn hits a Durable Object memory-limit reset (the isolate exceeded its 128 MB limit), recovery now classifies it as a distinct, deterministic failure rather than a deploy-style transient. A memory reset re-OOMs on re-run (the turn's working set, not the platform, is the cause), so it must NOT be deferred and retried forever like a code-update/connection-lost transient. Each such crash bumps a durable per-incident oomAttempts counter; recovery retries a small number of times (new chatRecovery.maxOomRetries, default 3) — in case the OOM was a transient spike — then seals with reason="out_of_memory". This is far tighter than the generic maxRecoveryWork backstop because an OOM is attributable and each re-run re-runs the model.

    This complements the finite maxRecoveryWork default: the OOM budget is the fast path for memory resets that surface as catchable errors thrown from recovery bookkeeping (e.g. storage/SQL rejections after the reset), while maxRecoveryWork remains a backstop for the hard-kill case where no in-isolate code runs to record the OOM.

    Adds an alarm-boundary circuit breaker (agents) as the universal backstop for the case the in-DO budgets can't catch (#1825): a memory-limit reset that bypasses them entirely — thrown before the budget code runs (e.g. boot-time state hydration OOMs), or whose own small writes also OOM under memory pressure. Left unhandled, such an error propagates out of alarm() and the platform auto-retries the alarm forever, re-running the doomed, billable turn each cycle. Agent.alarm() now intercepts ONLY Durable Object memory-limit resets at the outermost frame — where the heavy turn has unwound and GC has reclaimed its footprint, so the seal/purge writes can land where mid-turn ones OOMed. A durable strike counter tolerates a few resets (new static options.maxAlarmMemoryLimitStrikes, default 3) — backing off the looping rows so the retry is not a hot loop — then seals the recovery (out_of_memory) and surgically purges only the looping schedule rows, leaving unrelated scheduled tasks intact. A new alarm:memory_limit_reset observability event is emitted. Everything except memory-limit resets re-throws exactly as before.

    Also broadens and exports the isDurableObjectMemoryLimitReset(error) predicate from agents (a sibling to isDurableObjectCodeUpdateReset / isPlatformTransientError): it now matches the shared "exceeded its memory limit" fragment so truncated/reworded surfacings (observed in real #1825 logs) still classify.

  • #1826 1bbd9bc Thanks @threepointone! - Fix neverending chat-recovery retries when a Durable Object isolate runs out of memory mid-turn (#1825).

    chatRecovery.maxRecoveryWork now defaults to a generous finite backstop (1000) instead of Infinity. An isolate that exceeds its memory limit and is reset mid-stream has usually already streamed a little content, which bumps the durable progress counter. On the next wake recovery reads that as forward progress and resets both progress-keyed bounds — the attempt cap (maxAttempts) and the no-progress window (noProgressTimeoutMs) — and because each crash lands inside the alarm-debounce window the attempt counter is pinned too. With the work budget disabled (Infinity), no instrument could ever seal the turn, so recovery re-ran the turn (and its LLM calls) forever. The work meter is the one signal that keeps climbing across such a loop, so a finite default seals a runaway with reason="work_budget_exceeded" instead of looping.

    Work only accrues from the first interruption until the turn completes, so a normal interrupted turn never approaches the cap. A very long agentic turn that legitimately produces a large amount of content under heavy interruption can raise maxRecoveryWork (or set it to Infinity to restore the previous fully-unbounded behavior, ideally paired with a shouldKeepRecovering predicate that bounds the runaway via real token/cost accounting).

@cloudflare/think@0.11.1

Patch Changes

  • #1826 1bbd9bc Thanks @threepointone! - Add a tight, OOM-specific retry budget to chat recovery so a memory-limit crash loop seals fast and attributably (#1825).

    When a recovery turn hits a Durable Object memory-limit reset (the isolate exceeded its 128 MB limit), recovery now classifies it as a distinct, deterministic failure rather than a deploy-style transient. A memory reset re-OOMs on re-run (the turn's working set, not the platform, is the cause), so it must NOT be deferred and retried forever like a code-update/connection-lost transient. Each such crash bumps a durable per-incident oomAttempts counter; recovery retries a small number of times (new chatRecovery.maxOomRetries, default 3) — in case the OOM was a transient spike — then seals with reason="out_of_memory". This is far tighter than the generic maxRecoveryWork backstop because an OOM is attributable and each re-run re-runs the model.

    This complements the finite maxRecoveryWork default: the OOM budget is the fast path for memory resets that surface as catchable errors thrown from recovery bookkeeping (e.g. storage/SQL rejections after the reset), while maxRecoveryWork remains a backstop for the hard-kill case where no in-isolate code runs to record the OOM.

    Adds an alarm-boundary circuit breaker (agents) as the universal backstop for the case the in-DO budgets can't catch (#1825): a memory-limit reset that bypasses them entirely — thrown before the budget code runs (e.g. boot-time state hydration OOMs), or whose own small writes also OOM under memory pressure. Left unhandled, such an error propagates out of alarm() and the platform auto-retries the alarm forever, re-running the doomed, billable turn each cycle. Agent.alarm() now intercepts ONLY Durable Object memory-limit resets at the outermost frame — where the heavy turn has unwound and GC has reclaimed its footprint, so the seal/purge writes can land where mid-turn ones OOMed. A durable strike counter tolerates a few resets (new static options.maxAlarmMemoryLimitStrikes, default 3) — backing off the looping rows so the retry is not a hot loop — then seals the recovery (out_of_memory) and surgically purges only the looping schedule rows, leaving unrelated scheduled tasks intact. A new alarm:memory_limit_reset observability event is emitted. Everything except memory-limit resets re-throws exactly as before.

    Also broadens and exports the isDurableObjectMemoryLimitReset(error) predicate from agents (a sibling to isDurableObjectCodeUpdateReset / isPlatformTransientError): it now matches the shared "exceeded its memory limit" fragment so truncated/reworded surfacings (observed in real #1825 logs) still classify.

  • #1826 1bbd9bc Thanks @threepointone! - Fix neverending chat-recovery retries when a Durable Object isolate runs out of memory mid-turn (#1825).

    chatRecovery.maxRecoveryWork now defaults to a generous finite backstop (1000) instead of Infinity. An isolate that exceeds its memory limit and is reset mid-stream has usually already streamed a little content, which bumps the durable progress counter. On the next wake recovery reads that as forward progress and resets both progress-keyed bounds — the attempt cap (maxAttempts) and the no-progress window (noProgressTimeoutMs) — and because each crash lands inside the alarm-debounce window the attempt counter is pinned too. With the work budget disabled (Infinity), no instrument could ever seal the turn, so recovery re-ran the turn (and its LLM calls) forever. The work meter is the one signal that keeps climbing across such a loop, so a finite default seals a runaway with reason="work_budget_exceeded" instead of looping.

    Work only accrues from the first interruption until the turn completes, so a normal interrupted turn never approaches the cap. A very long agentic turn that legitimately produces a large amount of content under heavy interruption can raise maxRecoveryWork (or set it to Infinity to restore the previous fully-unbounded behavior, ideally paired with a shouldKeepRecovering predicate that bounds the runaway via real token/cost accounting).

  • #1821 de6a695 Thanks @threepointone! - Add an opt-in, read-only HTTP fetch capability for Think agents via the new @cloudflare/think/tools/fetch export and a fetchTools property on Think.

    createFetchTools() generates a generic, allowlisted fetch_url tool plus one fetch_<name> tool per named service-binding/Fetcher target. It is GET-only with Workers-grounded SSRF defenses (private/loopback/link-local/*.internal blocking, URL normalization, credential rejection), separate download/model/workspace size limits (maxBytes, maxModelChars, response: "workspace" spill), an allowlist-aware redirect policy with cross-origin header stripping, a model header allowlist, and a tool:fetch observability event. Disabled by default.

  • #1823 b58b5a3 Thanks @threepointone! - Improve Think's tool-call lifecycle hooks (follow-ups from #1343):

    • Preserve preliminary streaming through beforeToolCall. Tools whose execute is an async generator (async function* execute(...)) now stream their preliminary tool-results to the model even though Think wraps execute to consult beforeToolCall first. Non-streaming tools keep a scalar wrapper, so they never emit a synthetic preliminary chunk. The non-canonical async () => makeIterator() form (a Promise<AsyncIterable>) still collapses to its last yielded value, matching the raw AI SDK.
    • Per-tool typing on the lifecycle contexts. When an explicit TOOLS generic is passed, narrowing on ctx.toolName now narrows ctx.input on beforeToolCall and — new — ctx.output on afterToolCall's success branch to that tool's inferred output type. Dynamic tools stay unknown. Behavior with the default ToolSet is unchanged.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

@github-actions github-actions Bot force-pushed the changeset-release/main branch 4 times, most recently from 1f7a6e1 to 3dedab6 Compare June 28, 2026 11:12
@github-actions github-actions Bot force-pushed the changeset-release/main branch from 3dedab6 to bcc9f1a Compare June 28, 2026 11:15
@threepointone threepointone merged commit 688d722 into main Jun 28, 2026
6 checks passed
@threepointone threepointone deleted the changeset-release/main branch June 28, 2026 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant