Skip to content

Adds retry support to the Amazon.Lambda.DurableExecution#2363

Draft
GarrettBeatty wants to merge 1 commit into
GarrettBeatty/stack/2from
GarrettBeatty/stack/3
Draft

Adds retry support to the Amazon.Lambda.DurableExecution#2363
GarrettBeatty wants to merge 1 commit into
GarrettBeatty/stack/2from
GarrettBeatty/stack/3

Conversation

@GarrettBeatty
Copy link
Copy Markdown
Collaborator

@GarrettBeatty GarrettBeatty commented May 12, 2026

Stacked PRs:


#2216

What

Adds retry support to the Amazon.Lambda.DurableExecution SDK on top of the foundation in #2360. After this PR a step that throws can be retried with configurable backoff and jitter; durable executions resume after the retry timer elapses without billing Lambda compute during the wait.

Public API introduced:

Type Purpose
IRetryStrategy Decides whether a failed step should retry, with what delay.
RetryDecision Output of IRetryStrategy.ShouldRetryShouldRetry flag plus Delay.
RetryStrategy Static factory: Default, Transient, None, Exponential(...), FromDelegate(...).
JitterStrategy None / Half / Full for exponential backoff.
StepSemantics AtLeastOncePerRetry (default) / AtMostOncePerRetry.
StepConfig.RetryStrategy, StepConfig.Semantics Per-step retry configuration.

Why

Real workflows fail. A step that calls a flaky downstream service or hits a transient throttle needs to retry without restarting the whole workflow. Durable execution makes service-mediated retries possible: the SDK checkpoints a RETRY operation with a NextAttemptDelaySeconds, suspends the Lambda, and the service re-invokes us when the timer fires. The user's compute isn't billed during the wait.

AtMostOncePerRetry semantics handle non-idempotent steps (e.g. charging a card): a START checkpoint is durably persisted before user code runs, so a Lambda crash mid-execution can be detected on replay and routed through the retry strategy rather than re-executing.

How

Retry control flow. When a step throws, StepOperation.HandleStepFailureAsync consults the configured IRetryStrategy.ShouldRetry(ex, attemptNumber). If the decision says retry, the SDK enqueues a RETRY checkpoint carrying NextAttemptDelaySeconds, then suspends via TerminationManager.SuspendAndAwait so RunAsync returns Pending to the service. On the next invocation, StepOperation.ReplayAsync sees Status == PENDING and either re-suspends (timer not yet elapsed) or re-executes (timer fired) with the carried-forward attempt counter.

At-most-once semantics. For non-idempotent steps, Semantics = AtMostOncePerRetry writes a START checkpoint and blocks until the batcher flushes it before user code runs. If Lambda crashes between user code and the SUCCEED flush, replay sees STARTED with no terminal record and routes through HandleStepFailureAsync as a failed attempt instead of re-executing — the side effect runs at most once per attempt.

Retry strategy contract. IRetryStrategy.ShouldRetry(Exception, int attemptNumber) returns a RetryDecision. ExponentialRetryStrategy supports configurable max attempts, initial/max delay, backoff rate, jitter (None/Half/Full), and exception filtering by type or message regex. Built-in factories: RetryStrategy.Default (6 attempts, 5s/60s, 2× backoff, full jitter), Transient (3 attempts, 1s/5s, half jitter), None. RetryStrategy.FromDelegate(...) for arbitrary policies.

Key files:

  • Config/IRetryStrategy.cs — strategy interface + RetryDecision value type
  • Config/RetryStrategy.cs — built-in strategies, ExponentialRetryStrategy, JitterStrategy, StepSemantics, DelegateRetryStrategy
  • Config/StepConfig.cs — adds RetryStrategy and Semantics properties
  • Internal/StepOperation.cs — adds PENDING (retry timer) and STARTED (AtMostOnce crash recovery) replay arms; HandleStepFailureAsync decision tree
  • Internal/TerminationManager.cs — adds RetryScheduled reason

Testing

21 new unit tests in Amazon.Lambda.DurableExecution.Tests (130 total, up from 109 in #2360):

  • RetryStrategyTests (14 tests) — exponential backoff math, jitter strategies, max-attempt exhaustion, exception-type and message-pattern filtering, delegate strategies
  • DurableContextTests retry block (6 tests) — FailsWithRetryStrategy_CheckpointsRetryAndSuspends, FailsNoRetryStrategy_CheckpointsFail, RetryExhausted_CheckpointsFail, PendingWithFutureTimestamp_Suspends, PendingWithPastTimestamp_ReExecutes, AtMostOnce_FlushesStartBeforeExecution, AtMostOnce_StartedReplay_TriggersRetryHandler

Integration tests (Amazon.Lambda.DurableExecution.IntegrationTests) — RetrySucceeds and RetryExhausts end-to-end against the real durable-execution service.

Out of scope (follow-up PRs)

  • MapAsync / ParallelAsync / RunInChildContextAsync / WaitForConditionAsync
  • CallbackAsync, InvokeAsync
  • DefaultJsonCheckpointSerializer
  • DurableLogger replay-suppression (currently NullLogger)
  • Annotations source-generator integration / [DurableExecution] attribute
  • DurableTestRunner / Amazon.Lambda.DurableExecution.Testing package
  • dotnet new lambda.DurableFunction blueprint

GarrettBeatty added a commit that referenced this pull request May 12, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 711bf82 to 4f05fa9 Compare May 12, 2026 16:20
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 12, 2026 16:31
GarrettBeatty added a commit that referenced this pull request May 12, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 4f05fa9 to 54d18f9 Compare May 12, 2026 16:31
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 12, 2026 16:31
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 12, 2026 18:16
GarrettBeatty added a commit that referenced this pull request May 12, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 54d18f9 to 599445f Compare May 12, 2026 18:16
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 12, 2026 18:16
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 12, 2026 21:30
GarrettBeatty added a commit that referenced this pull request May 12, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 599445f to e7a85e4 Compare May 12, 2026 21:30
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 12, 2026 21:30
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 12, 2026 21:34
GarrettBeatty added a commit that referenced this pull request May 12, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from e7a85e4 to 8f23ebb Compare May 12, 2026 21:34
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 12, 2026 21:34
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 13, 2026 16:04
GarrettBeatty added a commit that referenced this pull request May 13, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 8f23ebb to e39e68e Compare May 13, 2026 16:04
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 13, 2026 16:04
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 13, 2026 16:21
GarrettBeatty added a commit that referenced this pull request May 13, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from e39e68e to 52055d3 Compare May 13, 2026 16:21
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 13, 2026 16:21
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 13, 2026 16:39
GarrettBeatty added a commit that referenced this pull request May 13, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from b431212 to 095c948 Compare May 13, 2026 19:57
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 13, 2026 19:57
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 13, 2026 20:13
GarrettBeatty added a commit that referenced this pull request May 13, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 095c948 to 81b9144 Compare May 13, 2026 20:13
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 13, 2026 20:13
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 13, 2026 21:24
GarrettBeatty added a commit that referenced this pull request May 13, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 81b9144 to 531cbbe Compare May 13, 2026 21:24
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 13, 2026 21:24
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 13, 2026 21:49
GarrettBeatty added a commit that referenced this pull request May 13, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 531cbbe to 31ea7e8 Compare May 13, 2026 21:49
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 13, 2026 21:49
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 13, 2026 22:20
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 31ea7e8 to ef44439 Compare May 13, 2026 22:20
GarrettBeatty added a commit that referenced this pull request May 13, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 13, 2026 22:20
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 13, 2026 22:31
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from ef44439 to 6bc97f2 Compare May 13, 2026 22:31
GarrettBeatty added a commit that referenced this pull request May 13, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 13, 2026 22:31
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 13, 2026 22:35
GarrettBeatty added a commit that referenced this pull request May 13, 2026
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 6bc97f2 to 85eae3e Compare May 13, 2026 22:35
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 13, 2026 22:35
stack-info: PR: #2363, branch: GarrettBeatty/stack/3
@GarrettBeatty GarrettBeatty changed the base branch from GarrettBeatty/stack/2 to feature/durablefunction May 14, 2026 01:24
@GarrettBeatty GarrettBeatty force-pushed the GarrettBeatty/stack/3 branch from 85eae3e to 0a32c0d Compare May 14, 2026 01:24
@GarrettBeatty GarrettBeatty changed the base branch from feature/durablefunction to GarrettBeatty/stack/2 May 14, 2026 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant