Skip to content

fix(enhance/analyzer): retry 529-overloaded, single-shot checkpoint, exploitable filter + bounded agentic retry#95

Open
gadievron wants to merge 2 commits into
masterfrom
fix/enhance-analyzer-retry-529-overloaded-single-shot-checkpoint
Open

fix(enhance/analyzer): retry 529-overloaded, single-shot checkpoint, exploitable filter + bounded agentic retry#95
gadievron wants to merge 2 commits into
masterfrom
fix/enhance-analyzer-retry-529-overloaded-single-shot-checkpoint

Conversation

@gadievron
Copy link
Copy Markdown
Collaborator

Four confirmed bugs across the enhance and analyzer paths; all fixes keep
existing callers' behavior (new params optional, default to prior behavior).

529-overloaded never retried: is_retryable_error()'s string branch omitted
"529"/"overloaded", and the analyzer detection path feeds str(e) (not a dict),
so Anthropic-overloaded errors were silently never auto-retried. Add
"529"/"overloaded" to the string-branch tuple, mirroring the dict branch's
status_code >= 500.

exploitable filter dropped single-shot units: the analyzer exploitable filter
read only agent_context.security_classification, silently dropping every
single-shot unit (which writes llm_context). Add
_unit_security_classification() consumer fallback (agent_context ->
llm_context) plus a loud WARNING guard when a filter is set but no unit carries
any classification.

single-shot had no checkpoint/resume: single-shot enhance_dataset had no
checkpoint/resume. Add an optional checkpoint_path (per-unit save + resume +
KeyboardInterrupt handling); generalize the shared checkpoint helpers with an
optional context_key (default agent_context, so agentic is unchanged);
auto-derive + thread the checkpoint path for both modes in core/enhancer.py.

agentic post-loop retry was single-pass: the agentic post-loop transient retry
was a single pass. Wrap it in a bounded multi-round loop (MAX_RETRY_ROUNDS=3)
that stops early when no retryable units remain or a round recovers nothing.

Tests: tests/test_enhance_resilience.py (9 tests). ruff + py_compile clean.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

Coordination

Touches a file also modified by in-flight PR #51 (region-disjoint; textual merge only).

…exploitable filter + bounded agentic retry

Four confirmed bugs across the enhance and analyzer paths; all fixes keep
existing callers' behavior (new params optional, default to prior behavior).

529-overloaded never retried: is_retryable_error()'s string branch omitted
"529"/"overloaded", and the analyzer detection path feeds str(e) (not a dict),
so Anthropic-overloaded errors were silently never auto-retried. Add
"529"/"overloaded" to the string-branch tuple, mirroring the dict branch's
status_code >= 500.

exploitable filter dropped single-shot units: the analyzer exploitable filter
read only agent_context.security_classification, silently dropping every
single-shot unit (which writes llm_context). Add
_unit_security_classification() consumer fallback (agent_context ->
llm_context) plus a loud WARNING guard when a filter is set but no unit carries
any classification.

single-shot had no checkpoint/resume: single-shot enhance_dataset had no
checkpoint/resume. Add an optional checkpoint_path (per-unit save + resume +
KeyboardInterrupt handling); generalize the shared checkpoint helpers with an
optional context_key (default agent_context, so agentic is unchanged);
auto-derive + thread the checkpoint path for both modes in core/enhancer.py.

agentic post-loop retry was single-pass: the agentic post-loop transient retry
was a single pass. Wrap it in a bounded multi-round loop (MAX_RETRY_ROUNDS=3)
that stops early when no retryable units remain or a round recovers nothing.

Tests: tests/test_enhance_resilience.py (9 tests). ruff + py_compile clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… without a real credential

The resilience tests construct ContextEnhancer with client=None, which falls
back to a default AnthropicClient whose __init__ raises when ANTHROPIC_API_KEY
is unset (the CI condition). Add an autouse fixture that monkeypatch-sets a
dummy key. No network call is made: every test monkeypatches the actual
enhancement methods, so the SDK client is never used to issue a request and the
bounded-retry / checkpoint-resume assertions remain fully active.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant