perf: add adaptive reasoning effort and token-cost optimizations by andrelncampos · Pull Request #166 · lessweb/deepcode-cli

andrelncampos · 2026-06-05T20:47:20Z

This PR reduces token cost during long agent sessions by adding adaptive reasoning effort and related token-usage optimizations.

Main changes:

Adds RuntimeReasoningEffortManager to dynamically switch reasoning_effort between high and max.
Escalates to max after repeated tool failures or repeated identical tool-call loops.
Downgrades back to high after stable clean turns.
Adds cooldowns and anti-flapping behavior to avoid oscillation.
Integrates runtime effort changes into the session loop.
Reuses cached tool definitions during the loop.
Caches the system prompt per model.
Uses estimated context size for active token tracking instead of relying only on response usage.
Adds tests for escalation, downgrade, reset, cooldown and anti-flapping behavior.

Why:
Using max reasoning effort for every turn is expensive. This change keeps the default effort lower and escalates only when runtime signals indicate that the model needs more reasoning depth.

Validation:

npm run format
npm run build
npm run check
npm test

Result:

421 passing tests
0 failing tests
8 skipped tests

fix: stabilize dynamic reasoning effort transitions and tokens use

306c204

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: add adaptive reasoning effort and token-cost optimizations#166

perf: add adaptive reasoning effort and token-cost optimizations#166
andrelncampos wants to merge 1 commit into
lessweb:mainfrom
andrelncampos:fix/dynamic-reasoning-effort

andrelncampos commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

andrelncampos commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant