Fix #1897: bug(memos-local-plugin): Hermes background model status checks can trigger paid by Memtensor-AI · Pull Request #1899 · MemTensor/MemOS

Memtensor-AI · 2026-06-08T12:39:00Z

Description

Fixes #1897: paid LLM request storm on Hermes when the configured provider returns terminal errors (402 insufficient balance / 401 invalid key / 403). Issue reporter saw ~12,900 paid DeepSeek requests in 24 h tracking exactly the local system_model_status audit-row count.

Root cause: the system_model_status row name is misleading. It is not a health probe; it is the audit row written once per LLM call inside apps/memos-local-plugin/core/llm/client.ts::callWithFallback. With no circuit breaker, every pipeline subscriber (capture / session-relation / reward / L2 / L3 / skill / retrieval LLM filter / world-model) kept firing on every turn, induction, and closed episode, generating one paid request each even though the provider was permanently broken until the operator fixed billing.

Fix: per-LlmClient circuit breaker that trips on terminal provider errors (HTTP 401/402/403 or messages containing "insufficient balance" / "invalid api key" / "unauthorized" / "account suspended" / "billing"). When open, calls short-circuit inside the facade — zero paid requests, no further system_model_status="error" rows. Half-open after a 5-minute cool-down (configurable, min 30 s); the next call probes the provider and either closes the breaker on success or re-opens it on terminal failure. Host fallback rescues a call without tripping the breaker. A coalesced system_model_status="circuit_open" row is emitted at most once per ~25 s so the Logs viewer still surfaces the suppression event without replacing paid spam with audit-row spam. The breaker is enabled by default; legacy behaviour is available via circuitBreaker.enabled = false.

Tests: 9 new vitest cases under apps/memos-local-plugin/tests/unit/llm/client.test.ts covering trip on 402, trip on "insufficient balance" message, no trip on generic transient, coalescing under load, half-open close on success, host-fallback rescues without trip, legacy disabled mode, stats exposure, and re-open on terminal probe failure. All 59 LLM and 28 pipeline tests pass; tsc --noEmit clean. Three full-suite failures (storage migrator / traces-count / e2e episode merge) are pre-existing on the base branch dev-20260604-v2.0.19 and unrelated to this change.

Out of scope (tracked separately): 429 Retry-After handling (issue #1620), per-tool rate limits, daily budget caps.

Related Issue (Required): Fixes #1897

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactor (does not change functionality, e.g. code style improvements, linting)
Documentation update

How Has This Been Tested?

Automated tests are pending.

Unit Test
Test Script Or Test Steps (please provide)
Pipeline Automated API Test (please provide)

Checklist

I have performed a self-review of my own code
I have commented my code in hard-to-understand areas
I have added tests that prove my fix is effective or that my feature works
I have created related documentation issue/PR in MemOS-Docs (if applicable)
I have linked the issue to this PR (if applicable)
I have mentioned the person who will review this PR

@MatthewZhuang, @CarltonXiang, @syzsunshine219 please review this PR.

Reviewer Checklist

closes bug(memos-local-plugin): Hermes background model status checks can trigger paid LLM request storm #1897
Made sure Checks passed
Tests have been provided

…nal provider errors Issue #1897 reported ~12,900 paid LLM requests in 24 h on Hermes against a DeepSeek key with insufficient balance. The local `system_model_status` row count (12,900) closely tracked the provider-side `request_count` (11,344) for the same billing window. The naming is misleading: `system_model_status` is not a health probe; it is the audit row written once per LLM call (ok / fallback / error) inside `core/llm/client.ts`. With no circuit breaker, every pipeline subscriber (capture / session-relation / reward / L2 / L3 / skill / retrieval LLM filter / world-model) kept firing on every turn / closed episode / induction, generating one paid request each. Add a per-`LlmClient` circuit breaker: - Trips on terminal errors: HTTP 401/402/403 or messages containing `insufficient balance` / `invalid api key` / `unauthorized` / `account suspended` / `billing`. - Open: short-circuits subsequent calls inside the facade without contacting the provider. Throws `MemosError(LLM_UNAVAILABLE)` with `details.circuitOpen=true` so existing catch blocks still work. - Half-open after cool-down (default 5 min, configurable, min 30 s): next call probes the provider; success closes the breaker, terminal failure re-opens it for another cool-down. - Host fallback rescues a call without tripping the breaker — fallback exists precisely to keep going when the primary is down. - Coalesces `system_model_status="circuit_open"` audit rows to at most one per ~25 s while the breaker stays open, so we don't replace paid spam with audit-row spam. - Exposes `circuitOpen` / `circuitOpenUntil` / `circuitOpenedReason` via `LlmClientStats` for the Overview viewer card. - Enabled by default; legacy behaviour available via `circuitBreaker.enabled = false`. Tests: 9 new vitest cases under `tests/unit/llm/client.test.ts` covering trip on 402, trip on "insufficient balance" message, no trip on generic transient, coalescing, half-open close on success, host-fallback rescues without trip, disabled mode, stats fields, and re-open on terminal probe failure. All 59 LLM and 28 pipeline tests pass; `tsc --noEmit` clean. Out of scope (tracked separately): 429 `Retry-After` handling (issue #1620), per-tool rate limits, daily budget caps.

Memtensor-AI · 2026-06-08T12:48:11Z

✅ Automated Test Results: PASSED

All tests passed (35/35 executed, 36 skipped). memos_local_plugin/smoke: 0 passed, 1 skipped, memos_local_plugin/contract: 35 passed, 35 skipped. Duration: 5s

Branch: bugfix/autodev-1897

Memtensor-AI assigned CarltonXiang, MatthewZhuang and syzsunshine219 Jun 8, 2026

Memtensor-AI requested review from CarltonXiang, MatthewZhuang and syzsunshine219 June 8, 2026 12:39

Memtensor-AI added ai-generated bug Something isn't working | 功能异常 labels Jun 8, 2026

Memtensor-AI mentioned this pull request Jun 8, 2026

bug(memos-local-plugin): Hermes background model status checks can trigger paid LLM request storm #1897

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #1897: bug(memos-local-plugin): Hermes background model status checks can trigger paid #1899

Fix #1897: bug(memos-local-plugin): Hermes background model status checks can trigger paid #1899
Memtensor-AI wants to merge 1 commit into
dev-20260604-v2.0.19from
bugfix/autodev-1897

Memtensor-AI commented Jun 8, 2026

Uh oh!

Memtensor-AI commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Memtensor-AI commented Jun 8, 2026

Description

Type of change

How Has This Been Tested?

Checklist

Reviewer Checklist

Uh oh!

Memtensor-AI commented Jun 8, 2026

✅ Automated Test Results: PASSED

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants