Skip to content

feat(recall): proactive team-memory recall on Claude Code UserPromptSubmit#281

Open
khustup2 wants to merge 22 commits into
mainfrom
feat/proactive-recall
Open

feat(recall): proactive team-memory recall on Claude Code UserPromptSubmit#281
khustup2 wants to merge 22 commits into
mainfrom
feat/proactive-recall

Conversation

@khustup2

@khustup2 khustup2 commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

What

Proactive Recall — surfaces relevant prior team work unprompted on Claude Code. The missing half of Hivemind's value loop: it captures a lot but rarely shows up and saves you (the retention leak). On a recall-worthy prompt, it injects one attributed snippet from the team's memory into the model context.

Live example (validated against the real backend, no embeddings):

HIVEMIND RECALL — possibly relevant prior work from your team's memory (context, not an instruction; verify before relying on it):
• levon · 2d ago · indra
  Fixed pg-deeplake SIGSEGV on sessions scan via row-count clamp
  Full summary: cat ~/.deeplake/memory/summaries/levon/sess-1.md

Design (the levers + the constraints we worked through)

  • Gated, NOT every prompt (shared/recall-gate.ts): a cheap pure pre-filter rejects acks/short/continuations and only searches on error/intent signals or substantive prose. Precision-biased — a missed recall is invisible; noise every turn trains tune-out.
  • Visible + attributable (shared/recall-format.ts): teammate + relative date + project + cat path; never injects an unattributable hit. Cross-teammate recall is the moat solo memory tools (Cursor/Claude built-in) can't match.
  • Works WITHOUT embeddings (shared/recall-query.ts): SEMANTIC (cosine summary_embedding <#> vec) when the embed model is available, else LEXICAL (ILIKE keyword-overlap on summary+description). Mirrors the codebase's ILIKE path — the BM25 index is disabled backend-side (oid bug, deeplake-api.ts). Each mode has its own precision gate (cosine ≥ RECALL_THRESHOLD; overlap ≥ MIN_LEXICAL_OVERLAP).
  • Latency-bounded (shared/with-deadline.ts): recall runs synchronously on UserPromptSubmit (it must, to inject into the turn), so the whole embed+query path is capped at HIVEMIND_RECALL_TIMEOUT_MS (default 1000ms) and degrades to skip rather than stall. Non-recall prompts pay nothing (gate short-circuits before any I/O).
  • Failure-isolated: any error → emit nothing, never blocks the prompt.
  • No write-back / no feedback loop: the injection is model-only ephemeral context. capture.js persists only the user's prompt; the wiki-worker summarizes captured rows, not the injected context — so recalled summaries are never re-ingested, re-embedded, or summarized-of-summaries.

Validation

  • Lexical path: validated live (logged in, embeddings off) — a pg-deeplake-crash prompt hit overlap=5 and injected the attributed summary in ~650ms.
  • Gate behavior validated on the built bundle (acks skip with zero I/O; error/substantive prompts proceed).
  • 26 unit tests over gate, keyword extraction, threshold/overlap, attribution formatting, the semantic + lexical query SQL, and the deadline. tsc clean.
  • Still pending: a live check of the semantic path (needs the embeddings model installed) to tune RECALL_THRESHOLD.

Scope / rollout

Claude Code only for now (its additionalContext is model-only → low risk). Cursor next pending a beforeSubmitPrompt probe; Hermes/Pi feasible; Codex deliberately held out (its per-prompt additionalContext is user-visible — would clobber the TUI every prompt). Opt-out: HIVEMIND_RECALL=false.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features
    • Added proactive “recall” context injection on user prompt submission, using semantic recall when available and keyword-based fallback otherwise.
    • Added always-on recall telemetry to ~/.deeplake/recall-events.jsonl for injections, misses, timeouts, errors, and unattributable cases.
  • Bug Fixes
    • Improved reliability by canceling recall searches and retry waits promptly when operations are aborted, including deadline-bounded timeouts.
  • Documentation
    • Updated README with proactive recall configuration options (enable/disable, lexical overlap, and timeout).
  • Tests
    • Expanded and hardened recall unit and end-to-end test coverage.

…ubmit

Surfaces relevant prior work *unprompted* — the missing piece that turns
Hivemind from 'captures memory' into 'saves me, repeatedly'. On a
recall-worthy prompt only, semantic-search the team's summaries and, if
the top hit clears a relevance threshold, inject ONE attributed snippet
('recalled from <teammate> · <date>') into the model context.

Consolidates the three levers we agreed on:
- (1) proactive per-prompt recall — NOT on every prompt: a cheap pure
  gate (shared/recall-gate.ts) rejects acks/continuations and only
  searches on error/intent signals or substantive prose. Precision-biased
  (a missed recall is invisible; noise every turn trains tune-out).
- (2) visible + attributable, leaning on the team moat solo tools lack:
  shared/recall-format.ts surfaces the teammate + date + project and a
  cat-path to the full summary; never injects an unattributable hit.
- (5) measurable: a structured 'recall' log event (searched/hit/below/
  injected + score) so value is provable, not an abstract token counter.

Mechanics: shared/recall-query.ts runs a focused cosine query
((summary_embedding <#> vec) AS score, threshold-gated); recall.ts wires
gate -> embed -> query -> threshold -> inject, failure-isolated so it
never blocks a prompt. additionalContext on Claude Code is model-only
(invisible to the user). Opt-out via HIVEMIND_RECALL=false.

18 unit tests cover the gate, formatter, and query. Claude Code first
(model-only channel = low risk); Cursor/Codex follow once validated.

NOTE: the live embed+query path needs an integration check against a real
backend (not unit-testable in this harness) before un-drafting.
@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a recall.ts Claude Code hook that fires on UserPromptSubmit to conditionally inject attributed team memory context into the model. New shared modules implement precision gating, keyword extraction, semantic and lexical SQL query functions, a deadline-bounded promise helper, a context formatter, and structured telemetry recording. The hook and its bundle entry are wired into the build config and harness documentation. The Deeplake API is extended with AbortSignal support to enable fine-grained latency control.

Changes

Proactive Recall Hook

Layer / File(s) Summary
Deeplake API AbortSignal support
src/deeplake-api.ts, tests/shared/deeplake-api.test.ts
DeeplakeApi.query now accepts an optional caller-provided AbortSignal and passes it to _queryWithRetry. The retry logic builds composite signals using AbortSignal.any() so that an external abort cancels in-flight requests as well as retry backoff waits. Tests verify that an already-aborted signal causes immediate rejection without calling fetch, and that a live signal is passed through to the fetch call options.
Utility primitives: withDeadline and serializeFloat4Array export
src/hooks/shared/with-deadline.ts, src/shell/grep-core.ts, tests/shared/recall.test.ts
withDeadline races a promise against a timer and resolves to fallback on timeout, with settled-flag guarding and timer unref() to prevent holding the event loop. serializeFloat4Array is re-exported for embedding serialization in recall queries. Unit tests cover fast resolution, timeout fallback, rejection propagation, and non-positive-deadline passthrough.
Recall gating logic and keyword extraction
src/hooks/shared/recall-gate.ts, tests/shared/recall.test.ts
proactiveRecallDisabled checks env-based opt-out flags. shouldRecall applies layered length, acknowledgement-regex, signal-pattern, and word-count gates with structured reason codes. passesThreshold enforces finite cosine-score validation. extractKeywords tokenizes prompts for lexical fallback with stopword filtering, deduplication, and count capping. Exports RECALL_THRESHOLD and MIN_LEXICAL_OVERLAP constants. Unit tests cover gate precision, threshold boundaries, NaN handling, and keyword extraction behavior.
RecallHit schema and formatRecallContext
src/hooks/shared/recall-format.ts, tests/shared/recall.test.ts
Defines RecallHit and FormatRecallInput types. Implements parseSummaryPath to extract author/session from paths, daysAgo to compute relative day differences, and relativeDay to convert day deltas into labels. formatRecallContext produces an attributed HIVEMIND RECALL block with "you" vs. teammate phrasing, relative-date rendering, description normalization, and safe filesystem reference. Tests validate attribution, relative-date buckets, unattributable path rejection, and output structure.
Semantic and lexical recall SQL queries
src/hooks/shared/recall-query.ts, tests/shared/recall.test.ts
recallTopHit executes cosine-ranked semantic search with optional project scoping and path exclusion. recallTopHitLexical provides embedding-free fallback via ILIKE keyword overlap ranking. mapTopRow coerces result rows into RecallHit with safe field coercion and score normalization. Tests assert SQL shape, field mapping, null returns on empty results, and early exit for non-finite embeddings.
Recall telemetry recording
src/hooks/shared/recall-events.ts, tests/shared/recall.test.ts
Implements recordRecallEvent to append always-on structured JSONL logs to ~/.deeplake/recall-events.jsonl with event kind (injected, below, none, timeout, error, no-config, unattributable) and optional metadata. Creates parent directory as needed and never throws on filesystem errors. Tests validate JSONL append, one-line-per-event persistence, and non-throwing behavior on unwritable paths.
Hook orchestration: main(), findHit(), emit()
src/hooks/recall.ts, tests/claude-code/recall-hook.test.ts
Core entrypoint applies layered skip gates (env flag, wiki-worker mode, plugin toggle, shouldRecall), loads config token (recording no-config telemetry if missing), dispatches findHit() under withDeadline with AbortController signal, checks mode-aware threshold with hitPasses(), formats context via formatRecallContext, records structured recall event, and emits hookSpecificOutput JSON. Errors caught and isolated with process.exit(0). Comprehensive integration tests validate guard paths (env, plugin, config), lexical/semantic modes, threshold injection, semantic-to-lexical fallback, session-own-summary exclusion, and failure isolation across all event kinds.
Build config, harness wiring, and coverage thresholds
esbuild.config.mjs, harnesses/claude-code/hooks/hooks.json, vitest.config.ts
recall.js added to esbuild ccHooks bundle. The UserPromptSubmit hook sequence gains recall.js command entry with 2-second timeout, positioned after capture.js. Vitest coverage thresholds set to 90% for recall hook and shared helper modules.
Configuration and feature documentation
README.md
New environment variables for proactive recall control (disable flag, minimum lexical overlap, synchronous timeout) documented in config table. New "Proactive recall" section explains trigger conditions, snippet source (attributed team memory), semantic/lexical fallback behavior, relevance filtering, default-on status, and structured telemetry recording.

Sequence Diagram(s)

sequenceDiagram
  participant Claude as Claude Code
  participant recall as recall.ts
  participant gate as shouldRecall
  participant findHit as findHit()
  participant db as Deeplake SQL
  participant fmt as formatRecallContext
  participant emit as emit()

  Claude->>recall: UserPromptSubmit (stdin JSON)
  recall->>recall: env HIVEMIND_PROACTIVE_RECALL enabled?
  alt disabled
    recall-->>Claude: exit 0 (no output)
  end
  recall->>recall: plugin enabled?
  alt disabled
    recall-->>Claude: exit 0
  end
  recall->>gate: shouldRecall(prompt)
  alt recall: false
    gate-->>recall: skip reason
    recall-->>Claude: exit 0
  else recall: true
    recall->>findHit: findHit(config, prompt, now)<br/>(wrapped in withDeadline)
    findHit->>db: recallTopHit (semantic cosine)<br/>or recallTopHitLexical (ILIKE overlap)
    db-->>findHit: top row or empty
    findHit-->>recall: {kind:'hit'} | {kind:'none'} | timeout fallback
    recall->>recall: hitPasses(hit.score, hit.mode)?
    alt below threshold or unattributable
      recall-->>Claude: exit 0
    else passes threshold
      recall->>fmt: formatRecallContext(hit, currentUser, now)
      fmt-->>recall: attributed HIVEMIND RECALL block
      recall->>emit: emit(additionalContext)
      emit-->>Claude: console.log hookSpecificOutput JSON
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • activeloopai/hivemind#246: Adds isHivemindPluginEnabled() gating that the new recall hook depends on for its plugin-disabled skip path.

Suggested reviewers

  • efenocchi
  • kaghni

Poem

🐇 A memory stirs in the warren below,
The recall hook fires — should the context flow?
Semantic cosines or lexical match,
A deadline ticks, but we won't miss the catch.
"HIVEMIND RECALL" now echoes the day,
The rabbit remembered what others would say! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(recall): proactive team-memory recall on Claude Code UserPromptSubmit' is concise, specific, and clearly describes the main feature addition—proactive recall triggering on Claude Code's UserPromptSubmit event.
Description check ✅ Passed The description comprehensively covers what proactive recall does, design decisions, validation approach, and scope. However, the Version Bump section of the template is not addressed in the provided description.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/proactive-recall

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Coverage Report

Scope: files changed in this PR. Enforced threshold: 90% per metric (per file via vitest.config.ts).

Status Category Percentage Covered / Total
🟢 Lines 99.85% (🎯 90%) 668 / 669
🟢 Statements 98.30% (🎯 90%) 810 / 824
🟢 Functions 98.35% (🎯 90%) 119 / 121
🟢 Branches 92.66% (🎯 90%) 606 / 654
File Coverage — 8 files changed
File Stmts Branches Functions Lines
src/deeplake-api.ts 🟢 97.9% 🔴 88.8% 🟢 98.0% 🟢 99.6%
src/hooks/recall.ts 🟢 100.0% 🟢 92.3% 🟢 100.0% 🟢 100.0%
src/hooks/shared/recall-events.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/hooks/shared/recall-format.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/hooks/shared/recall-gate.ts 🟢 100.0% 🟢 95.2% 🟢 100.0% 🟢 100.0%
src/hooks/shared/recall-query.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/hooks/shared/with-deadline.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/shell/grep-core.ts 🟢 97.5% 🟢 92.8% 🟢 97.4% 🟢 100.0%

Generated for commit d8fbe49.

khustup2 added 2 commits June 21, 2026 07:06
recall runs synchronously on UserPromptSubmit (it must, to inject context
into the turn), so on a recall-worthy prompt it blocks the agent's
response until it finishes. Previously the SQL query was only bounded by
the 10s hook timeout — a slow/cold backend could stall the turn.

Add withDeadline(): race the whole embed+query path against
HIVEMIND_RECALL_TIMEOUT_MS (default 1000ms); on timeout, skip (emit
nothing) instead of stalling. Worst-case added latency on a recall-worthy
prompt is now ~1s, not up to 10s; non-recall prompts still pay nothing
(the gate short-circuits before any I/O). Tightened the hooks.json
backstop 10s -> 5s. 3 withDeadline tests (beats / exceeds / rejects).
…t embeddings

Recall previously required the embeddings model (semantic-only); it now
falls back to lexical search when embeddings are unavailable/disabled, so
it works on a plain login with no 600MB model install.

- extractKeywords(): salient tokens from the prompt (drop stopwords/short,
  keep identifiers), de-duped + capped.
- recallTopHitLexical(): ranks summaries by distinct-keyword overlap via
  ILIKE on summary+description (mirrors the codebase's ILIKE search; the
  BM25 index is currently disabled backend-side, deeplake-api.ts). score =
  overlap count, gated by MIN_LEXICAL_OVERLAP (default 2) for precision.
- recall.ts: semantic when embeddings yield a vector, else lexical (also
  falls back when summaries aren't embedded yet). Mode-aware gate +
  mode-tagged telemetry.

Validated live (logged in, embeddings off): a pg-deeplake-crash prompt hit
overlap=5 and injected the attributed summary in ~650ms. 26 unit tests.
@khustup2 khustup2 marked this pull request as ready for review June 21, 2026 08:23

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
tests/shared/recall.test.ts (1)

104-112: ⚡ Quick win

Prefer exact assertions for recall block content instead of broad substrings.

For this structured output, asserting the full expected lines (especially the summary path line and instruction-framing line) will catch regressions that toContain(...) can miss.

As per coding guidelines, "tests/**: Prefer asserting on specific values (paths, messages) over generic substrings."

Also applies to: 125-128

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/recall.test.ts` around lines 104 - 112, The test for
formatRecallContext is using overly broad toContain assertions which can miss
regressions in structured output. Replace the individual toContain calls with
exact assertions on the complete expected lines, particularly for the
instruction-framing line containing the full path and the summary information
line. Focus especially on asserting the complete path like "cat
~/.deeplake/memory/summaries/levon/sess-1.md" as a full line rather than just
the substring. This same refactoring should also be applied to the similar test
case mentioned at lines 125-128. Use exact line-matching assertions instead of
generic substring checks to ensure the recall block output structure is
preserved across changes.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/hooks/recall.ts`:
- Around line 45-50: The current parsing of EMBED_TIMEOUT_MS and
RECALL_BUDGET_MS environment variables using Number() can result in invalid
values like NaN, 0, or negative numbers, which would cause timeouts to expire
instantly or fail silently. Add validation for both constants to ensure the
parsed value is a positive number, and fall back to the default value if the
parsed number is not valid (check for NaN, zero, or negative values). This
ensures only sensible timeout values are used and prevents recall behavior from
being disabled due to invalid timing values.

In `@src/hooks/shared/recall-format.ts`:
- Around line 62-76: The return statement in the formatHivemindRecall function
is constructing a shell command by directly interpolating hit.path from the
database into a cat command, which is a security risk even after parsing.
Instead of using the raw hit.path in the shell command string, reconstruct the
path from the validated safe segments returned by parseSummaryPath (the parsed
object containing author and other fields). Build the printed path by
reassembling the known safe components rather than interpolating the unvalidated
database-derived hit.path directly into the command template.

In `@src/hooks/shared/recall-query.ts`:
- Around line 48-52: The pgvector `<#>` operator returns the negative inner
product, which means the current `ORDER BY score DESC` clause is sorting by
least similar results first instead of most similar. In the SQL query
construction where `summary_embedding <#> ${vecLit}` is used as the score,
negate the operator result (e.g., by prefixing it with a negation operator) so
that positive similarity scores are produced and `ORDER BY score DESC` correctly
ranks by relevance with the most similar results first. Additionally, update the
inaccurate file comment on line 8 that claims `<#>` is "normalized similarity
(0..1, higher = closer)" to correctly reflect that `<#>` returns the negative
inner product.

In `@src/hooks/shared/with-deadline.ts`:
- Line 8: The early return of promise `p` when `ms` is non-positive bypasses the
rejection-to-fallback degradation behavior, allowing rejections to escape
without proper handling. Remove or modify the condition that returns `p`
directly for non-positive ms values to ensure that all promises go through the
rejection-to-fallback wrapping mechanism, maintaining consistent error handling
behavior regardless of the deadline value passed to the withDeadline function.

---

Nitpick comments:
In `@tests/shared/recall.test.ts`:
- Around line 104-112: The test for formatRecallContext is using overly broad
toContain assertions which can miss regressions in structured output. Replace
the individual toContain calls with exact assertions on the complete expected
lines, particularly for the instruction-framing line containing the full path
and the summary information line. Focus especially on asserting the complete
path like "cat ~/.deeplake/memory/summaries/levon/sess-1.md" as a full line
rather than just the substring. This same refactoring should also be applied to
the similar test case mentioned at lines 125-128. Use exact line-matching
assertions instead of generic substring checks to ensure the recall block output
structure is preserved across changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 497b3ce1-58b1-4623-882a-bf03d32ae3db

📥 Commits

Reviewing files that changed from the base of the PR and between 626e860 and 77586b1.

📒 Files selected for processing (9)
  • esbuild.config.mjs
  • harnesses/claude-code/hooks/hooks.json
  • src/hooks/recall.ts
  • src/hooks/shared/recall-format.ts
  • src/hooks/shared/recall-gate.ts
  • src/hooks/shared/recall-query.ts
  • src/hooks/shared/with-deadline.ts
  • src/shell/grep-core.ts
  • tests/shared/recall.test.ts

Comment thread src/hooks/recall.ts Outdated
Comment thread src/hooks/shared/recall-format.ts Outdated
Comment thread src/hooks/shared/recall-query.ts Outdated
Comment thread src/hooks/shared/with-deadline.ts Outdated
The pure helpers were tested but recall.ts itself — the branching where
bugs hide — wasn't. Add tests driving main() with mocked boundaries:
- guards: HIVEMIND_RECALL=false, plugin disabled, ack prompt (no I/O),
  no-config
- lexical: inject above overlap floor, skip below, skip <2 keywords,
  excludes the current session's own summary
- semantic: inject above cosine threshold, skip below, fall back to
  lexical when no embedded rows / when the embed daemon is down
- latency budget: skip (no emit) when the search exceeds the budget
- failure isolation: query error never throws / never emits
- no-rows: emit nothing

41 recall tests total (26 pure + 15 orchestration); tsc clean.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/claude-code/recall-hook.test.ts (1)

128-129: ⚡ Quick win

Prefer exact log assertions instead of substring matches.

These paths are deterministic in this suite, so expect.stringContaining(...) is broader than needed and can hide regressions in mode/score/threshold formatting.

As per coding guidelines, "Prefer asserting on specific values (paths, messages) over generic substrings."

Also applies to: 135-136, 161-162, 169-170, 198-199, 211-212

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/claude-code/recall-hook.test.ts` around lines 128 - 129, Replace the
substring matching approach using expect.stringContaining with exact string
assertions for the debugLogMock calls throughout the test file. Instead of
checking if the log contains "injected mode=lexical", determine what the
complete deterministic log message should be for each test case and assert on
the exact full message. This applies to all instances of expect.stringContaining
in the debugLogMock assertions at lines 128-129, 135-136, 161-162, 169-170,
198-199, and 211-212. For each assertion, identify the exact expected log output
(including mode, score, and threshold values as appropriate) and change from
expect.stringContaining() to an exact equality assertion.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/claude-code/recall-hook.test.ts`:
- Around line 98-105: The test "skips an acknowledgement prompt before any I/O"
has prompts that are too short and trigger the "too-short" gate before reaching
the intended acknowledgement logic, resulting in false coverage. Additionally,
the assertion uses a generic substring match instead of asserting on the
specific expected value. Fix this by making the stdinMock prompt long enough to
pass the minimum length gate so the test exercises the actual acknowledgement
branch, and replace the generic stringContaining assertion with a specific
assertion matching the exact expected skip gate message value.

---

Nitpick comments:
In `@tests/claude-code/recall-hook.test.ts`:
- Around line 128-129: Replace the substring matching approach using
expect.stringContaining with exact string assertions for the debugLogMock calls
throughout the test file. Instead of checking if the log contains "injected
mode=lexical", determine what the complete deterministic log message should be
for each test case and assert on the exact full message. This applies to all
instances of expect.stringContaining in the debugLogMock assertions at lines
128-129, 135-136, 161-162, 169-170, 198-199, and 211-212. For each assertion,
identify the exact expected log output (including mode, score, and threshold
values as appropriate) and change from expect.stringContaining() to an exact
equality assertion.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: bb8e8f11-1d61-4157-974d-b892f2380ed7

📥 Commits

Reviewing files that changed from the base of the PR and between 77586b1 and 89149d1.

📒 Files selected for processing (1)
  • tests/claude-code/recall-hook.test.ts

Comment thread tests/claude-code/recall-hook.test.ts
khustup2 added 2 commits June 22, 2026 05:23
Recall previously only logged via HIVEMIND_DEBUG, so usage/hit-rate was
unanswerable without the flag. Add an always-on sink: one JSON line per
recall-worthy invocation to ~/.deeplake/recall-events.jsonl, independent
of debug, failure-isolated.

- recall-events.ts: recordRecallEvent(), append-only JSONL, never throws.
- recall.ts: records every post-gate outcome (injected/below/none/
  timeout/no-config/unattributable) with mode/score/author/teammate/
  project/session.
- tests: sink unit tests + orchestration assertions per outcome.
Make proactive recall (the aggressive auto-search-and-inject behavior)
cleanly optional, on by default, with a dedicated env var distinct from
capture and from the agent's own reactive recall.

- proactiveRecallDisabled() in recall-gate.ts: on by default; disable via
  HIVEMIND_PROACTIVE_RECALL_DISABLED=1 (matches HIVEMIND_AUTOPULL_DISABLED)
  or HIVEMIND_PROACTIVE_RECALL=0 (matches HIVEMIND_CAPTURE). Case-insensitive,
  whitespace-tolerant, pure/env-injectable.
- recall.ts: gate on it (replaces the narrow HIVEMIND_RECALL===false check),
  documented in the header.
- README: added the var to the Configuration table (+ MIN_OVERLAP /
  TIMEOUT_MS tuning knobs) and a 'Proactive recall' section explaining the
  feature, the on-by-default posture, and the exact disable commands.
- Tests: opt-out unit cases (all on/off + disabled forms, default-enabled)
  and orchestration cases for both env paths.
@khustup2

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

khustup2 added 2 commits June 23, 2026 22:15
codex (P2):
- Scope recall to the CURRENT project (projectNameFromCwd) — keeps the
  cross-teammate value, drops cross-repo false positives.
- shouldRecall evaluates ack + SIGNAL_RES BEFORE the length gate, so short
  high-signal prompts ('TypeError in auth', 'how did we fix X?') recall
  instead of being rejected as too-short.

coderabbit:
- parsePositive() guards numeric env overrides (NaN/0/negative → fallback)
  for the embed timeout, recall budget, and lexical-overlap threshold, so a
  bad value can't instant-timeout or disable recall.
- formatRecallContext prints a path POINTER (not a 'cat' shell command),
  built only from validated path segments — DB-derived values can't produce
  unsafe command text if copied.
- withDeadline degrades a rejection to fallback even when ms<=0; dropped the
  always-true unref guard.
(coderabbit <#>-ordering comment rebutted separately: Deeplake's <#> is
normalized similarity 0..1, higher=closer, so DESC is correct — mirrors
the production grep-core path.)
…gister thresholds

- coderabbit: the ack / keywords<2 orchestration tests were short enough to
  hit the gate's too-short branch instead of the branch they claimed; use a
  signal prompt with <2 keywords ('TypeError?') and rely on the gate reorder
  so the ack test exercises ack.
- New coverage: low-signal gate branch, all relativeDay buckets
  (today/yesterday/days/weeks/months/unparseable), unsafe-path-segment
  omission, empty-description, parsePositive edge cases, withDeadline ms<=0,
  recallTopHit no-opts + non-numeric-score, HIVEMIND_WIKI_WORKER guard, and
  the 'unattributable' outcome.
- Register all six recall files in vitest.config thresholds at 90/90/90/90
  (new-files convention).
@khustup2

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/hooks/shared/recall-format.ts`:
- Around line 71-77: The safeSeg regex validation pattern allows dot characters,
which enables directory traversal attacks through `..` sequences in path
segments. Modify the safeSeg regex pattern to prevent this vulnerability by
either removing the dot (.) character from the allowed character class entirely,
or by explicitly excluding the `.` and `..` patterns. This ensures that
database-derived author and session values cannot be used to escape the intended
~/.deeplake/memory/summaries/ subdirectory path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f851aad1-2e08-4020-a914-9a483798303a

📥 Commits

Reviewing files that changed from the base of the PR and between e97adae and 2dd44f1.

📒 Files selected for processing (7)
  • src/hooks/recall.ts
  • src/hooks/shared/recall-format.ts
  • src/hooks/shared/recall-gate.ts
  • src/hooks/shared/with-deadline.ts
  • tests/claude-code/recall-hook.test.ts
  • tests/shared/recall.test.ts
  • vitest.config.ts
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/claude-code/recall-hook.test.ts
  • src/hooks/shared/recall-gate.ts
  • src/hooks/shared/with-deadline.ts
  • tests/shared/recall.test.ts
  • src/hooks/recall.ts

Comment thread src/hooks/shared/recall-format.ts Outdated
…e 2)

codex [P2] — a below-threshold semantic hit no longer suppresses the
lexical path: findHit now prefers a semantic hit that clears the threshold,
otherwise falls through to lexical (exact keyword/identifier match), the
same hybrid blend the grep path uses. Identifier-heavy prompts (stack
traces, exact error names) recall even when the best embedding hit is
unrelated. Test: 'a below-threshold semantic hit does not suppress a
passing lexical match'.

codex [P1] — bound the worst-case turn block. embed self-caps at 500ms and
query at its own AbortSignal.timeout, but in-flight I/O isn't cancelled, so
the hook process can linger past the 1s budget on a wedged backend.
Tightened the recall hook's hard timeout 5s -> 2s so Claude Code terminates
it well before then. Full in-flight cancellation (threading an
AbortController through DeeplakeApi.query's retry loop) is a focused
shared-infra follow-up.
@khustup2

Copy link
Copy Markdown
Contributor Author

Codex cycle-2 responses (b475a4e)

[P2] Hybrid lexical fallback → fixed. findHit now returns a semantic hit only when it clears RECALL_THRESHOLD; otherwise it falls through to the lexical path and injects a lexical hit that clears the overlap floor. A below-threshold semantic hit no longer suppresses an exact keyword/identifier match. Covered by the new 'a below-threshold semantic hit does not suppress a passing lexical match' test.

[P1] Wall-clock bound → partially addressed now, full fix tracked. embed() self-caps at 500ms and DeeplakeApi.query() at its own AbortSignal.timeout, but the in-flight sockets aren't cancelled, so the hook process can linger past the 1s budget on a wedged backend. I tightened the recall hook's hard timeout 5s→2s so Claude Code terminates it well before then (bounding the turn block). The complete fix — threading an AbortController through DeeplakeApi.query's retry loop so the budget truly cancels in-flight I/O — touches shared infra used by every hook; I'm doing it as a focused follow-up with its own tests rather than blind-shipping it here (local test runner is currently wedged, so CI is my only verification this cycle).

@khustup2

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…codex cycle 3)

codex [P1] — the budget now cancels in-flight I/O: thread an optional
AbortController signal through DeeplakeApi.query()'s retry loop
(AbortSignal.any with the per-attempt fetch timeout; abort-check between
retries). recall main() aborts on timeout so the query socket is cancelled
rather than running to its own timeout.

codex [P3] — backend failures no longer mislabeled as timeouts: withDeadline
is a pure deadline (propagates rejections); findHit is failure-isolated and
reports a distinct 'error' result, so recall-events records 'error' vs
'timeout' accurately.

Also: a too-weak lexical hit (overlap < floor) records 'none', not 'below'.

Tests: query abort/forward-signal, withDeadline propagation, findHit error,
hybrid below-then-lexical. tsc clean.
@khustup2

Copy link
Copy Markdown
Contributor Author

Codex cycle-3 responses (d1a4771)

[P1] Cancel in-flight I/O on budget expiry → fixed properly. DeeplakeApi.query() now accepts an optional AbortSignal (combined with its per-attempt fetch timeout via AbortSignal.any, with an abort-check between retries). recall passes the budget controller's signal and calls controller.abort() on timeout — so the query socket is actually cancelled, not left running to its own ~10s timeout. Tests: 'throws without fetching when an already-aborted signal is passed' + 'forwards a live caller signal into the fetch call'.

[P3] Don't label failures as timeouts → fixed. withDeadline is now a pure deadline (returns the fallback only on the timer; propagates resolutions/rejections), and findHit is failure-isolated, returning a distinct error result. recall-events.jsonl now records error vs timeout accurately. Tests: 'records error (not timeout) and never emits when the query fails' + the withDeadline propagation tests.

Local v8-coverage instrumentation is wedged in my environment so CI is verifying the full suite + 90% gate this cycle; tsc is clean and the plain test run was green.

@khustup2

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/deeplake-api.ts`:
- Around line 239-253: The _queryWithRetry method checks the externalSignal at
the start of the retry loop but the retry delays (sleep/backoff between
attempts) do not respect signal abortion, causing the query promise to remain
alive until the backoff completes even after the external signal fires. Locate
the sleep or delay call used between retry attempts and modify it to accept the
externalSignal parameter so that it can be interrupted when the signal aborts.
This ensures that if the caller's signal (e.g., recall's latency budget) fires
while waiting between retries, the delay is immediately cancelled rather than
blocking the entire operation.

In `@tests/shared/deeplake-api.test.ts`:
- Around line 38-47: The abort assertions in the test are too generic and don't
sufficiently verify the abort behavior. For the first test checking the rejected
query call with AbortSignal.abort(), replace the regex pattern matching in the
toThrow assertion with an exact error message check to verify the specific abort
error. For the second test checking that a live caller signal is forwarded into
the fetch call, after calling api.query with ctrl.signal, abort the controller
and then assert that the signal captured in mockFetch.mock.calls[0][1].signal is
actually aborted (check its aborted property) rather than just verifying it is
an instance of AbortSignal. This ensures the implementation properly forwards
the caller's signal and doesn't drop it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a0025753-d000-4720-9537-1653aa22b473

📥 Commits

Reviewing files that changed from the base of the PR and between b475a4e and d1a4771.

📒 Files selected for processing (7)
  • src/deeplake-api.ts
  • src/hooks/recall.ts
  • src/hooks/shared/recall-events.ts
  • src/hooks/shared/with-deadline.ts
  • tests/claude-code/recall-hook.test.ts
  • tests/shared/deeplake-api.test.ts
  • tests/shared/recall.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/hooks/shared/recall-events.ts
  • tests/claude-code/recall-hook.test.ts
  • tests/shared/recall.test.ts

Comment thread src/deeplake-api.ts
Comment on lines +38 to +47
await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow(/abort/i);
expect(mockFetch).not.toHaveBeenCalled();
});

it("forwards a live caller signal into the fetch call", async () => {
mockFetch.mockResolvedValueOnce(jsonResponse({ columns: ["id"], rows: [["1"]] }));
const api = makeApi();
const ctrl = new AbortController();
await api.query("SELECT id FROM t", ctrl.signal);
expect(mockFetch.mock.calls[0][1].signal).toBeInstanceOf(AbortSignal);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Assert exact abort behavior, not just a generic signal.

Line 38 can assert the exact message, and Line 47 should prove the captured fetch signal aborts when ctrl aborts; otherwise an implementation that drops the caller signal still passes.

Tighten the abort assertions
-    await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow(/abort/i);
+    await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow("Query aborted");
     expect(mockFetch).not.toHaveBeenCalled();
   });
@@
     const ctrl = new AbortController();
     await api.query("SELECT id FROM t", ctrl.signal);
-    expect(mockFetch.mock.calls[0][1].signal).toBeInstanceOf(AbortSignal);
+    const fetchSignal = mockFetch.mock.calls[0][1].signal as AbortSignal;
+    expect(fetchSignal).toBeInstanceOf(AbortSignal);
+    expect(fetchSignal.aborted).toBe(false);
+    ctrl.abort();
+    expect(fetchSignal.aborted).toBe(true);
   });

As per path instructions, tests/**: “Prefer asserting on specific values (paths, messages) over generic substrings.”

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow(/abort/i);
expect(mockFetch).not.toHaveBeenCalled();
});
it("forwards a live caller signal into the fetch call", async () => {
mockFetch.mockResolvedValueOnce(jsonResponse({ columns: ["id"], rows: [["1"]] }));
const api = makeApi();
const ctrl = new AbortController();
await api.query("SELECT id FROM t", ctrl.signal);
expect(mockFetch.mock.calls[0][1].signal).toBeInstanceOf(AbortSignal);
await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow("Query aborted");
expect(mockFetch).not.toHaveBeenCalled();
});
it("forwards a live caller signal into the fetch call", async () => {
mockFetch.mockResolvedValueOnce(jsonResponse({ columns: ["id"], rows: [["1"]] }));
const api = makeApi();
const ctrl = new AbortController();
await api.query("SELECT id FROM t", ctrl.signal);
const fetchSignal = mockFetch.mock.calls[0][1].signal as AbortSignal;
expect(fetchSignal).toBeInstanceOf(AbortSignal);
expect(fetchSignal.aborted).toBe(false);
ctrl.abort();
expect(fetchSignal.aborted).toBe(true);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/deeplake-api.test.ts` around lines 38 - 47, The abort assertions
in the test are too generic and don't sufficiently verify the abort behavior.
For the first test checking the rejected query call with AbortSignal.abort(),
replace the regex pattern matching in the toThrow assertion with an exact error
message check to verify the specific abort error. For the second test checking
that a live caller signal is forwarded into the fetch call, after calling
api.query with ctrl.signal, abort the controller and then assert that the signal
captured in mockFetch.mock.calls[0][1].signal is actually aborted (check its
aborted property) rather than just verifying it is an instance of AbortSignal.
This ensures the implementation properly forwards the caller's signal and
doesn't drop it.

Source: Path instructions

…ex cycle 4)

codex [P1] — the budget now also short-circuits the retry BACKOFF, not just
the in-flight fetch: sleep(ms, signal) clears its timer and rejects on abort,
and DeeplakeApi.query's two retry sleeps pass the caller signal. So a
transient retryable failure can't keep the hook process alive through the
exponential backoff past the budget.

codex [P2] — recall no longer drops LEGACY summary rows. formatRecallContext
attributes from the row's own 'author' column (the query already selects it)
instead of requiring the path to parse as /summaries/<author>/<session>;
non-canonical paths still inject, just without the path pointer. Only a row
with no author at all is 'unattributable'.

Tests: legacy-path injection, no-author → unattributable, format teammate/you
paths unchanged. tsc clean.
@khustup2

Copy link
Copy Markdown
Contributor Author

Codex cycle-4 responses (5a49fa8)

[P1] Abortable retry backoff → fixed. sleep(ms, signal) now clears its timer and rejects on abort, and DeeplakeApi.query's two retry sleeps pass the caller signal. So a transient retryable failure can't keep the process alive through the exponential backoff past the budget — the abort short-circuits the sleep too, not just the fetch.

[P2] Legacy summary paths → fixed. formatRecallContext now attributes from the row's own author column (the query selects it) rather than requiring the path to parse as /summaries/<author>/<session>. Legacy/non-canonical paths still inject (just without the path pointer); only a row with no author at all is unattributable. Tests: 'injects a LEGACY row using the row's author' + 'unattributable only when there is no author'.

@khustup2

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…ex cycle 5)

codex [P2]: `project` scoping uses the cwd basename, so repos sharing a
leaf dir name (…/foo/api vs …/bar/api) collide. This is inherent to the
stored data — capture tags summary rows with the same basename `project`
(projectNameFromCwd), and the memory table has no more-unique key
(project_key exists only on the codebase table). Recall filters as
precisely as the data allows, and basename scoping is strictly better than
the org-wide search it replaced. Full disambiguation needs a unique
project key on summary rows (capture/schema change + migration), tracked
as a follow-up. Documented inline.
@khustup2

Copy link
Copy Markdown
Contributor Author

Codex cycle-5 response (c5eb009)

[P2] basename project-scoping collision — documented as a known, bounded limitation rather than 'fixed', because the proper fix isn't in recall:

  • project is the cwd basename, set by projectNameFromCwd — the SAME value capture tags every summary row with. So recall filters exactly as precisely as the stored data; it can't disambiguate …/foo/api from …/bar/api because the summaries themselves are both tagged project='api'.
  • The memory/summaries table has no more-unique key (project_key exists only on the codebase table).
  • Truly disambiguating colliding basenames requires a unique project key on summary rows — a capture + schema change with data migration, out of scope for this recall PR and tracked as a follow-up.
  • Basename scoping is still strictly better than the org-wide search you asked me to replace in cycle 1, and the collision only bites within one org's memory when two repos share a leaf name. Documented inline at recall.ts.

@khustup2

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…ex cycle 6)

codex [P2] — the injected 'Full summary:' pointer now uses the configured
memory root (config.memoryPath / HIVEMIND_MEMORY_PATH) instead of a literal
~/.deeplake/memory, so it points at the real file in non-default installs.
formatRecallContext takes memoryRoot; recall passes config.memoryPath.

codex [P2] — the embed step is now budget-bound: EMBED_TIMEOUT_MS is clamped
to <= RECALL_BUDGET_MS, so EmbedClient.embed() (which has no abort hook)
can't outlast the recall budget even if HIVEMIND_SEMANTIC_EMBED_TIMEOUT_MS
is set higher.

Tests: custom-memory-root pointer; CONFIG mock carries memoryPath. tsc clean.
@khustup2

Copy link
Copy Markdown
Contributor Author

Codex cycle-6 responses (d9e84bf)

[P2] Custom memory root → fixed. formatRecallContext now builds the 'Full summary:' pointer from the configured memory root (config.memoryPath / HIVEMIND_MEMORY_PATH, trailing slash normalized) instead of a literal ~/.deeplake/memory. Test: 'builds the summary pointer from the configured memory root'.

[P2] Embed not budget-bound → fixed. EMBED_TIMEOUT_MS is clamped to <= RECALL_BUDGET_MS, so EmbedClient.embed() (no abort hook) can't outlast the recall budget even when HIVEMIND_SEMANTIC_EMBED_TIMEOUT_MS is set higher than the budget.

@khustup2

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…ng (codex cycle 7)

codex [P1] — both recall queries now filter `path LIKE '/summaries/%'`, so
notes/goals/user files in the memory table can't outrank and get injected as
'prior work' (the formatter and README both assume summary hits).

codex [P2] — dropped cwd-basename project scoping. Summaries are tagged with
the capture-time cwd basename, so basename scoping (a) collides across repos
sharing a leaf name and (b) WORSE, silently drops valid same-repo history
when prompting from a subdirectory (session tagged 'repo', prompt from
'repo/src' → 'src'). Precision now comes from the summaries-only filter +
the relevance threshold; robust project-key scoping is a follow-up.

Tests: summary-row filter asserted on both queries; orchestration asserts
summaries-only and no basename project filter. tsc clean.
@khustup2

Copy link
Copy Markdown
Contributor Author

Codex cycle-7 responses (4f03e9d)

[P1] Restrict to summary rows → fixed. Both the semantic and lexical queries now filter path LIKE '/summaries/%', so notes/goals/user memory files can't outrank real session summaries and get injected as 'prior work'.

[P2] Basename scoping → fixed by removing it. You're right that it's worse than I framed: beyond the cross-repo collision, basename scoping silently drops valid same-repo history when prompting from a subdirectory (session tagged repo, later prompt from repo/src scopes to src → no match) — a common cd workflow. Since summaries are tagged with the capture-time cwd basename, no basename-derived filter reliably matches. Precision now comes from the /summaries/% filter + the relevance threshold; robust project-aware scoping needs a stable project key on summary rows (capture/schema change), tracked as a follow-up.

@khustup2

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…n (codex cycle 8)

Codex P1: the recalled summary excerpt is AI-generated from prior
sessions and may carry user-controlled / injected text; injecting it
verbatim into model context let a single poisoned row smuggle live
instructions into unrelated sessions.

Render it inert before injection: sanitizeSnippet() neutralizes all
line terminators (LINE_TERMINATOR_RE, incl. U+2028/U+2029), replaces
backticks/double-quotes so it can't break the quoted frame or open a
fake code fence, collapses whitespace, and hard-caps at 240 chars.
The block header now explicitly frames the quoted text as untrusted
DATA, never an instruction. Covered by a new security test that feeds
an injection payload and asserts the excerpt is single-line, fence-free
and capped.
@khustup2

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…e 9)

Codex P2: the question/how-to phrasing in SIGNAL_RES fired BEFORE the
length gate, so bare follow-ups like 'which folder' or "what's the
cap?" matched a generic question word and triggered a synchronous
memory search + snippet injection — exactly the conversational
back-and-forth the gate is meant to skip, hurting precision/latency.

Split signals by strength: STRONG markers (errors, file:line,
ErrorName, recall/continuity intent) still win regardless of length,
but a generic question word is now a WEAK signal that only upgrades a
prompt which ALSO clears the substantive-length bar (>=24 chars,
>=6 words) to a 'signal' recall. Short question follow-ups fall through
to too-short/low-signal and are skipped. Covered by new gate tests.
@khustup2

Copy link
Copy Markdown
Contributor Author

Addressed latest codex review findings:

  • [P1] Treat recalled summary text as untrusted input — fixed in 4300ed54. sanitizeSnippet() now renders the excerpt inert before injection: neutralizes all line terminators (incl. U+2028/U+2029), strips backticks/quotes so it can't break the quoted frame or open a fake fence, collapses whitespace, hard-caps at 240 chars; header explicitly frames the quoted text as untrusted DATA, not an instruction. New security test feeds an injection payload and asserts single-line / fence-free / capped.
  • [P2] Short generic question follow-ups shouldn't recall — fixed in 408143b9. Split signals by strength: STRONG markers (errors, file:line, ErrorName, recall-intent) still win regardless of length, but a bare question word is now a WEAK signal that only recalls when the prompt also clears the substantive-length bar (≥24 chars, ≥6 words). "which folder" / "what's the cap?" now skip. New gate tests cover it.

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

@khustup2 Great iterative work across these cycles — the prompt-injection hardening and the STRONG/WEAK signal split both look well-reasoned. Triggering the review now.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Codex P2: on a Claude Code marketplace auto-upgrade, the new versioned
cache dir lacks the node_modules symlink that 'hivemind embeddings
install' created. capture.js repairs it via ensurePluginNodeModulesLink(),
but recall and capture are independent async UserPromptSubmit hooks —
recall can run first, so the first prompt after an upgrade silently lost
SEMANTIC recall (fell back to lexical/null) even with embeddings
installed.

recall.ts now runs the same best-effort self-heal right before building
the EmbedClient, inside the existing SEMANTIC_ENABLED branch (no new
gate branch; lexical-only runs never touch it). Failure is non-fatal —
we degrade to lexical. Tests mock self-heal, assert it runs before the
embed call (invocationCallOrder) and that a throwing repair still
recalls.
@khustup2

Copy link
Copy Markdown
Contributor Author

[P2] Repair plugin embed deps before semantic recall — fixed in 02980716. On a marketplace auto-upgrade, recall could build the EmbedClient before capture.js restored the new version's node_modules symlink, silently dropping semantic recall on the first post-upgrade prompt. recall.ts now runs the same best-effort ensurePluginNodeModulesLink() self-heal right before the EmbedClient (inside the existing semantic branch — no new gate branch; lexical-only runs are untouched). New tests assert the repair runs before the embed call and that a throwing repair still recalls.

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

@khustup2 The ensurePluginNodeModulesLink() self-heal placement sounds solid — running it inside the semantic branch right before the EmbedClient ensures lexical-only paths are untouched and that a broken symlink on post-upgrade doesn't silently downgrade semantic recall. Triggering the review now.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

khustup2 added 2 commits June 23, 2026 23:59
…cle 11)

Codex P2: SessionStart/Capture/SessionEnd all skip non-interactive
entrypoints (sdk-py/sdk-ts/sdk-cli) when HIVEMIND_CAPTURE_ONLY_CLI=true,
but the new UserPromptSubmit recall hook did not. In CLI-only setups it
would still query Deeplake and inject hidden context into Agent SDK /
`claude -p` runs, perturbing scripted/headless output the user
explicitly scoped to interactive CLI.

main() now calls entrypointPassesOnlyCliGate() right after the
plugin-enabled check (before any stdin/I/O) and returns when it fails.
Tests: a sdk-cli session under the gate skips before any I/O; a bare
'cli' session still recalls.
…ycle 11)

Codex P2: the lexical fallback ordered only by overlap count — a small
integer that ties often when summaries share the same 2-3 keywords.
Since recall injects only the top row, Postgres was free to return an
arbitrary tied summary, making recall nondeterministic and able to
surface a STALE fix instead of the newest relevant one for the same
prompt.

Add a stable secondary sort 'last_update_date DESC, path ASC' (the
TIE_BREAK constant) to BOTH the lexical and semantic queries: on equal
score, prefer the most recently updated summary, then path as a final
total order so the same prompt always recalls the same row. Tests
assert the full ORDER BY on both query builders.
@khustup2

Copy link
Copy Markdown
Contributor Author

Addressed codex cycle 11 findings:

  • [P2] Honor HIVEMIND_CAPTURE_ONLY_CLI — fixed in 40c22e65. main() now calls entrypointPassesOnlyCliGate() right after the plugin-enabled check (before any I/O), so recall skips sdk-py/sdk-ts/sdk-cli sessions in CLI-only setups just like the other hooks. Tests: sdk-cli session skips before I/O; bare cli still recalls.
  • [P2] Deterministic score-tie break — fixed in a30128a9. Both the lexical and semantic queries now ORDER BY score DESC, last_update_date DESC, path ASC (TIE_BREAK), so equal-overlap ties resolve to the newest summary (then path as a total order) — no more arbitrary/stale row on repeated prompts. Tests assert the full ORDER BY on both builders.

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

@khustup2 Both fixes look well-placed:

  • entrypointPassesOnlyCliGate() inserted before I/O is exactly the right gating order — sdk-py/sdk-ts/sdk-cli sessions are skipped without touching config loading or the embed stack.
  • Three-column ORDER BY score DESC, last_update_date DESC, path ASC gives a complete total order: relevance first, recency as the first tie-breaker, then a lexicographic stable key so equal-overlap repeated prompts always surface the same row.

Triggering the review now.

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

khustup2 added 3 commits June 24, 2026 00:02
…inter (coderabbit)

CodeRabbit: safeSeg allowed '.' chars, so a bare '..' segment in a
DB-derived summary path passed validation and the printed pointer could
traverse out of summaries/ (e.g. /summaries/../config.md →
~/.deeplake/memory/config.md). safeSeg now also rejects all-dots
segments (/^\.+$/), so '.'/'..' are refused while normal names and
versioned segments (v1.2) still pass. Traversal-y rows still recall —
just without the unsafe pointer. Covered by a new test.
The 'never throws when the path is unwritable' test pointed fake-home at
/proc/nonexistent/cannot-write, so recordRecallEvent ran
mkdirSync('/proc/nonexistent/cannot-write/.deeplake', {recursive:true}).
Recursive mkdir on a non-existent procfs path HANGS on Linux (the
syscall never returns) instead of erroring — wedging the entire vitest
run, which is why CI's 'Run tests with coverage' step never completed
(every run sat in_progress until the job timed out).

Use a deterministic unwritable path instead: point home at an existing
FILE so creating '<file>/.deeplake' fails fast with ENOTDIR. Same
assertion (recordRecallEvent must not throw, no file written), no hang.
Full suite now completes in ~30s.
With the hang fixed, the coverage step ran for the first time and
flagged three per-file gaps (previously invisible because the step
never completed):

- with-deadline.ts (branches 75%): the `settled` flag's `if (settled)
  return` guard was unreachable (we clearTimeout on settle + Promises
  settle once), so that branch could never be covered. Dropped the
  redundant bookkeeping — equivalent behavior, branches now 100%.
- recall-query.ts (branches 78%): added tests for the lexical `project`
  filter + explicit limit, and for mapTopRow defaulting every missing
  row field to '' (the ?? '' branches). 100%.
- recall.ts (functions 87.5%): added a test that makes main() reject
  (readStdin throws) to exercise the top-level fatal `.catch` arrow —
  logs 'fatal' and exits 0. Functions 100%.

All recall files now clear the 90% per-file thresholds; coverage runs
cleanly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants