feat(recall): proactive team-memory recall on Claude Code UserPromptSubmit#281
feat(recall): proactive team-memory recall on Claude Code UserPromptSubmit#281khustup2 wants to merge 22 commits into
Conversation
…ubmit
Surfaces relevant prior work *unprompted* — the missing piece that turns
Hivemind from 'captures memory' into 'saves me, repeatedly'. On a
recall-worthy prompt only, semantic-search the team's summaries and, if
the top hit clears a relevance threshold, inject ONE attributed snippet
('recalled from <teammate> · <date>') into the model context.
Consolidates the three levers we agreed on:
- (1) proactive per-prompt recall — NOT on every prompt: a cheap pure
gate (shared/recall-gate.ts) rejects acks/continuations and only
searches on error/intent signals or substantive prose. Precision-biased
(a missed recall is invisible; noise every turn trains tune-out).
- (2) visible + attributable, leaning on the team moat solo tools lack:
shared/recall-format.ts surfaces the teammate + date + project and a
cat-path to the full summary; never injects an unattributable hit.
- (5) measurable: a structured 'recall' log event (searched/hit/below/
injected + score) so value is provable, not an abstract token counter.
Mechanics: shared/recall-query.ts runs a focused cosine query
((summary_embedding <#> vec) AS score, threshold-gated); recall.ts wires
gate -> embed -> query -> threshold -> inject, failure-isolated so it
never blocks a prompt. additionalContext on Claude Code is model-only
(invisible to the user). Opt-out via HIVEMIND_RECALL=false.
18 unit tests cover the gate, formatter, and query. Claude Code first
(model-only channel = low risk); Cursor/Codex follow once validated.
NOTE: the live embed+query path needs an integration check against a real
backend (not unit-testable in this harness) before un-drafting.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a ChangesProactive Recall Hook
Sequence Diagram(s)sequenceDiagram
participant Claude as Claude Code
participant recall as recall.ts
participant gate as shouldRecall
participant findHit as findHit()
participant db as Deeplake SQL
participant fmt as formatRecallContext
participant emit as emit()
Claude->>recall: UserPromptSubmit (stdin JSON)
recall->>recall: env HIVEMIND_PROACTIVE_RECALL enabled?
alt disabled
recall-->>Claude: exit 0 (no output)
end
recall->>recall: plugin enabled?
alt disabled
recall-->>Claude: exit 0
end
recall->>gate: shouldRecall(prompt)
alt recall: false
gate-->>recall: skip reason
recall-->>Claude: exit 0
else recall: true
recall->>findHit: findHit(config, prompt, now)<br/>(wrapped in withDeadline)
findHit->>db: recallTopHit (semantic cosine)<br/>or recallTopHitLexical (ILIKE overlap)
db-->>findHit: top row or empty
findHit-->>recall: {kind:'hit'} | {kind:'none'} | timeout fallback
recall->>recall: hitPasses(hit.score, hit.mode)?
alt below threshold or unattributable
recall-->>Claude: exit 0
else passes threshold
recall->>fmt: formatRecallContext(hit, currentUser, now)
fmt-->>recall: attributed HIVEMIND RECALL block
recall->>emit: emit(additionalContext)
emit-->>Claude: console.log hookSpecificOutput JSON
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Coverage ReportScope: files changed in this PR. Enforced threshold: 90% per metric (per file via
File Coverage — 8 files changed
Generated for commit d8fbe49. |
recall runs synchronously on UserPromptSubmit (it must, to inject context into the turn), so on a recall-worthy prompt it blocks the agent's response until it finishes. Previously the SQL query was only bounded by the 10s hook timeout — a slow/cold backend could stall the turn. Add withDeadline(): race the whole embed+query path against HIVEMIND_RECALL_TIMEOUT_MS (default 1000ms); on timeout, skip (emit nothing) instead of stalling. Worst-case added latency on a recall-worthy prompt is now ~1s, not up to 10s; non-recall prompts still pay nothing (the gate short-circuits before any I/O). Tightened the hooks.json backstop 10s -> 5s. 3 withDeadline tests (beats / exceeds / rejects).
…t embeddings Recall previously required the embeddings model (semantic-only); it now falls back to lexical search when embeddings are unavailable/disabled, so it works on a plain login with no 600MB model install. - extractKeywords(): salient tokens from the prompt (drop stopwords/short, keep identifiers), de-duped + capped. - recallTopHitLexical(): ranks summaries by distinct-keyword overlap via ILIKE on summary+description (mirrors the codebase's ILIKE search; the BM25 index is currently disabled backend-side, deeplake-api.ts). score = overlap count, gated by MIN_LEXICAL_OVERLAP (default 2) for precision. - recall.ts: semantic when embeddings yield a vector, else lexical (also falls back when summaries aren't embedded yet). Mode-aware gate + mode-tagged telemetry. Validated live (logged in, embeddings off): a pg-deeplake-crash prompt hit overlap=5 and injected the attributed summary in ~650ms. 26 unit tests.
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
tests/shared/recall.test.ts (1)
104-112: ⚡ Quick winPrefer exact assertions for recall block content instead of broad substrings.
For this structured output, asserting the full expected lines (especially the summary path line and instruction-framing line) will catch regressions that
toContain(...)can miss.As per coding guidelines, "
tests/**: Prefer asserting on specific values (paths, messages) over generic substrings."Also applies to: 125-128
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/shared/recall.test.ts` around lines 104 - 112, The test for formatRecallContext is using overly broad toContain assertions which can miss regressions in structured output. Replace the individual toContain calls with exact assertions on the complete expected lines, particularly for the instruction-framing line containing the full path and the summary information line. Focus especially on asserting the complete path like "cat ~/.deeplake/memory/summaries/levon/sess-1.md" as a full line rather than just the substring. This same refactoring should also be applied to the similar test case mentioned at lines 125-128. Use exact line-matching assertions instead of generic substring checks to ensure the recall block output structure is preserved across changes.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/hooks/recall.ts`:
- Around line 45-50: The current parsing of EMBED_TIMEOUT_MS and
RECALL_BUDGET_MS environment variables using Number() can result in invalid
values like NaN, 0, or negative numbers, which would cause timeouts to expire
instantly or fail silently. Add validation for both constants to ensure the
parsed value is a positive number, and fall back to the default value if the
parsed number is not valid (check for NaN, zero, or negative values). This
ensures only sensible timeout values are used and prevents recall behavior from
being disabled due to invalid timing values.
In `@src/hooks/shared/recall-format.ts`:
- Around line 62-76: The return statement in the formatHivemindRecall function
is constructing a shell command by directly interpolating hit.path from the
database into a cat command, which is a security risk even after parsing.
Instead of using the raw hit.path in the shell command string, reconstruct the
path from the validated safe segments returned by parseSummaryPath (the parsed
object containing author and other fields). Build the printed path by
reassembling the known safe components rather than interpolating the unvalidated
database-derived hit.path directly into the command template.
In `@src/hooks/shared/recall-query.ts`:
- Around line 48-52: The pgvector `<#>` operator returns the negative inner
product, which means the current `ORDER BY score DESC` clause is sorting by
least similar results first instead of most similar. In the SQL query
construction where `summary_embedding <#> ${vecLit}` is used as the score,
negate the operator result (e.g., by prefixing it with a negation operator) so
that positive similarity scores are produced and `ORDER BY score DESC` correctly
ranks by relevance with the most similar results first. Additionally, update the
inaccurate file comment on line 8 that claims `<#>` is "normalized similarity
(0..1, higher = closer)" to correctly reflect that `<#>` returns the negative
inner product.
In `@src/hooks/shared/with-deadline.ts`:
- Line 8: The early return of promise `p` when `ms` is non-positive bypasses the
rejection-to-fallback degradation behavior, allowing rejections to escape
without proper handling. Remove or modify the condition that returns `p`
directly for non-positive ms values to ensure that all promises go through the
rejection-to-fallback wrapping mechanism, maintaining consistent error handling
behavior regardless of the deadline value passed to the withDeadline function.
---
Nitpick comments:
In `@tests/shared/recall.test.ts`:
- Around line 104-112: The test for formatRecallContext is using overly broad
toContain assertions which can miss regressions in structured output. Replace
the individual toContain calls with exact assertions on the complete expected
lines, particularly for the instruction-framing line containing the full path
and the summary information line. Focus especially on asserting the complete
path like "cat ~/.deeplake/memory/summaries/levon/sess-1.md" as a full line
rather than just the substring. This same refactoring should also be applied to
the similar test case mentioned at lines 125-128. Use exact line-matching
assertions instead of generic substring checks to ensure the recall block output
structure is preserved across changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 497b3ce1-58b1-4623-882a-bf03d32ae3db
📒 Files selected for processing (9)
esbuild.config.mjsharnesses/claude-code/hooks/hooks.jsonsrc/hooks/recall.tssrc/hooks/shared/recall-format.tssrc/hooks/shared/recall-gate.tssrc/hooks/shared/recall-query.tssrc/hooks/shared/with-deadline.tssrc/shell/grep-core.tstests/shared/recall.test.ts
The pure helpers were tested but recall.ts itself — the branching where bugs hide — wasn't. Add tests driving main() with mocked boundaries: - guards: HIVEMIND_RECALL=false, plugin disabled, ack prompt (no I/O), no-config - lexical: inject above overlap floor, skip below, skip <2 keywords, excludes the current session's own summary - semantic: inject above cosine threshold, skip below, fall back to lexical when no embedded rows / when the embed daemon is down - latency budget: skip (no emit) when the search exceeds the budget - failure isolation: query error never throws / never emits - no-rows: emit nothing 41 recall tests total (26 pure + 15 orchestration); tsc clean.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/claude-code/recall-hook.test.ts (1)
128-129: ⚡ Quick winPrefer exact log assertions instead of substring matches.
These paths are deterministic in this suite, so
expect.stringContaining(...)is broader than needed and can hide regressions in mode/score/threshold formatting.As per coding guidelines, "Prefer asserting on specific values (paths, messages) over generic substrings."
Also applies to: 135-136, 161-162, 169-170, 198-199, 211-212
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/claude-code/recall-hook.test.ts` around lines 128 - 129, Replace the substring matching approach using expect.stringContaining with exact string assertions for the debugLogMock calls throughout the test file. Instead of checking if the log contains "injected mode=lexical", determine what the complete deterministic log message should be for each test case and assert on the exact full message. This applies to all instances of expect.stringContaining in the debugLogMock assertions at lines 128-129, 135-136, 161-162, 169-170, 198-199, and 211-212. For each assertion, identify the exact expected log output (including mode, score, and threshold values as appropriate) and change from expect.stringContaining() to an exact equality assertion.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/claude-code/recall-hook.test.ts`:
- Around line 98-105: The test "skips an acknowledgement prompt before any I/O"
has prompts that are too short and trigger the "too-short" gate before reaching
the intended acknowledgement logic, resulting in false coverage. Additionally,
the assertion uses a generic substring match instead of asserting on the
specific expected value. Fix this by making the stdinMock prompt long enough to
pass the minimum length gate so the test exercises the actual acknowledgement
branch, and replace the generic stringContaining assertion with a specific
assertion matching the exact expected skip gate message value.
---
Nitpick comments:
In `@tests/claude-code/recall-hook.test.ts`:
- Around line 128-129: Replace the substring matching approach using
expect.stringContaining with exact string assertions for the debugLogMock calls
throughout the test file. Instead of checking if the log contains "injected
mode=lexical", determine what the complete deterministic log message should be
for each test case and assert on the exact full message. This applies to all
instances of expect.stringContaining in the debugLogMock assertions at lines
128-129, 135-136, 161-162, 169-170, 198-199, and 211-212. For each assertion,
identify the exact expected log output (including mode, score, and threshold
values as appropriate) and change from expect.stringContaining() to an exact
equality assertion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: bb8e8f11-1d61-4157-974d-b892f2380ed7
📒 Files selected for processing (1)
tests/claude-code/recall-hook.test.ts
Recall previously only logged via HIVEMIND_DEBUG, so usage/hit-rate was unanswerable without the flag. Add an always-on sink: one JSON line per recall-worthy invocation to ~/.deeplake/recall-events.jsonl, independent of debug, failure-isolated. - recall-events.ts: recordRecallEvent(), append-only JSONL, never throws. - recall.ts: records every post-gate outcome (injected/below/none/ timeout/no-config/unattributable) with mode/score/author/teammate/ project/session. - tests: sink unit tests + orchestration assertions per outcome.
Make proactive recall (the aggressive auto-search-and-inject behavior) cleanly optional, on by default, with a dedicated env var distinct from capture and from the agent's own reactive recall. - proactiveRecallDisabled() in recall-gate.ts: on by default; disable via HIVEMIND_PROACTIVE_RECALL_DISABLED=1 (matches HIVEMIND_AUTOPULL_DISABLED) or HIVEMIND_PROACTIVE_RECALL=0 (matches HIVEMIND_CAPTURE). Case-insensitive, whitespace-tolerant, pure/env-injectable. - recall.ts: gate on it (replaces the narrow HIVEMIND_RECALL===false check), documented in the header. - README: added the var to the Configuration table (+ MIN_OVERLAP / TIMEOUT_MS tuning knobs) and a 'Proactive recall' section explaining the feature, the on-by-default posture, and the exact disable commands. - Tests: opt-out unit cases (all on/off + disabled forms, default-enabled) and orchestration cases for both env paths.
|
@coderabbitai review |
✅ Action performedReview finished.
|
codex (P2):
- Scope recall to the CURRENT project (projectNameFromCwd) — keeps the
cross-teammate value, drops cross-repo false positives.
- shouldRecall evaluates ack + SIGNAL_RES BEFORE the length gate, so short
high-signal prompts ('TypeError in auth', 'how did we fix X?') recall
instead of being rejected as too-short.
coderabbit:
- parsePositive() guards numeric env overrides (NaN/0/negative → fallback)
for the embed timeout, recall budget, and lexical-overlap threshold, so a
bad value can't instant-timeout or disable recall.
- formatRecallContext prints a path POINTER (not a 'cat' shell command),
built only from validated path segments — DB-derived values can't produce
unsafe command text if copied.
- withDeadline degrades a rejection to fallback even when ms<=0; dropped the
always-true unref guard.
(coderabbit <#>-ordering comment rebutted separately: Deeplake's <#> is
normalized similarity 0..1, higher=closer, so DESC is correct — mirrors
the production grep-core path.)
…gister thresholds
- coderabbit: the ack / keywords<2 orchestration tests were short enough to
hit the gate's too-short branch instead of the branch they claimed; use a
signal prompt with <2 keywords ('TypeError?') and rely on the gate reorder
so the ack test exercises ack.
- New coverage: low-signal gate branch, all relativeDay buckets
(today/yesterday/days/weeks/months/unparseable), unsafe-path-segment
omission, empty-description, parsePositive edge cases, withDeadline ms<=0,
recallTopHit no-opts + non-numeric-score, HIVEMIND_WIKI_WORKER guard, and
the 'unattributable' outcome.
- Register all six recall files in vitest.config thresholds at 90/90/90/90
(new-files convention).
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/hooks/shared/recall-format.ts`:
- Around line 71-77: The safeSeg regex validation pattern allows dot characters,
which enables directory traversal attacks through `..` sequences in path
segments. Modify the safeSeg regex pattern to prevent this vulnerability by
either removing the dot (.) character from the allowed character class entirely,
or by explicitly excluding the `.` and `..` patterns. This ensures that
database-derived author and session values cannot be used to escape the intended
~/.deeplake/memory/summaries/ subdirectory path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: f851aad1-2e08-4020-a914-9a483798303a
📒 Files selected for processing (7)
src/hooks/recall.tssrc/hooks/shared/recall-format.tssrc/hooks/shared/recall-gate.tssrc/hooks/shared/with-deadline.tstests/claude-code/recall-hook.test.tstests/shared/recall.test.tsvitest.config.ts
🚧 Files skipped from review as they are similar to previous changes (5)
- tests/claude-code/recall-hook.test.ts
- src/hooks/shared/recall-gate.ts
- src/hooks/shared/with-deadline.ts
- tests/shared/recall.test.ts
- src/hooks/recall.ts
…e 2) codex [P2] — a below-threshold semantic hit no longer suppresses the lexical path: findHit now prefers a semantic hit that clears the threshold, otherwise falls through to lexical (exact keyword/identifier match), the same hybrid blend the grep path uses. Identifier-heavy prompts (stack traces, exact error names) recall even when the best embedding hit is unrelated. Test: 'a below-threshold semantic hit does not suppress a passing lexical match'. codex [P1] — bound the worst-case turn block. embed self-caps at 500ms and query at its own AbortSignal.timeout, but in-flight I/O isn't cancelled, so the hook process can linger past the 1s budget on a wedged backend. Tightened the recall hook's hard timeout 5s -> 2s so Claude Code terminates it well before then. Full in-flight cancellation (threading an AbortController through DeeplakeApi.query's retry loop) is a focused shared-infra follow-up.
Codex cycle-2 responses (b475a4e)[P2] Hybrid lexical fallback → fixed. [P1] Wall-clock bound → partially addressed now, full fix tracked. |
|
@coderabbitai review |
✅ Action performedReview finished.
|
…codex cycle 3) codex [P1] — the budget now cancels in-flight I/O: thread an optional AbortController signal through DeeplakeApi.query()'s retry loop (AbortSignal.any with the per-attempt fetch timeout; abort-check between retries). recall main() aborts on timeout so the query socket is cancelled rather than running to its own timeout. codex [P3] — backend failures no longer mislabeled as timeouts: withDeadline is a pure deadline (propagates rejections); findHit is failure-isolated and reports a distinct 'error' result, so recall-events records 'error' vs 'timeout' accurately. Also: a too-weak lexical hit (overlap < floor) records 'none', not 'below'. Tests: query abort/forward-signal, withDeadline propagation, findHit error, hybrid below-then-lexical. tsc clean.
Codex cycle-3 responses (d1a4771)[P1] Cancel in-flight I/O on budget expiry → fixed properly. [P3] Don't label failures as timeouts → fixed. Local v8-coverage instrumentation is wedged in my environment so CI is verifying the full suite + 90% gate this cycle; |
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/deeplake-api.ts`:
- Around line 239-253: The _queryWithRetry method checks the externalSignal at
the start of the retry loop but the retry delays (sleep/backoff between
attempts) do not respect signal abortion, causing the query promise to remain
alive until the backoff completes even after the external signal fires. Locate
the sleep or delay call used between retry attempts and modify it to accept the
externalSignal parameter so that it can be interrupted when the signal aborts.
This ensures that if the caller's signal (e.g., recall's latency budget) fires
while waiting between retries, the delay is immediately cancelled rather than
blocking the entire operation.
In `@tests/shared/deeplake-api.test.ts`:
- Around line 38-47: The abort assertions in the test are too generic and don't
sufficiently verify the abort behavior. For the first test checking the rejected
query call with AbortSignal.abort(), replace the regex pattern matching in the
toThrow assertion with an exact error message check to verify the specific abort
error. For the second test checking that a live caller signal is forwarded into
the fetch call, after calling api.query with ctrl.signal, abort the controller
and then assert that the signal captured in mockFetch.mock.calls[0][1].signal is
actually aborted (check its aborted property) rather than just verifying it is
an instance of AbortSignal. This ensures the implementation properly forwards
the caller's signal and doesn't drop it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: a0025753-d000-4720-9537-1653aa22b473
📒 Files selected for processing (7)
src/deeplake-api.tssrc/hooks/recall.tssrc/hooks/shared/recall-events.tssrc/hooks/shared/with-deadline.tstests/claude-code/recall-hook.test.tstests/shared/deeplake-api.test.tstests/shared/recall.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- src/hooks/shared/recall-events.ts
- tests/claude-code/recall-hook.test.ts
- tests/shared/recall.test.ts
| await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow(/abort/i); | ||
| expect(mockFetch).not.toHaveBeenCalled(); | ||
| }); | ||
|
|
||
| it("forwards a live caller signal into the fetch call", async () => { | ||
| mockFetch.mockResolvedValueOnce(jsonResponse({ columns: ["id"], rows: [["1"]] })); | ||
| const api = makeApi(); | ||
| const ctrl = new AbortController(); | ||
| await api.query("SELECT id FROM t", ctrl.signal); | ||
| expect(mockFetch.mock.calls[0][1].signal).toBeInstanceOf(AbortSignal); |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Assert exact abort behavior, not just a generic signal.
Line 38 can assert the exact message, and Line 47 should prove the captured fetch signal aborts when ctrl aborts; otherwise an implementation that drops the caller signal still passes.
Tighten the abort assertions
- await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow(/abort/i);
+ await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow("Query aborted");
expect(mockFetch).not.toHaveBeenCalled();
});
@@
const ctrl = new AbortController();
await api.query("SELECT id FROM t", ctrl.signal);
- expect(mockFetch.mock.calls[0][1].signal).toBeInstanceOf(AbortSignal);
+ const fetchSignal = mockFetch.mock.calls[0][1].signal as AbortSignal;
+ expect(fetchSignal).toBeInstanceOf(AbortSignal);
+ expect(fetchSignal.aborted).toBe(false);
+ ctrl.abort();
+ expect(fetchSignal.aborted).toBe(true);
});As per path instructions, tests/**: “Prefer asserting on specific values (paths, messages) over generic substrings.”
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow(/abort/i); | |
| expect(mockFetch).not.toHaveBeenCalled(); | |
| }); | |
| it("forwards a live caller signal into the fetch call", async () => { | |
| mockFetch.mockResolvedValueOnce(jsonResponse({ columns: ["id"], rows: [["1"]] })); | |
| const api = makeApi(); | |
| const ctrl = new AbortController(); | |
| await api.query("SELECT id FROM t", ctrl.signal); | |
| expect(mockFetch.mock.calls[0][1].signal).toBeInstanceOf(AbortSignal); | |
| await expect(api.query("SELECT 1", AbortSignal.abort())).rejects.toThrow("Query aborted"); | |
| expect(mockFetch).not.toHaveBeenCalled(); | |
| }); | |
| it("forwards a live caller signal into the fetch call", async () => { | |
| mockFetch.mockResolvedValueOnce(jsonResponse({ columns: ["id"], rows: [["1"]] })); | |
| const api = makeApi(); | |
| const ctrl = new AbortController(); | |
| await api.query("SELECT id FROM t", ctrl.signal); | |
| const fetchSignal = mockFetch.mock.calls[0][1].signal as AbortSignal; | |
| expect(fetchSignal).toBeInstanceOf(AbortSignal); | |
| expect(fetchSignal.aborted).toBe(false); | |
| ctrl.abort(); | |
| expect(fetchSignal.aborted).toBe(true); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/shared/deeplake-api.test.ts` around lines 38 - 47, The abort assertions
in the test are too generic and don't sufficiently verify the abort behavior.
For the first test checking the rejected query call with AbortSignal.abort(),
replace the regex pattern matching in the toThrow assertion with an exact error
message check to verify the specific abort error. For the second test checking
that a live caller signal is forwarded into the fetch call, after calling
api.query with ctrl.signal, abort the controller and then assert that the signal
captured in mockFetch.mock.calls[0][1].signal is actually aborted (check its
aborted property) rather than just verifying it is an instance of AbortSignal.
This ensures the implementation properly forwards the caller's signal and
doesn't drop it.
Source: Path instructions
…ex cycle 4) codex [P1] — the budget now also short-circuits the retry BACKOFF, not just the in-flight fetch: sleep(ms, signal) clears its timer and rejects on abort, and DeeplakeApi.query's two retry sleeps pass the caller signal. So a transient retryable failure can't keep the hook process alive through the exponential backoff past the budget. codex [P2] — recall no longer drops LEGACY summary rows. formatRecallContext attributes from the row's own 'author' column (the query already selects it) instead of requiring the path to parse as /summaries/<author>/<session>; non-canonical paths still inject, just without the path pointer. Only a row with no author at all is 'unattributable'. Tests: legacy-path injection, no-author → unattributable, format teammate/you paths unchanged. tsc clean.
Codex cycle-4 responses (5a49fa8)[P1] Abortable retry backoff → fixed. [P2] Legacy summary paths → fixed. |
|
@coderabbitai review |
✅ Action performedReview finished.
|
…ex cycle 5) codex [P2]: `project` scoping uses the cwd basename, so repos sharing a leaf dir name (…/foo/api vs …/bar/api) collide. This is inherent to the stored data — capture tags summary rows with the same basename `project` (projectNameFromCwd), and the memory table has no more-unique key (project_key exists only on the codebase table). Recall filters as precisely as the data allows, and basename scoping is strictly better than the org-wide search it replaced. Full disambiguation needs a unique project key on summary rows (capture/schema change + migration), tracked as a follow-up. Documented inline.
Codex cycle-5 response (c5eb009)[P2] basename project-scoping collision — documented as a known, bounded limitation rather than 'fixed', because the proper fix isn't in recall:
|
|
@coderabbitai review |
✅ Action performedReview finished.
|
…ex cycle 6) codex [P2] — the injected 'Full summary:' pointer now uses the configured memory root (config.memoryPath / HIVEMIND_MEMORY_PATH) instead of a literal ~/.deeplake/memory, so it points at the real file in non-default installs. formatRecallContext takes memoryRoot; recall passes config.memoryPath. codex [P2] — the embed step is now budget-bound: EMBED_TIMEOUT_MS is clamped to <= RECALL_BUDGET_MS, so EmbedClient.embed() (which has no abort hook) can't outlast the recall budget even if HIVEMIND_SEMANTIC_EMBED_TIMEOUT_MS is set higher. Tests: custom-memory-root pointer; CONFIG mock carries memoryPath. tsc clean.
Codex cycle-6 responses (d9e84bf)[P2] Custom memory root → fixed. [P2] Embed not budget-bound → fixed. |
|
@coderabbitai review |
✅ Action performedReview finished.
|
…ng (codex cycle 7) codex [P1] — both recall queries now filter `path LIKE '/summaries/%'`, so notes/goals/user files in the memory table can't outrank and get injected as 'prior work' (the formatter and README both assume summary hits). codex [P2] — dropped cwd-basename project scoping. Summaries are tagged with the capture-time cwd basename, so basename scoping (a) collides across repos sharing a leaf name and (b) WORSE, silently drops valid same-repo history when prompting from a subdirectory (session tagged 'repo', prompt from 'repo/src' → 'src'). Precision now comes from the summaries-only filter + the relevance threshold; robust project-key scoping is a follow-up. Tests: summary-row filter asserted on both queries; orchestration asserts summaries-only and no basename project filter. tsc clean.
Codex cycle-7 responses (4f03e9d)[P1] Restrict to summary rows → fixed. Both the semantic and lexical queries now filter [P2] Basename scoping → fixed by removing it. You're right that it's worse than I framed: beyond the cross-repo collision, basename scoping silently drops valid same-repo history when prompting from a subdirectory (session tagged |
|
@coderabbitai review |
✅ Action performedReview finished.
|
…n (codex cycle 8) Codex P1: the recalled summary excerpt is AI-generated from prior sessions and may carry user-controlled / injected text; injecting it verbatim into model context let a single poisoned row smuggle live instructions into unrelated sessions. Render it inert before injection: sanitizeSnippet() neutralizes all line terminators (LINE_TERMINATOR_RE, incl. U+2028/U+2029), replaces backticks/double-quotes so it can't break the quoted frame or open a fake code fence, collapses whitespace, and hard-caps at 240 chars. The block header now explicitly frames the quoted text as untrusted DATA, never an instruction. Covered by a new security test that feeds an injection payload and asserts the excerpt is single-line, fence-free and capped.
|
@coderabbitai review |
✅ Action performedReview finished.
|
…e 9) Codex P2: the question/how-to phrasing in SIGNAL_RES fired BEFORE the length gate, so bare follow-ups like 'which folder' or "what's the cap?" matched a generic question word and triggered a synchronous memory search + snippet injection — exactly the conversational back-and-forth the gate is meant to skip, hurting precision/latency. Split signals by strength: STRONG markers (errors, file:line, ErrorName, recall/continuity intent) still win regardless of length, but a generic question word is now a WEAK signal that only upgrades a prompt which ALSO clears the substantive-length bar (>=24 chars, >=6 words) to a 'signal' recall. Short question follow-ups fall through to too-short/low-signal and are skipped. Covered by new gate tests.
|
Addressed latest codex review findings:
@coderabbitai review |
|
✅ Action performedReview finished.
|
Codex P2: on a Claude Code marketplace auto-upgrade, the new versioned cache dir lacks the node_modules symlink that 'hivemind embeddings install' created. capture.js repairs it via ensurePluginNodeModulesLink(), but recall and capture are independent async UserPromptSubmit hooks — recall can run first, so the first prompt after an upgrade silently lost SEMANTIC recall (fell back to lexical/null) even with embeddings installed. recall.ts now runs the same best-effort self-heal right before building the EmbedClient, inside the existing SEMANTIC_ENABLED branch (no new gate branch; lexical-only runs never touch it). Failure is non-fatal — we degrade to lexical. Tests mock self-heal, assert it runs before the embed call (invocationCallOrder) and that a throwing repair still recalls.
|
[P2] Repair plugin embed deps before semantic recall — fixed in @coderabbitai review |
|
✅ Action performedReview finished.
|
…cle 11) Codex P2: SessionStart/Capture/SessionEnd all skip non-interactive entrypoints (sdk-py/sdk-ts/sdk-cli) when HIVEMIND_CAPTURE_ONLY_CLI=true, but the new UserPromptSubmit recall hook did not. In CLI-only setups it would still query Deeplake and inject hidden context into Agent SDK / `claude -p` runs, perturbing scripted/headless output the user explicitly scoped to interactive CLI. main() now calls entrypointPassesOnlyCliGate() right after the plugin-enabled check (before any stdin/I/O) and returns when it fails. Tests: a sdk-cli session under the gate skips before any I/O; a bare 'cli' session still recalls.
…ycle 11) Codex P2: the lexical fallback ordered only by overlap count — a small integer that ties often when summaries share the same 2-3 keywords. Since recall injects only the top row, Postgres was free to return an arbitrary tied summary, making recall nondeterministic and able to surface a STALE fix instead of the newest relevant one for the same prompt. Add a stable secondary sort 'last_update_date DESC, path ASC' (the TIE_BREAK constant) to BOTH the lexical and semantic queries: on equal score, prefer the most recently updated summary, then path as a final total order so the same prompt always recalls the same row. Tests assert the full ORDER BY on both query builders.
|
Addressed codex cycle 11 findings:
@coderabbitai review |
|
Triggering the review now. ✅ Action performedReview finished.
|
…inter (coderabbit) CodeRabbit: safeSeg allowed '.' chars, so a bare '..' segment in a DB-derived summary path passed validation and the printed pointer could traverse out of summaries/ (e.g. /summaries/../config.md → ~/.deeplake/memory/config.md). safeSeg now also rejects all-dots segments (/^\.+$/), so '.'/'..' are refused while normal names and versioned segments (v1.2) still pass. Traversal-y rows still recall — just without the unsafe pointer. Covered by a new test.
The 'never throws when the path is unwritable' test pointed fake-home at
/proc/nonexistent/cannot-write, so recordRecallEvent ran
mkdirSync('/proc/nonexistent/cannot-write/.deeplake', {recursive:true}).
Recursive mkdir on a non-existent procfs path HANGS on Linux (the
syscall never returns) instead of erroring — wedging the entire vitest
run, which is why CI's 'Run tests with coverage' step never completed
(every run sat in_progress until the job timed out).
Use a deterministic unwritable path instead: point home at an existing
FILE so creating '<file>/.deeplake' fails fast with ENOTDIR. Same
assertion (recordRecallEvent must not throw, no file written), no hang.
Full suite now completes in ~30s.
With the hang fixed, the coverage step ran for the first time and flagged three per-file gaps (previously invisible because the step never completed): - with-deadline.ts (branches 75%): the `settled` flag's `if (settled) return` guard was unreachable (we clearTimeout on settle + Promises settle once), so that branch could never be covered. Dropped the redundant bookkeeping — equivalent behavior, branches now 100%. - recall-query.ts (branches 78%): added tests for the lexical `project` filter + explicit limit, and for mapTopRow defaulting every missing row field to '' (the ?? '' branches). 100%. - recall.ts (functions 87.5%): added a test that makes main() reject (readStdin throws) to exercise the top-level fatal `.catch` arrow — logs 'fatal' and exits 0. Functions 100%. All recall files now clear the 90% per-file thresholds; coverage runs cleanly.
What
Proactive Recall — surfaces relevant prior team work unprompted on Claude Code. The missing half of Hivemind's value loop: it captures a lot but rarely shows up and saves you (the retention leak). On a recall-worthy prompt, it injects one attributed snippet from the team's memory into the model context.
Live example (validated against the real backend, no embeddings):
Design (the levers + the constraints we worked through)
shared/recall-gate.ts): a cheap pure pre-filter rejects acks/short/continuations and only searches on error/intent signals or substantive prose. Precision-biased — a missed recall is invisible; noise every turn trains tune-out.shared/recall-format.ts): teammate + relative date + project +catpath; never injects an unattributable hit. Cross-teammate recall is the moat solo memory tools (Cursor/Claude built-in) can't match.shared/recall-query.ts): SEMANTIC (cosinesummary_embedding <#> vec) when the embed model is available, else LEXICAL (ILIKE keyword-overlap on summary+description). Mirrors the codebase's ILIKE path — the BM25 index is disabled backend-side (oid bug,deeplake-api.ts). Each mode has its own precision gate (cosine ≥RECALL_THRESHOLD; overlap ≥MIN_LEXICAL_OVERLAP).shared/with-deadline.ts): recall runs synchronously onUserPromptSubmit(it must, to inject into the turn), so the whole embed+query path is capped atHIVEMIND_RECALL_TIMEOUT_MS(default 1000ms) and degrades to skip rather than stall. Non-recall prompts pay nothing (gate short-circuits before any I/O).capture.jspersists only the user's prompt; the wiki-worker summarizes captured rows, not the injected context — so recalled summaries are never re-ingested, re-embedded, or summarized-of-summaries.Validation
tscclean.RECALL_THRESHOLD.Scope / rollout
Claude Code only for now (its
additionalContextis model-only → low risk). Cursor next pending abeforeSubmitPromptprobe; Hermes/Pi feasible; Codex deliberately held out (its per-promptadditionalContextis user-visible — would clobber the TUI every prompt). Opt-out:HIVEMIND_RECALL=false.🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
~/.deeplake/recall-events.jsonlfor injections, misses, timeouts, errors, and unattributable cases.