Give Claude Code semantic memory of your entire codebase. Every prompt you send is automatically enriched with the most relevant chunks from your code, docs, and Claude memory files — no manual context-pasting required.
Powered by Google Gemini embeddings (free tier) + Claude Code hooks.
Claude Code is great, but it doesn't know your codebase until you tell it. You end up:
- Copy-pasting file content into every prompt
- Re-explaining the same architecture decisions
- Watching Claude search the same files over and over
This CLI fixes that. It indexes your project once, then injects the top 3 most relevant chunks into every Claude Code prompt automatically via the UserPromptSubmit hook.
You ask: "how does auth work in this project?" Claude receives: your question + the actual auth-related code chunks, before generating a response.
- Semantic search over your code, docs, and Claude memory files
- Auto-injection into every Claude Code prompt (via hook)
- Auto-reindexing when you edit files (debounced, runs in background)
- Multi-project safe — each project has its own hook, no conflicts
- Free — Google Gemini embeddings free tier (1500 req/min)
- Local-only — all embeddings stored in
chunks.json, nothing leaves your machine except the embedding API call - Dashboard — browse your indexed chunks in a local web UI
- Node.js 22+
- A Google AI Studio API key (free) — get one at https://aistudio.google.com/apikey
- Claude Code installed
git clone https://github.com/<your-username>/rag-cli.git
cd rag-cli
npm install
npm run build
npm linkVerify the install:
rag --helpIn any project you want Claude Code to "remember":
cd /path/to/your-project
# 1. Scaffold the rag/ folder (auto-detects src/, lib/, components/, etc.)
rag init
# 2. Paste your Google AI API key into rag/.env
echo "GOOGLE_AI_API_KEY=your_key_here" > rag/.env
# 3. Index your codebase + Claude memory
rag index
# 4. Wire the auto-inject + auto-reindex hooks into Claude Code
rag install-hooksThat's it. Open Claude Code in this project and every prompt now has semantic context.
| Command | Description |
|---|---|
rag init |
Create rag/ folder, auto-detect project structure |
rag index |
Index all matching files into embeddings |
rag query "..." |
Run a semantic search manually |
rag install-hooks |
Add RAG hooks to ~/.claude/settings.json |
rag status |
Show index stats (files, chunks, disk size) |
rag reset |
Delete index and rebuild from scratch |
rag files |
List every indexed file |
rag config |
Show current patterns and memory path |
rag config add "pattern" |
Add a glob pattern (e.g. "docs/**/*.md") |
rag config rm "pattern" |
Remove a glob pattern |
rag dashboard |
Open the chunk browser in your browser |
You ask Claude: "what's the rate limit on the session endpoint?"
│
▼
UserPromptSubmit hook fires
│
▼
Your question → Gemini embedding → 768-dim vector
│
▼
Cosine similarity against all chunks in rag/chunks.json
│
▼
Top 3 chunks (>= 70% similarity) injected into Claude's context
│
▼
Claude responds with the actual answer from your code
When you edit a file, the PostToolUse hook reindexes the changed paths in the background (60s debounce, so rapid edits don't thrash the API).
After rag init + rag install-hooks, each project gets:
your-project/
├── rag/
│ ├── .env # Google AI key (auto-gitignored)
│ ├── .gitignore # ignores .env + chunks.json
│ ├── config.json # patterns + memory directory
│ ├── chunks.json # embedding index (auto-gitignored)
│ ├── hook.ts # UserPromptSubmit handler
│ └── reindex-hook.ts # PostToolUse handler
Nothing sensitive is committed — .env and chunks.json are auto-gitignored.
rag install-hooks is idempotent and project-scoped. Each project's hooks are identified by absolute path in ~/.claude/settings.json, so:
- Installing in project A doesn't overwrite project B's hooks
- Each hook detects the current
cwdand skips if it's not its own project - You can have 10 projects with RAG enabled simultaneously — they don't interfere
| Score | Meaning |
|---|---|
>= 80% |
Strong match — the answer is in this chunk |
70–80% |
Good match — relevant context |
55–70% |
Weak match — may or may not help |
< 55% |
Discarded — not injected |
Threshold is set to 70% by default. Tune in rag/hook.ts if needed.
| Item | Cost |
|---|---|
| Gemini embedding API | Free (1500 req/min quota) |
| Indexing 300 files | ~$0 (within free tier) |
| Per-prompt query | ~$0 (one embedding call) |
| Storage | Local JSON, ~25 MB per 600 chunks |
| Issue | Fix |
|---|---|
API key not valid |
Recheck rag/.env, ensure no trailing spaces |
models/... not found |
The model is gemini-embedding-001 — update if you forked |
| Hook not firing | Verify absolute path in ~/.claude/settings.json matches your project |
chunks.json empty |
Check rag/config.json patterns match real files |
| Windows notification missing | Ensure PowerShell is in your PATH |
Not initialized |
Run rag init first |
PRs welcome. Some ideas:
- Support for additional embedding providers (OpenAI, Voyage, local models)
- Cross-platform notifications (macOS, Linux)
- Smarter chunking for specific languages (AST-based instead of line-based)
- Incremental reindexing (only re-embed changed chunks, not entire files)
Open an issue first if you want to discuss a larger change.
MIT