Skip to content

ghosthunterbug/rag-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG CLI for Claude Code

Give Claude Code semantic memory of your entire codebase. Every prompt you send is automatically enriched with the most relevant chunks from your code, docs, and Claude memory files — no manual context-pasting required.

Powered by Google Gemini embeddings (free tier) + Claude Code hooks.


Why

Claude Code is great, but it doesn't know your codebase until you tell it. You end up:

  • Copy-pasting file content into every prompt
  • Re-explaining the same architecture decisions
  • Watching Claude search the same files over and over

This CLI fixes that. It indexes your project once, then injects the top 3 most relevant chunks into every Claude Code prompt automatically via the UserPromptSubmit hook.

You ask: "how does auth work in this project?" Claude receives: your question + the actual auth-related code chunks, before generating a response.


Features

  • Semantic search over your code, docs, and Claude memory files
  • Auto-injection into every Claude Code prompt (via hook)
  • Auto-reindexing when you edit files (debounced, runs in background)
  • Multi-project safe — each project has its own hook, no conflicts
  • Free — Google Gemini embeddings free tier (1500 req/min)
  • Local-only — all embeddings stored in chunks.json, nothing leaves your machine except the embedding API call
  • Dashboard — browse your indexed chunks in a local web UI

Requirements


Installation

git clone https://github.com/<your-username>/rag-cli.git
cd rag-cli
npm install
npm run build
npm link

Verify the install:

rag --help

Quick Start

In any project you want Claude Code to "remember":

cd /path/to/your-project

# 1. Scaffold the rag/ folder (auto-detects src/, lib/, components/, etc.)
rag init

# 2. Paste your Google AI API key into rag/.env
echo "GOOGLE_AI_API_KEY=your_key_here" > rag/.env

# 3. Index your codebase + Claude memory
rag index

# 4. Wire the auto-inject + auto-reindex hooks into Claude Code
rag install-hooks

That's it. Open Claude Code in this project and every prompt now has semantic context.


Commands

Command Description
rag init Create rag/ folder, auto-detect project structure
rag index Index all matching files into embeddings
rag query "..." Run a semantic search manually
rag install-hooks Add RAG hooks to ~/.claude/settings.json
rag status Show index stats (files, chunks, disk size)
rag reset Delete index and rebuild from scratch
rag files List every indexed file
rag config Show current patterns and memory path
rag config add "pattern" Add a glob pattern (e.g. "docs/**/*.md")
rag config rm "pattern" Remove a glob pattern
rag dashboard Open the chunk browser in your browser

How It Works

You ask Claude: "what's the rate limit on the session endpoint?"
        │
        ▼
UserPromptSubmit hook fires
        │
        ▼
Your question → Gemini embedding → 768-dim vector
        │
        ▼
Cosine similarity against all chunks in rag/chunks.json
        │
        ▼
Top 3 chunks (>= 70% similarity) injected into Claude's context
        │
        ▼
Claude responds with the actual answer from your code

When you edit a file, the PostToolUse hook reindexes the changed paths in the background (60s debounce, so rapid edits don't thrash the API).


File Layout

After rag init + rag install-hooks, each project gets:

your-project/
├── rag/
│   ├── .env              # Google AI key (auto-gitignored)
│   ├── .gitignore        # ignores .env + chunks.json
│   ├── config.json       # patterns + memory directory
│   ├── chunks.json       # embedding index (auto-gitignored)
│   ├── hook.ts           # UserPromptSubmit handler
│   └── reindex-hook.ts   # PostToolUse handler

Nothing sensitive is committed — .env and chunks.json are auto-gitignored.


Multi-Project Safety

rag install-hooks is idempotent and project-scoped. Each project's hooks are identified by absolute path in ~/.claude/settings.json, so:

  • Installing in project A doesn't overwrite project B's hooks
  • Each hook detects the current cwd and skips if it's not its own project
  • You can have 10 projects with RAG enabled simultaneously — they don't interfere

Similarity Scores

Score Meaning
>= 80% Strong match — the answer is in this chunk
70–80% Good match — relevant context
55–70% Weak match — may or may not help
< 55% Discarded — not injected

Threshold is set to 70% by default. Tune in rag/hook.ts if needed.


Cost

Item Cost
Gemini embedding API Free (1500 req/min quota)
Indexing 300 files ~$0 (within free tier)
Per-prompt query ~$0 (one embedding call)
Storage Local JSON, ~25 MB per 600 chunks

Troubleshooting

Issue Fix
API key not valid Recheck rag/.env, ensure no trailing spaces
models/... not found The model is gemini-embedding-001 — update if you forked
Hook not firing Verify absolute path in ~/.claude/settings.json matches your project
chunks.json empty Check rag/config.json patterns match real files
Windows notification missing Ensure PowerShell is in your PATH
Not initialized Run rag init first

Contributing

PRs welcome. Some ideas:

  • Support for additional embedding providers (OpenAI, Voyage, local models)
  • Cross-platform notifications (macOS, Linux)
  • Smarter chunking for specific languages (AST-based instead of line-based)
  • Incremental reindexing (only re-embed changed chunks, not entire files)

Open an issue first if you want to discuss a larger change.


License

MIT

About

Semantic memory for Claude Code via Google Gemini embeddings — auto-injects relevant code chunks into every prompt

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors