A paper-aware chat client where every cited sentence traces back to its source.
Multi-agent tool routing · in-repo RAG knowledge base · agentic per-paper retrieval · a Citation Canvas that links every [chunk] back to the exact passage in the paper.
PaperHub is built UX-first. Every retrieved chunk has a clickable provenance trail, every generation step writes an audit row, and every chat turn is reconstructible from SQLite alone. A single chat interface routes each turn to the right specialist agent — paper search, paper Q&A, NL→SQL library stats, memory curation, or slide generation.
- 🔎 Agentic paper retrieval. Ask a question across your enabled papers and a per-paper subagent navigates each paper by its section table-of-contents (
list_sections→read_section) rather than blind cosine-similarity top-k, then a flagship model synthesises across papers over the raw cited chunks. - 🧷 Citation Canvas. Inline
[chunk:N]markers in every answer link back to the exact passage — click to open a side-by-side reading panel that scrolls to and highlights the cited chunk, in both the LaTeX-rendered HTML and the source PDF. Multi-chunk markers ([chunk:a, b]) are each clickable. No ungrounded claims. - 🌍 Answers in your language. The router detects the language of your question, so asking in Chinese is answered in Chinese — citation markers and paper titles preserved. A remembered "always reply in X" preference overrides per-turn detection across every agent.
- 📊 Ask stats about your library. "How many papers do I have?", "list my sessions" → a
library_statsagent translates the question to read-only SQL over a deterministic table allowlist (a separate in-process sqlite MCP server), self-repairs a failed query, and answers with the numbers + the exact SQL it ran. - 🧠 Session + global memory. Tell it to remember a fact or preference and it persists — session-scoped (this chat) or global (everywhere). A rule-based safety gate refuses secrets, an LLM detects conflicts and supersedes the stale note (active/superseded history kept), and only active memories are recalled into answers. A Memory Manager panel lets you view, edit, (de)activate, and delete entries — even in an empty chat (global only).
- 🧭 Visible routing + tracing. A routing badge shows which agent + model handled each turn; an expandable trace panel lists every model/MCP/pipeline step with latency and status. The full DAG replays from SQLite.
- 🌐 Discovery via web + Semantic Scholar.
paper_searchdecomposes into Parser → Discover (no-key multi-engine web search) → Resolve (Semantic Scholar) → Synthesize, so even vague references ("that diffusion paper everyone cites") resolve to a citable hit. - 📎 Bring your own papers. Attach by arXiv ID, paste a URL, or upload a PDF. Content is deduplicated and cached — re-importing the same paper into another session is instant.
- ➗ Math renders. LaTeX in answers (
$…$,$$…$$) renders as real equations via KaTeX. - 💾 Pick up on any device. Sessions and their full chat record live in the backend, not the browser — open the app anywhere and your conversations, paper-search cards, and references are all there. Deleting a chat removes it everywhere (with Undo); empty scratch chats are cleaned up automatically.
- 🔌 MCP-native. The agent's own tools are served over MCP (in-process FastMCP at
/mcp); external clients (Claude Desktop, Cursor) can reach the same surface.
| Area | Choice |
|---|---|
| Backend | Python 3.11 · FastAPI · LangGraph · LiteLLM · SQLite (aiosqlite) · Pydantic v2 |
| Frontend | TypeScript · React 19 · Vite · Tailwind · Zustand · react-markdown + KaTeX |
| RAG | Chroma · BAAI/bge-small-en-v1.5 embedder · ms-marco-MiniLM cross-encoder (hosted in a sibling model-server process) |
| LLM | Gemini by default (any LiteLLM provider — small-tier subagents, flagship finalizer) |
| Tooling | uv · pytest · ruff · mypy --strict · Vitest · ESLint · Conventional Commits |
Local-only, single-user. No auth surface — point it at your own LLM key and run it on your machine.
Prerequisites: Python 3.11 + uv, Node 18+, and an LLM API key (Gemini by default).
git clone https://github.com/whats2000/PaperHub.git
cd PaperHub
# Install both halves
cd backend && uv sync # Python deps from uv.lock
cd ../frontend && npm install # JS deps from package-lock.jsonConfigure your LLM key:
cd backend
cp .env.example .env # then fill in GEMINI_API_KEY (or your provider's key)Recommended (Windows, one command): scripts/start.ps1 orchestrates all
the sibling processes — it brings up the external MCP daemons (open-websearch)
via paperhub-mcp-up, the model-server, then the backend with hot-reload:
# Terminal 1 — backend stack (MCP daemons + model-server + FastAPI on :8000)
cd backend
.\scripts\start.ps1# Terminal 2 — frontend (Vite + React, hot-reload, :5173)
cd frontend
npm run devOpen http://localhost:5173 and start chatting.
Lower-level: run uvicorn directly
The model-server auto-spawns on first backend boot, so the minimum is:
cd backend
uv run uvicorn paperhub.app:app --reload --reload-dir src --port 8000Note: this path does not start the web-search daemon for you. On Windows,
uvicorn --reload runs on a SelectorEventLoop, so the in-worker autostart
falls back gracefully (papers-only) — bring web search up yourself with
uv run paperhub-mcp-up (or use scripts/start.ps1, which does it). See the
web-search note under Configuration.
No API key handy? Exercise the chat plumbing with mocked LLMs (PowerShell):
$env:PAPERHUB_ROUTER_MOCK = '{"intent":"chitchat","model_tier":"small","confidence":0.9,"reasoning":"dev"}' $env:PAPERHUB_CHITCHAT_MOCK = "Hello from PaperHub!" uv run uvicorn paperhub.app:app --reload --reload-dir src --port 8000
All settings live in backend/.env (grouped by function in .env.example). The ones you'll likely touch:
| Variable | Purpose | Default |
|---|---|---|
GEMINI_API_KEY |
LLM provider credential (or OPENAI_API_KEY / ANTHROPIC_API_KEY) |
— |
PAPERHUB_PAPER_QA_MODEL |
Flagship finalizer (cross-paper synthesis) | gemini/gemini-2.5-pro |
PAPERHUB_PAPER_QA_SUBAGENT_MODEL |
Per-paper section navigator (lightweight) | gemini/gemini-3.1-flash-lite |
PAPERHUB_DEVICE |
Embedder/reranker device (auto/cpu/cuda/mps) |
auto |
PAPERHUB_SEMANTIC_SCHOLAR_API_KEY |
Higher Semantic Scholar rate limit (optional) | — |
GPU (optional). torch installs CPU-only by default. For CUDA boxes: uv sync --extra cu124 / --extra cu126 / --extra cu130.
Web-search discovery (optional). paper_search / paper_suggest gain a no-key multi-engine discovery step when an open-websearch daemon is reachable on :3000. You don't install it by hand — scripts/start.ps1 (or uv run paperhub-mcp-up) reads mcp_servers.toml and launches every launch-declaring MCP server for you via npx -y, which fetches the package on first run (~25s, one-time):
cd backend
uv run paperhub-mcp-up # launches open-websearch on :3000 (skips if already up)When it's up, the backend's MCP registry auto-exposes web.search / web.fetch. When it's down, the agent falls back to a papers-only flow — no config needed. Spawned daemons are detached so they survive backend --reload; explicit teardown is start.ps1's job (otherwise they clear at reboot). Requires Node 18+ on PATH. (The paperhub-papers MCP surface ships in-process at /mcp; no install required.)
┌─────────────────┐ SSE ┌───────────────────────────────────────────┐
│ React shell │ ◄───────────── │ FastAPI · POST /chat │
│ - Composer │ │ ┌─────────────────────────────────────┐ │
│ - Routing badge│ │ │ LangGraph turn │ │
│ - Trace panel │ │ │ Router ─► chitchat | paper_qa | │ │
│ - Citation │ │ │ paper_search | slides | │ │
│ Canvas │ │ │ library_stats │ │
└─────────────────┘ │ └─────────────────────────────────────┘ │
│ │ │
│ ▼ paper_qa: fan out one subagent │
│ per paper → section nav → │
│ flagship finalizer over raw chunks │
│ ┌─────────┐ ┌──────────┐ ┌────────────┐ │
│ │ LiteLLM │ │ Chroma │ │ SQLite │ │
│ │ adapter │ │ (RAG) │ │ (audit + │ │
│ │ │ │ │ │ schema) │ │
│ └─────────┘ └──────────┘ └────────────┘ │
│ ▲ embedder + reranker in a sibling │
│ model-server process (:8001) │
└───────────────────────────────────────────┘
Every model call, MCP call, and pipeline step writes a tool_calls row before returning — enough state to reconstruct the full agent context from SELECT * FROM tool_calls WHERE run_id = ? alone. Paper content is deduplicated: one paper_content row + one cache dir + one set of chunks/vectors per unique paper, regardless of how many sessions reference it.
Full architecture lives in the SRS.
| Plan | Scope | State |
|---|---|---|
| A | Backend foundation + Router-only chat | ✅ complete |
| B | Frontend foundation (React shell, SSE, routing badge, trace panel) | ✅ complete |
| C | Paper Pipeline + Research Agent (ingest, RAG, paper_search, agentic paper_qa, MCP layer, model-server, PDF upload) | ✅ complete — merged (SRS v2.10) |
| D | Search results + Reference Sources + Citation Canvas (HTML + PDF passage highlighting) | ✅ complete — merged (SRS v2.13) |
| E | SQL Agent + library_stats (sqlite MCP) + session/global memory governance (gate, conflict-supersede, Memory Manager UI) |
✅ complete — merged (SRS v2.17) |
| F | Slide Pipeline + Report Agent | 🔜 planned |
| G | Compare view + filesystem / paperhub.* MCP |
🔜 planned |
Each plan ships working, testable software on its own. Plans live under docs/superpowers/plans/.
PaperHub is built spec → plan → TDD, with subagent-driven implementation and per-task spec-compliance + code-quality review.
Backend gates (from backend/):
uv run pytest # 602 tests, hermetic
uv run ruff check src tests
uv run mypy src # --strictFrontend gates (from frontend/):
npm test # Vitest + RTL + MSW (217 tests)
npm run typecheck # tsc --strict
npm run lint # ESLint flat config
npm run build # Vite production buildReplay any past chat turn from SQLite (debugging the agent flow):
cd backend
uv run paperhub-replay --run-id 1Contributing AI agents: read CLAUDE.md first — it carries the conventions, the fix-now policy, and the agent-flow observability rules.
.
├── backend/
│ ├── src/paperhub/ # FastAPI app · agents · pipelines · rag · mcp · modelserver · tracer
│ ├── tests/ # pytest suite (602 tests, hermetic)
│ └── pyproject.toml # uv project · mypy --strict · ruff
├── frontend/ # React 19 + Vite + Tailwind + Zustand
├── docs/superpowers/
│ ├── specs/ # SRS — authoritative architecture document
│ └── plans/ # implementation plans, one per sub-project
├── reference/ # copied source from paper2slides-plus + Intro2GenAI-hw1
├── CLAUDE.md # AI-agent orientation for this repo
└── README.md
workspace/ (gitignored) holds runtime state — the SQLite database, the papers cache, and the Chroma index.
- System Requirements Specification — authoritative architecture, schema, scope, and acceptance criteria (currently v2.17).
- Implementation plans — one per sub-project, each executed via TDD.
- Backend developer docs — backend-specific notes.
Apache License 2.0 — © PaperHub contributors. You may use, modify, and distribute this software under the terms of the license, which includes an express grant of patent rights from contributors.