An open-source tool-calling LLM agent for constructing VEuPathDB search strategies.
How Underspecified Prompts Shape Tool-Calling LLM Agents in Scientific Workflows
PathFinder’s goal is to make complex query/strategy construction easier, faster, and more reliable by combining:
- Unified agent (a single agent that researches, plans, and executes as needed per turn)
- Execution with real tools (build/edit a real strategy graph via validated tool calls)
- Catalog grounding (live WDK catalog + optional Qdrant RAG for fast discovery and examples)
This project is intended to be integrated with VEuPathDB systems in the future once the research prototype is sufficiently mature.
This repo is organized as:
apps/api/: FastAPI backend (“Pathfinder API”)- SSE chat endpoint (
/api/v1/chat) streams agent output and tool events. - A single unified agent that can research, plan, and execute tool calls -- the model decides which capability to use on each turn.
- SSE chat endpoint (
apps/web/: Next.js UI- Chat UI with strategy graph visualization, step editing, and result panes.
- Workbench for gene set management and multi-panel analysis (enrichment, distributions, cross-validation).
- Proxies API routes via Next rewrites (see
apps/web/next.config.js).
packages/shared-ts/: shared TypeScript types (and OpenAPI tooling)- The web app imports types via TS path mapping to
packages/shared-ts/src(seeapps/web/tsconfig.json).
- The web app imports types via TS path mapping to
packages/shared-py/: shared Pydantic models (Python)packages/spec/: OpenAPI spec (packages/spec/openapi.yaml)
The API also includes: gene set management, evaluation engine (metrics, cross-validation, enrichment), export tools, model catalog with token metrics, and workbench chat.
PathFinder uses a single unified agent that has access to all tools (research, planning, and execution) and decides which to invoke on each turn. The model uses its judgment to:
- Research: explore the catalog, clarify ambiguous goals, discover record types / searches / parameters
- Plan: save planning artifacts (markdown summaries, assumptions, parameter choices), reason about strategy structure
- Execute: create/update strategy graph steps via tool calls, validate parameters against WDK search specs, run multi-step builds using delegation (sub-agent orchestration)
The API streams Server-Sent Events (SSE) for:
- assistant deltas and final messages
- tool call start/end (including tool results)
- “derived” UI events emitted from tool results (e.g., planning artifacts, citations, graph snapshots)
Key entrypoints:
- API app:
apps/api/src/veupath_chatbot/main.py - Chat orchestration:
apps/api/src/veupath_chatbot/services/chat/orchestrator.py - SSE streaming:
apps/api/src/veupath_chatbot/transport/http/streaming.py - Unified tool registry:
apps/api/src/veupath_chatbot/ai/tools/unified_registry.py - Graph step creation + validation:
apps/api/src/veupath_chatbot/ai/tools/strategy_tools/step_ops.py
PathFinder can discover catalog/search metadata via:
- Live WDK calls (authoritative)
- Qdrant RAG (fast semantic retrieval; may be stale/incomplete)
RAG is controlled by a single setting named rag_enabled in apps/api/config.toml (see apps/api/src/veupath_chatbot/platform/config.py).
- Docker (recommended for Postgres, Qdrant, and the full stack)
- Python 3.14+
- Node.js 24+
Enable local formatting/linting hooks so issues are caught before push:
cd apps/api
uv sync --extra dev
cd ../..
yarn install
uv run pre-commit install --hook-type pre-commit --hook-type pre-pushThere are two configuration sources for the API:
- TOML:
apps/api/config.toml(checked in) - Environment:
.env(not checked in; examples exist)
Examples:
apps/api/.env.exampleapps/web/.env.example
Docker Compose will pick up variables from a repo-root .env file (if present) and/or your shell environment. In practice your .env should contain at least:
- API
API_SECRET_KEY(32+ chars)- At least one LLM provider key:
OPENAI_API_KEY— default provider (gpt-4.1); also required for RAG embeddings (text-embedding-3-small)ANTHROPIC_API_KEY— use withchat_provider=anthropic(default model:claude-sonnet-4-6)GEMINI_API_KEY— use withchat_provider=gemini(default model:gemini-2.5-pro)- Ollama (local) — no key needed; set
OLLAMA_BASE_URLand add models toollama_models.yaml
- Web
NEXT_PUBLIC_API_URL=http://localhost:8000
- Optional / common
DATABASE_URL(defaults to PostgreSQL onlocalhost:5432if unset)QDRANT_URL/QDRANT_API_KEY(only needed if you’re not using the docker-compose defaults)OLLAMA_BASE_URL(defaulthttp://localhost:11434/v1; usehttp://host.docker.internal:11434/v1when running the API inside Docker)- Startup-ingestion tuning is configured in
apps/api/config.toml(keys:rag_startup_*)
PathFinder supports local LLMs via Ollama. To add local models:
- Install and start Ollama (
ollama serve). - Pull any models you want (e.g.
ollama pull qwen3:8b). - Copy the example config and edit it:
cp ollama_models.yaml.example ollama_models.yamlEach entry in ollama_models.yaml specifies:
| Field | Required | Description |
|---|---|---|
model |
yes | Ollama model name (e.g. qwen3:8b, llama3) |
name |
no | Display name in the UI (defaults to model name) |
thinking |
no | Whether the model supports reasoning (default false) |
context_size |
no | Max context window in tokens (default 4096) |
Example:
models:
- model: qwen3:8b
name: Qwen 3 8B
thinking: true
context_size: 40960
- model: llama3
name: Llama 3
context_size: 8192When running the API inside Docker, set OLLAMA_BASE_URL=http://host.docker.internal:11434/v1 in your .env so the container can reach Ollama on the host.
From repo root:
docker compose up --build- Web:
http://localhost:3000 - API:
http://localhost:8000- Docs:
http://localhost:8000/docs - Health:
http://localhost:8000/health
- Docs:
Notes:
- Compose includes Postgres, Redis, and Qdrant by default.
PathFinder supports RAG for:
- WDK catalog ingestion (record types + searches)
- Example plans ingestion (public strategies → searchable examples)
By default, ingestion runs automatically in the API at startup (in the background) when:
rag_enabled=true(default)OPENAI_API_KEYis set (required for embeddings)
Manual ingestion is usually unnecessary unless you want to reset/rebuild the collections.
# Full reset + rebuild of Qdrant collections (WDK + example plans)
docker compose --profile ingest run --rm rag_reindexNotes:
- Both jobs require
OPENAI_API_KEY(embeddings). - The manual reindex writes a JSONL report under
apps/api/ingest_reports/(gitignored).
API:
cd apps/api
uv sync --extra dev
uv run uvicorn veupath_chatbot.main:app --reload --host 0.0.0.0 --port 8000If you’re not running the full stack via Docker Compose, you still need local services:
docker compose up -d db redis qdrantWeb:
cd apps/web
yarn install
yarn devcd apps/api
uv run pytest
uv run ruff check .
uv run mypy srccd apps/web
yarn lint
yarn typecheck
yarn testAPI docs: veupathdb-pathfinder.readthedocs.io
API documentation is built with Sphinx and covers architecture, agents, tools, and modules. A .readthedocs.yaml config is included for hosting on Read the Docs.
cd apps/api
uv sync --extra dev
uv run sphinx-build -b html docs docs/_build/htmlOpen apps/api/docs/_build/html/index.html in a browser.
- OpenAPI spec:
packages/spec/openapi.yaml - Generate/update shared TS types from the spec:
cd packages/shared-ts
yarn install
yarn generate:openapiThe web app also uses path-based imports for shared TS types (see apps/web/tsconfig.json) and Next transpilation settings (apps/web/next.config.js).
PathFinder is a research-driven prototype. These are the biggest gaps you should expect today:
- CD (deployment pipelines): CI (
.github/workflows/ci.yml) and a security scan workflow exist, but there is no continuous deployment pipeline yet. - Contribution docs: no
CONTRIBUTING.md, no governance/release process. - Production hardening: no documented deployment path (containers, reverse proxy, secrets management)
- Database migrations: Alembic is set up with 4 migrations, but schema creation still relies on SQLAlchemy
create_all; Alembic is not yet used as the primary migration workflow. - Evaluation (thesis): an evaluation framework exists in
thesis/eval/(gold strategies, prompts, analysis scripts), but reproducible experiment packaging and benchmarks are still in progress.
PathFinder is built around the idea that ambiguous or underspecified requests are normal when humans describe complex strategies. The system therefore emphasizes:
- integrated planning (artifacts, structured reasoning, and delegation -- all within a single agent rather than a separate mode)
- catalog grounding (reduce hallucinated tool names/parameters)
- validation and error shaping (turn tool failures into actionable, structured feedback)
- decomposition + delegation (break complex goals into smaller strategy subproblems)
PathFinder builds on:
- VEuPathDB / WDK concepts and APIs (strategy graphs, searches, parameter specs)
- FastAPI (API) and Next.js (web UI)
- Kani for tool-calling agent orchestration