A lightweight local proxy that sits between Claude Code and LM Studio, enabling fully local, zero-cloud agentic coding with automatic model routing, enabling usage of a supporting vision capable model.
- Text/code requests →
qwen/qwen3-coder-next(primary coder) - Image/vision requests →
mistralai/devstral-small-2-2512(vision + coding)
Once images are analyzed and cached, all subsequent turns automatically route back to the coder model — with image descriptions injected as text so the coder retains full visual context without ever receiving raw image data.
IMPORTANT: Image description cache is in-memory — lost when the router restarts! Start a new Claude Code session after restarting the router! Or tell it to recheck the images as they have "changed".
Claude Code
│
▼
claude-router :3000
├── last user message has uncached images? ──► Devstral (vision)
│ │
│ caches description
│ │
└── text / cached images ◄──────────────────────────┘
│
▼ (images replaced with <<IMAGE DESCRIPTION: ...>>)
Qwen3 Coder Next
│
▼
LM Studio :1234
│
▼
your machine
zero cloud
The router applies several transforms to work around prompt template limitations of local models:
| Transform | Purpose |
|---|---|
rewriteForVision |
Extracts images from tool_result blocks, injects synthetic assistant/user turn |
substituteImagesWithDescriptions |
Replaces raw images in coder-bound requests with cached text descriptions |
normalizeAssistantMessages |
Merges interleaved text+tool_use blocks (required by Devstral's jinja template) |
fixMessageAlternation |
Merges consecutive same-role messages to enforce strict alternation |
Strip thinking / betas |
Removes Claude-specific params unsupported by local models |
- Node.js 18+
- LM Studio 0.4.1+ (for Anthropic-compatible
/v1/messagesendpoint) - Claude Code installed globally
- Models loaded in LM Studio:
qwen/qwen3-coder-nextmistralai/devstral-small-2-2512
Comfy case, otherwise just run like described in Usage.2 "Start the router" (node ~/bin/claude-router.mjs)
# 1. Place the router in your personal bin
mkdir -p ~/bin
cp claude-router.mjs ~/bin/claude-router.mjs
chmod +x ~/bin/claude-router.mjs
# 2. Make sure ~/bin is on your PATH (add to ~/.zshrc if needed)
export PATH="$HOME/bin:$PATH"lms load qwen/qwen3-coder-next --gpu max
lms load mistralai/devstral-small-2-2512 --gpu max
# Verify both are running
lms psOr just do it in the GUI..
Tip: Devstral defaults to 4096 context. Bump it for longer sessions:
lms load mistralai/devstral-small-2-2512 --context-length 32768 --gpu max
node ~/bin/claude-router.mjsExpected output:
🔀 claude-router on :3000
coder → qwen/qwen3-coder-next
vision → mistralai/devstral-small-2-2512
CRUDE TO BE HONEST!
Create a launcher script ~/bin/claude-local:
#!/bin/bash
unset ANTHROPIC_API_KEY
ANTHROPIC_BASE_URL="http://127.0.0.1:3000" \
ANTHROPIC_AUTH_TOKEN="lmstudio" \
ANTHROPIC_API_KEY="" \
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 \
claude "$@"chmod +x ~/bin/claude-localThen use claude-local instead of claude for local sessions:
claude-local # uses LM Studio via router
claude # unchanged — still uses Anthropic serversEdit the constants at the top of claude-router.mjs:
const CODER_MODEL = "qwen/qwen3-coder-next"; // model for code tasks
const VISION_MODEL = "mistralai/devstral-small-2-2512"; // model for image tasks
const PROXY_PORT = 3000; // router portLM Studio is assumed to be running on 127.0.0.1:1234. To change this, update the forward() function:
hostname: "127.0.0.1",
port: 1234,| Condition | Model |
|---|---|
| Last user message contains uncached images | VISION_MODEL |
| Last user message contains only cached images | CODER_MODEL + descriptions injected |
| Last user message is text only | CODER_MODEL |
Images are considered "cached" once Devstral has analyzed them and returned a text description. The cache persists for the lifetime of the router process (in-memory, per session).
→ coder [qwen/qwen3-coder-next]
→ coder [qwen/qwen3-coder-next]
→ vision [mistralai/devstral-small-2-2512]
cached description for 3 image(s)
→ coder [qwen/qwen3-coder-next]
→ coder [qwen/qwen3-coder-next]
| Component | Version |
|---|---|
| LM Studio | 0.4.10 |
| Coder model | qwen/qwen3-coder-next (44.86 GB, 131072 ctx) |
| Vision model | mistralai/devstral-small-2-2512 (14.12 GB) |
| Node.js | 25.9.0 |
- Image description cache is in-memory — lost when the router restarts. Start a new Claude Code session after restarting the router.
- Streaming responses from Devstral may not be parseable as JSON for caching. If no description is cached, images fall back to
<<IMAGE: binary image file>>in coder history. - Not a production proxy — no TLS, no auth, no rate limiting. Intended for local single-user use only.
- Model name must match exactly what
lms psreports. A mismatch silently falls back to whatever LM Studio has loaded as default.
MIT