Feature(LiteLLM): Wire LiteLLM Docker profile with application config overrides (provider and embedder)#533
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces LiteLLM-specific configuration files (embedder.litellm.json and generator.litellm.json) and mounts them as overrides in docker-compose-litellm.yml. Feedback is provided to address two critical configuration issues: first, the num_ctx parameter should be removed from the litellm provider models in generator.litellm.json to prevent a runtime TypeError when calling the OpenAI-based client; second, the initialize_kwargs block in embedder.litellm.json should be removed to avoid passing literal placeholder strings when environment variables are unset, allowing the client to safely fall back to its native environment lookups.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| "qwen3:1.7b": { | ||
| "temperature": 0.7, | ||
| "top_p": 0.8, | ||
| "num_ctx": 32000 | ||
| }, | ||
| "llama3:8b": { | ||
| "temperature": 0.7, | ||
| "top_p": 0.8, | ||
| "num_ctx": 8000 | ||
| }, | ||
| "qwen3:8b": { | ||
| "temperature": 0.7, | ||
| "top_p": 0.8, | ||
| "num_ctx": 32000 | ||
| } |
There was a problem hiding this comment.
The litellm provider models include the num_ctx parameter. Since LiteLLMClient inherits from OpenAIClient and utilizes the standard openai Python SDK under the hood, passing non-standard parameters like num_ctx directly in model_kwargs will cause a TypeError (unexpected keyword argument) client-side when calling chat.completions.create.
Additionally, the context window size (num_ctx) is typically configured on the LiteLLM server side rather than per-request. Removing num_ctx and correcting the indentation to match the rest of the file (10 spaces for properties) resolves this runtime risk and keeps the configuration clean.
"qwen3:1.7b": {
"temperature": 0.7,
"top_p": 0.8
},
"llama3:8b": {
"temperature": 0.7,
"top_p": 0.8
},
"qwen3:8b": {
"temperature": 0.7,
"top_p": 0.8
}| "client_class": "LiteLLMClient", | ||
| "initialize_kwargs": { | ||
| "api_key": "${LITELLM_API_KEY}", | ||
| "base_url": "${LITELLM_BASE_URL}" | ||
| }, | ||
| "batch_size": 10, |
There was a problem hiding this comment.
The initialize_kwargs block explicitly passes api_key and base_url using environment variable placeholders. However, if these environment variables are not set, the configuration loader (replace_env_placeholders in api/config.py) will leave the literal placeholder strings "${LITELLM_API_KEY}" and "${LITELLM_BASE_URL}" intact. This will cause LiteLLMClient to initialize with these invalid literal strings instead of falling back to its built-in defaults or environment variable lookups.
Since LiteLLMClient already natively handles retrieving LITELLM_API_KEY and LITELLM_BASE_URL from the environment (with sensible fallbacks like "dummy" and "http://localhost:4000"), you can safely remove the initialize_kwargs block entirely to make the configuration more robust and less redundant.
"client_class": "LiteLLMClient",
"batch_size": 10,
Summary
This PR is part of a 4-PR integration effort to add LiteLLM support to DeepWiki-Open while maintaining full backward compatibility.
PR Series
Summary (this PR)
This PR completes the deployment-layer integration for LiteLLM by wiring the Docker Compose setup to application-level provider and embedding configurations.
It ensures that when LiteLLM is enabled via Docker, the correct provider configuration and model registry are automatically applied.
🔧 Changes
Configuration Layer
generator.litellm.jsonwith full provider registry for:embedder.litellm.jsonfor LiteLLM-based embedding configurationDocker Integration
docker-compose-litellm.ymlgenerator.litellm.json → generator.jsonembedder.litellm.json → embedder.jsonProvider Alignment
🧠 Design Notes
🔄 Compatibility
This change is fully backward compatible:
🧪 Testing
Tested with:
📝 Notes for Reviewers
This PR focuses strictly on deployment + configuration wiring.
A documentation PR will follow to explain usage and setup flows.