LCORE-1853: Add relevance cutoff score to inline BYOK RAG#1720
Conversation
|
Too many files changed? Review this PR in Change Stack to see how the pieces fit before you dive in. Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (11)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
🧰 Additional context used📓 Path-based instructions (5)src/**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
src/**/configuration.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
src/constants.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
tests/**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
src/models/**/*.py📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧠 Learnings (5)📓 Common learnings📚 Learning: 2026-06-03T11:15:41.987ZApplied to files:
📚 Learning: 2026-01-12T10:58:40.230ZApplied to files:
📚 Learning: 2026-02-25T07:46:33.545ZApplied to files:
📚 Learning: 2026-03-17T11:34:53.242ZApplied to files:
🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txtdocs/openapi.json[error] 1-1: docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json (diff -u detected mismatch against /tmp/openapi-generated.json) 🪛 GitHub Actions: OpenAPI (Spectral) / spectraldocs/openapi.json[error] 1-1: OpenAPI schema is out of date. CI diff check failed for docs/openapi.json vs generated schema. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json 🪛 markdownlint-cli2 (0.22.1)docs/byok_guide.md[warning] 93-93: Blank line inside blockquote (MD028, no-blanks-blockquote) [warning] 167-167: Headings should be surrounded by blank lines (MD022, blanks-around-headings) [warning] 175-175: Headings should be surrounded by blank lines (MD022, blanks-around-headings) [warning] 409-409: Blank line inside blockquote (MD028, no-blanks-blockquote) 🛑 Comments failed to post (4)
🔇 Additional comments (14)
WalkthroughAdds per-source BYOK RAG relevance_cutoff_score (default 0.3), exposes AppConfig.relevance_cutoff_mapping, threads per-store score_threshold into vector search calls for Inline RAG, updates OpenAPI and docs, and adds tests covering defaults, validation, config dumping, and vector query propagation/filtering. ChangesBYOK RAG Relevance Cutoff Configuration
🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers:
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/openapi.json`:
- Around line 11809-11815: The OpenAPI schema file docs/openapi.json is out of
sync and should be regenerated rather than edited; run the provided generator
script (uv run scripts/generate_openapi_schema.py docs/openapi.json) to recreate
the schema, then verify the relevance_cutoff_score field in the regenerated JSON
has the correct default value that matches
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE from src/constants.py and commit the
regenerated file.
In `@src/utils/vector_search.py`:
- Around line 29-34: Update the docstring for _relevance_cutoff_for_vector_store
to follow Google Python conventions: add a brief description line, then a
Parameters section documenting vector_store_id (str) and what it represents, and
a Returns section describing the returned float (either the matched
brag.relevance_cutoff_score or
constants.DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE). If the function can raise
any exceptions, add a Raises section; otherwise omit Raises. Ensure the
docstring text references the function name and types only (no code) and matches
the existing one-line summary.
In `@tests/unit/models/config/test_dump_configuration.py`:
- Line 11: Replace the top-level import "import constants" with explicit
from-imports for the names actually used in this test (e.g., "from constants
import FOO, BAR" — substitute the real constant names referenced in this file)
and update usages that refer to constants.<NAME> (including the reference around
line 996) to use the direct names (FOO, BAR, etc.) so the import style matches
the existing "from X import Y" pattern used elsewhere in the file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f0c9945a-ead4-4245-9d4b-99ea45582ed7
📒 Files selected for processing (8)
docs/byok_guide.mddocs/openapi.jsonsrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: black
- GitHub Check: unit_tests (3.13)
- GitHub Check: Pylinter
- GitHub Check: radon
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/models/config.pysrc/constants.pysrc/utils/vector_search.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
src/constants.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
constants.pyfor shared constants with descriptive comments and type hints usingFinal[type]
Files:
src/constants.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.pytests/unit/models/config/test_byok_rag.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt
docs/openapi.json
[error] 1-1: OpenAPI schema out of date. diff detected changes between docs/openapi.json and /tmp/openapi-generated.json. Regenerate with: 'uv run scripts/generate_openapi_schema.py docs/openapi.json'.
🪛 GitHub Actions: OpenAPI (Spectral) / spectral
docs/openapi.json
[error] 1-1: OpenAPI schema is out of date. 'diff -u docs/openapi.json /tmp/openapi-generated.json' failed; regenerate with: 'uv run scripts/generate_openapi_schema.py docs/openapi.json'.
🪛 markdownlint-cli2 (0.22.1)
docs/byok_guide.md
[warning] 89-89: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🔇 Additional comments (15)
src/constants.py (1)
196-198: LGTM! Well-defined constant for BYOK RAG relevance filtering.The constant follows the established pattern with proper type hints and descriptive comment. The default value of 0.3 is reasonable for similarity-based filtering.
src/models/config.py (1)
1636-1643: LGTM! Proper Pydantic field definition for relevance cutoff.The field is correctly configured with:
- Appropriate type annotation and default value from constants
gt=0validation ensuring positive cutoff values- Clear description specifying "raw similarity score" (before multipliers)
The validation constraint requiring values > 0 prevents disabling the cutoff entirely, which aligns with the PR requirement for a "positive float."
tests/unit/models/config/test_dump_configuration.py (1)
996-996: LGTM! Test correctly validates default value serialization.The test properly verifies that the new
relevance_cutoff_scorefield is included in the dumped configuration with the correct default value from constants.tests/unit/models/config/test_byok_rag.py (3)
7-7: LGTM!The import of
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCOREis appropriate for testing the default value behavior.
39-39: LGTM!The test assertions properly validate both the default and custom
relevance_cutoff_scorevalues. The choice of 0.72 as a test value provides good coverage.Also applies to: 59-59, 68-68
208-220: LGTM!The validation test correctly verifies the
gt=0constraint onrelevance_cutoff_score. Testing the boundary case with0.0provides good coverage.src/utils/vector_search.py (2)
186-213: LGTM!The function correctly accepts and forwards the
score_thresholdparameter toclient.vector_io.query. The docstring is properly updated to document the new parameter.
418-427: LGTM!The implementation correctly retrieves the per-vector-store
relevance_cutoff_scoreand passes it to the query function. Each store in the parallel queries will use its own configured cutoff threshold.docs/byok_guide.md (4)
82-82: LGTM!The documentation clearly explains the
relevance_cutoff_scorefeature, including its purpose, default value, and how it's applied during Inline RAG retrieval.
297-298: LGTM!The YAML example clearly demonstrates how to configure
relevance_cutoff_score, including helpful comments about the default value and scope.
304-306: LGTM!The clarification about score space and per-store tuning is valuable guidance for users configuring this feature.
333-336: LGTM!The comparison table clearly shows that
relevance_cutoff_scoreapplies only to Inline RAG (BYOK), providing a helpful reference for users choosing between modes.tests/unit/utils/test_vector_search.py (3)
430-432: LGTM!The mock configurations are consistently updated to include
relevance_cutoff_score, using the default constant value appropriately.Also applies to: 472-474, 516-518, 522-524, 752-754
500-504: LGTM!The assertion correctly verifies that
score_thresholdis passed tovector_io.queryalong with the other expected parameters.
561-599: LGTM!The new test explicitly verifies that a custom
relevance_cutoff_scoreconfiguration is correctly passed tovector_io.queryasscore_threshold. This provides comprehensive coverage for the feature.
7187d0b to
8e966b0
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/byok_guide.md`:
- Around line 84-94: Remove the empty line between the two adjacent blockquote
note blocks so they are back-to-back (i.e., merge the `> [!NOTE]` block about
OKP/BYOK `score_multiplier` and chunk constants with the following `> [!NOTE]`
block about `relevance_cutoff_score`) to satisfy MD028; locate the two `>
[!NOTE]` paragraphs in the byok guide and delete the blank line separating them
so markdownlint no longer reports the blockquote-blank rule.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: dc0b1607-f348-4610-b49a-bace5ce7b17c
📒 Files selected for processing (8)
docs/byok_guide.mddocs/openapi.jsonsrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: spectral
- GitHub Check: Pylinter
- GitHub Check: build-pr
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/models/config/test_dump_configuration.pytests/unit/models/config/test_byok_rag.pytests/unit/utils/test_vector_search.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/constants.pysrc/models/config.pysrc/utils/vector_search.py
src/constants.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
constants.pyfor shared constants with descriptive comments and type hints usingFinal[type]
Files:
src/constants.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🪛 markdownlint-cli2 (0.22.1)
docs/byok_guide.md
[warning] 89-89: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🔇 Additional comments (4)
src/constants.py (1)
196-198: Good addition of shared BYOK relevance cutoff constant.
Typed withFinal[float], clearly named, and placed in the correct constants section.tests/unit/models/config/test_dump_configuration.py (2)
11-11: Import update looks correct and consistent with this test module.
1028-1028: Expected dump payload correctly includesrelevance_cutoff_score.
This strengthens coverage for the new BYOK RAG config field serialization path.docs/openapi.json (1)
11810-11815: Default value is correct.The default value of
0.3matches theDEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCOREconstant fromsrc/constants.py. The schema definition withexclusiveMinimum: 0.0correctly enforces positive values and the description accurately reflects the filtering behavior.
8e966b0 to
fee7578
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/byok_guide.md (1)
89-89:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winRemove the blank blockquote line to satisfy markdownlint MD028.
The empty blockquote line between the two adjacent
> [!NOTE]blocks still triggers theMD028 (no-blanks-blockquote)rule violation that was previously flagged.📝 Proposed fix
> context, set the `BYOK_RAG_MAX_CHUNKS` and `OKP_RAG_MAX_CHUNKS` constants in `src/constants.py` > (defaults: 10 and 5 respectively). For Tool RAG, use `TOOL_RAG_MAX_CHUNKS` (default: 10). -> > [!NOTE] > `relevance_cutoff_score` applies to Inline RAG only. When the model uses Tool RAG (`file_search`),🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/byok_guide.md` at line 89, Remove the empty blockquote line between the two adjacent blockquote markers so MD028 is satisfied: find the consecutive lines starting with "> [!NOTE]" and delete the blank line (the standalone ">"/empty blockquote) separating them so the two note blocks are directly adjacent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/openapi.json`:
- Around line 11809-11815: Regenerate the out-of-sync OpenAPI schema file
docs/openapi.json by running the project script
scripts/generate_openapi_schema.py (via the project's runner e.g., `uv run`)
instead of editing the JSON manually; after regeneration verify the
relevance_cutoff_score default in docs/openapi.json matches the
DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE constant defined in src/constants.py,
then commit the regenerated docs/openapi.json.
---
Duplicate comments:
In `@docs/byok_guide.md`:
- Line 89: Remove the empty blockquote line between the two adjacent blockquote
markers so MD028 is satisfied: find the consecutive lines starting with ">
[!NOTE]" and delete the blank line (the standalone ">"/empty blockquote)
separating them so the two note blocks are directly adjacent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d4d39bf3-18f3-4c52-9c75-cdb4a6f534b2
📒 Files selected for processing (8)
docs/byok_guide.mddocs/openapi.jsonsrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: build-pr
- GitHub Check: integration_tests (3.12)
- GitHub Check: Pylinter
- GitHub Check: unit_tests (3.12)
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/constants.pysrc/models/config.pysrc/utils/vector_search.py
src/constants.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
constants.pyfor shared constants with descriptive comments and type hints usingFinal[type]
Files:
src/constants.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/utils/test_vector_search.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt
docs/openapi.json
[error] 1-1: docs/openapi.json is out of date. Step checks openapi schema via uv run python scripts/generate_openapi_schema.py /tmp/openapi-generated.json and diff -u docs/openapi.json /tmp/openapi-generated.json; diff failed. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json.
🪛 GitHub Actions: OpenAPI (Spectral) / spectral
docs/openapi.json
[error] 1-1: docs/openapi.json is out of date. Diff detected between docs/openapi.json and generated /tmp/openapi-generated.json. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json
🔇 Additional comments (11)
src/constants.py (1)
196-197: LGTM!src/models/config.py (1)
1721-1727: LGTM!tests/unit/models/config/test_dump_configuration.py (1)
11-11: LGTM!Also applies to: 1028-1028
tests/unit/models/config/test_byok_rag.py (1)
7-7: LGTM!Also applies to: 39-39, 59-59, 68-68, 208-220
src/utils/vector_search.py (2)
29-44: LGTM!
196-223: LGTM!Also applies to: 434-434
docs/byok_guide.md (1)
82-82: LGTM!Also applies to: 297-298, 303-306, 333-336, 586-586
tests/unit/utils/test_vector_search.py (4)
3-5: LGTM!Also applies to: 26-26, 31-54
460-462: LGTM!Also applies to: 502-504, 530-535, 546-548, 552-554, 882-884
591-629: LGTM!
631-674: LGTM!Also applies to: 676-729
|
/ok-to-test |
|
Related to #1525 |
are-ces
left a comment
There was a problem hiding this comment.
LGTM overall TYVM!
I have a couple of nits which you can ignore but I would like to see some integration test for this functionality
| vector_store_id, | ||
| query, | ||
| score_multiplier_mapping.get(vector_store_id, 1.0), | ||
| _relevance_cutoff_for_vector_store(vector_store_id), |
There was a problem hiding this comment.
Tiny nit from SWE perspective. Just wondering: since score multiplier and relevance score have the same data structure, would it make sense to retrieve cutoff score the same way?
| ) | ||
|
|
||
|
|
||
| def test_byok_rag_configuration_relevance_cutoff_must_be_positive() -> None: |
There was a problem hiding this comment.
Nit, do we also want to test behavior if it is negative?
fee7578 to
51e6d31
Compare
|
@are-ces PTAL |
87c28b5 to
f6deca8
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/utils/vector_search.py (2)
568-573: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueStale
max_chunksentry in_fetch_solr_ragdocstring.The Args section documents a
max_chunksparameter (Lines 572-573) that does not exist in the signature(client, query, solr). Remove it to keep the docstring consistent with the function.📝 Proposed docstring fix
solr: Structured Solr inline RAG request from the API (optional). - max_chunks: Maximum number of chunks to return. If None, uses - constants.OKP_RAG_MAX_CHUNKS.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/utils/vector_search.py` around lines 568 - 573, The docstring for _fetch_solr_rag incorrectly documents a non-existent parameter max_chunks; update the Args section of the _fetch_solr_rag docstring to remove the max_chunks entry so it matches the function signature (client, query, solr) and keep the existing descriptions for client, query, and solr unchanged.
248-282:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix BYOK RAG docstring parameter order in
_query_store_for_byok_rag
- The docstring lists
max_chunksbeforescore_threshold, but the signature is(..., score_threshold: float, max_chunks: int = ...). Swap theArgs:entries to match.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/utils/vector_search.py` around lines 248 - 282, The docstring for _query_store_for_byok_rag lists the parameters in the wrong order (shows max_chunks before score_threshold) which doesn't match the function signature; update the Args: section of the _query_store_for_byok_rag docstring so score_threshold appears before max_chunks and adjust their descriptions to match the signature and default (constants.BYOK_RAG_MAX_CHUNKS).
♻️ Duplicate comments (1)
docs/openapi.json (1)
1-19424:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winRegenerate
docs/openapi.jsonfrom source-of-truth instead of editing this artifact manually.This file is out of sync with the generated schema, and CI is currently failing on the spectral gate. Please regenerate and commit the generated output, otherwise this PR cannot merge.
🔧 Required fix
-# Manually edited OpenAPI artifact (drifted) +# Regenerated OpenAPI artifact from codeuv run scripts/generate_openapi_schema.py docs/openapi.json🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/openapi.json` around lines 1 - 19424, The openapi.json file is out of sync with the generated schema and manual edits to it have caused CI failures. To fix this, regenerate the openapi.json file from the source-of-truth schema using the provided generation script. Specifically, run the command `uv run scripts/generate_openapi_schema.py docs/openapi.json` to produce an up-to-date and consistent openapi.json artifact, then commit the regenerated file to the PR.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pyproject.toml`:
- Around line 229-231: Revert the repo-wide pylint relaxation in pyproject.toml
by removing or restoring the [tool.pylint.design] max-args and
max-positional-arguments overrides so the global "too-many-arguments" rules
remain strict, and instead add a targeted suppression or refactor for the BYOK
RAG helper: open src/utils/vector_search.py and either refactor async function
_query_store_for_byok_rag(...) to reduce its parameters or add a local "#
pylint: disable=too-many-arguments" (or "# pylint:
disable=too-many-positional-arguments") immediately above that function
declaration to allow the exception only there. Ensure no other global changes
are made so existing local suppressions (e.g., in response_generator,
_check_shield_moderation, prepare_tools, build_turn_summary,
create_transcript_metadata) remain intact.
In `@src/models/config.py`:
- Around line 2351-2387: The validator currently only checks self.rag.inline for
OKP (constants.OKP_RAG_ID) but the docstring/comments say inline OR tool should
trigger auto-enable; update the predicate by changing the has_okp computation in
validate_reranker_auto_enable to consider both self.rag.inline and self.rag.tool
(e.g., check constants.OKP_RAG_ID in either collection) and adjust the log
message/comment to say "inline or tool" as needed; keep the existing
protected-access and pylint disables around reranker._explicitly_configured
unchanged.
- Around line 2018-2023: The Field default for the "model" attribute in the
config schema is hardcoded to "cross-encoder/ms-marco-MiniLM-L6-v2" which
duplicates the shared constant; update the default to reference the shared
constant DEFAULT_CROSS_ENCODER_MODEL from the constants module (import it at the
top if necessary) so the schema default and runtime default stay in sync, i.e.,
replace the literal string used in the Field default with
constants.DEFAULT_CROSS_ENCODER_MODEL and ensure the import for
DEFAULT_CROSS_ENCODER_MODEL is present where the Field is defined.
In `@src/utils/vector_search.py`:
- Around line 107-119: The docstring in the Solr vector_io query builder in
src/utils/vector_search.py contains two Returns sections; collapse them into a
single Google-style Returns section that describes the returned dict and its
keys (mode, filters, and any other params) and remove the duplicated/malformed
Returns block so the docstring follows Google conventions for the function that
builds query parameters for vector_io.query.
---
Outside diff comments:
In `@src/utils/vector_search.py`:
- Around line 568-573: The docstring for _fetch_solr_rag incorrectly documents a
non-existent parameter max_chunks; update the Args section of the
_fetch_solr_rag docstring to remove the max_chunks entry so it matches the
function signature (client, query, solr) and keep the existing descriptions for
client, query, and solr unchanged.
- Around line 248-282: The docstring for _query_store_for_byok_rag lists the
parameters in the wrong order (shows max_chunks before score_threshold) which
doesn't match the function signature; update the Args: section of the
_query_store_for_byok_rag docstring so score_threshold appears before max_chunks
and adjust their descriptions to match the signature and default
(constants.BYOK_RAG_MAX_CHUNKS).
---
Duplicate comments:
In `@docs/openapi.json`:
- Around line 1-19424: The openapi.json file is out of sync with the generated
schema and manual edits to it have caused CI failures. To fix this, regenerate
the openapi.json file from the source-of-truth schema using the provided
generation script. Specifically, run the command `uv run
scripts/generate_openapi_schema.py docs/openapi.json` to produce an up-to-date
and consistent openapi.json artifact, then commit the regenerated file to the
PR.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a791573e-b3ab-45c7-ba42-5465d7a60429
📒 Files selected for processing (11)
docs/byok_guide.mddocs/openapi.jsonpyproject.tomlsrc/configuration.pysrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/test_configuration.pytests/unit/utils/test_vector_search.py
💤 Files with no reviewable changes (3)
- tests/unit/test_configuration.py
- tests/unit/models/config/test_dump_configuration.py
- tests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/constants.pysrc/configuration.pysrc/models/config.pysrc/utils/vector_search.py
src/constants.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
constants.pyfor shared constants with descriptive comments and type hints usingFinal[type]
Files:
src/constants.py
src/**/configuration.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/configuration.py: All config models must extendConfigurationBasewithextra="forbid"to reject unknown fields
Use@field_validatorand@model_validatorfor custom validation in Pydantic models
Files:
src/configuration.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/models/config/test_byok_rag.py
🧠 Learnings (2)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt
docs/openapi.json
[error] 1-1: OpenAPI schema is out of date. docs/openapi.json differs from the newly generated schema. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json
🪛 GitHub Actions: OpenAPI (Spectral) / spectral
docs/openapi.json
[error] 1-1: OpenAPI schema is out of date. Pipeline diff failed between 'docs/openapi.json' and generated '/tmp/openapi-generated.json'. Regenerate with: 'uv run scripts/generate_openapi_schema.py docs/openapi.json'
🪛 markdownlint-cli2 (0.22.1)
docs/byok_guide.md
[warning] 92-92: Blank line inside blockquote
(MD028, no-blanks-blockquote)
[warning] 93-93: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🔇 Additional comments (8)
src/utils/vector_search.py (2)
270-279: LGTM!
501-523: ⚡ Quick winEnsure
relevance_cutoff_mappinguses the same keyspace as the vector store lookup ID.
src/utils/vector_search.pylooks uprelevance_cutoff_mapping.get(vector_store_id, DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE)wherevector_store_idcomes fromvector_store_ids_to_query; confirmrelevance_cutoff_mappingis keyed by the samevector_db_idnamespace. If it’s keyed by user-facingrag_idinstead, the configured per-store cutoff won’t be applied and will always fall back to the default.tests/unit/models/config/test_byok_rag.py (1)
39-39: LGTM!Also applies to: 59-68, 194-226
src/constants.py (1)
3-3: LGTM!Also applies to: 161-163, 184-205, 295-300
src/models/config.py (1)
231-240: LGTM!Also applies to: 470-527, 617-625, 704-727, 1817-1823, 2176-2180, 2237-2247
src/configuration.py (1)
17-17: LGTM!Also applies to: 31-31, 463-489, 534-550
docs/byok_guide.md (2)
92-93: MD028 blockquote spacing issue still present.This is the same markdownlint issue already reported earlier: blank lines inside adjacent blockquote notes.
82-82: LGTM!Also applies to: 89-91, 94-98, 302-303, 309-311, 338-341, 591-591
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/utils/vector_search.py (2)
568-573: 🧹 Nitpick | 🔵 Trivial | 💤 Low valueStale
max_chunksentry in_fetch_solr_ragdocstring.The Args section documents a
max_chunksparameter (Lines 572-573) that does not exist in the signature(client, query, solr). Remove it to keep the docstring consistent with the function.📝 Proposed docstring fix
solr: Structured Solr inline RAG request from the API (optional). - max_chunks: Maximum number of chunks to return. If None, uses - constants.OKP_RAG_MAX_CHUNKS.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/utils/vector_search.py` around lines 568 - 573, The docstring for _fetch_solr_rag incorrectly documents a non-existent parameter max_chunks; update the Args section of the _fetch_solr_rag docstring to remove the max_chunks entry so it matches the function signature (client, query, solr) and keep the existing descriptions for client, query, and solr unchanged.
248-282:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix BYOK RAG docstring parameter order in
_query_store_for_byok_rag
- The docstring lists
max_chunksbeforescore_threshold, but the signature is(..., score_threshold: float, max_chunks: int = ...). Swap theArgs:entries to match.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/utils/vector_search.py` around lines 248 - 282, The docstring for _query_store_for_byok_rag lists the parameters in the wrong order (shows max_chunks before score_threshold) which doesn't match the function signature; update the Args: section of the _query_store_for_byok_rag docstring so score_threshold appears before max_chunks and adjust their descriptions to match the signature and default (constants.BYOK_RAG_MAX_CHUNKS).
♻️ Duplicate comments (1)
docs/openapi.json (1)
1-19424:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winRegenerate
docs/openapi.jsonfrom source-of-truth instead of editing this artifact manually.This file is out of sync with the generated schema, and CI is currently failing on the spectral gate. Please regenerate and commit the generated output, otherwise this PR cannot merge.
🔧 Required fix
-# Manually edited OpenAPI artifact (drifted) +# Regenerated OpenAPI artifact from codeuv run scripts/generate_openapi_schema.py docs/openapi.json🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/openapi.json` around lines 1 - 19424, The openapi.json file is out of sync with the generated schema and manual edits to it have caused CI failures. To fix this, regenerate the openapi.json file from the source-of-truth schema using the provided generation script. Specifically, run the command `uv run scripts/generate_openapi_schema.py docs/openapi.json` to produce an up-to-date and consistent openapi.json artifact, then commit the regenerated file to the PR.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pyproject.toml`:
- Around line 229-231: Revert the repo-wide pylint relaxation in pyproject.toml
by removing or restoring the [tool.pylint.design] max-args and
max-positional-arguments overrides so the global "too-many-arguments" rules
remain strict, and instead add a targeted suppression or refactor for the BYOK
RAG helper: open src/utils/vector_search.py and either refactor async function
_query_store_for_byok_rag(...) to reduce its parameters or add a local "#
pylint: disable=too-many-arguments" (or "# pylint:
disable=too-many-positional-arguments") immediately above that function
declaration to allow the exception only there. Ensure no other global changes
are made so existing local suppressions (e.g., in response_generator,
_check_shield_moderation, prepare_tools, build_turn_summary,
create_transcript_metadata) remain intact.
In `@src/models/config.py`:
- Around line 2351-2387: The validator currently only checks self.rag.inline for
OKP (constants.OKP_RAG_ID) but the docstring/comments say inline OR tool should
trigger auto-enable; update the predicate by changing the has_okp computation in
validate_reranker_auto_enable to consider both self.rag.inline and self.rag.tool
(e.g., check constants.OKP_RAG_ID in either collection) and adjust the log
message/comment to say "inline or tool" as needed; keep the existing
protected-access and pylint disables around reranker._explicitly_configured
unchanged.
- Around line 2018-2023: The Field default for the "model" attribute in the
config schema is hardcoded to "cross-encoder/ms-marco-MiniLM-L6-v2" which
duplicates the shared constant; update the default to reference the shared
constant DEFAULT_CROSS_ENCODER_MODEL from the constants module (import it at the
top if necessary) so the schema default and runtime default stay in sync, i.e.,
replace the literal string used in the Field default with
constants.DEFAULT_CROSS_ENCODER_MODEL and ensure the import for
DEFAULT_CROSS_ENCODER_MODEL is present where the Field is defined.
In `@src/utils/vector_search.py`:
- Around line 107-119: The docstring in the Solr vector_io query builder in
src/utils/vector_search.py contains two Returns sections; collapse them into a
single Google-style Returns section that describes the returned dict and its
keys (mode, filters, and any other params) and remove the duplicated/malformed
Returns block so the docstring follows Google conventions for the function that
builds query parameters for vector_io.query.
---
Outside diff comments:
In `@src/utils/vector_search.py`:
- Around line 568-573: The docstring for _fetch_solr_rag incorrectly documents a
non-existent parameter max_chunks; update the Args section of the
_fetch_solr_rag docstring to remove the max_chunks entry so it matches the
function signature (client, query, solr) and keep the existing descriptions for
client, query, and solr unchanged.
- Around line 248-282: The docstring for _query_store_for_byok_rag lists the
parameters in the wrong order (shows max_chunks before score_threshold) which
doesn't match the function signature; update the Args: section of the
_query_store_for_byok_rag docstring so score_threshold appears before max_chunks
and adjust their descriptions to match the signature and default
(constants.BYOK_RAG_MAX_CHUNKS).
---
Duplicate comments:
In `@docs/openapi.json`:
- Around line 1-19424: The openapi.json file is out of sync with the generated
schema and manual edits to it have caused CI failures. To fix this, regenerate
the openapi.json file from the source-of-truth schema using the provided
generation script. Specifically, run the command `uv run
scripts/generate_openapi_schema.py docs/openapi.json` to produce an up-to-date
and consistent openapi.json artifact, then commit the regenerated file to the
PR.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a791573e-b3ab-45c7-ba42-5465d7a60429
📒 Files selected for processing (11)
docs/byok_guide.mddocs/openapi.jsonpyproject.tomlsrc/configuration.pysrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/test_configuration.pytests/unit/utils/test_vector_search.py
💤 Files with no reviewable changes (3)
- tests/unit/test_configuration.py
- tests/unit/models/config/test_dump_configuration.py
- tests/unit/utils/test_vector_search.py
📜 Review details
🔇 Additional comments (8)
src/utils/vector_search.py (2)
270-279: LGTM!
501-523: ⚡ Quick winEnsure
relevance_cutoff_mappinguses the same keyspace as the vector store lookup ID.
src/utils/vector_search.pylooks uprelevance_cutoff_mapping.get(vector_store_id, DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE)wherevector_store_idcomes fromvector_store_ids_to_query; confirmrelevance_cutoff_mappingis keyed by the samevector_db_idnamespace. If it’s keyed by user-facingrag_idinstead, the configured per-store cutoff won’t be applied and will always fall back to the default.tests/unit/models/config/test_byok_rag.py (1)
39-39: LGTM!Also applies to: 59-68, 194-226
src/constants.py (1)
3-3: LGTM!Also applies to: 161-163, 184-205, 295-300
src/models/config.py (1)
231-240: LGTM!Also applies to: 470-527, 617-625, 704-727, 1817-1823, 2176-2180, 2237-2247
src/configuration.py (1)
17-17: LGTM!Also applies to: 31-31, 463-489, 534-550
docs/byok_guide.md (2)
92-93: MD028 blockquote spacing issue still present.This is the same markdownlint issue already reported earlier: blank lines inside adjacent blockquote notes.
82-82: LGTM!Also applies to: 89-91, 94-98, 302-303, 309-311, 338-341, 591-591
🛑 Comments failed to post (4)
pyproject.toml (1)
229-231:
⚠️ Potential issue | 🟠 Major | ⚡ Quick win🧩 Analysis chain
🏁 Script executed:
#!/bin/bash set -euo pipefail python - <<'PY' from pathlib import Path import ast def counts(node): args = node.args positional = len(args.posonlyargs) + len(args.args) total = positional + len(args.kwonlyargs) return total, positional for path in sorted(Path("src").rglob("*.py")): try: tree = ast.parse(path.read_text()) except Exception as exc: print(f"PARSE_ERROR {path}: {exc}") continue for node in ast.walk(tree): if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)): total, positional = counts(node) if total > 5 or positional > 5: kind = "async def" if isinstance(node, ast.AsyncFunctionDef) else "def" print(f"{path}:{node.lineno}: {kind} {node.name} total_args={total} positional_args={positional}") PYRepository: lightspeed-core/lightspeed-stack
Length of output: 1585
🏁 Script executed:
#!/bin/bash set -euo pipefail # 1) Show the exact pylint design config in pyproject.toml around the flagged lines echo "=== pyproject.toml: tool.pylint.design context ===" python - <<'PY' from pathlib import Path p = Path("pyproject.toml") lines = p.read_text().splitlines() start, end = 200, 260 for i in range(start, min(end, len(lines))): # line numbers are 1-based print(f"{i+1}:{lines[i]}") PY # 2) Find BYOK RAG / relevance-cutoff related identifiers in src/ echo echo "=== Search: byok / rag / relevance cutoff identifiers ===" rg -n --hidden --glob '!.venv/**' --glob '!.git/**' \ -S '(byok|byo?k|rag|relevance|cutoff|threshold)' src || true # 3) Count args in the function(s) likely touched by relevance-cutoff work (best-effort via grep of nearby matches) echo echo "=== Best-effort: show function defs around matches ===" rg -n --hidden --glob '!.venv/**' --glob '!.git/**' \ -S 'def |async def ' src/app src/utils src/models | head -n 200 || trueRepository: lightspeed-core/lightspeed-stack
Length of output: 50388
🏁 Script executed:
#!/bin/bash set -euo pipefail python - <<'PY' from pathlib import Path import ast def arg_counts(fn: ast.AST): # pylint max-args / max-positional-arguments behavior is close to counting these. args = fn.args posonly = len(getattr(args, "posonlyargs", [])) positional = posonly + len(args.args) kwonly = len(args.kwonlyargs) # vararg / kwarg exist in ast.arguments vararg = 1 if args.vararg is not None else 0 kwarg = 1 if args.kwarg is not None else 0 total = positional + kwonly + vararg + kwarg return total, positional def scan(path: Path): txt = path.read_text(encoding="utf-8") tree = ast.parse(txt) out = [] for node in ast.walk(tree): if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)): total, positional = arg_counts(node) out.append((str(path), node.lineno, node.name, isinstance(node, ast.AsyncFunctionDef), total, positional)) return out roots = [Path("src")] rows = [] parse_errors = [] for root in roots: for path in sorted(root.rglob("*.py")): try: rows.extend(scan(path)) except SyntaxError as e: parse_errors.append((str(path), e.lineno, e.msg)) except Exception as e: parse_errors.append((str(path), None, repr(e))) # Focus: what exceeds 5 (pylint default commonly) and what exceeds 8 (your new limit). exceed_5_total = [r for r in rows if r[4] > 5] exceed_8_total = [r for r in rows if r[4] > 8] exceed_5_pos = [r for r in rows if r[5] > 5] exceed_8_pos = [r for r in rows if r[5] > 8] print(f"functions total_params>5: {len(exceed_5_total)}") print(f"functions total_params>8: {len(exceed_8_total)}") print(f"functions positional_params>5: {len(exceed_5_pos)}") print(f"functions positional_params>8: {len(exceed_8_pos)}") # BYOK relevance-cutoff likely lives in vector_search.py vs_rows = [r for r in rows if r[0].endswith("src/utils/vector_search.py")] print(f"\nvector_search.py function count: {len(vs_rows)}") print("vector_search.py with total_params>5 (show):") for r in sorted([r for r in vs_rows if r[4] > 5], key=lambda x:(x[0],x[1])): print(f"{r[0]}:{r[1]} {('async ' if r[3] else '')}{r[2]} total={r[4]} positional={r[5]}") print("\nTop 20 by total_params (to gauge breadth):") for r in sorted(rows, key=lambda x:x[4], reverse=True)[:20]: print(f"{r[0]}:{r[1]} {('async ' if r[3] else '')}{r[2]} total={r[4]} positional={r[5]}") print("\nParse errors (first 20):") for pe in parse_errors[:20]: print(pe) print(f"Parse errors: {len(parse_errors)}") PYRepository: lightspeed-core/lightspeed-stack
Length of output: 2206
🏁 Script executed:
#!/bin/bash set -euo pipefail # Show how pyproject.toml:229-231 changed in this PR/branch git diff -U3 -- pyproject.toml | sed -n '1,200p'Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
#!/bin/bash set -euo pipefail # Check whether we have parent commits to diff against git rev-parse HEAD~1 >/dev/null 2>&1 || { echo "NO_PARENT_COMMIT"; exit 0; } echo "=== Current pyproject.toml: tool.pylint.design ===" rg -n --context 2 '^\\[tool\\.pylint\\.design\\]|^max-args\\s*=|^max-positional-arguments\\s*=' pyproject.toml echo echo "=== Previous commit (HEAD~1): tool.pylint.design ===" git show HEAD~1:pyproject.toml | rg -n --context 2 '^\\[tool\\.pylint\\.design\\]|^max-args\\s*=|^max-positional-arguments\\s*=' || true echo echo "=== Previous commit: snippet around tool.pylint.design section (if present) ===" git show HEAD~1:pyproject.toml | python - <<'PY' import sys, re txt=sys.stdin.read().splitlines() for i,line in enumerate(txt): if line.strip()=="[tool.pylint.design]": for j in range(max(0,i-5), min(len(txt), i+12)): print(f"{j+1}:{txt[j]}") break else: print("No [tool.pylint.design] section in HEAD~1") PYRepository: lightspeed-core/lightspeed-stack
Length of output: 93
🏁 Script executed:
#!/bin/bash set -euo pipefail echo "=== local pylint disables for argument-count ===" rg -n 'pylint:\s*disable=.*too-many-(arguments|positional-arguments)' src || true echo echo "=== _query_store_for_byok_rag snippet ===" python - <<'PY' from pathlib import Path p=Path("src/utils/vector_search.py") lines=p.read_text().splitlines() for i,l in enumerate(lines): if "async def _query_store_for_byok_rag" in l: for j in range(max(0,i-20), min(len(lines), i+90)): print(f"{j+1}:{lines[j]}") break PY echo echo "=== functions with total args >5 and <=8 (approx mapping to pylint design limits) ===" python - <<'PY' from pathlib import Path import ast def arg_counts(fn: ast.AST): args=fn.args posonly=len(getattr(args,'posonlyargs',[])) positional=posonly+len(args.args) kwonly=len(args.kwonlyargs) vararg=1 if args.vararg else 0 kwarg=1 if args.kwarg else 0 total=positional+kwonly+vararg+kwarg return total,positional rows=[] for path in Path("src").rglob("*.py"): try: tree=ast.parse(path.read_text(encoding="utf-8")) except SyntaxError: continue for node in ast.walk(tree): if isinstance(node,(ast.FunctionDef,ast.AsyncFunctionDef)): total,positional=arg_counts(node) if total>5 and total<=8: kind="async def" if isinstance(node,ast.AsyncFunctionDef) else "def" rows.append((str(path),node.lineno,kind,node.name,total,positional)) rows=sorted(rows, key=lambda r:(r[0],r[1])) print(f"affected_count={len(rows)}") for r in rows: path,ln,kind,name,total,pos=r print(f"{path}:{ln} {kind} {name} total={total} positional={pos}") PYRepository: lightspeed-core/lightspeed-stack
Length of output: 6930
Avoid repo-wide pylint
max-argsrelaxation for BYOK RAG
pyproject.tomlsets[tool.pylint.design] max-args = 8andmax-positional-arguments = 8(lines 229-231), which relaxestoo-many-arguments/too-many-positional-argumentsacross the whole repo. The scan shows 9 functions insrc/with total args in the 6–8 range; onlysrc/utils/vector_search.py:248 async _query_store_for_byok_rag(...)is BYOK RAG-related, while others includeresponse_generator,_check_shield_moderation,prepare_tools,build_turn_summary, andcreate_transcript_metadata. Since other outliers are already handled via local# pylint: disable=too-many-argumentssuppressions, keep the stricter global limits and apply a targeted suppression/refactor specifically for_query_store_for_byok_rag.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pyproject.toml` around lines 229 - 231, Revert the repo-wide pylint relaxation in pyproject.toml by removing or restoring the [tool.pylint.design] max-args and max-positional-arguments overrides so the global "too-many-arguments" rules remain strict, and instead add a targeted suppression or refactor for the BYOK RAG helper: open src/utils/vector_search.py and either refactor async function _query_store_for_byok_rag(...) to reduce its parameters or add a local "# pylint: disable=too-many-arguments" (or "# pylint: disable=too-many-positional-arguments") immediately above that function declaration to allow the exception only there. Ensure no other global changes are made so existing local suppressions (e.g., in response_generator, _check_shield_moderation, prepare_tools, build_turn_summary, create_transcript_metadata) remain intact.src/models/config.py (2)
2018-2023: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Use the shared reranker-model constant here.
This default duplicates
constants.DEFAULT_CROSS_ENCODER_MODEL, so the schema default and runtime default can drift on the next model change. Reference the shared constant instead of hardcoding the string.♻️ Proposed fix
model: str = Field( - default="cross-encoder/ms-marco-MiniLM-L6-v2", + default=constants.DEFAULT_CROSS_ENCODER_MODEL, title="Reranker model", description="Cross-encoder model name for reranking RAG chunks. " - "Defaults to 'cross-encoder/ms-marco-MiniLM-L6-v2' from sentence-transformers.", + f"Defaults to '{constants.DEFAULT_CROSS_ENCODER_MODEL}' from sentence-transformers.", )As per coding guidelines, "Check
constants.pyfor shared constants before defining new ones".📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.model: str = Field( default=constants.DEFAULT_CROSS_ENCODER_MODEL, title="Reranker model", description="Cross-encoder model name for reranking RAG chunks. " f"Defaults to '{constants.DEFAULT_CROSS_ENCODER_MODEL}' from sentence-transformers.", )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/models/config.py` around lines 2018 - 2023, The Field default for the "model" attribute in the config schema is hardcoded to "cross-encoder/ms-marco-MiniLM-L6-v2" which duplicates the shared constant; update the default to reference the shared constant DEFAULT_CROSS_ENCODER_MODEL from the constants module (import it at the top if necessary) so the schema default and runtime default stay in sync, i.e., replace the literal string used in the Field default with constants.DEFAULT_CROSS_ENCODER_MODEL and ensure the import for DEFAULT_CROSS_ENCODER_MODEL is present where the Field is defined.
2351-2387:
⚠️ Potential issue | 🟠 Major | ⚡ Quick win
validate_reranker_auto_enableonly honors inline OKP.The validator’s docstring and comments say OKP in inline or tool strategies should trigger auto-enable, but Line 2368 only checks
self.rag.inline. A config with BYOK plusokponly inrag.toolwill skip this path. Either includeself.rag.toolin the predicate or narrow the documented contract to inline-only behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/models/config.py` around lines 2351 - 2387, The validator currently only checks self.rag.inline for OKP (constants.OKP_RAG_ID) but the docstring/comments say inline OR tool should trigger auto-enable; update the predicate by changing the has_okp computation in validate_reranker_auto_enable to consider both self.rag.inline and self.rag.tool (e.g., check constants.OKP_RAG_ID in either collection) and adjust the log message/comment to say "inline or tool" as needed; keep the existing protected-access and pylint disables around reranker._explicitly_configured unchanged.src/utils/vector_search.py (1)
107-119: 🧹 Nitpick | 🔵 Trivial | 💤 Low value
Duplicate/malformed
Returns:section in docstring.This docstring has two
Returns:blocks: the first (Lines 113-116) actually documentsparamskeys rather than the return value, and the second (Lines 118-119) is the real return description. Collapse into a single, correctReturns:section to follow Google docstring conventions.As per coding guidelines: "Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises".📝 Proposed docstring fix
Args: solr: Optional structured Solr request (mode and filters from the API). k: Optional number of results to return. If not provided, uses default. Returns: - - mode: Solr search mode (semantic, hybrid, lexical) - - filters: Solr filter payload, may contain structured metadata filters - - Returns: - Parameter dictionary for ``vector_io.query`` with extracted filters at top level. + Parameter dictionary for ``vector_io.query`` with extracted filters at top + level (keys include ``mode`` and, when present, ``filters``). """📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements."""Build query parameters for Solr vector_io search. Args: solr: Optional structured Solr request (mode and filters from the API). k: Optional number of results to return. If not provided, uses default. Returns: Parameter dictionary for ``vector_io.query`` with extracted filters at top level (keys include ``mode`` and, when present, ``filters``). """🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/utils/vector_search.py` around lines 107 - 119, The docstring in the Solr vector_io query builder in src/utils/vector_search.py contains two Returns sections; collapse them into a single Google-style Returns section that describes the returned dict and its keys (mode, filters, and any other params) and remove the duplicated/malformed Returns block so the docstring follows Google conventions for the function that builds query parameters for vector_io.query.
f6deca8 to
8443691
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
docs/openapi.json (1)
11855-11861:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winPipeline failure: regenerate the OpenAPI schema using the project script.
The OpenAPI (Spectral) CI check is failing with a definitive error:
[error] docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.jsonThis file is a generated artifact and should not be manually edited. Even though the
relevance_cutoff_scoreproperty definition appears semantically correct (thedefault: 0.3matchesDEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCOREandexclusiveMinimum: 0.0correctly maps thegt=0constraint), the schema must be regenerated using the project's generation script to pass CI.Run the regeneration command:
#!/bin/bash uv run scripts/generate_openapi_schema.py docs/openapi.jsonAfter regeneration, verify the output includes
relevance_cutoff_scorewith the expected default and constraints, then commit the regenerated file.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/openapi.json` around lines 11855 - 11861, This OpenAPI file is a generated artifact and is out of date; run the project generation script to regenerate docs/openapi.json so CI passes. Execute the generator uv run scripts/generate_openapi_schema.py docs/openapi.json, then confirm the resulting schema contains the relevance_cutoff_score property with default 0.3 and exclusiveMinimum 0.0 (matching DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE and gt=0 constraint), and commit the regenerated docs/openapi.json.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/byok_guide.md`:
- Around line 167-184: Add a blank line immediately after the "#### Option 1:
Use rag-content Download Script (Optional)" heading and another blank line
immediately after the "#### Option 2: Manual Download and Configuration" heading
in docs/byok_guide.md so each heading is followed by an empty line (to satisfy
markdownlint MD022); edit the two heading lines shown in the diff to insert a
single newline after each heading.
- Around line 202-211: Add a new row for the relevance_cutoff_score field to the
field reference table: document the field name `relevance_cutoff_score`, mark
whether it is required (No), state the default (e.g., `0.0` or the actual
default used in code/config), and provide a short description like "Minimum
similarity score threshold for considering vector matches in Inline RAG; results
below this value are ignored." Update the table near the other Inline RAG fields
(`rag_id`, `rag_type`, `embedding_model`, `embedding_dimension`, `vector_db_id`,
`db_path`, `score_multiplier`) so the new row appears consistently with
formatting and alignment.
- Around line 406-413: Remove the blank line separating the two adjacent
blockquote blocks ([!NOTE] and [!TIP]) so they are back-to-back (or
alternatively insert non-blockquote content between them); edit the markdown
around the [!NOTE] and [!TIP] markers to ensure no empty line exists between the
closing line of the NOTE block and the opening line of the TIP block to satisfy
MD028.
In `@pyproject.toml`:
- Around line 231-233: The project raised pylint's
max-args/max-positional-arguments to 8 which masks functions that should be
refactored; instead, collapse large argument lists (e.g., functions like
_query_store_for_byok_rag) into a single typed parameter object/dataclass
(suggest name VectorQueryParams) that holds related fields (vector_db_id, query,
top_k, score_threshold, filters, limits, etc.), update all call sites to
construct and pass that object, add validation/defaults on the dataclass and
adjust function signatures to accept the dataclass (and update type
hints/annotations), and revert or keep pylintrc limits stricter to encourage
this refactor rather than relaxing max-args globally.
---
Duplicate comments:
In `@docs/openapi.json`:
- Around line 11855-11861: This OpenAPI file is a generated artifact and is out
of date; run the project generation script to regenerate docs/openapi.json so CI
passes. Execute the generator uv run scripts/generate_openapi_schema.py
docs/openapi.json, then confirm the resulting schema contains the
relevance_cutoff_score property with default 0.3 and exclusiveMinimum 0.0
(matching DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE and gt=0 constraint), and
commit the regenerated docs/openapi.json.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9d96bbb8-fd49-4b92-b7eb-7f3fcf7c782a
📒 Files selected for processing (11)
docs/byok_guide.mddocs/openapi.jsonpyproject.tomlsrc/configuration.pysrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/test_configuration.pytests/unit/utils/test_vector_search.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: build-pr
- GitHub Check: Pylinter
- GitHub Check: unit_tests (3.12)
- GitHub Check: unit_tests (3.13)
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/configuration.pysrc/constants.pysrc/models/config.pysrc/utils/vector_search.py
src/**/configuration.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/configuration.py: All config models must extendConfigurationBasewithextra="forbid"to reject unknown fields
Use@field_validatorand@model_validatorfor custom validation in Pydantic models
Files:
src/configuration.py
src/constants.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
constants.pyfor shared constants with descriptive comments and type hints usingFinal[type]
Files:
src/constants.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/models/config/test_dump_configuration.pytests/unit/test_configuration.pytests/unit/models/config/test_byok_rag.pytests/unit/utils/test_vector_search.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 1843
File: src/utils/vector_search.py:472-484
Timestamp: 2026-06-03T11:15:41.987Z
Learning: In `src/utils/vector_search.py`, the `_fetch_byok_rag` function's `vector_store_ids` parameter always contains user-facing `rag_id`s, never raw llama-stack `vector_db_id`s. The API contract enforces rag_ids only, so filtering `vector_store_ids` against `configuration.configuration.rag.retrieval.inline.sources` (also rag_ids) before calling `resolve_vector_store_ids` is intentional and correct.
📚 Learning: 2026-06-03T11:15:41.987Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 1843
File: src/utils/vector_search.py:472-484
Timestamp: 2026-06-03T11:15:41.987Z
Learning: In `src/utils/vector_search.py`, the `_fetch_byok_rag` function's `vector_store_ids` parameter always contains user-facing `rag_id`s, never raw llama-stack `vector_db_id`s. The API contract enforces rag_ids only, so filtering `vector_store_ids` against `configuration.configuration.rag.retrieval.inline.sources` (also rag_ids) before calling `resolve_vector_store_ids` is intentional and correct.
Applied to files:
tests/unit/models/config/test_dump_configuration.pytests/unit/models/config/test_byok_rag.pysrc/utils/vector_search.pydocs/byok_guide.mdtests/unit/utils/test_vector_search.py
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
📚 Learning: 2026-03-17T11:34:53.242Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 1335
File: docs/config.md:411-411
Timestamp: 2026-03-17T11:34:53.242Z
Learning: In the lightspeed-stack project (`src/models/config.py`, `docs/config.md`), the internal Solr filter `is_chunk:true` (defined as `SOLR_CHUNK_FILTER_QUERY` in `src/constants.py`) is always injected by the system for OKP searches and is intentionally hidden from users. The `chunk_filter_query` field in `OkpConfiguration` is user-facing and additive-only, but the documentation must NOT mention the internal `is_chunk:true` behavior — this is a deliberate design decision by the maintainers.
Applied to files:
docs/byok_guide.md
🪛 GitHub Actions: OpenAPI (Spectral) / 0_spectral.txt
docs/openapi.json
[error] 1-1: docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json (diff -u detected mismatch against /tmp/openapi-generated.json)
🪛 GitHub Actions: OpenAPI (Spectral) / spectral
docs/openapi.json
[error] 1-1: OpenAPI schema is out of date. CI diff check failed for docs/openapi.json vs generated schema. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.json
🪛 markdownlint-cli2 (0.22.1)
docs/byok_guide.md
[warning] 93-93: Blank line inside blockquote
(MD028, no-blanks-blockquote)
[warning] 167-167: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 175-175: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 409-409: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🔇 Additional comments (14)
src/constants.py (1)
204-205: LGTM!src/models/config.py (1)
1817-1823: LGTM!src/configuration.py (1)
534-550: LGTM!src/utils/vector_search.py (1)
253-254: LGTM!Also applies to: 264-277, 502-518
tests/unit/models/config/test_byok_rag.py (1)
7-12: LGTM!Also applies to: 39-40, 59-60, 68-69, 194-226
tests/unit/models/config/test_dump_configuration.py (1)
44-224: LGTM!Also applies to: 1247-1247, 1501-1574
tests/unit/test_configuration.py (1)
1198-1276: LGTM!tests/unit/utils/test_vector_search.py (1)
4-5: LGTM!Also applies to: 32-32, 37-60, 572-574, 614-616, 639-643, 661-664, 698-839, 995-997, 1035-1037, 1089-1091
docs/byok_guide.md (6)
18-19: LGTM!
82-82: LGTM!
94-97: LGTM!
232-238: LGTM!
267-270: LGTM!
302-305: LGTM!
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 4
♻️ Duplicate comments (1)
docs/openapi.json (1)
11855-11861:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winPipeline failure: regenerate the OpenAPI schema using the project script.
The OpenAPI (Spectral) CI check is failing with a definitive error:
[error] docs/openapi.json is out of date. Regenerate with: uv run scripts/generate_openapi_schema.py docs/openapi.jsonThis file is a generated artifact and should not be manually edited. Even though the
relevance_cutoff_scoreproperty definition appears semantically correct (thedefault: 0.3matchesDEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCOREandexclusiveMinimum: 0.0correctly maps thegt=0constraint), the schema must be regenerated using the project's generation script to pass CI.Run the regeneration command:
#!/bin/bash uv run scripts/generate_openapi_schema.py docs/openapi.jsonAfter regeneration, verify the output includes
relevance_cutoff_scorewith the expected default and constraints, then commit the regenerated file.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/openapi.json` around lines 11855 - 11861, This OpenAPI file is a generated artifact and is out of date; run the project generation script to regenerate docs/openapi.json so CI passes. Execute the generator uv run scripts/generate_openapi_schema.py docs/openapi.json, then confirm the resulting schema contains the relevance_cutoff_score property with default 0.3 and exclusiveMinimum 0.0 (matching DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE and gt=0 constraint), and commit the regenerated docs/openapi.json.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/byok_guide.md`:
- Around line 167-184: Add a blank line immediately after the "#### Option 1:
Use rag-content Download Script (Optional)" heading and another blank line
immediately after the "#### Option 2: Manual Download and Configuration" heading
in docs/byok_guide.md so each heading is followed by an empty line (to satisfy
markdownlint MD022); edit the two heading lines shown in the diff to insert a
single newline after each heading.
- Around line 202-211: Add a new row for the relevance_cutoff_score field to the
field reference table: document the field name `relevance_cutoff_score`, mark
whether it is required (No), state the default (e.g., `0.0` or the actual
default used in code/config), and provide a short description like "Minimum
similarity score threshold for considering vector matches in Inline RAG; results
below this value are ignored." Update the table near the other Inline RAG fields
(`rag_id`, `rag_type`, `embedding_model`, `embedding_dimension`, `vector_db_id`,
`db_path`, `score_multiplier`) so the new row appears consistently with
formatting and alignment.
- Around line 406-413: Remove the blank line separating the two adjacent
blockquote blocks ([!NOTE] and [!TIP]) so they are back-to-back (or
alternatively insert non-blockquote content between them); edit the markdown
around the [!NOTE] and [!TIP] markers to ensure no empty line exists between the
closing line of the NOTE block and the opening line of the TIP block to satisfy
MD028.
In `@pyproject.toml`:
- Around line 231-233: The project raised pylint's
max-args/max-positional-arguments to 8 which masks functions that should be
refactored; instead, collapse large argument lists (e.g., functions like
_query_store_for_byok_rag) into a single typed parameter object/dataclass
(suggest name VectorQueryParams) that holds related fields (vector_db_id, query,
top_k, score_threshold, filters, limits, etc.), update all call sites to
construct and pass that object, add validation/defaults on the dataclass and
adjust function signatures to accept the dataclass (and update type
hints/annotations), and revert or keep pylintrc limits stricter to encourage
this refactor rather than relaxing max-args globally.
---
Duplicate comments:
In `@docs/openapi.json`:
- Around line 11855-11861: This OpenAPI file is a generated artifact and is out
of date; run the project generation script to regenerate docs/openapi.json so CI
passes. Execute the generator uv run scripts/generate_openapi_schema.py
docs/openapi.json, then confirm the resulting schema contains the
relevance_cutoff_score property with default 0.3 and exclusiveMinimum 0.0
(matching DEFAULT_BYOK_RAG_RELEVANCE_CUTOFF_SCORE and gt=0 constraint), and
commit the regenerated docs/openapi.json.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9d96bbb8-fd49-4b92-b7eb-7f3fcf7c782a
📒 Files selected for processing (11)
docs/byok_guide.mddocs/openapi.jsonpyproject.tomlsrc/configuration.pysrc/constants.pysrc/models/config.pysrc/utils/vector_search.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/test_configuration.pytests/unit/utils/test_vector_search.py
📜 Review details
🔇 Additional comments (14)
src/constants.py (1)
204-205: LGTM!src/models/config.py (1)
1817-1823: LGTM!src/configuration.py (1)
534-550: LGTM!src/utils/vector_search.py (1)
253-254: LGTM!Also applies to: 264-277, 502-518
tests/unit/models/config/test_byok_rag.py (1)
7-12: LGTM!Also applies to: 39-40, 59-60, 68-69, 194-226
tests/unit/models/config/test_dump_configuration.py (1)
44-224: LGTM!Also applies to: 1247-1247, 1501-1574
tests/unit/test_configuration.py (1)
1198-1276: LGTM!tests/unit/utils/test_vector_search.py (1)
4-5: LGTM!Also applies to: 32-32, 37-60, 572-574, 614-616, 639-643, 661-664, 698-839, 995-997, 1035-1037, 1089-1091
docs/byok_guide.md (6)
18-19: LGTM!
82-82: LGTM!
94-97: LGTM!
232-238: LGTM!
267-270: LGTM!
302-305: LGTM!
🛑 Comments failed to post (4)
docs/byok_guide.md (3)
167-184:
⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd blank lines after headings to satisfy markdownlint.
Lines 167 and 175 have headings that should be followed by a blank line for consistent markdown formatting (MD022).
📝 Proposed fix
At line 167, add a blank line after the heading:
#### Option 1: Use rag-content Download Script (Optional) + You can use the embedding generation step mentioned in the rag-content repo:At line 175, add a blank line after the heading:
#### Option 2: Manual Download and Configuration + Alternatively, you can download your own embedding model and update the path in your YAML configuration:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.#### Option 1: Use rag-content Download Script (Optional) You can use the embedding generation step mentioned in the rag-content repo:🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 167-167: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below(MD022, blanks-around-headings)
[warning] 175-175: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/byok_guide.md` around lines 167 - 184, Add a blank line immediately after the "#### Option 1: Use rag-content Download Script (Optional)" heading and another blank line immediately after the "#### Option 2: Manual Download and Configuration" heading in docs/byok_guide.md so each heading is followed by an empty line (to satisfy markdownlint MD022); edit the two heading lines shown in the diff to insert a single newline after each heading.
202-211:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd
relevance_cutoff_scorerow to the field reference table.The new
relevance_cutoff_scoreparameter is documented in line 82, included in the YAML example at line 233, and appears in the comparison table at lines 267-270, but is missing from this field reference table. For completeness, add a row documenting this field.📝 Proposed fix
| `db_path` | Yes | — | Path to the vector database file | | `score_multiplier` | No | `1.0` | Weight for Inline RAG ranking (values > 1.0 boost; < 1.0 reduce) | +| `relevance_cutoff_score` | No | `0.3` | Minimum raw similarity score for Inline RAG chunks (BYOK only; not used in Tool RAG) |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.| Field | Required | Default | Description | |-----------------------|----------|-------------------------------------------|-------------------------------------------------------------------------------------------| | `rag_id` | Yes | — | Unique identifier for the knowledge source | | `rag_type` | No | `inline::faiss` | Vector store provider type | | `embedding_model` | No | `sentence-transformers/all-mpnet-base-v2` | Embedding model identifier or path | | `embedding_dimension` | No | `768` | Embedding vector dimensionality | | `vector_db_id` | Yes | — | Vector store ID generated by rag-content (e.g. `vs_8c94967b-81cc-4028-a294-9cfac6fd9ae2`) | | `db_path` | Yes | — | Path to the vector database file | | `score_multiplier` | No | `1.0` | Weight for Inline RAG ranking (values > 1.0 boost; < 1.0 reduce) | | `relevance_cutoff_score` | No | `0.3` | Minimum raw similarity score for Inline RAG chunks (BYOK only; not used in Tool RAG) |🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/byok_guide.md` around lines 202 - 211, Add a new row for the relevance_cutoff_score field to the field reference table: document the field name `relevance_cutoff_score`, mark whether it is required (No), state the default (e.g., `0.0` or the actual default used in code/config), and provide a short description like "Minimum similarity score threshold for considering vector matches in Inline RAG; results below this value are ignored." Update the table near the other Inline RAG fields (`rag_id`, `rag_type`, `embedding_model`, `embedding_dimension`, `vector_db_id`, `db_path`, `score_multiplier`) so the new row appears consistently with formatting and alignment.
406-413:
⚠️ Potential issue | 🟡 Minor | ⚡ Quick winRemove blank line between adjacent blockquotes to satisfy markdownlint.
Line 409 has a blank line between two adjacent blockquote blocks (NOTE and TIP), which triggers MD028. Either merge them into a single blockquote or add non-blockquote content between them.
📝 Proposed fix
> [!NOTE] > For pgvector, ensure your PostgreSQL credentials are available via environment variables > (e.g., `POSTGRES_PASSWORD`). - > [!TIP] > A complete working example combining BYOK and OKP is available at > [`examples/lightspeed-stack-byok-okp-rag.yaml`](../examples/lightspeed-stack-byok-okp-rag.yaml).📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.> [!NOTE] > For pgvector, ensure your PostgreSQL credentials are available via environment variables > (e.g., `POSTGRES_PASSWORD`). > [!TIP] > A complete working example combining BYOK and OKP is available at > [`examples/lightspeed-stack-byok-okp-rag.yaml`](../examples/lightspeed-stack-byok-okp-rag.yaml).🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 409-409: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/byok_guide.md` around lines 406 - 413, Remove the blank line separating the two adjacent blockquote blocks ([!NOTE] and [!TIP]) so they are back-to-back (or alternatively insert non-blockquote content between them); edit the markdown around the [!NOTE] and [!TIP] markers to ensure no empty line exists between the closing line of the NOTE block and the opening line of the TIP block to satisfy MD028.pyproject.toml (1)
231-233: 🧹 Nitpick | 🔵 Trivial | 🏗️ Heavy lift
Consider refactoring functions instead of raising complexity limits.
Increasing
max-argsandmax-positional-argumentsfrom the default 5 to 8 is a significant, project-wide relaxation of code quality constraints. While this unblocks the current PR (likely due to addingscore_thresholdparameters in the Vector Search Integration layer), it permits higher complexity for all future code.Functions requiring 8+ parameters often signal an opportunity to introduce parameter objects or data classes. For example, if
_query_store_for_byok_ragor similar functions now have many related parameters (e.g., query text, filters, thresholds, limits), consider grouping them into a typed configuration object.♻️ Alternative approach using a parameter object
Instead of:
def _query_store_for_byok_rag( vector_db_id: str, query: str, top_k: int, score_threshold: float, param5: type, param6: type, # ... more params ) -> Result: passRefactor to:
`@dataclass` class VectorQueryParams: """Parameters for BYOK RAG vector store queries.""" vector_db_id: str query: str top_k: int score_threshold: float # ... other related params def _query_store_for_byok_rag(params: VectorQueryParams) -> Result: passThis improves maintainability by:
- Grouping related parameters with clear ownership
- Making it easier to add new parameters without signature changes
- Enabling parameter validation and defaults in one place
- Reducing cognitive load when reading function calls
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pyproject.toml` around lines 231 - 233, The project raised pylint's max-args/max-positional-arguments to 8 which masks functions that should be refactored; instead, collapse large argument lists (e.g., functions like _query_store_for_byok_rag) into a single typed parameter object/dataclass (suggest name VectorQueryParams) that holds related fields (vector_db_id, query, top_k, score_threshold, filters, limits, etc.), update all call sites to construct and pass that object, add validation/defaults on the dataclass and adjust function signatures to accept the dataclass (and update type hints/annotations), and revert or keep pylintrc limits stricter to encourage this refactor rather than relaxing max-args globally.
Description
In the Lightspeed Stack configuration, byok_rag section, added a "relevance_cutoff_score" configuraiton variable, a positive float.
When a RAG chunk is retrieved from a BYOK database and its relevance score is less than the cutoff, the chunk is dropped from further consideration. Specifically, this cutoff filtering happens before score multipliers are applied.
Only added handling of relevance_cutoff_score to the inline BYOK RAG for now as adding it to the tool BYOK RAG is calling for changes in Llama Stack.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Documentation
API
Tests