LCORE-2437: Support pgvector in byok_rag enrichment script#1856
Conversation
|
Warning Review limit reached
More reviews will be available in 12 minutes and 43 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (8)
WalkthroughThis PR enables pgvector as a configurable remote RAG backend in BYOK (Bring Your Own Knowledge) configurations. The Changespgvector BYOK RAG integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Warning Review ran into problems🔥 ProblemsStopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/models/config.py (1)
1775-1870:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRestrict
rag_typeto supported values to prevent late enrichment failures.Line 1775 accepts any non-empty string, but unsupported values fail later in
src/llama_stack_configuration.pyduring provider generation. Validate this at model parse time (or useLiteral) so bad config is rejected early with a clear error.Suggested fix
-from typing import Any, Literal, Optional, Self +from typing import Any, Literal, Optional, Self @@ - rag_type: str = Field( + rag_type: Literal["inline::faiss", "remote::pgvector"] = Field( constants.DEFAULT_RAG_TYPE, min_length=1, title="RAG type", description="Type of RAG database (e.g. 'inline::faiss', 'remote::pgvector').", ) @@ - if self.rag_type == "inline::faiss": + if self.rag_type == "inline::faiss": if not self.db_path: raise ValueError("db_path is required when rag_type is 'inline::faiss'") elif self.rag_type == "remote::pgvector": @@ if getattr(self, field_name) is None: object.__setattr__(self, field_name, default_value) + else: + raise ValueError( + "Unsupported rag_type. Expected one of: 'inline::faiss', 'remote::pgvector'" + ) return self🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/models/config.py` around lines 1775 - 1870, The rag_type Field currently allows any non-empty string (rag_type) which defers errors until provider generation; restrict it to supported values (e.g. use typing.Literal[...] or an Enum) and validate early in the model: change the type of rag_type to a Literal/Enum of the supported options (e.g. "inline::faiss", "remote::pgvector", etc.) and/or add a check at the start of validate_rag_type_fields to raise ValueError if self.rag_type not in the allowed set, so invalid configs are rejected during model parsing; update the Field description accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/byok_guide.md`:
- Around line 303-306: Update the field-reference table so the `db_path` row no
longer lists it as always required; change its requirement to "required for
`inline::faiss` only" and ensure the pgvector section stays consistent with the
note that connection fields (`host`, `port`, `db`, `user`, `password`) default
to `${env.POSTGRES_*}` when omitted; update any caption or remark in the table
that currently implies `db_path` is mandatory to reference `inline::faiss`
specifically instead.
In `@src/llama_stack_configuration.py`:
- Around line 342-344: When applying extra_fields from template into config,
treat explicit None/null or empty-string values in brag as missing so defaults
are used; replace the current assignment config[field] = brag.get(field,
default) with logic that reads val = brag.get(field, None) and if val is None or
val == "" sets val = default before assigning to config[field], ensuring fields
like host: (YAML null/empty) get the template default rather than literal
null/empty in the provider config.
In `@src/models/config.py`:
- Around line 1845-1850: The password field in src/models/config.py is declared
as password: Optional[str] using Field which allows plain-text serialization;
change its type to Optional[SecretStr] (from pydantic) and update the Field
usage accordingly so the value is treated as a secret (e.g., use SecretStr for
Postgres password handling consistent with other DB secrets). Add/adjust the
import for SecretStr, update any type hints or callers that access password
(e.g., password.get_secret_value() where raw string is required), and ensure
default behavior and env-derived fallback remain intact.
In `@tests/unit/models/config/test_byok_rag.py`:
- Around line 230-245: The test function
test_byok_rag_pgvector_custom_connection_fields sets custom connection values
for user and password but only asserts host, port, and db; update this test to
also assert the overridden credentials by adding assertions for store.user ==
"admin" and store.password == "secret" within the
test_byok_rag_pgvector_custom_connection_fields function so regressions in
mapping user/password are caught.
---
Outside diff comments:
In `@src/models/config.py`:
- Around line 1775-1870: The rag_type Field currently allows any non-empty
string (rag_type) which defers errors until provider generation; restrict it to
supported values (e.g. use typing.Literal[...] or an Enum) and validate early in
the model: change the type of rag_type to a Literal/Enum of the supported
options (e.g. "inline::faiss", "remote::pgvector", etc.) and/or add a check at
the start of validate_rag_type_fields to raise ValueError if self.rag_type not
in the allowed set, so invalid configs are rejected during model parsing; update
the Field description accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4d96cc64-eda3-4d81-bf28-c9b67ecaee54
📒 Files selected for processing (7)
docs/byok_guide.mddocs/rag_guide.mdsrc/llama_stack_configuration.pysrc/models/config.pytests/unit/models/config/test_byok_rag.pytests/unit/models/config/test_dump_configuration.pytests/unit/test_llama_stack_configuration.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: build-pr
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: unit_tests (3.12)
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/models/config/test_dump_configuration.pytests/unit/models/config/test_byok_rag.pytests/unit/test_llama_stack_configuration.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/models/config.pysrc/llama_stack_configuration.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Pydantic models must use
@model_validatorand@field_validatorfor validation and complete type annotations for all attributes, avoidingAnytype
Files:
src/models/config.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 1843
File: src/utils/vector_search.py:472-484
Timestamp: 2026-06-03T11:15:41.987Z
Learning: In `src/utils/vector_search.py`, the `_fetch_byok_rag` function's `vector_store_ids` parameter always contains user-facing `rag_id`s, never raw llama-stack `vector_db_id`s. The API contract enforces rag_ids only, so filtering `vector_store_ids` against `configuration.configuration.rag.retrieval.inline.sources` (also rag_ids) before calling `resolve_vector_store_ids` is intentional and correct.
📚 Learning: 2026-06-03T11:15:41.987Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 1843
File: src/utils/vector_search.py:472-484
Timestamp: 2026-06-03T11:15:41.987Z
Learning: In `src/utils/vector_search.py`, the `_fetch_byok_rag` function's `vector_store_ids` parameter always contains user-facing `rag_id`s, never raw llama-stack `vector_db_id`s. The API contract enforces rag_ids only, so filtering `vector_store_ids` against `configuration.configuration.rag.retrieval.inline.sources` (also rag_ids) before calling `resolve_vector_store_ids` is intentional and correct.
Applied to files:
docs/rag_guide.mdsrc/models/config.pytests/unit/models/config/test_byok_rag.pytests/unit/test_llama_stack_configuration.pydocs/byok_guide.mdsrc/llama_stack_configuration.py
📚 Learning: 2025-08-18T10:58:14.951Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:47-47
Timestamp: 2025-08-18T10:58:14.951Z
Learning: psycopg2-binary is required by some llama-stack providers in the lightspeed-stack project, so it cannot be replaced with psycopg v3 or moved to optional dependencies without breaking llama-stack functionality.
Applied to files:
docs/rag_guide.md
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.py
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
docs/byok_guide.md
📚 Learning: 2026-05-20T08:09:43.391Z
Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: src/llama_stack_configuration.py:651-683
Timestamp: 2026-05-20T08:09:43.391Z
Learning: In `src/llama_stack_configuration.py`, the `apply_high_level_inference` function currently emits `provider_id: p_type` (underscore form, e.g. `sentence_transformers`) directly from the high-level type key, which collides with Llama Stack's hyphenated provider IDs (e.g. `sentence-transformers`). This is a known PoC divergence documented in the spike doc ("Findings discovered during PoC") and tracked in the implementation JIRA "Unified llama_stack.config schema + synthesizer". Decision S5 mandates that each backend-specific synthesizer translates LCORE's canonical type Literal vocabulary to the target backend's expected shape (hyphenated provider_id for Llama Stack; model-string prefixes for Pydantic AI). The PoC code will be removed before merge; the fix belongs in the implementation ticket.
Applied to files:
src/llama_stack_configuration.py
d0aced5 to
14bfe24
Compare
Add pgvector as a supported rag_type in the BYOK RAG enrichment pipeline. When rag_type is set to remote::pgvector, the enrichment script generates a pgvector provider config with PostgreSQL connection fields instead of the FAISS-style kv_sqlite storage backend. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
14bfe24 to
94b5cf4
Compare
Description
Add pgvector as a supported
rag_typein the BYOK RAG enrichment pipeline. Whenrag_type: remote::pgvectoris set in abyok_ragentry, the enrichment script now generates a valid pgvector provider config with PostgreSQL connection fields (host,port,db,user,password) instead of the FAISS-stylekv_sqlitestorage backend.Key changes:
ByokRagmodel: Add optional pgvector connection fields with${env.POSTGRES_*}defaults, makedb_pathoptional (not needed for pgvector), add model validator for backend-specific field requirements.VECTOR_IO_TEMPLATESmap so adding new backends requires only a new map entry.construct_storage_backends_sectionskips backends that don't need custom storage._build_vector_io_configbuilds provider configs from templates.byok_ragconfig examples.Type of change
Tools used to create PR
Related Tickets & Documents
Checklist before requesting a review
Testing
lightspeed-stack.yamlwith a pgvector BYOK entry:uv run python src/llama_stack_configuration.py -c lightspeed-stack.yaml -i run.yaml -o /tmp/enriched.yamlremote::pgvectorprovider withpersistence.backend: kv_defaultand connection fields.uv run make test-unit(2528 passed).Summary by CodeRabbit
New Features
Documentation
Tests