Conversation
- rebuild goal-scan from the latest main branch - retain code-scan audit docs and resolved issue ledger - preserve response error contracts and redaction behavior - keep production smoke tests on namespaced fixtures
- make RAG request client_config scoped and restored after use - avoid eager request redaction when debug logging is disabled - preserve explicit response validation errors and simplify HTTPError flow - clean admin error control flow and cross-platform path assertion - expose empty Gremlin payloads instead of silently returning None
- reject malformed Gremlin response payloads explicitly - avoid leaking raw KeyError from ResponseData construction - keep None response handling as the existing no-result path - cover empty payload behavior with a focused regression test
- remove request-wide graph config lock from RAG API - roll back provider type settings when config apply fails - let graph import apply typed defaults for missing properties - clarify scan report wording for post-scan fixes
- add TODO for replacing temporary huge_settings mutation - document the request-scoped flow/operator context direction - keep behavior unchanged for the current PR
- stop tracking .workflow/code-scan execution records - ignore local code-scan scratch artifacts - keep durable code-scan docs under docs/
There was a problem hiding this comment.
Pull request overview
This PR strengthens core reliability/security contracts across hugegraph-python-client and hugegraph-llm (response parsing, error typing, sensitive-log redaction, admin log path hardening), while also restructuring/expanding tests and adding a restartable code-scan ledger/reporting set under .workflow/code-scan/.
Changes:
- Hardened security boundaries (admin
/logspath validation; redaction of password/token-like fields in client debug/error logs). - Tightened client error/response contracts (typed
ServerError/ResponseParseError, stricter Gremlin response payload validation). - Replaced/augmented integration & contract tests to exercise production flows/operators and newly defined contracts; added code-scan plan/spec/report artifacts.
Reviewed changes
Copilot reviewed 42 out of 43 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| hugegraph-python-client/src/tests/api/test_schema.py | Adds targeted FIXME markers for missing schema-builder contract coverage. |
| hugegraph-python-client/src/tests/api/test_response_validation.py | Adds contract tests for parse errors, status typing, and log redaction behavior. |
| hugegraph-python-client/src/tests/api/test_gremlin.py | Updates expected Gremlin error typing and adds contract tests for exception preservation and payload validation. |
| hugegraph-python-client/src/tests/api/test_graph.py | Adds FIXME markers and highlights current integration test state isolation gaps. |
| hugegraph-python-client/src/tests/api/test_auth_routing.py | Adds contract tests for debug redaction in HGraphSession.request() (plus FIXME about real resolver coverage). |
| hugegraph-python-client/src/pyhugegraph/utils/util.py | Implements redact_sensitive_data(), improves response parsing/typing, and raises ResponseParseError for malformed successes. |
| hugegraph-python-client/src/pyhugegraph/utils/huge_requests.py | Redacts sensitive kwargs in debug logs and avoids eager f-string logging. |
| hugegraph-python-client/src/pyhugegraph/utils/exceptions.py | Adds ResponseParseError and ServerError exception types. |
| hugegraph-python-client/src/pyhugegraph/api/gremlin.py | Stops blanket re-wrapping; validates response payload shape and preserves upstream exception types. |
| hugegraph-llm/src/tests/operators/llm_op/test_property_graph_extract.py | Updates expectations to preserve typed values and filter unknown properties. |
| hugegraph-llm/src/tests/operators/index_op/test_vector_index_query.py | Adds FIXME for flow→operator parameter pass-through contract coverage. |
| hugegraph-llm/src/tests/operators/hugegraph_op/test_commit_to_hugegraph.py | Adds FIXME markers for missing real failure-branch coverage. |
| hugegraph-llm/src/tests/operators/hugegraph_op/test_commit_to_hugegraph_load_into_graph.py | Adds an extraction→commit round-trip typed-value preservation contract test and additional FIXME markers. |
| hugegraph-llm/src/tests/models/llms/test_openai_client.py | Adds FIXME markers for provider failure-propagation contract gaps. |
| hugegraph-llm/src/tests/integration/test_rag_pipeline.py | Replaces test-local stand-ins with production operators and deterministic fakes for smoke coverage. |
| hugegraph-llm/src/tests/integration/test_kg_construction.py | Replaces mock KG pipeline with production flow/operator smoke tests using deterministic fake LLM. |
| hugegraph-llm/src/tests/integration/test_graph_rag_pipeline.py | Replaces test-local pipeline scaffolding with production flow build smoke tests. |
| hugegraph-llm/src/tests/indices/test_faiss_vector_index.py | Adds FIXME markers for cross-backend threshold semantics/contract coverage. |
| hugegraph-llm/src/tests/config/test_prompt_config.py | Adds FIXME markers for runtime YAML prompt contract coverage. |
| hugegraph-llm/src/tests/config/test_config.py | Adds FIXME markers for .env sync typing and side-effect contract coverage. |
| hugegraph-llm/src/tests/api/test_rag_api.py | Adds contract tests for config rollback, supported-provider validation, and request-scoped graph overrides. |
| hugegraph-llm/src/tests/api/test_admin_api.py | Adds contract tests for /logs traversal rejection, insecure token rejection, and valid streaming. |
| hugegraph-llm/src/hugegraph_llm/operators/llm_op/property_graph_extract.py | Preserves typed property values and filters to schema-defined fields. |
| hugegraph-llm/src/hugegraph_llm/api/rag_api.py | Adds request-scoped config snapshot/restore, provider validation/rollback, and request handling refactors. |
| hugegraph-llm/src/hugegraph_llm/api/models/rag_requests.py | Constrains provider types via Literal to reject unsupported providers at validation time. |
| hugegraph-llm/src/hugegraph_llm/api/admin_api.py | Hardens /logs via secure token checks and strict path validation. |
| docs/specs/2026-05-31-hugegraph-ai-code-scan-design.md | Adds code-scan design spec and invariants/priority model. |
| docs/plans/2026-05-31-hugegraph-ai-code-scan.md | Adds code-scan execution plan and deliverables/ledger structure. |
| .workflow/code-scan/README.md | Introduces scan ledger rules and pointers to spec/plan. |
| .workflow/code-scan/code-scan-state.json | Captures scan progress/state, touched files, and reported commands/results. |
| .workflow/code-scan/checkpoints/00-setup.md | Records setup checkpoint for scan process. |
| .workflow/code-scan/checkpoints/01-client-transport.md | Records client transport lane checkpoint results. |
| .workflow/code-scan/checkpoints/02-client-api-structure.md | Records client API/structure lane checkpoint results. |
| .workflow/code-scan/checkpoints/03-llm-api-config.md | Records LLM API/config lane checkpoint results. |
| .workflow/code-scan/checkpoints/04-llm-flow-operator.md | Records LLM flow/operator lane checkpoint results. |
| .workflow/code-scan/checkpoints/05-llm-index-model.md | Records LLM index/model lane checkpoint results. |
| .workflow/code-scan/checkpoints/06-test-quality.md | Records test-quality lane checkpoint results. |
| .workflow/code-scan/checkpoints/07-synthesis.md | Records cross-module synthesis checkpoint results. |
| .workflow/code-scan/reports/issues.md | Adds prioritized issue ledger with evidence and statuses. |
| .workflow/code-scan/reports/module-map.md | Adds a module/layer scan coverage map. |
| .workflow/code-scan/reports/test-quality-ledger.md | Adds a test-quality ledger summarizing gaps, FIXMEs, and coverage notes. |
| .workflow/code-scan/reports/final-code-scan-report.md | Adds final scan report summary and follow-up fix progress section. |
| .gitignore | Ignores local PR review scratch directories under .workflow/. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- remove run-specific checklist state from scan plan - document scratch ledger as optional local state - avoid agent-specific wording in reusable docs
imbajin
left a comment
There was a problem hiding this comment.
I found a few remaining correctness/security gaps around failed config rollback, Admin UI log access, Gremlin contracts, and flow-test coverage. I also +1'd the existing Copilot comment for the request-scoped graph config race instead of duplicating it.
|
Follow-up note on The current snapshot/restore is only a transitional guard. It reduces permanent global config leakage, but it does not provide true concurrent request isolation because I do not think we should fix this by adding a global lock in this PR. That critical section covers potentially long-running RAG/Text2Gremlin execution, including graph queries, vector retrieval, and LLM calls; locking it would effectively serialize concurrent requests. Suggested scope for this PR:
Expected follow-up direction: pass graph config through request-scoped flow/operator context and create per-request HugeGraph clients instead of reading/writing |
- replace transitional TODO with explicit FIXME - document why global locking is not acceptable - point follow-up toward request-scoped graph config
- redact sensitive response bodies in client error logs - normalize Gremlin malformed success payload errors - guard Gradio admin log access with secure token checks - roll back concrete provider config fields on failures - cover edge property filtering and RAG flow smoke contracts
Short Summary
Full Summary by CR
Bug Fixes
New Features
Documentation
Tests