feat: Refactor JSON parsing into shared utility for CodeAgent, JudgeAgent, and SnippetAgent#178
feat: Refactor JSON parsing into shared utility for CodeAgent, JudgeAgent, and SnippetAgent#178vulam1808 wants to merge 2 commits into
Conversation
…gent, and SnippetAgent Fixes XortexAI#141 - Bounty solution
There was a problem hiding this comment.
Code Review
This pull request centralizes LLM JSON response parsing by introducing a shared utility module, src/utils/json_parse.py, and refactoring CodeAgent, JudgeAgent, and SnippetAgent to use it. Feedback highlights a critical issue in src/agents/judge.py where removing the json import will cause a NameError due to a remaining reference in an exception handler. Additionally, it is recommended to ensure extract_json_from_response always returns a dictionary to prevent AttributeError in calling code, and to address unused helper functions introduced in the new utility module.
|
|
||
| import asyncio | ||
| import json | ||
| from typing import Any, Callable, Dict, List, Optional |
There was a problem hiding this comment.
Removing the json import here will cause a NameError in the _parse_response method (specifically at line 513) because json.JSONDecodeError is still referenced in an except block. To fix this, you should either restore the import or update the except block to catch Exception, which would be consistent with the refactoring applied to CodeAgent and SnippetAgent.
| from typing import Any, Callable, Dict, List, Optional | |
| import json | |
| from typing import Any, Callable, Dict, List, Optional |
| if len(parts) > 1: | ||
| cleaned = parts[1].split("```", 1)[0] | ||
|
|
||
| return json.loads(cleaned.strip()) |
There was a problem hiding this comment.
The json.loads function can return types other than a dictionary (e.g., lists, strings, or booleans). Since this utility is type-hinted to return a dict and is used by agents that immediately call .get() on the result, a non-dictionary return value will trigger an AttributeError. It is safer to verify that the parsed data is a dictionary before returning it.
| return json.loads(cleaned.strip()) | |
| data = json.loads(cleaned.strip()) | |
| return data if isinstance(data, dict) else {} |
| def parse_list_field( | ||
| raw: str, | ||
| field_name: str, | ||
| *, | ||
| model_cls: type[T], | ||
| logger: Optional[logging.Logger] = None, | ||
| ) -> list[T]: | ||
| """Parse a list of Pydantic models from a JSON field in an LLM response. | ||
|
|
||
| Args: | ||
| raw: The raw LLM output string. | ||
| field_name: The key in the JSON object whose value is a list. | ||
| model_cls: Pydantic model class to instantiate for each item. | ||
| logger: Optional logger for error reporting. | ||
|
|
||
| Returns: | ||
| A list of parsed model instances (empty list on any failure). | ||
| """ | ||
| log = logger or logging.getLogger("xmem.utils.json_parse") | ||
| try: | ||
| data = extract_json_from_response(raw) | ||
| items_data = data.get(field_name, []) | ||
| if not items_data: | ||
| return [] | ||
|
|
||
| results: list[T] = [] | ||
| for item_dict in items_data: | ||
| if not isinstance(item_dict, dict): | ||
| continue | ||
| try: | ||
| results.append(model_cls(**item_dict)) | ||
| except Exception as exc: | ||
| log.warning("Failed to parse item in %s: %s", field_name, exc) | ||
|
|
||
| return results | ||
|
|
||
| except Exception as exc: | ||
| log.error("Failed to parse %s from response: %s", field_name, exc) | ||
| log.debug("Raw response: %s", raw[:500]) | ||
| return [] | ||
|
|
||
|
|
||
| def parse_json_response( | ||
| raw: str, | ||
| result_cls: type[T], | ||
| *, | ||
| expected_field: Optional[str] = None, | ||
| logger: Optional[logging.Logger] = None, | ||
| ) -> T: | ||
| """Parse an LLM response into a Pydantic result model. | ||
|
|
||
| Args: | ||
| raw: The raw LLM output string. | ||
| result_cls: Pydantic model class for the result object. | ||
| expected_field: If provided, extract the list from this field and | ||
| pass it as a constructor argument named "field_name". | ||
| The constructor must accept the field as a list of dicts. | ||
| logger: Optional logger for error reporting. | ||
|
|
||
| Returns: | ||
| An instance of result_cls, populated or empty on failure. | ||
| """ | ||
| log = logger or logging.getLogger("xmem.utils.json_parse") | ||
| try: | ||
| data = extract_json_from_response(raw) | ||
|
|
||
| if expected_field: | ||
| field_data = data.get(expected_field, []) | ||
| return result_cls(**{expected_field: field_data}) | ||
|
|
||
| return result_cls(**data) | ||
|
|
||
| except Exception as exc: | ||
| log.error("Failed to parse response into %s: %s", result_cls.__name__, exc) | ||
| log.debug("Raw response: %s", raw[:500]) | ||
| return result_cls() No newline at end of file |
There was a problem hiding this comment.
The functions parse_list_field and parse_json_response are defined but not currently used by any of the agents refactored in this pull request. To maintain a clean codebase and avoid dead code, these should either be integrated into the agents (e.g., to replace manual mapping loops) or removed until they are actually needed.
…gent, and SnippetAgent
|
@vulam1808 this looks like a bot PR. we advice to first discuss your approach on the code agent. closing this for now |
Bounty Solution for #141
Issue: Redesign, Audit and Revamp Code & Snippet Agents + Judge Logic
Bounty Amount: $20
Changes Made:
src/utils/json_parse.pywith sharedextract_json_from_response()utilityCodeAgent._parse_response()to use shared utility (removed 10 lines of duplicate code)JudgeAgent._parse_judgement()to use shared utility (removed ~10 lines of duplicate code)SnippetAgentsimilarly (removed ~10 lines of duplicate code)This refactoring addresses the architectural debt identified in the issue by extracting common JSON parsing logic into a shared, testable utility that all agents can use.
Auto-generated by Bounty Hunter Pipeline