feat: support Claude Code transcripts by LoikStyle · Pull Request #168 · XortexAI/XMem

LoikStyle · 2026-05-11T05:08:16Z

Summary

Add Claude Code JSONL transcript parsing to the context import pipeline
Extract only conversational text from user/assistant turns and ignore tool-only blocks
Include focused regression coverage for Claude Code transcript uploads

Test Plan

python3 -m pytest tests/test_claude_code_transcript.py -q -o addopts=''
python3 -m py_compile src/api/routes/memory.py server.py

Fixes #156

gemini-code-assist

Code Review

This pull request implements a parser for Claude Code JSONL transcripts, adding _content_to_text and _parse_claude_code_transcript functions to server.py and src/api/routes/memory.py, and includes a new test file. Feedback indicates that the parsing logic is duplicated and should be moved to a shared module to reduce maintenance overhead. Additionally, a performance optimization was suggested to include a heuristic check for JSON content before attempting to parse the transcript lines.

gemini-code-assist · 2026-05-11T05:13:28Z

+def _content_to_text(content: Any) -> str:
+    """Extract readable text from Claude Code message content blocks."""
+    if isinstance(content, str):
+        return content.strip()
+    if isinstance(content, list):
+        chunks: list[str] = []
+        for item in content:
+            if isinstance(item, str):
+                chunks.append(item)
+            elif isinstance(item, dict) and item.get("type") == "text":
+                chunks.append(str(item.get("text", "")))
+        return "\n".join(chunk.strip() for chunk in chunks if chunk.strip()).strip()
+    return ""


The logic for _content_to_text and _parse_claude_code_transcript is duplicated between server.py and src/api/routes/memory.py. This increases maintenance overhead and the risk of inconsistencies as the parsing logic evolves. Consider moving these utilities to a shared module (e.g., src/utils/transcripts.py) that both files can import from.

gemini-code-assist · 2026-05-11T05:13:28Z

+    current_user_query: str | None = None
+    assistant_chunks: list[str] = []
+
+    for raw_line in text.splitlines():


The current implementation of _parse_claude_code_transcript iterates through every line of the input text and attempts to parse it as JSON. This can be inefficient for large non-JSON transcripts (e.g., standard markdown files that don't match Cursor or Antigravity formats). Since Claude Code transcripts are JSONL files, adding a quick heuristic check at the beginning of the function can avoid unnecessary processing.

Suggested change

for raw_line in text.splitlines():

if not text.strip().startswith("{"):

return []

for raw_line in text.splitlines():

gemini-code-assist · 2026-05-11T05:13:29Z

+    current_user_query: str | None = None
+    assistant_chunks: List[str] = []
+
+    for raw_line in text.splitlines():


The current implementation of _parse_claude_code_transcript iterates through every line of the input text and attempts to parse it as JSON. This can be inefficient for large non-JSON transcripts. Adding a quick heuristic check at the beginning of the function can avoid unnecessary processing for files that are clearly not in JSONL format.

Suggested change

for raw_line in text.splitlines():

if not text.strip().startswith("{"):

return []

for raw_line in text.splitlines():

Ankit-Kotnala · 2026-05-11T18:33:23Z

+    current_user_query: str | None = None
+    assistant_chunks: List[str] = []
+
+    for raw_line in text.splitlines():


Good call. Since Claude Code transcripts are JSONL, the shared parser should first reject obvious non-JSONL input before iterating through every line. This should be fixed in the shared parser rather than separately in both files.

Ankit-Kotnala · 2026-05-11T18:33:31Z



+def _content_to_text(content: Any) -> str:
+    """Extract readable text from Claude Code message content blocks."""


Agree. Since this parser is used by both the standalone server and the production memory route, please move the Claude transcript parsing into src/utils/transcripts.py and have both server.py and src/api/routes/memory.py import the shared parser from there.

Ankit-Kotnala

The feature is good, but @LoikStyle should centralize the parser and clean up the test before merge.

feat: support Claude Code transcripts

dc401c6

LoikStyle requested review from ishaanxgupta and ved015 as code owners May 11, 2026 05:08

github-actions Bot added tests api labels May 11, 2026

LoikStyle mentioned this pull request May 11, 2026

add support of claude code transcript in /context page #156

Open

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

ishaanxgupta requested review from Ankit-Kotnala May 11, 2026 08:37

Ankit-Kotnala reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support Claude Code transcripts#168

feat: support Claude Code transcripts#168
LoikStyle wants to merge 1 commit into
XortexAI:mainfrom
LoikStyle:loikstyle/claude-code-transcript-156

LoikStyle commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

Ankit-Kotnala May 11, 2026

Uh oh!

Ankit-Kotnala May 11, 2026

Uh oh!

Ankit-Kotnala left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		def _content_to_text(content: Any) -> str:
		"""Extract readable text from Claude Code message content blocks."""

Conversation

LoikStyle commented May 11, 2026

Summary

Test Plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Ankit-Kotnala May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Ankit-Kotnala May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Ankit-Kotnala left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants