A command-line tool to convert Claude Code session logs to Parquet format for data analysis and AI applications.
npm install -g claude2parquet# Export Claude Code logs for current directory to claude_code_<project>.parquet
claude2parquet
# Export logs from all projects
claude2parquet --all
# Export to custom filename
claude2parquet --output logs.parquet
# Export logs for a specific project directory
claude2parquet --project ~/code/myappThe generated Parquet file contains the following columns:
project(STRING): Project name derived from the session directorysession_id(STRING): Unique session identifieruuid(STRING): Unique message identifiertimestamp(STRING): Message timestamp in ISO formattype(STRING): Message type (user or assistant)role(STRING): Message rolemodel(STRING): Model used for assistant messagescontent(STRING): Flattened message contentversion(STRING): Claude Code versioncwd(STRING): Working directory at time of messagegit_branch(STRING): Active git branch at time of message
- Node.js
- Claude Code must be installed with session logs in
~/.claude/projects/
Claude Code deletes session logs older than 30 days by default. To retain more history, set cleanupPeriodDays in ~/.claude/settings.json:
{ "cleanupPeriodDays": 365 }--output <file>,-o <file>: Output parquet filename (default:claude_code_<project>.parquet, orclaude_code.parquetwith--all)--project <path>: Filter logs to a specific project directory--all: Export logs from all projects--help,-h: Show help message
- Analyzing Claude Code usage patterns across projects
- Training ML models on human-AI coding interactions
- Creating datasets for software engineering research
- Building usage dashboards and productivity metrics
Hyperparam is a tool for exploring and curating AI datasets, such as those produced by claude2parquet.