Experimental claude skill for puzzletron algoritgm by danielkorzekwa · Pull Request #1769 · NVIDIA/Model-Optimizer

danielkorzekwa · 2026-06-18T11:20:52Z

What does this PR do?

Type of change: new feature

Experimental claude skill for puzzletron compression algorithm. See .agents/skills/puzzletron/README.md for details

Usage

see .agents/skills/puzzletron/README.md

Testing

Please test it manually before approving MR.
Propose automated way of testing

Before your PR is "Ready for review"

Is this change backward compatible?: ✅
Did you write any new necessary tests?: tested manually only
Did you update Changelog?: ✅
Did you get Claude approval on this PR?: ✅ / ❌ / N/A

Summary by CodeRabbit

Release Notes

New Features
- Added an experimental Puzzletron repo-local agent skill with /puzzletron mip and /puzzletron all to run the MIP step or the full pipeline.
- Added real-time /puzzletron mip progress and /puzzletron all progress reporting with per-step (and per-rate for MIP) elapsed time, completion state, estimated remaining time, and results path when available.
Documentation
- Expanded Puzzletron skill docs with command syntax, progress/ETA behavior, and example outputs.
- Updated example guidance for using Puzzletron with AI agents.
Tests
- Added deterministic tests covering progress reporting and skill execution behavior.
Chores
- Linked the Claude skill directory to the updated Puzzletron skill docs.

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

coderabbitai · 2026-06-18T11:25:07Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 368df67c-eaf0-41f0-b918-f69c394fb5d6

📥 Commits

Reviewing files that changed from the base of the PR and between 91cb775 and 62109c6.

📒 Files selected for processing (2)

.agents/skills/puzzletron/all_progress.py
.agents/skills/puzzletron/mip_progress.py

🚧 Files skipped from review as they are similar to previous changes (2)

.agents/skills/puzzletron/all_progress.py
.agents/skills/puzzletron/mip_progress.py

📝 Walkthrough

Walkthrough

Adds an experimental /puzzletron skill with all and mip commands, progress-reporting scripts, deterministic tests, and updated user-facing docs. It also wires the skill into .claude/skills/ and records the feature in the changelog.

Changes

Puzzletron Agent Skill

Layer / File(s)	Summary
Slash-command routing spec `.agents/skills/puzzletron/SKILL.md`	Defines skill metadata, command routing, argument validation, `all`/`mip` execution payloads, and progress subcommand delegation.
Full-pipeline progress script `.agents/skills/puzzletron/all_progress.py`	Reads `log.txt`, parses step events and timestamps, estimates remaining time from in-flight signals or observed durations, and prints the 8-step status summary.
MIP progress script `.agents/skills/puzzletron/mip_progress.py`	Reads `log.txt`, validates compression-rate sweeps, reports sweep-disabled or per-rate progress, computes elapsed and remaining time, and extracts results paths and solver timing details.
Tests, README, and wiring `.agents/skills/puzzletron/tests/test_skill.py`, `.agents/skills/puzzletron/README.md`, `examples/puzzletron/README.md`, `.claude/skills/puzzletron`, `CHANGELOG.rst`	Adds deterministic tests, updates the skill and example READMEs, records the feature in the changelog, and wires the skill into `.claude/skills/` via symlink.

Estimated code review effort: 3 (Moderate) | ~20 minutes

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: an experimental Claude skill for Puzzletron.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns	✅ Passed	No prohibited patterns found; the PR only adds agent docs/scripts/tests, with no modelopt/examples Python or dependency changes.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch dkorzekwa/puzzletron_claude_skill

_{Comment @coderabbitai help to get the list of available commands.}

codecov · 2026-06-18T11:30:11Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.10%. Comparing base (b0ee953) to head (62109c6).
⚠️ Report is 1 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (b0ee953) and HEAD (62109c6). Click for more details.

HEAD has 2 uploads less than BASE

Flag BASE (b0ee953) HEAD (62109c6)

unit 2 1

examples 12 11

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1769      +/-   ##
==========================================
- Coverage   75.20%   66.10%   -9.10%     
==========================================
  Files         515      516       +1     
  Lines       57245    57270      +25     
==========================================
- Hits        43050    37861    -5189     
- Misses      14195    19409    +5214

Flag	Coverage Δ
examples	`42.96% <ø> (-0.17%)`	⬇️
unit	`54.91% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.agents/skills/puzzletron/all_progress.py:
- Around line 80-84: The variables `cur_b` and `total_b` are only defined inside
the elif block when `batch_matches` is truthy, but they are used later in the
code (around line 100) regardless of which conditional branch executes. When the
if condition on line 80 evaluates to true (sol_done is not None and sol_total is
truthy), the elif block is skipped entirely, leaving `cur_b` and `total_b`
undefined. Extract the batch data unpacking logic (extracting pct, cur_b, and
total_b from batch_matches[-1]) before the if-elif conditional block to ensure
these variables are always defined when batch_matches is non-empty, preventing
NameError when they are referenced later in the code.

In @.agents/skills/puzzletron/mip_progress.py:
- Around line 53-59: Replace the hardcoded source line number markers with
content-based semantic markers to make detection robust to code refactoring. In
the completion detection block around line 57, replace the condition checking
for "sweep.py:292" with a check for "Results written to:" which is the actual
completion message. In the related detection block around lines 109-114 that
currently guards on "sweep.py:258", remove the line number check entirely and
instead use unconditional regex matching on the "compression_rate=" pattern
which is already a proven approach used at line 99 for results detection.

In @.agents/skills/puzzletron/SKILL.md:
- Around line 34-40: The specification lacks numeric validation for the
nproc_per_node parameter before it is interpolated into shell commands, creating
a security vulnerability for shell injection attacks. Add an explicit validation
rule to both the "all" and "local" command sections in the skill specification
that checks whether nproc_per_node matches the pattern of a positive integer
(^[0-9]+$). Insert this validation check after the "value not found" check and
before the "Otherwise use the parsed value" instruction in both sections. If the
value is not strictly numeric, the specification should instruct to ask the user
"nproc_per_node must be a positive integer." and STOP before any shell command
execution occurs.
- Around line 46-53: The shell pipeline using torchrun piped to tee piped to
grep does not properly propagate exit codes because without pipefail, the
pipeline only returns the exit code of the rightmost command (grep). When
torchrun fails but grep successfully finds the "Puzzletron Progress" pattern,
the pipeline reports success even though the actual torchrun command failed. To
fix this, add set -o pipefail before or at the beginning of the script block
containing the torchrun command to ensure that the pipeline returns a non-zero
exit code when any command in the pipeline fails, allowing accurate exit code
reporting as mentioned in the instructions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f507f804-2357-44dd-934e-633f88d0cd06

📥 Commits

Reviewing files that changed from the base of the PR and between 769ea5f and afb6a71.

📒 Files selected for processing (7)

.agents/skills/puzzletron/README.md
.agents/skills/puzzletron/SKILL.md
.agents/skills/puzzletron/all_progress.py
.agents/skills/puzzletron/mip_progress.py
.claude/skills/puzzletron
CHANGELOG.rst
examples/puzzletron/README.md

…re always defined Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

…l and mip commands Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

…ection in mip_progress.py Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

…t masked by grep exit code Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

kevalmorabia97 · 2026-06-22T09:47:37Z

+Run the following Bash command, substituting `<nproc_per_node>` with the parsed value:
+
+```bash
+set -o pipefail && export PYTHONPATH=$PYTHONPATH:/workspace/Model-Optimizer && \


Not all users would have modelopt mounted at /workspace/Model-Optimizer and we shouldnt need to modify PYTHONPATH. Instead, can we have users install puzzletron dependencies as per example readme as a pre-requisite to running this skill? then we just need to run the torchrun ... command here

Also how common is it for everyone to have claude code installed inside their docker container? Generally its there on the local machine but not in docker

kevalmorabia97 · 2026-06-22T10:21:51Z

/claude review

claude · 2026-06-22T10:29:59Z

+rates_match = re.search(r"Compression rates: \[(.*?)\]", text)
+all_rates = [norm(r.strip()) for r in rates_match.group(1).split(",")] if rates_match else []
+
+# Detect completion via step 8 marker or sweep.py:292


[SUGGESTION] Stale comment: it still references sweep.py:292, but the code below no longer keys off any source line number — it detects completion via the content markers "Results written to:" and "Puzzletron Progress 8/8". A comment that names a marker the code doesn't use is misleading to the next reader. Suggest aligning it with the actual logic:

Suggested change

# Detect completion via step 8 marker or sweep.py:292

# Detect completion via the "Results written to:" line or the step 8/8 marker

claude

Claude review passed — no blocking issues found. LGTM

Summary

Findings: CRITICAL: 0 · IMPORTANT: 0 · SUGGESTION: 1

This PR adds an experimental /puzzletron Claude Code agent skill: docs (README.md, SKILL.md), two pure-stdlib log-parsing progress scripts (all_progress.py, mip_progress.py), a relative symlink wiring .claude/skills/puzzletron → .agents/skills/puzzletron, plus CHANGELOG and example-README updates. No modelopt/ source is touched, so the mode/state-composability, export-compatibility, and algorithm-correctness categories don't apply here. All 7 changed files were reviewed.

Prior CodeRabbit findings — all resolved in HEAD:

set -o pipefail is present in both the mip and all torchrun pipelines, so a torchrun failure is no longer masked by the trailing grep.
nproc_per_node is validated against ^[0-9]+$ before being interpolated into the shell command (both commands).
cur_b/total_b are now unpacked from batch_matches[-1] before the if/elif block in all_progress.py, fixing the prior NameError.
Completion detection in mip_progress.py uses content markers ("Results written to:", "Puzzletron Progress 8/8") instead of brittle source line numbers.

Verified: the symlink target is correct (mode 120000, relative), and the hardcoded config path examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml exists.

Remaining (non-blocking):

1 SUGGESTION: a stale code comment in mip_progress.py:53 still references sweep.py:292, a marker the code no longer uses.

Risk: Low. Self-contained, experimental, opt-in tooling with no impact on the core optimization library or existing user workflows.

Signed-off-by: Johannes Rausch <jrausch@nvidia.com>

j-rausch · 2026-07-03T13:34:07Z

I've re-run the PR branch and added some changes to resolve friction that I've run into here: https://github.com/NVIDIA/Model-Optimizer/tree/jrausch/pr1769-review

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

danielkorzekwa added 13 commits June 17, 2026 07:52

Add a dummy puzzletron skill

8ab8f7a

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Add progress comnand for puzzletron mip_sweep

9a6d716

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

update puzzletron readme

1e79463

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

fix typo in SKILL.md

f99c01f

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Add PYTHONPATH to mip_sweep skill

131296c

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

add skill for puzzletron all

c5c2015

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

update progress bar

1a62799

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Rename mip_sweep to mip command

fa197c2

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Move python scripts for puzzletron claude command to py scripts.

03ceee3

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

update progress bar

02ec52e

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

code clean up

dd6fbdb

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

code clean up

c9f1fe1

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

add changelog for puzzletron clade skill

afb6a71

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

danielkorzekwa requested review from a team as code owners June 18, 2026 11:20

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

Comment thread .agents/skills/puzzletron/all_progress.py

Comment thread .agents/skills/puzzletron/mip_progress.py Outdated

Comment thread .agents/skills/puzzletron/SKILL.md

Comment thread .agents/skills/puzzletron/SKILL.md

danielkorzekwa mentioned this pull request Jun 19, 2026

When creating a claude skill for puzzletron, quantization tests fails slowing down development process #1772

Open

danielkorzekwa added 4 commits June 19, 2026 00:21

NameError: unpack batch_matches before if/elif so cur_b and total_b a…

37d7132

…re always defined Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

dd nproc_per_node integer validation to prevent shell injection in al…

417ae87

…l and mip commands Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

replace hardcoded sweep.py line-number markers with content-based det…

708cf7b

…ection in mip_progress.py Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

add set -o pipefail to torchrun pipelines so torchrun failures are no…

ca9d8b4

…t masked by grep exit code Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

coderabbitai Bot approved these changes Jun 19, 2026

View reviewed changes

danielkorzekwa assigned chochowski, Separius, j-rausch, grzegorz-k-karch and kevalmorabia97 Jun 22, 2026

danielkorzekwa mentioned this pull request Jun 22, 2026

Create adding_new_model_tutorial.md #1784

Open

kevalmorabia97 reviewed Jun 22, 2026

View reviewed changes

kevalmorabia97 requested review from Separius, chochowski, grzegorz-k-karch and j-rausch June 22, 2026 10:21

kevalmorabia97 unassigned Separius, grzegorz-k-karch, chochowski, kevalmorabia97 and j-rausch Jun 22, 2026

claude Bot reviewed Jun 22, 2026

View reviewed changes

claude Bot approved these changes Jun 22, 2026

View reviewed changes

j-rausch added 3 commits July 3, 2026 13:46

Merge current main into Puzzletron skill branch

0aa20ba

Signed-off-by: Johannes Rausch <jrausch@nvidia.com>

Harden Puzzletron skill execution and progress handling

775fd25

Signed-off-by: Johannes Rausch <jrausch@nvidia.com>

Test Puzzletron skill execution and progress handling

91cb775

Signed-off-by: Johannes Rausch <jrausch@nvidia.com>

Update puzzletron skill

62109c6

Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Experimental claude skill for puzzletron algoritgm#1769

Experimental claude skill for puzzletron algoritgm#1769
danielkorzekwa wants to merge 21 commits into
mainfrom
dkorzekwa/puzzletron_claude_skill

danielkorzekwa commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

codecov Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kevalmorabia97 Jun 22, 2026

Uh oh!

kevalmorabia97 Jun 22, 2026

Uh oh!

kevalmorabia97 commented Jun 22, 2026

Uh oh!

claude Bot Jun 22, 2026

Uh oh!

claude Bot left a comment

Uh oh!

j-rausch commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	# Detect completion via step 8 marker or sweep.py:292
	# Detect completion via the "Results written to:" line or the step 8/8 marker

Uh oh!

Conversation

danielkorzekwa commented Jun 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

codecov Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kevalmorabia97 Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

kevalmorabia97 Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

kevalmorabia97 commented Jun 22, 2026

Uh oh!

claude Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Summary

Uh oh!

j-rausch commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

danielkorzekwa commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

codecov Bot commented Jun 18, 2026 •

edited

Loading