Experimental claude skill for puzzletron algoritgm#1769
Experimental claude skill for puzzletron algoritgm#1769danielkorzekwa wants to merge 21 commits into
Conversation
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds an experimental ChangesPuzzletron Agent Skill
Estimated code review effort: 3 (Moderate) | ~20 minutes 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## main #1769 +/- ##
==========================================
- Coverage 75.20% 66.10% -9.10%
==========================================
Files 515 516 +1
Lines 57245 57270 +25
==========================================
- Hits 43050 37861 -5189
- Misses 14195 19409 +5214
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.agents/skills/puzzletron/all_progress.py:
- Around line 80-84: The variables `cur_b` and `total_b` are only defined inside
the elif block when `batch_matches` is truthy, but they are used later in the
code (around line 100) regardless of which conditional branch executes. When the
if condition on line 80 evaluates to true (sol_done is not None and sol_total is
truthy), the elif block is skipped entirely, leaving `cur_b` and `total_b`
undefined. Extract the batch data unpacking logic (extracting pct, cur_b, and
total_b from batch_matches[-1]) before the if-elif conditional block to ensure
these variables are always defined when batch_matches is non-empty, preventing
NameError when they are referenced later in the code.
In @.agents/skills/puzzletron/mip_progress.py:
- Around line 53-59: Replace the hardcoded source line number markers with
content-based semantic markers to make detection robust to code refactoring. In
the completion detection block around line 57, replace the condition checking
for "sweep.py:292" with a check for "Results written to:" which is the actual
completion message. In the related detection block around lines 109-114 that
currently guards on "sweep.py:258", remove the line number check entirely and
instead use unconditional regex matching on the "compression_rate=" pattern
which is already a proven approach used at line 99 for results detection.
In @.agents/skills/puzzletron/SKILL.md:
- Around line 34-40: The specification lacks numeric validation for the
nproc_per_node parameter before it is interpolated into shell commands, creating
a security vulnerability for shell injection attacks. Add an explicit validation
rule to both the "all" and "local" command sections in the skill specification
that checks whether nproc_per_node matches the pattern of a positive integer
(^[0-9]+$). Insert this validation check after the "value not found" check and
before the "Otherwise use the parsed value" instruction in both sections. If the
value is not strictly numeric, the specification should instruct to ask the user
"nproc_per_node must be a positive integer." and STOP before any shell command
execution occurs.
- Around line 46-53: The shell pipeline using torchrun piped to tee piped to
grep does not properly propagate exit codes because without pipefail, the
pipeline only returns the exit code of the rightmost command (grep). When
torchrun fails but grep successfully finds the "Puzzletron Progress" pattern,
the pipeline reports success even though the actual torchrun command failed. To
fix this, add set -o pipefail before or at the beginning of the script block
containing the torchrun command to ensure that the pipeline returns a non-zero
exit code when any command in the pipeline fails, allowing accurate exit code
reporting as mentioned in the instructions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f507f804-2357-44dd-934e-633f88d0cd06
📒 Files selected for processing (7)
.agents/skills/puzzletron/README.md.agents/skills/puzzletron/SKILL.md.agents/skills/puzzletron/all_progress.py.agents/skills/puzzletron/mip_progress.py.claude/skills/puzzletronCHANGELOG.rstexamples/puzzletron/README.md
…re always defined Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…l and mip commands Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…ection in mip_progress.py Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
…t masked by grep exit code Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
| Run the following Bash command, substituting `<nproc_per_node>` with the parsed value: | ||
|
|
||
| ```bash | ||
| set -o pipefail && export PYTHONPATH=$PYTHONPATH:/workspace/Model-Optimizer && \ |
There was a problem hiding this comment.
Not all users would have modelopt mounted at /workspace/Model-Optimizer and we shouldnt need to modify PYTHONPATH. Instead, can we have users install puzzletron dependencies as per example readme as a pre-requisite to running this skill? then we just need to run the torchrun ... command here
There was a problem hiding this comment.
Also how common is it for everyone to have claude code installed inside their docker container? Generally its there on the local machine but not in docker
|
/claude review |
| rates_match = re.search(r"Compression rates: \[(.*?)\]", text) | ||
| all_rates = [norm(r.strip()) for r in rates_match.group(1).split(",")] if rates_match else [] | ||
|
|
||
| # Detect completion via step 8 marker or sweep.py:292 |
There was a problem hiding this comment.
[SUGGESTION] Stale comment: it still references sweep.py:292, but the code below no longer keys off any source line number — it detects completion via the content markers "Results written to:" and "Puzzletron Progress 8/8". A comment that names a marker the code doesn't use is misleading to the next reader. Suggest aligning it with the actual logic:
| # Detect completion via step 8 marker or sweep.py:292 | |
| # Detect completion via the "Results written to:" line or the step 8/8 marker |
There was a problem hiding this comment.
Claude review passed — no blocking issues found. LGTM
Summary
Findings: CRITICAL: 0 · IMPORTANT: 0 · SUGGESTION: 1
This PR adds an experimental /puzzletron Claude Code agent skill: docs (README.md, SKILL.md), two pure-stdlib log-parsing progress scripts (all_progress.py, mip_progress.py), a relative symlink wiring .claude/skills/puzzletron → .agents/skills/puzzletron, plus CHANGELOG and example-README updates. No modelopt/ source is touched, so the mode/state-composability, export-compatibility, and algorithm-correctness categories don't apply here. All 7 changed files were reviewed.
Prior CodeRabbit findings — all resolved in HEAD:
set -o pipefailis present in both themipandalltorchrun pipelines, so a torchrun failure is no longer masked by the trailinggrep.nproc_per_nodeis validated against^[0-9]+$before being interpolated into the shell command (both commands).cur_b/total_bare now unpacked frombatch_matches[-1]before theif/elifblock inall_progress.py, fixing the priorNameError.- Completion detection in
mip_progress.pyuses content markers ("Results written to:","Puzzletron Progress 8/8") instead of brittle source line numbers.
Verified: the symlink target is correct (mode 120000, relative), and the hardcoded config path examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml exists.
Remaining (non-blocking):
- 1 SUGGESTION: a stale code comment in
mip_progress.py:53still referencessweep.py:292, a marker the code no longer uses.
Risk: Low. Self-contained, experimental, opt-in tooling with no impact on the core optimization library or existing user workflows.
Signed-off-by: Johannes Rausch <jrausch@nvidia.com>
Signed-off-by: Johannes Rausch <jrausch@nvidia.com>
Signed-off-by: Johannes Rausch <jrausch@nvidia.com>
|
I've re-run the PR branch and added some changes to resolve friction that I've run into here: https://github.com/NVIDIA/Model-Optimizer/tree/jrausch/pr1769-review |
Signed-off-by: Daniel Korzekwa <dkorzekwa@nvidia.com>
What does this PR do?
Type of change: new feature
Experimental claude skill for puzzletron compression algorithm. See
.agents/skills/puzzletron/README.mdfor detailsUsage
see
.agents/skills/puzzletron/README.mdTesting
Before your PR is "Ready for review"
Summary by CodeRabbit
Release Notes
New Features
/puzzletron mipand/puzzletron allto run the MIP step or the full pipeline./puzzletron mip progressand/puzzletron all progressreporting with per-step (and per-rate for MIP) elapsed time, completion state, estimated remaining time, and results path when available.Documentation
Tests
Chores