[Fix] Make h1_locomotion demo runnable on main: migrate deprecated RSL-RL config and checkpoint by hujc7 · Pull Request #5903 · isaac-sim/IsaacLab

hujc7 · 2026-06-01T19:14:07Z

1. Summary

Fixes NVBug 6120938 (P0): ./isaaclab.sh -p scripts/demos/h1_locomotion.py crashes on main (Isaac Sim 5.1.0, rsl-rl-lib==5.0.1). Two sequential incompatibilities with rsl-rl 5.x; both fixed here.
Crash 1 — KeyError: 'class_name' at OnPolicyRunner construction: the H1 agent config uses the deprecated policy field; rsl-rl 5.x expects per-model actor/critic configs. Fixed by handle_deprecated_rsl_rl_cfg in the demo (as train.py/play.py already do).
Crash 2 — KeyError: 'actor_state_dict' at runner.load(): the published H1 checkpoint (Isaac 5.1 assets) was saved in the pre-5.0 layout (single combined model_state_dict), while rsl-rl 5.x expects separate actor_state_dict/critic_state_dict. Fixed by a new handle_deprecated_rsl_rl_checkpoint helper that remaps the old layout, applied before loading.

2. Why both are needed

The cfg migration alone unblocks construction but the demo then fails loading the old-format published checkpoint. develop's demo avoids the second crash only because its newer (Isaac 6.0) assets ship the checkpoint in the new layout — there is no conversion code there. On main (5.1 assets) the checkpoint must be converted.

3. Changes

source/isaaclab_rl/isaaclab_rl/rsl_rl/utils.py: add handle_deprecated_rsl_rl_checkpoint — remaps actor.N→actor_state_dict.mlp.N, critic.N→critic_state_dict.mlp.N, std→actor_state_dict.distribution.std_param; new-layout checkpoints and rsl-rl < 5.0 pass through unchanged; raises on unrecognized keys.
scripts/demos/h1_locomotion.py: migrate the agent config, then convert the checkpoint before ppo_runner.load.
scripts/reinforcement_learning/rsl_rl/play.py: apply the same converter to the resolved checkpoint before runner.load (covers --use_pretrained_checkpoint, which loads the same old published checkpoint; no-op for locally-trained checkpoints).
source/isaaclab_rl/test/test_rsl_rl_cfg_deprecation.py: unit tests for the converter.
isaaclab_rl version bump + changelog.

4. Verification

In an Isaac Sim 5.1.0 + rsl-rl-lib==5.0.1 container (the demo's GUI viewport needs a display the CI host lacks; the bug is upstream of rendering, so validated headless on the real code paths):

Demo, unmodified → KeyError: 'class_name'; cfg-only → KeyError: 'actor_state_dict'; with both fixes → the real h1_locomotion module's __init__ completes (config migrates, checkpoint converts via strict-matching load_state_dict, runner.load succeeds) and the policy steps with finite joint targets.
play.py --use_pretrained_checkpoint: unmodified → KeyError: 'actor_state_dict' at runner.load; with the converter → checkpoint converts, loads, and the policy runs.
Converter unit tests: 4 passed. ./isaaclab.sh -f clean.

5. Test plan

Container repro of both demo crashes (unmodified / cfg-only).
Real h1_locomotion module runs headless end-to-end with both fixes (init completes + inference).
play.py --use_pretrained_checkpoint FAIL→PASS with the converter.
Converter unit tests + pre-commit clean.

The H1 rough agent config defines the policy via the deprecated `policy` field (RslRlPpoActorCriticCfg). With rsl-rl-lib 5.0.1, which expects per-model `actor`/`critic` configs carrying `class_name`, constructing OnPolicyRunner raises KeyError: 'class_name' because the demo fed the un-migrated config straight to the runner. Migrate the parsed config via handle_deprecated_rsl_rl_cfg before building the runner, matching scripts/reinforcement_learning/rsl_rl train/play. The helper converts `policy` into `actor`/`critic` (class_name=MLPModel), eliminating the KeyError.

isaaclab-review-bot

🤖 Isaac Lab Review Bot — PR #5903 (Updated)

📋 Summary

This PR now addresses both deprecation issues that arise when running h1_locomotion.py with rsl-rl >= 5.0:

Config migration (original fix): Converts deprecated policy field into actor/critic configs via handle_deprecated_rsl_rl_cfg.
Checkpoint migration (new): Converts pre-5.0 combined model_state_dict checkpoints into the split actor_state_dict/critic_state_dict layout via the new handle_deprecated_rsl_rl_checkpoint.

🔬 Isaac Lab Expert

Verdict: ✅ Well-implemented checkpoint migration

The new handle_deprecated_rsl_rl_checkpoint() in isaaclab_rl/rsl_rl/utils.py:

Correctly gates on rsl-rl version (only converts when >= 5.0)
Detects old layout by checking for model_state_dict key and absence of actor_state_dict
Remaps keys properly: actor.* → mlp.* in actor dict, critic.* → mlp.* in critic dict, std → distribution.std_param
Raises ValueError for unrecognized keys (defensive, avoids silent corruption)
Writes converted checkpoint to a sibling file (_rslrl5 suffix) rather than mutating the original
Carries optimizer_state_dict, iter, and infos for completeness

The demo script integration is clean — calls handle_deprecated_rsl_rl_checkpoint(checkpoint, metadata.version("rsl-rl-lib")) right after fetching the published checkpoint, before the runner uses it.

🧪 Test Coverage Analyzer

Excellent test coverage added. TestHandleDeprecatedRslRlCheckpoint covers:

✅ Converts combined state dict with correct key remapping and value preservation
✅ Passes through new-layout checkpoints untouched
✅ Passes through when rsl-rl < 5.0 (no conversion needed)
✅ Raises ValueError on unrecognized keys

All test cases use tmp_path fixture properly and test both the path return value and the loaded content.

📦 Release Hygiene

Version bump to 0.5.2 ✅
CHANGELOG entry with clear description ✅
__init__.py exports the new public function ✅

🔍 Minor Observations (non-blocking)

The weights_only=False in torch.load is intentional (rsl-rl checkpoints may contain non-tensor objects). This matches the pattern used by rsl-rl's own OnPolicyRunner.load().
The conversion only supports MLP actor-critic architectures (documented in the ValueError message). RNN/Transformer checkpoints would need separate handling, but this matches the scope of the H1 demo which uses MLP.

🏁 Final Verdict: LGTM ✅

Comprehensive fix that addresses the full stack of deprecation issues (config + checkpoint). Clean implementation, good error handling, proper tests, correct release hygiene. Ready to merge.

📝 Update (bf96706): New commit extends handle_deprecated_rsl_rl_checkpoint to scripts/reinforcement_learning/rsl_rl/play.py. This is correct — play.py --use_pretrained_checkpoint loads published (pre-5.0) checkpoints through the same OnPolicyRunner.load path and would hit the same KeyError: 'actor_state_dict'. The fix imports the converter and applies it to resume_path just before runner.load(), with a no-op guard for new-layout checkpoints. Consistent with the pattern already used in the demo script. Previous assessment stands — LGTM ✅.

📝 Update (5f1d13d): Branch merged with main — brings in the daqp 0.8.5 pin bump (#5902, version 0.54.4, CHANGELOG, setup.py). These changes are from an unrelated already-merged PR and do not affect this PR's logic. No new concerns — LGTM ✅.

greptile-apps · 2026-06-01T19:15:57Z

Greptile Summary

This PR fixes a crash in the H1 locomotion demo caused by the H1RoughPPORunnerCfg using the deprecated policy field from rsl-rl 4.x, which OnPolicyRunner in rsl-rl 5.x no longer accepts — raising a KeyError: 'class_name' when it tries to pop that key from the absent actor dict.

Adds handle_deprecated_rsl_rl_cfg(agent_cfg, metadata.version("rsl-rl-lib")) immediately after parse_rsl_rl_cfg, exactly mirroring the steps already performed in scripts/reinforcement_learning/rsl_rl/train.py and play.py.
The import style (from importlib import metadata) differs slightly from train.py/play.py (import importlib.metadata as metadata) but is functionally identical.
No logic changes beyond the migration call; the fix is minimal and well-scoped to the demo.

Confidence Score: 5/5

Safe to merge — the change is a single, well-understood migration call that already works correctly in train.py and play.py.

The fix is minimal (3 lines), directly mirrors tested patterns from the sibling training and play scripts, and handle_deprecated_rsl_rl_cfg has dedicated unit tests for rsl-rl 5.x migration. No new logic is introduced and no existing behavior is altered.

No files require special attention.

Important Files Changed

Filename	Overview
scripts/demos/h1_locomotion.py	Adds handle_deprecated_rsl_rl_cfg call after config parsing to migrate the legacy policy field into actor/critic sub-configs, matching the pattern already used by train.py and play.py; fixes the KeyError crash with rsl-rl-lib 5.x.

Sequence Diagram

sequenceDiagram
    participant Script as h1_locomotion.py
    participant CLI as cli_args
    participant Handle as handle_deprecated_rsl_rl_cfg
    participant Runner as OnPolicyRunner (rsl-rl 5.x)

    Script->>CLI: parse_rsl_rl_cfg(TASK, args_cli)
    CLI-->>Script: agent_cfg (legacy policy field)
    Note over Script: NEW: migrate config
    Script->>Handle: handle_deprecated_rsl_rl_cfg(agent_cfg, "5.0.1")
    Handle-->>Script: agent_cfg (actor + critic with class_name)
    Script->>Runner: OnPolicyRunner(env, agent_cfg.to_dict(), ...)
    Runner-->>Script: ppo_runner (no KeyError)

_{Reviews (1): Last reviewed commit: "Fix KeyError 'class_name' in h1_locomoti..." | Re-trigger Greptile}

The cfg migration alone unblocked OnPolicyRunner construction but the demo then failed at runner.load() with KeyError: 'actor_state_dict': the published H1 checkpoint (Isaac 5.1 assets) was saved in the pre-5.0 rsl-rl layout (single combined model_state_dict), while rsl-rl 5.x expects separate actor_state_dict/critic_state_dict. Add handle_deprecated_rsl_rl_checkpoint to isaaclab_rl, which remaps the old combined state dict (actor.*/critic.*/std) into the actor/critic layout, and call it in the demo before loading. Verified end-to-end in an Isaac Sim 5.1.0 + rsl-rl 5.0.1 container: convert -> load -> policy inference produces finite actions. Adds unit tests for the converter.

play.py --use_pretrained_checkpoint loads the published (pre-5.0) checkpoint via the same OnPolicyRunner.load path, so it hit the same KeyError: 'actor_state_dict'. Apply handle_deprecated_rsl_rl_checkpoint to the resolved checkpoint before loading (no-op for new-layout checkpoints, so locally-trained checkpoints are unaffected). Verified: baseline play.py --use_pretrained_checkpoint crashes at actor_state_dict; with the converter it loads and runs.

hujc7 requested a review from ooctipus as a code owner June 1, 2026 19:14

github-actions Bot added bug Something isn't working isaac-lab Related to Isaac Lab team labels Jun 1, 2026

isaaclab-review-bot Bot reviewed Jun 1, 2026

View reviewed changes

hujc7 requested review from ClemensSchwarke and Mayankm96 as code owners June 1, 2026 20:33

hujc7 changed the title ~~[Fix] Migrate deprecated RSL-RL config in h1_locomotion.py demo to fix KeyError: 'class_name'~~ [Fix] Make h1_locomotion demo runnable on main: migrate deprecated RSL-RL config and checkpoint Jun 1, 2026

kellyguo11 approved these changes Jun 2, 2026

View reviewed changes

Merge branch 'main' into jichuanh/fix-h1-locomotion-class-name

5f1d13d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Make h1_locomotion demo runnable on main: migrate deprecated RSL-RL config and checkpoint#5903

[Fix] Make h1_locomotion demo runnable on main: migrate deprecated RSL-RL config and checkpoint#5903
hujc7 wants to merge 4 commits into
isaac-sim:mainfrom
hujc7:jichuanh/fix-h1-locomotion-class-name

hujc7 commented Jun 1, 2026 •

edited

Loading

Uh oh!

isaaclab-review-bot Bot left a comment •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hujc7 commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Summary

2. Why both are needed

3. Changes

4. Verification

5. Test plan

Uh oh!

isaaclab-review-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

🤖 Isaac Lab Review Bot — PR #5903 (Updated)

📋 Summary

🔬 Isaac Lab Expert

🧪 Test Coverage Analyzer

📦 Release Hygiene

🔍 Minor Observations (non-blocking)

🏁 Final Verdict: LGTM ✅

Uh oh!

greptile-apps Bot commented Jun 1, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hujc7 commented Jun 1, 2026 •

edited

Loading

isaaclab-review-bot Bot left a comment •

edited

Loading