feat(configurator): add GymnasiumAdapter for CloudAI envs#930
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds a pluggable Encoding abstraction for env-parameter observation leaves, wires structured observations through CloudAIGymEnv, introduces a new GymnasiumAdapter wrapping BaseGym as a gymnasium.Env, adds a gymnasium optional dependency with lazy import support, updates public exports, and fixes a traceback re-raise in DSE job handling. ChangesDSE env-param encoding and Gymnasium integration
Estimated code review effort: 4 (Complex) | ~60 minutes Related PRs: None identified. Suggested labels: enhancement, rl, configurator Suggested reviewers: None identified. 🐰 A gym for trials, params take flight, 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)
146-172:⚠️ Potential issue | 🟠 Major | ⚡ Quick winConstraint-failure branch breaks per-step artifact contract and observation shape consistency.
At Line 171, the early return skips
write_trajectory(...)andobserver.after_step(...)even thoughbefore_step(...)already ran at Lines 146-147. This can leaveenv.csvwith a step that is missing intrajectory.csv. It also returns a fixed[-1.0], which mismatches the dynamic observation shape introduced at Line 103 whenagent_metricshas more than one metric.🔧 Proposed fix
if not self.test_run.test.constraint_check(self.test_run, self.runner.system): logging.info("Constraint check failed. Skipping step.") - return [-1.0], self.rewards.constraint_failure, True, {} + failed_observation = [self.rewards.metric_failure] * max(len(self.test_run.test.agent_metrics), 1) + failed_reward = self.rewards.constraint_failure + self.write_trajectory( + TrajectoryEntry( + step=self.test_run.step, + action=action, + reward=failed_reward, + observation=failed_observation, + env_params=dict(self.test_run.current_env_params), + ) + ) + for observer in self.observers: + observer.after_step(self.test_run, failed_observation, failed_reward) + return failed_observation, failed_reward, True, {}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/cloudai_gym.py` around lines 146 - 172, The constraint-failure early return at the end of the diff skips both write_trajectory and observer.after_step calls even though before_step was already invoked, breaking the per-step artifact contract and leaving the trajectory file inconsistent with env.csv. Additionally, the hardcoded [-1.0] observation return value does not match the dynamic observation shape determined by agent_metrics. To fix this, when the constraint_check fails: create a TrajectoryEntry with the current step, action, the constraint_failure reward, and the current observation from self.test_run, call write_trajectory with this entry, invoke observer.after_step with the test_run, current observation, and constraint_failure reward, then return the current observation (not the hardcoded [-1.0]), the constraint_failure reward, and the appropriate done flag.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cloudai/configurator/env_params.py`:
- Line 2: The copyright headers in both files use an invalid year format that
fails the repository's copyright header validation test. In
src/cloudai/configurator/env_params.py at line 2, change the copyright header
from "Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights
reserved." to "Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights
reserved." by removing the year range and keeping only 2026. Apply the identical
change to tests/test_env_params.py at line 2, changing from the 2024-2026 format
to just 2026 to match the repository's required copyright-year formatting policy
enforced by tests/test_check_copyright_headers.py.
In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 251-253: The current implementation trusts the keys returned by
encode_observation() when building the output dictionary, which can cause
KeyError for extra keys or silently produce incomplete observations if keys are
missing. Fix this by first validating that the set of keys from the encoded
observation matches the set of keys in the descriptors dictionary, then
materialize the output by iterating through descriptors keys instead of
encoded.items(), ensuring all required descriptor keys are present and properly
coerced without relying on the encode_observation() output to have the correct
keys.
- Around line 206-207: The step() method's action parameter is typed as
dict[str, int] but the implementation and tests show it needs to accept
dict[str, Any] to handle both integer and continuous numpy array values that are
passed to decode_action(). Change the type annotation of the action parameter in
the step() method signature from dict[str, int] to dict[str, Any] to match what
decode_action() expects and what the tests actually pass to it.
In `@tests/test_action_space.py`:
- Around line 43-50: These negative-validation tests intentionally pass invalid
arguments to verify runtime validation rejects them, but this causes type
checker errors. Use typing.cast() to suppress these violations at the affected
sites. In tests/test_action_space.py lines 43-50, wrap the invalid dtype literal
"double" with cast(Any, "double") in the
test_continuous_space_rejects_unknown_dtype function, and wrap the entire
ContinuousSpace constructor call with cast(dict[str, Any], {...}) to suppress
the extra step parameter in test_continuous_space_forbids_extra_fields.
Similarly, in tests/test_env_params.py lines 142-149, apply cast(Any,
"categorical") for the invalid kind literal and cast(dict[str, Any], {...}) for
the constructor call containing the unexpected extra field.
---
Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 146-172: The constraint-failure early return at the end of the
diff skips both write_trajectory and observer.after_step calls even though
before_step was already invoked, breaking the per-step artifact contract and
leaving the trajectory file inconsistent with env.csv. Additionally, the
hardcoded [-1.0] observation return value does not match the dynamic observation
shape determined by agent_metrics. To fix this, when the constraint_check fails:
create a TrajectoryEntry with the current step, action, the constraint_failure
reward, and the current observation from self.test_run, call write_trajectory
with this entry, invoke observer.after_step with the test_run, current
observation, and constraint_failure reward, then return the current observation
(not the hardcoded [-1.0]), the constraint_failure reward, and the appropriate
done flag.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: b6827577-13bc-423b-ab43-f10df882a769
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
pyproject.tomlsrc/cloudai/_core/action_space.pysrc/cloudai/_core/test_scenario.pysrc/cloudai/cli/handlers.pysrc/cloudai/configurator/__init__.pysrc/cloudai/configurator/base_agent.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/env_params.pysrc/cloudai/configurator/gymnasium_adapter.pysrc/cloudai/core.pysrc/cloudai/models/workload.pytests/test_action_space.pytests/test_cloudaigym.pytests/test_env_params.pytests/test_gymnasium_adapter_contract.pytests/test_handlers.pytests/test_test_scenario.py
49f5b43 to
707c038
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/cloudai/configurator/cloudai_gym.py (1)
169-171:⚠️ Potential issue | 🟠 Major | ⚡ Quick winConstraint-failure path breaks step-alignment and observation-shape contracts
At Line 169, the early return bypasses trajectory writing and
after_stepcallbacks (afterbefore_stepalready persisted env params), which can desynchronizeenv.csvandtrajectory.csv. It also returns a fixed[-1.0], which mismatches the new metric-sized observation shape whenagent_metricshas length > 1.💡 Suggested localized fix
if not self.test_run.test.constraint_check(self.test_run, self.runner.system): logging.info("Constraint check failed. Skipping step.") - return [-1.0], self.rewards.constraint_failure, True, {} + observation = [-1.0] * max(len(self.test_run.test.agent_metrics), 1) + reward = self.rewards.constraint_failure + self.write_trajectory( + TrajectoryEntry( + step=self.test_run.step, + action=action, + reward=reward, + observation=observation, + env_params=dict(self.test_run.current_env_params), + ) + ) + for observer in self.observers: + observer.after_step(self.test_run, observation, reward) + return observation, reward, True, {}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/cloudai_gym.py` around lines 169 - 171, The constraint-failure path in the step function contains an early return that bypasses trajectory writing and after_step callbacks, which creates desynchronization between env.csv and trajectory.csv, and also returns a fixed observation shape of [-1.0] that does not match the expected observation size when agent_metrics has length greater than one. Instead of returning early when the constraint_check fails, set the appropriate constraint_failure reward and done flag, then allow the function to continue to the normal step completion flow to ensure trajectory writing and after_step callbacks are executed, and construct the observation array to match the correct shape based on the actual agent_metrics size.src/cloudai/configurator/base_agent.py (1)
91-92:⚠️ Potential issue | 🟠 MajorFix
select_actionreturn type to align with therun()loop's termination contract.Line 144 checks
if result is None:to break the loop, but the abstract signature on Line 91 declaresselect_actionreturnstuple[int, dict[str, Any]](non-optional). This contract mismatch violates the expected termination protocol: implementations that follow the strict signature will never return None, but the loop expects them to.Suggested fix
- def select_action(self, observation: list[float] | None = None) -> tuple[int, dict[str, Any]]: + def select_action( + self, observation: list[float] | None = None + ) -> tuple[int, dict[str, Any]] | None:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/base_agent.py` around lines 91 - 92, The abstract method select_action on line 91 declares a return type of tuple[int, dict[str, Any]] (non-optional), but the run() method's loop on line 144 checks if result is None to break, creating a contract mismatch. Update the return type annotation of the select_action method to be tuple[int, dict[str, Any]] | None to allow implementations to return None as a termination signal, aligning the abstract signature with the loop's termination protocol.
♻️ Duplicate comments (2)
src/cloudai/configurator/gymnasium_adapter.py (2)
251-252:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEnforce structured-observation key parity before materialization.
The structured path currently trusts
encode_observation()keys. Extra keys can throwKeyError; missing keys can silently produce partial observations. Validate key sets first and build output from descriptor keys.Proposed fix
env = cast(StructuredObservation, self._env) encoded = env.encode_observation(list(obs)) - return {name: self._leaf_to_value(descriptors[name], leaf) for name, leaf in encoded.items()} + self._assert_keys(encoded.keys(), set(descriptors), "encoded observation") + return {name: self._leaf_to_value(descriptors[name], encoded[name]) for name in descriptors}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/gymnasium_adapter.py` around lines 251 - 252, The current implementation iterates over the keys returned by encode_observation() without validating that they match the expected descriptor keys, which can cause KeyError if extra keys are present or silently produce partial observations if keys are missing. Validate that the keys in the encoded result match the keys in the descriptors dictionary before materializing the output, then build the return dictionary by iterating over descriptor keys (rather than encoded keys) to ensure all required keys are present and handled correctly in the _leaf_to_value call.
206-207:⚠️ Potential issue | 🟠 Major | ⚡ Quick winWiden
step()action typing to match actual accepted payloads.
step()is typed asdict[str, int], but this method forwards todecode_action()which accepts continuous Box payloads (e.g., numpy arrays). The current signature is narrower than real behavior and will keep type-checking failures on valid call sites.Proposed fix
- def step(self, action: dict[str, int]) -> tuple[Any, float, bool, bool, dict[str, Any]]: + def step(self, action: dict[str, Any]) -> tuple[Any, float, bool, bool, dict[str, Any]]: params = {**self._fixed_params, **self.decode_action(action)} return self._step_with_params(params)#!/bin/bash set -euo pipefail # Verify the current step signature. rg -nP 'def step\(self,\s*action:\s*dict\[str,\s*int\]\)' src/cloudai/configurator/gymnasium_adapter.py # Verify continuous payload usage in tests (numpy array passed to adapter.step()). rg -n -C2 'adapter\.step\(\{' tests/test_gymnasium_adapter_contract.py rg -n -C2 'np\.array\(' tests/test_gymnasium_adapter_contract.py🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/gymnasium_adapter.py` around lines 206 - 207, The step() method signature has an action parameter typed as dict[str, int], which is too restrictive. The method actually forwards to decode_action() which accepts continuous Box payloads including numpy arrays, but the current typing prevents valid callers from passing these payloads without type-checking errors. Widen the action parameter type annotation in the step() method to accept the broader range of payload types that decode_action() actually handles, such as numpy arrays and other gymnasium-compatible action formats.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cloudai/cli/handlers.py`:
- Around line 166-180: When re-raising an exception outside its except block
using `raise run_error` on line 180, Python rebinds the traceback context to the
new raise site, obscuring the original error frame. To preserve the original
traceback, restructure the code to use a bare `raise` statement inside the
except block where the exception is caught, or if the code structure requires
deferred raising, save the exception with its traceback using `sys.exc_info()`
and restore it when re-raising to maintain the original crash context during
debugging.
---
Outside diff comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 91-92: The abstract method select_action on line 91 declares a
return type of tuple[int, dict[str, Any]] (non-optional), but the run() method's
loop on line 144 checks if result is None to break, creating a contract
mismatch. Update the return type annotation of the select_action method to be
tuple[int, dict[str, Any]] | None to allow implementations to return None as a
termination signal, aligning the abstract signature with the loop's termination
protocol.
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 169-171: The constraint-failure path in the step function contains
an early return that bypasses trajectory writing and after_step callbacks, which
creates desynchronization between env.csv and trajectory.csv, and also returns a
fixed observation shape of [-1.0] that does not match the expected observation
size when agent_metrics has length greater than one. Instead of returning early
when the constraint_check fails, set the appropriate constraint_failure reward
and done flag, then allow the function to continue to the normal step completion
flow to ensure trajectory writing and after_step callbacks are executed, and
construct the observation array to match the correct shape based on the actual
agent_metrics size.
---
Duplicate comments:
In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 251-252: The current implementation iterates over the keys
returned by encode_observation() without validating that they match the expected
descriptor keys, which can cause KeyError if extra keys are present or silently
produce partial observations if keys are missing. Validate that the keys in the
encoded result match the keys in the descriptors dictionary before materializing
the output, then build the return dictionary by iterating over descriptor keys
(rather than encoded keys) to ensure all required keys are present and handled
correctly in the _leaf_to_value call.
- Around line 206-207: The step() method signature has an action parameter typed
as dict[str, int], which is too restrictive. The method actually forwards to
decode_action() which accepts continuous Box payloads including numpy arrays,
but the current typing prevents valid callers from passing these payloads
without type-checking errors. Widen the action parameter type annotation in the
step() method to accept the broader range of payload types that decode_action()
actually handles, such as numpy arrays and other gymnasium-compatible action
formats.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 06fc2119-7c97-4dff-8fe0-9bcea50b5f90
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
pyproject.tomlsrc/cloudai/_core/action_space.pysrc/cloudai/_core/test_scenario.pysrc/cloudai/cli/handlers.pysrc/cloudai/configurator/__init__.pysrc/cloudai/configurator/base_agent.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/env_params.pysrc/cloudai/configurator/gymnasium_adapter.pysrc/cloudai/core.pysrc/cloudai/models/workload.pytests/test_action_space.pytests/test_cloudaigym.pytests/test_env_params.pytests/test_gymnasium_adapter_contract.pytests/test_handlers.pytests/test_test_scenario.py
|
@coderabbitai review |
✅ Action performedReview finished.
|
707c038 to
0dfac69
Compare
|
@coderabbitai review |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)
146-172:⚠️ Potential issue | 🟠 Major | ⚡ Quick winConstraint-failure early return breaks
env.csv↔trajectory.csvstep alignment
observer.before_step(...)runs at Line 147 (andEnvParamsObserverwritesenv.csv), but on Line 169 the constraint-failure branch returns at Line 171 without writing a trajectory row or firingafter_step. This creates orphanenv.csvrows for failed trials and breaks the 1:1 step-merge contract.💡 Proposed fix
if not self.test_run.test.constraint_check(self.test_run, self.runner.system): logging.info("Constraint check failed. Skipping step.") - return [-1.0], self.rewards.constraint_failure, True, {} + observation = [-1.0] + reward = self.rewards.constraint_failure + self.write_trajectory( + TrajectoryEntry( + step=self.test_run.step, + action=action, + reward=reward, + observation=observation, + env_params=dict(self.test_run.current_env_params), + ) + ) + for observer in self.observers: + observer.after_step(self.test_run, observation, reward) + return observation, reward, True, {}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/cloudai_gym.py` around lines 146 - 172, The constraint-failure return path does not maintain symmetry with the successful step path: while observer.before_step() is called at the start, the early return when constraint_check() fails skips both writing a TrajectoryEntry and firing observer.after_step(), creating orphan entries in the env.csv file. To fix this, in the constraint-failure branch (after the constraint_check call), add a write_trajectory() call with a TrajectoryEntry containing the current step, action, reward (use self.rewards.constraint_failure), observation, and env_params, and then call observer.after_step() with the appropriate parameters before returning, mirroring the pattern used in the cached_result branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 146-172: The constraint-failure return path does not maintain
symmetry with the successful step path: while observer.before_step() is called
at the start, the early return when constraint_check() fails skips both writing
a TrajectoryEntry and firing observer.after_step(), creating orphan entries in
the env.csv file. To fix this, in the constraint-failure branch (after the
constraint_check call), add a write_trajectory() call with a TrajectoryEntry
containing the current step, action, reward (use
self.rewards.constraint_failure), observation, and env_params, and then call
observer.after_step() with the appropriate parameters before returning,
mirroring the pattern used in the cached_result branch.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: d5a94f42-f6db-481a-bc1b-808fd6d07a4e
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (13)
pyproject.tomlsrc/cloudai/_core/action_space.pysrc/cloudai/_core/test_scenario.pysrc/cloudai/configurator/__init__.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/env_params.pysrc/cloudai/configurator/gymnasium_adapter.pysrc/cloudai/core.pysrc/cloudai/models/workload.pytests/test_action_space.pytests/test_cloudaigym.pytests/test_env_params.pytests/test_gymnasium_adapter_contract.py
✅ Action performedReview finished.
|
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
17bea13 to
29aaabe
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)
169-172:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHandle constraint-failure trials through the same recording path.
Line 171 returns a hardcoded one-element observation and exits before
write_trajectory/observer.after_step. With env params enabled,before_stephas already writtenenv.csv, so this path breaksenv.csv↔trajectory.csvstep alignment and can return an observation shape inconsistent withdefine_observation_space()when multiple metrics are configured.💡 Proposed fix
if not self.test_run.test.constraint_check(self.test_run, self.runner.system): logging.info("Constraint check failed. Skipping step.") - return [-1.0], self.rewards.constraint_failure, True, {} + observation = [self.rewards.metric_failure] * max(len(self.test_run.test.agent_metrics), 1) + reward = self.rewards.constraint_failure + self.write_trajectory( + TrajectoryEntry( + step=self.test_run.step, + action=action, + reward=reward, + observation=observation, + env_params=dict(self.test_run.current_env_params), + ) + ) + for observer in self.observers: + observer.after_step(self.test_run, observation, reward) + return observation, reward, True, {}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/cloudai_gym.py` around lines 169 - 172, The constraint-failure early return at line 171 bypasses the trajectory recording path (write_trajectory and observer.after_step), causing misalignment between env.csv and trajectory.csv when environment parameters are enabled. Additionally, the hardcoded observation [-1.0] may not match the shape defined by define_observation_space() when multiple metrics are configured. Instead of returning early when test_run.test.constraint_check() fails, route this case through the same recording and observation logic as successful steps by calling write_trajectory and observer.after_step before returning, and ensure the returned observation matches the shape defined by define_observation_space() rather than using a hardcoded single-element list.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 191-195: The _decode_continuous method silently truncates
multi-value inputs by flattening the array and taking only the first element at
index zero, which can cause incorrect parameter values to be processed. Add
validation after reshaping the input to check that it contains exactly one
element, and raise a ValueError with a descriptive message if the array size is
not one. This validation should occur before the clamping logic to fail fast on
malformed inputs.
---
Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 169-172: The constraint-failure early return at line 171 bypasses
the trajectory recording path (write_trajectory and observer.after_step),
causing misalignment between env.csv and trajectory.csv when environment
parameters are enabled. Additionally, the hardcoded observation [-1.0] may not
match the shape defined by define_observation_space() when multiple metrics are
configured. Instead of returning early when test_run.test.constraint_check()
fails, route this case through the same recording and observation logic as
successful steps by calling write_trajectory and observer.after_step before
returning, and ensure the returned observation matches the shape defined by
define_observation_space() rather than using a hardcoded single-element list.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: c94cbeeb-1753-4223-b398-49305c0a7939
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
pyproject.tomlsrc/cloudai/_core/action_space.pysrc/cloudai/cli/handlers.pysrc/cloudai/configurator/__init__.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/env_params.pysrc/cloudai/configurator/gymnasium_adapter.pysrc/cloudai/core.pytests/test_action_space.pytests/test_env_params.pytests/test_gymnasium_adapter_contract.pytests/test_handlers.py
de02b99 to
67c7178
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
6634765 to
d9c9e14
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
d9c9e14 to
77e79a3
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
77e79a3 to
c19ae2a
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
c19ae2a to
463da25
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
75b0f5a to
d914d74
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
d914d74 to
0417e79
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
0417e79 to
4f479dd
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
4f479dd to
6247722
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
6247722 to
49c1bbf
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
49c1bbf to
3b72447
Compare
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
- Drop the speculative `cloudai_env` property: it had no in-repo consumer (every adapter holder also holds the raw env), and exposing the adaptee leaks the abstraction. A narrow forwarding method can be added later if a real need appears. - Rename the `StructuredObservation` protocol to `StructuredObservationProducer`: it describes an env that *produces* structured observations, not an observation. Narrow with `isinstance` against the `@runtime_checkable` protocol at both adapter sites instead of an opaque `cast`; the runtime guarantee is now visible at the call site. No inheritance is added to `CloudAIGymEnv` (it is not an "is-a" StructuredObservationProducer). - Switch the adapter's sibling imports (`BaseGym`, `StructuredObservationProducer`) to relative imports, matching the configurator package convention and avoiding a cycle (the adapter is imported by `cloudai.core`).
3b72447 to
f783e97
Compare
step() increments test_run.step before it samples/runs, so the next trial index is step + 1. Extract that offset into a read-only CloudAIGymEnv.upcoming_trial property and use it in step(), so the "+1" is defined in exactly one place. Behavior-preserving.
CloudAIGymEnv keeps its flat (metrics) observation and reports the per-trial env_param regime on the Gym info dict under info["env_params"] (the key is present only when a regime was applied, so its presence alone signals a non-empty, valid regime). reset() peeks upcoming_trial to report the regime step() will apply next. Adds the producer hooks a consumer needs to build a structured observation: encode_env_params() and structured_observation_descriptors(), plus the Encoding/CategoricalEncoding stack and the StructuredObservation protocol. The GymnasiumAdapter (separate PR) composes the flat metrics with the queried regime into a spaces.Dict, so every RL impl gets the structured obs.
…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.
- Drop the speculative `cloudai_env` property: it had no in-repo consumer (every adapter holder also holds the raw env), and exposing the adaptee leaks the abstraction. A narrow forwarding method can be added later if a real need appears. - Rename the `StructuredObservation` protocol to `StructuredObservationProducer`: it describes an env that *produces* structured observations, not an observation. Narrow with `isinstance` against the `@runtime_checkable` protocol at both adapter sites instead of an opaque `cast`; the runtime guarantee is now visible at the call site. No inheritance is added to `CloudAIGymEnv` (it is not an "is-a" StructuredObservationProducer). - Switch the adapter's sibling imports (`BaseGym`, `StructuredObservationProducer`) to relative imports, matching the configurator package convention and avoiding a cycle (the adapter is imported by `cloudai.core`).
f783e97 to
5e6a87c
Compare
Wrap a CloudAI BaseGym as a gymnasium.Env-shaped object: a spaces.Dict of Discrete (list params) and Box (ContinuousSpace) actions over the tunable params with fixed (single-value) params injected each step; observations as either a flat float32 Box or, when the env opts in via the structured-obs hooks, a spaces.Dict of per-leaf ObsLeafDescriptor subspaces. Continuous dtype="int" params are quantized (rounded/clamped) at decode_action so the trajectory cache key collapses float jitter. The adapter is a pure pass-through over test_run.step (never mutates it), so contextual-bandit rollouts that reset() per trial keep a monotonic trial index. gymnasium is an optional dependency lazy-imported behind the new [rl] extra (also added to dev); CloudAIGymEnv.define_observation_space() now returns one slot per agent metric so adapters get the right Box shape. Exported via cloudai.core. Caller-contract tests pin the step-monotonicity, observation pass-through, continuous-quantization, and structured-obs invariants.
f81c387 to
3c4e130
Compare
|
@coderabbitai full review |
✅ Action performedFull review finished. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pyproject.toml`:
- Around line 60-62: The same gymnasium~=1.2 dependency is pinned in both the
dev and rl extras, creating duplication that can drift out of sync. Update
pyproject.toml so the gymnasium pin is defined in one place and reused by both
extras, and verify the dev and rl extra definitions stay aligned through their
shared dependency reference.
In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 90-95: The action-space parsing in GymnasiumAdapter currently
drops any raw_action_space entries that are not lists, so future non-list
parameter types can disappear silently before _step_with_params. Update the
initializer logic around _discrete_params and _fixed_params to explicitly handle
or reject non-list entries, or add a clear guard/comment in GymnasiumAdapter so
unsupported types are surfaced intentionally instead of being omitted without
notice.
- Around line 238-245: The structured observation path in the gymnasium adapter
is directly indexing info["env_params"], which can raise an opaque KeyError
instead of following the file’s descriptive validation pattern. In the method
that builds the observation/context payload, first validate that info contains
"env_params" using the existing _assert_keys helper (or equivalent descriptive
check), then only call StructuredObservationProducer.encode_env_params and
continue with the existing descriptor/key validation. Keep the behavior
consistent with the surrounding configurator logic by surfacing a clear
ValueError for missing env_params.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: a8e4dce0-93ce-476f-abdb-345080f8466c
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (11)
pyproject.tomlsrc/cloudai/cli/handlers.pysrc/cloudai/configurator/__init__.pysrc/cloudai/configurator/cloudai_gym.pysrc/cloudai/configurator/env_params.pysrc/cloudai/configurator/gymnasium_adapter.pysrc/cloudai/core.pysrc/cloudai/util/lazy_imports.pytests/test_cloudaigym.pytests/test_env_params.pytests/test_gymnasium_adapter_contract.py
Issue
gymnasium.Env-shaped view of a CloudAIBaseGym; there is no upstream adapter, and a flat[0.0]observation gives adapters the wrongBoxshape.Fix
GymnasiumAdapter(configurator):spaces.DictofDiscrete(list) +Box(ContinuousSpace) actions with fixed params injected per step; flat-Boxor structuredspaces.Dict(per-leafObsLeafDescriptor) observations;dtype="int"continuous actions quantized atdecode_action. Pure pass-through overtest_run.step(never mutated) so contextual-banditreset()-per-trial keeps a monotonic trial index.gymnasiumis lazy-imported behind a new[rl]extra;define_observation_space()now sizes by agent metrics. Exported viacloudai.core.Testing
tests/test_gymnasium_adapter_contract.py: caller-contract tests for step-monotonicity (within/across episodes), observation pass-through, continuous quantization/clamping, and the structured-obs gate. ruff + pyright + vulture + import-linter clean; 108 related tests pass.Stack: #901 ← #927 (ContinuousSpace) ← #928 (ObsLeafDescriptor) ← this. Final cloudai-side PR of the gymnasium-adapter upstreaming; consumes both primitives.