fix: ModelBuilder.deploy() should expose DataCacheConfig and other CreateInferenceCom (5750)#5753
Conversation
…eateInferenceCom (5750)
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds imports and two helper methods for resolving DataCacheConfig and ContainerSpecification, but is critically incomplete — the deploy() and _deploy_core_endpoint() methods are never actually modified to accept or wire through the new parameters. The helper methods also lack type annotations and the overall change doesn't achieve the stated goal.
| return None | ||
|
|
||
| from sagemaker.core.shapes import InferenceComponentDataCacheConfig | ||
|
|
There was a problem hiding this comment.
Minor: The dict-to-object conversion silently defaults enable_caching to False if the key is missing. Consider raising a ValueError if the dict doesn't contain the required enable_caching key, since the error message already states it must have that key. Silent defaults can mask user mistakes, which violates the core tenet of making it hard for users to make mistakes.
🤖 Iteration #1 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds support for DataCacheConfig, BaseInferenceComponentName, Container specification, and VariantName parameters to ModelBuilder.deploy(). The resolver methods and tests are well-structured, but there are several issues: missing tests for the core wiring logic in _deploy_core_endpoint, a line length violation, missing from __future__ import annotations, and the DataCacheConfig dict serialization doesn't forward all possible fields from the shape.
| container_dict = {} | ||
| if hasattr(resolved_container, "image") and resolved_container.image: | ||
| container_dict["Image"] = resolved_container.image | ||
| if hasattr(resolved_container, "artifact_url") and resolved_container.artifact_url: |
There was a problem hiding this comment.
Using hasattr checks on a Pydantic BaseModel (which InferenceComponentContainerSpecification likely is) is unnecessary — the attributes are always present (possibly None). Simplify to:
container_dict = {}
if resolved_container.image:
container_dict["Image"] = resolved_container.image
if resolved_container.artifact_url:
container_dict["ArtifactUrl"] = resolved_container.artifact_url
if resolved_container.environment:
container_dict["Environment"] = resolved_container.environmentThis is cleaner and more idiomatic for Pydantic models.
🤖 Iteration #2 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds support for DataCacheConfig, BaseInferenceComponentName, Container, and VariantName parameters to ModelBuilder.deploy(). The approach is reasonable but has several issues: line length violations, inconsistent handling of variant_name=None vs not provided, missing type annotations on the deploy method's return path, and the resolver methods are placed in the wrong file (model_builder_utils.py) but used in model_builder.py without proper wiring.
| resolved_cache_config = self._resolve_data_cache_config(ic_data_cache_config) | ||
| if resolved_cache_config is not None: | ||
| cache_dict = {"EnableCaching": resolved_cache_config.enable_caching} | ||
| # Forward any additional fields from the shape as they become available |
There was a problem hiding this comment.
The DataCacheConfig is being manually serialized to a dict ({"EnableCaching": ...}), but the spec dict already uses PascalCase API keys. Consider whether create_inference_component expects the Pydantic shape object directly (as sagemaker-core typically handles serialization) rather than a manually constructed dict. If the session method handles serialization, passing the resolved InferenceComponentDataCacheConfig object directly would be more robust and future-proof as new fields are added to the shape.
🤖 Iteration #3 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
sagemaker-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review
This PR adds support for DataCacheConfig, BaseInferenceComponentName, Container, and VariantName parameters to ModelBuilder.deploy(). While the feature is useful, there are several issues: a large block of validation logic was accidentally deleted from _deploy_core_endpoint, there's significant code duplication between deploy and update paths, the new parameters in deploy() are added before **kwargs breaking convention, and line length violations exist.
| mock_pv.assert_called_once() | ||
| pv_kwargs = mock_pv.call_args | ||
| assert pv_kwargs.kwargs.get("variant_name") == "CustomVariant" or \ | ||
| (len(pv_kwargs.args) > 0 and False) # variant_name is always a kwarg |
There was a problem hiding this comment.
The test file ends abruptly with an incomplete assertion and a backslash continuation. This line:
assert pv_kwargs.kwargs.get("variant_name") == "CustomVariant" or \
(len(pv_kwargs.args) > 0 and False) # variant_name is always a kwargThe or branch always evaluates to False, making it dead code. Simplify to:
assert pv_kwargs.kwargs.get("variant_name") == "CustomVariant"Also, the test file appears to be missing tests for _update_inference_component wiring, and there are no tests for the deploy() method's parameter forwarding logic (the kwargs population at line 4213).
🤖 Iteration #4 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
🤖 Iteration #5 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
There was a problem hiding this comment.
Two fixes needed:
- Bug:
variant_namealways overrides model customization default. Indeploy(),kwargs["variant_name"] = variant_name or "AllTraffic"always sets the key, so_deploy_model_customizationnever seesNoneand its backward-compat default ofendpoint_nameis dead code. Fix: only forwardvariant_namewhen explicitly provided:
# Replace:
kwargs["variant_name"] = variant_name or "AllTraffic"
# With:
if variant_name is not None:
kwargs["variant_name"] = variant_nameEach downstream path already has its own default — _deploy_core_endpoint defaults to "AllTraffic" via kwargs.get("variant_name", "AllTraffic"), and _deploy_model_customization defaults to endpoint_name via variant_name or endpoint_name or "AllTraffic".
- Drop the second integ test (test_deploy_with_data_cache_config_via_model_customization_path). The model customization path requires ml.g5.4xlarge which has a non-adjustable account quota of 2 instances. When CI runs tests in parallel, this test competes with the existing test_model_customization_deployment.py for the same quota, causing flaky InsufficientInstanceCapacity failures. The model customization path's data_cache_config and variant_name wiring is already covered by unit tests. Keep only the first integ test (test_deploy_with_data_cache_config_and_variant_name_via_ic_path) which uses ml.g5.2xlarge.
Also remove the TRAINING_JOB_NAME constant since it's no longer needed.
There was a problem hiding this comment.
Do not worry about CI failures! Removing the second integ test will fix one failure and the other failures are due to flakiness
🤖 Iteration #1 — Review Comments AddressedDescriptionExpose additional
Changes Made
|
Description
The issue requests exposing additional CreateInferenceComponent API parameters through ModelBuilder.deploy(), primarily DataCacheConfig, BaseInferenceComponentName, Container specification, and VariantName. The _deploy_core_endpoint method in model_builder.py builds InferenceComponentSpecification but does not pass through these parameters. The sagemaker.core.shapes module already has InferenceComponentDataCacheConfig and related shapes. The fix requires: (1) adding new optional parameters to the deploy() method and _deploy_core_endpoint(), (2) wiring those parameters into the InferenceComponentSpecification and InferenceComponent.create() call, and (3) making variant_name configurable instead of hardcoded to 'AllTraffic'. The deploy wrappers in model_builder_servers.py pass **kwargs through to _deploy_core_endpoint, so they require no changes.
Related Issue
Related issue: 5750
Changes Made
sagemaker-serve/src/sagemaker/serve/model_builder.pysagemaker-serve/src/sagemaker/serve/model_builder_utils.pyAI-Generated PR
This PR was automatically generated by the PySDK Issue Agent.
Merge Checklist
prefix: descriptionformat