Skip to content

[core] feat: refactor validate dependency#609

Open
sayakpaul wants to merge 4 commits into
mainfrom
dep-validation-refactor
Open

[core] feat: refactor validate dependency#609
sayakpaul wants to merge 4 commits into
mainfrom
dep-validation-refactor

Conversation

@sayakpaul
Copy link
Copy Markdown
Member

This PR refactors how we validate dependencies:

  • Fetches the metadata.json before doing the expensive snapshot_download(). This allows us to error out early so that the user doesn't have to wait for the error after the build variant download is completed.
  • Propagates related changes to get_local_kernel() as well.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Coverage report — kernels/

Measured on: Python 3.10 / Torch 2.12.0.
Other CI configurations are not included in this number.
Hardware-gated code paths (ROCm/XPU/NPU/Darwin/Windows) are excluded or unreachable on the Linux+CUDA runner.

Total coverage: 83.4% — threshold: 80% — ✅

Per-file breakdown
Name Stmts Miss Cover Missing
src/kernels/__init__.py 10 0 100%
src/kernels/_system.py 6 1 83% 10
src/kernels/_versions.py 63 7 89% 46, 49, 52-53, 56-57, 100
src/kernels/backends.py 194 55 72% 40, 44, 48-51, 68, 90, 108, 117, 121, 125-127, 148, 170, 181, 188-191, 201, 205-225, 233, 256-276
src/kernels/compat.py 8 1 88% 5
src/kernels/deps.py 54 4 93% 58-59, 79, 95
src/kernels/layer/__init__.py 6 0 100%
src/kernels/layer/_interval_tree.py 103 4 96% 23, 52, 147, 150
src/kernels/layer/device.py 48 14 71% 42, 47-49, 91, 96-98, 101, 149, 152, 155-157
src/kernels/layer/func.py 93 7 92% 72, 100, 154, 257, 263, 272, 290
src/kernels/layer/globals.py 5 0 100%
src/kernels/layer/kernelize.py 73 8 89% 255, 273, 281-282, 288, 292, 308-310
src/kernels/layer/layer.py 174 15 91% 166, 209, 215, 228, 320-321, 333, 342, 350, 361, 390, 394, 407, 460, 490
src/kernels/layer/mode.py 14 0 100%
src/kernels/layer/repos.py 130 34 74% 27, 33, 36-41, 61-62, 68, 71-74, 88, 92, 101-102, 108, 111-114, 121-122, 128, 131-134, 141-142, 148, 151-154, 235
src/kernels/lockfile.py 69 44 36% 37-98, 102-125
src/kernels/status.py 49 2 96% 23, 81
src/kernels/utils.py 298 60 80% 59, 71-75, 81-82, 211, 215, 218, 280, 288, 325-326, 364, 393, 398, 432, 609-614, 640, 643, 645, 651, 664-665, 686-695, 699-706, 714, 718-728, 732-739, 777, 781, 800, 802
src/kernels/variants.py 262 19 93% 56, 87, 108, 138, 247-248, 289, 291, 371-378, 384-390, 421-427, 439-445, 534-536
TOTAL 1659 275 83%

Updated by the Test kernels workflow on commit de5f1b1803b725adab7a301115d312bcda26c545.

@sayakpaul sayakpaul marked this pull request as ready for review June 2, 2026 08:27
Copy link
Copy Markdown
Member

@danieldk danieldk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what the solution is without downloading, but the dependency verification needs to be at load/import time. The issue is that downloads, getting kernel paths, etc. can be done for other reasons (e.g. predownloading a kernel, populating a Docker container, kernel validation, etc.) where we cannot assume that the dependencies are in the environment (yet).

Comment thread kernels/src/kernels/utils.py Outdated
Comment on lines +284 to +293
# Validate Python dependencies before downloading the variant.
metadata_path = api.hf_hub_download(
repo_id,
repo_type="kernel",
filename=f"build/{variant.variant_str}/metadata.json",
cache_dir=CACHE_DIR,
revision=revision,
local_files_only=False,
)
_validate_variant_dependencies(Path(metadata_path).parent)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be in install_kernel, because it's also used to predownload kernels to e.g. bake into Docker containers (e.g. when using kernel locking), so we cannot assume that it's run in the final environment where it has access to the final set of dependencies.

Copy link
Copy Markdown
Member

@danieldk danieldk Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about it, maybe in get_kernel/get_local_kernel/get_locked_kernel, since these are the two functions that actually end up loading kernels and in get_kernel we can put it before any other downloads?

Or maybe even better and closer to your PR, we can add a validate_dependencies=False option to install_kernel and then set it to True in the calls where we need validation?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just commented with something similar.

Comment thread kernels/src/kernels/utils.py Outdated
)
)
return _find_kernel_in_repo_path(repo_path, variant=variant, variant_locks=variant_locks)
variant_path = _find_kernel_in_repo_path(repo_path, variant=variant, variant_locks=variant_locks)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can't be in these functions, since we want to support getting a kernel directory without actually loading it.

@sayakpaul
Copy link
Copy Markdown
Member Author

I can remove them from the functions you mentioned but I would like to discuss the after steps before I do so. Specifically, for stuff like

I am not sure what the solution is without downloading, but the dependency verification needs to be at load/import time. The issue is that downloads, getting kernel paths, etc. can be done for other reasons (e.g. predownloading a kernel, populating a Docker container, kernel validation, etc.)

We cannot assume the users are ready to use them already and hence, it would not make sense to hard raise on validation. So, I propose the following:

  • Log a warning to the users that they don't have the requirements installed or
  • We do it exclusively from get_kernel() since it does more than just downloading a kernel.

I prefer the second as warnings tend to get ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants