feat(mac-bridge): pin workload Python interpreter + import gate (Layers B/C) + reusable skill#150
Merged
cursor[bot] merged 2 commits intoJun 18, 2026
Conversation
… prompt Captures the diagnosis+fix for the post-reboot ModuleNotFoundError (mlx_lm) on the kakeya-mac-m4 runner: lightweight env-probe diagnosis, 3-layer fix (pin venv on the runner agent PATH via .path/.env|launchd|systemd; resolve a pinned interpreter in the workflow/executor instead of bare python3; fail-fast import gate), reboot-inclusive verification, and the Cloud-VM-vs-runner distinction (Mac-only deps belong on the runner, not the Linux Cloud Agent env). Includes a ready-to-paste setup-agent prompt; generalized for any Claude/Codex agent. Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
…eck gate (Layer C) Reboots can repoint the runner's default python3 to one without mlx_lm, which broke every full-engine preset with a deep ModuleNotFoundError. Make the workload interpreter explicit and verified: - inference_engine/bridge/runner_python.py (NEW, pure + 100% unit-tested): workload_python_candidates (pin KAKEYA_MAC_PYTHON -> venvs -> PATH), resolve_workload_python (first interpreter that can import mlx_lm; else fallback), preset_requires_gate (mlx-/k3- engine presets, minus env-probe/ upgrade), substitute_python, gate_error_message. - scripts/mac_bridge/run_preset.py: resolve the pinned interpreter, rewrite bare python3 argv0 to it, export KAKEYA_MAC_PYTHON to the subprocess, and FAIL FAST (exit 90 + ::error::) when a gated preset has no mlx_lm-capable interpreter. - scripts/run_kakeya_mac.sh: honor KAKEYA_MAC_PYTHON; preflight asserts mlx+mlx_lm. CI enforcement: the resolution/gate logic lives in the unit-tested, 100%-coverage library (runner_python.py), so every PR exercises it on the Linux gate. See docs/skills/pin-selfhosted-runner-python-env-skill.md. Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
This was referenced Jun 18, 2026
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Makes the mac-bridge resilient to the post-reboot
ModuleNotFoundError: No module named 'mlx_lm'regression (the runner's defaultpython3flipped to a Python without the ML stack), and captures the pattern as a reusable skill. CI-enforced via a 100%-covered unit-tested library.Layer B — pinned interpreter resolution (no more bare
python3)inference_engine/bridge/runner_python.py(new, pure, dependency-injected):workload_python_candidates— ordered candidates:KAKEYA_MAC_PYTHON→ venv paths →python3.13→python3.resolve_workload_python— pick the first interpreter that canimport mlx_lm; fall back to the first existing candidate otherwise.substitute_python— rewrite a leading barepython3argv to the resolved interpreter.scripts/mac_bridge/run_preset.pyresolves the interpreter, rewritespython3argv, and exportsKAKEYA_MAC_PYTHONto the subprocess.scripts/run_kakeya_mac.shhonorsKAKEYA_MAC_PYTHON(and its preflight now assertsmlx+mlx_lm).Layer C — import self-check gate (fail fast)
For
mlx-/k3-engine presets (minusmlx-env-probe/mlx-upgrade, which exist to diagnose/repair the env), if no interpreter can importmlx_lmthe executor exits90with a clear::error::pointing at the skill — instead of a deepModuleNotFoundErrorminutes in.CI enforcement
The resolution + gate logic lives entirely in
runner_python.py, unit-tested at 100% coverage (tests/inference_engine/bridge/test_runner_python.py, 10 tests), so it's exercised on every Linux gate run.run_preset.py(coverage-exempt CLI) is a thin caller.Skill
docs/skills/pin-selfhosted-runner-python-env-skill.md— repo-agnostic SOP (diagnose → 3-layer fix → reboot-inclusive verify → Cloud-VM-vs-runner distinction → anti-patterns) with a ready-to-paste setup-agent prompt; linked fromkakeyainferenceenginebuildskill.md.Validation
mlx_lm(…/.venv-mac/bin/python3.13,mlx_lm_ok=True) — even withpinned=None— and substituted it for barepython3; the preset ran on it (exit 0).mac_bridge_layerBC_verification.txt
pytest tests/inference_engine/bridge/(41 passed);runner_python.py100% coverage.bash -n scripts/run_kakeya_mac.sh; launcher dry-run honorsKAKEYA_MAC_PYTHON.Note
For maximum durability, also pin the venv on the runner agent's
PATH(Layer A — host side, in the skill); setKAKEYA_MAC_PYTHONas a repo/runner variable to make the choice explicit rather than discovered. This PR makes the bridge robust even without those, and fails fast with guidance if the env is truly broken.To show artifacts inline, enable in settings.