test(mlx-fused): refresh stale fused-loop test fakes and expectations by FluffyAIcode · Pull Request #147 · FluffyAIcode/Kakeya-LLM-Inference-engine

FluffyAIcode · 2026-06-17T14:58:24Z

Summary

Fixes 4 pre-existing failing tests in tests/backends/mlx/test_fused_specdecode.py that had gone stale against the current fused_specdecode_generate loop. Test-only change — no production code touched. (Surfaced as the "⚠️ pre-existing failures" noted in PR #146.)

Why they were failing (verified red on `origin/main`)

The fused loop evolved but its unit-test fakes/expectations did not:

Missing fake method. The loop now calls adapter.last_aux_torch_slice(...) (aux hidden states are captured lazily in MX and bridged to torch on demand), but _FakeAdapter lacked it → AttributeError in all 3 test_fused_loop_* tests.
Full-acceptance no longer appends a correction. On full acceptance the loop reuses block_logits[-1] as the next distribution instead of forwarding a correction token, so committed sequences shifted:
- test_fused_loop_full_acceptance: appends == [104] (was [103, 105]).
- test_fused_loop_stops_on_eos: the EOS token now arrives as block 2's bonus → blocks == 2 (was 1); tokens unchanged ([100,101,102,103]).
Lazy aux bridging. forward_block now leaves _last_aux = None and exposes the bridged aux via last_aux_torch_slice(); test_adapter_prefill_forward_commit now asserts that contract.

Changes

Add last_aux_torch_slice to _FakeAdapter (mirrors MLXRestoredIncrementalVerifier).
Re-derive expected token sequences / appends / blocks for the 3 loop tests against current behavior (full traces in the test comments).
Update test_adapter_prefill_forward_commit to the lazy-aux contract (_last_aux is None; bridged via last_aux_torch_slice()).
Add test_fused_loop_greedy_fallback_on_low_acceptance covering the low-acceptance → greedy-fallback branch (previously untested).

Testing

✅ pytest tests/backends/mlx/test_fused_specdecode.py → 14 passed (was 9 passed / 4 failed on main)
Note: full-module 100% coverage for inference_engine.backends.mlx.* is enforced on the Mac integration run (the MLX-real paths can't import on the Linux CI host, per .coveragerc); this PR raises the Linux-exercisable loop coverage and fixes the red tests.

…ainst current behavior The control-flow tests in test_fused_specdecode.py were stale against the current fused_specdecode_generate loop, failing on origin/main: - _FakeAdapter lacked last_aux_torch_slice, which the loop now calls (aux is captured lazily in MX and bridged on demand) -> AttributeError. Add it. - Full acceptance no longer appends a correction token (it reuses block_logits[-1] as the next distribution), so the committed token sequence shifted: test_fused_loop_full_acceptance appends == [104] (not [103,105]); test_fused_loop_stops_on_eos reaches EOS as block-2's bonus -> blocks == 2. - test_adapter_prefill_forward_commit: forward_block now leaves _last_aux None and exposes the bridged aux via last_aux_torch_slice(); assert accordingly. - Add test_fused_loop_greedy_fallback_on_low_acceptance to cover the low-acceptance greedy-fallback branch (previously untested). All 14 tests pass; no production code changed. Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>

FluffyAIcode marked this pull request as ready for review June 17, 2026 15:05

FluffyAIcode merged commit f13594d into main Jun 17, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(mlx-fused): refresh stale fused-loop test fakes and expectations#147

test(mlx-fused): refresh stale fused-loop test fakes and expectations#147
FluffyAIcode merged 1 commit into
mainfrom
AgentMemory/refresh-fused-specdecode-test-fakes-2815

FluffyAIcode commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FluffyAIcode commented Jun 17, 2026

Summary

Why they were failing (verified red on origin/main)

Changes

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Why they were failing (verified red on `origin/main`)