Skip to content

test(mlx-fused): refresh stale fused-loop test fakes and expectations#147

Merged
FluffyAIcode merged 1 commit into
mainfrom
AgentMemory/refresh-fused-specdecode-test-fakes-2815
Jun 17, 2026
Merged

test(mlx-fused): refresh stale fused-loop test fakes and expectations#147
FluffyAIcode merged 1 commit into
mainfrom
AgentMemory/refresh-fused-specdecode-test-fakes-2815

Conversation

@FluffyAIcode

Copy link
Copy Markdown
Owner

Summary

Fixes 4 pre-existing failing tests in tests/backends/mlx/test_fused_specdecode.py that had gone stale against the current fused_specdecode_generate loop. Test-only change — no production code touched. (Surfaced as the "⚠️ pre-existing failures" noted in PR #146.)

Why they were failing (verified red on origin/main)

The fused loop evolved but its unit-test fakes/expectations did not:

  1. Missing fake method. The loop now calls adapter.last_aux_torch_slice(...) (aux hidden states are captured lazily in MX and bridged to torch on demand), but _FakeAdapter lacked it → AttributeError in all 3 test_fused_loop_* tests.
  2. Full-acceptance no longer appends a correction. On full acceptance the loop reuses block_logits[-1] as the next distribution instead of forwarding a correction token, so committed sequences shifted:
    • test_fused_loop_full_acceptance: appends == [104] (was [103, 105]).
    • test_fused_loop_stops_on_eos: the EOS token now arrives as block 2's bonus → blocks == 2 (was 1); tokens unchanged ([100,101,102,103]).
  3. Lazy aux bridging. forward_block now leaves _last_aux = None and exposes the bridged aux via last_aux_torch_slice(); test_adapter_prefill_forward_commit now asserts that contract.

Changes

  • Add last_aux_torch_slice to _FakeAdapter (mirrors MLXRestoredIncrementalVerifier).
  • Re-derive expected token sequences / appends / blocks for the 3 loop tests against current behavior (full traces in the test comments).
  • Update test_adapter_prefill_forward_commit to the lazy-aux contract (_last_aux is None; bridged via last_aux_torch_slice()).
  • Add test_fused_loop_greedy_fallback_on_low_acceptance covering the low-acceptance → greedy-fallback branch (previously untested).

Testing

  • pytest tests/backends/mlx/test_fused_specdecode.py14 passed (was 9 passed / 4 failed on main)
  • Note: full-module 100% coverage for inference_engine.backends.mlx.* is enforced on the Mac integration run (the MLX-real paths can't import on the Linux CI host, per .coveragerc); this PR raises the Linux-exercisable loop coverage and fixes the red tests.
Open in Web Open in Cursor 

…ainst current behavior

The control-flow tests in test_fused_specdecode.py were stale against the
current fused_specdecode_generate loop, failing on origin/main:

- _FakeAdapter lacked last_aux_torch_slice, which the loop now calls (aux is
  captured lazily in MX and bridged on demand) -> AttributeError. Add it.
- Full acceptance no longer appends a correction token (it reuses
  block_logits[-1] as the next distribution), so the committed token sequence
  shifted: test_fused_loop_full_acceptance appends == [104] (not [103,105]);
  test_fused_loop_stops_on_eos reaches EOS as block-2's bonus -> blocks == 2.
- test_adapter_prefill_forward_commit: forward_block now leaves _last_aux None
  and exposes the bridged aux via last_aux_torch_slice(); assert accordingly.
- Add test_fused_loop_greedy_fallback_on_low_acceptance to cover the
  low-acceptance greedy-fallback branch (previously untested).

All 14 tests pass; no production code changed.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
@FluffyAIcode FluffyAIcode marked this pull request as ready for review June 17, 2026 15:05
@FluffyAIcode FluffyAIcode merged commit f13594d into main Jun 17, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants