Experiment with proposer KV full-attn restoration on Mac by FluffyAIcode · Pull Request #108 · FluffyAIcode/Kakeya-LLM-Inference-engine

FluffyAIcode · 2026-06-11T10:42:53Z

Summary

Adds a disabled Mac S5 experiment flag, --s5-f-theta-restored-full-attn, that builds full-attention restored K/V from proposer K/V via f_theta instead of running an extra verifier prompt forward.
Captures ctx70 Mac evidence for both KL ON and KL OFF, showing the fixed build_restoration_s cost drops to about 2s.
Marks this as draft because current f_theta_v5_s5_sliding was trained with S5 full-attn layers excluded, and the experiment regresses recall to 0/1 while slowing attach/decode.

Evaluation

Unit checks: python3 -m py_compile scripts/research/k3_integrated_niah_eval_mac.py
Unit checks: pytest -q tests/inference_engine/v04/test_f_theta.py tests/backends/mlx/test_cache.py
KL ON ctx70: build_restoration_s=1.962, prefill_attach_s=24.096, decode_s=42.455, recall_cross_model=0.0, recall_oracle=1.0
KL OFF ctx70: build_restoration_s=2.154, prefill_attach_s=21.971, decode_s=27.862, recall_cross_model=0.0, recall_oracle=1.0

Interpretation

This confirms the architectural diagnosis: avoiding the extra verifier capture forward can remove most fixed restoration-build latency. It does not produce a usable optimization with the current checkpoint because full-attn f_theta restoration is not trained well enough for NIAH recall and the bad restored cache degrades decode.

Next Step

Train or fine-tune an f_theta checkpoint that includes full-attn S5 layers, then rerun this same flag as the latency/recall gate.

Made with Cursor

Adds a disabled S5 experiment that builds full-attention restored K/V from proposer K/V via f_theta instead of an extra verifier capture forward. The Mac ctx70 evidence shows the fixed build cost drops to about 2s, but recall regresses to 0/1 and decode slows, so this is evaluation evidence for retraining/next-step design rather than a merge-ready optimization. Co-authored-by: Cursor <cursoragent@cursor.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment with proposer KV full-attn restoration on Mac#108

Experiment with proposer KV full-attn restoration on Mac#108
FluffyAIcode wants to merge 1 commit into
AgentMemory/v04-pr-k3-block-c-f-theta-v2-trainer-fix-recall-8e7ffrom
AgentMemory/k3-proposer-kv-restoration-fastpath-8e7f

FluffyAIcode commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FluffyAIcode commented Jun 11, 2026

Summary

Evaluation

Interpretation

Next Step

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant