Skip to content

[CANCELLED] eval(omlx): oMLX parallel-inference evaluation (abandoned)#151

Draft
FluffyAIcode wants to merge 2 commits into
mainfrom
AgentMemory/omlx-parallel-inference-eval-2815
Draft

[CANCELLED] eval(omlx): oMLX parallel-inference evaluation (abandoned)#151
FluffyAIcode wants to merge 2 commits into
mainfrom
AgentMemory/omlx-parallel-inference-eval-2815

Conversation

@FluffyAIcode

@FluffyAIcode FluffyAIcode commented Jun 18, 2026

Copy link
Copy Markdown
Owner

CANCELLED — please close this PR. Per the user's request, the oMLX evaluation is abandoned and the head branch (AgentMemory/omlx-parallel-inference-eval-2815) has been deleted, so there is nothing to merge. GitHub did not auto-close it; one click on Close pull request will finish it off.

(Original scope: a read-only omlx-env-probe found oMLX was not installed on the Mac runner, and a parallel-inference bench was prepared but never run. All of it lived only on the now-deleted branch.)

Open in Web Open in Cursor 

cursoragent and others added 2 commits June 18, 2026 07:13
… capture its launch CLI

Prereq for evaluating whether oMLX (jundot/omlx) continuous-batching can do the
Gemma-4 parallel inference vllm-mlx could not. Read-only: detects CLI/app
bundle/brew/pip and dumps --help/serve|launch help; no server, no model load.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
Drives an already-running oMLX OpenAI server (OMLX_BASE_URL/OMLX_MODEL) with N
unique-needle requests serially then concurrently; reports errors, per-request
correctness (no cross-request contamination), and wall speedup — the exact
Gemma-4 parallel case vllm-mlx crashed on (shared_kv TypeError). Stdlib-only
(urllib+threads). Ready to run once oMLX is installed + serving on the Mac.

Co-authored-by: FluffyAIcode <FluffyAIcode@users.noreply.github.com>
@cursor cursor Bot deleted the AgentMemory/omlx-parallel-inference-eval-2815 branch June 18, 2026 07:23
@cursor cursor Bot changed the title eval(omlx): probe + parallel-inference bench to test oMLX continuous batching on Gemma-4 [CANCELLED] eval(omlx): oMLX parallel-inference evaluation (abandoned) Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants