Skip to content

[AMD] Add MiniMax-M3-FP8 MI355X ATOMESH update 0623#1930

Open
seungrokj wants to merge 5 commits into
mainfrom
amd/m3_atom_pd_fp8_0623
Open

[AMD] Add MiniMax-M3-FP8 MI355X ATOMESH update 0623#1930
seungrokj wants to merge 5 commits into
mainfrom
amd/m3_atom_pd_fp8_0623

Conversation

@seungrokj

@seungrokj seungrokj commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Eliminate all hardcoded MODEL_NAME == "DeepSeek-V4-Pro" / per-model checks from server_atom.sh
  • All model-specific configuration (env vars, parallel flags, MTP flags, KV cache flags, HF overrides) now driven from models_atom.yaml using the same python3 yaml.safe_load pattern as server_vllm.sh
  • Add MiniMax-M3-MXFP4 and MiniMax-M3-MXFP8 entries to models_atom.yaml with EAGLE3 MTP flags
  • Image bump for minimaxm3-fp8-mi355x-atom-disagg: rocm/atom-dev:MiniMax-M3-20260622rocm/atom-dev:MiniMax-M3-20260623

Fields added to models_atom.yaml

Field Purpose
env Space-separated KEY=VALUE pairs exported unconditionally
tp_dp_flags Parallel flags for TP+DPA mode
tp_dp_env Env vars exported only in TP+DPA mode
ep_dp_flags Parallel flags for EP+DPA mode
ep_dp_env Env vars exported only in EP+DPA mode
mtp_flags Flags prepended to SPEC_ARGS before $DECODE_MTP_SIZE
kv_cache_flags Full --kv_cache_dtype flag string
hf_overrides JSON string passed to --hf-overrides

PR Review Checklist

  • Verified that as of the moment of typing this, this is the latest version of PR_REVIEW_CHECKLIST.md
  • Verified that the general code quality meets the InferenceX standard and does not make the code quality any worse.
  • Verified that this PR has passed PR validation. Please link to GitHub Action workflow that shows this.
  • Verified that this PR passes evals. Please link to GitHub Action workflow that shows this.
  • Verified that speculative decoding PRs uses chat templates to align the AL distribution to real world
  • If a company claims that they support vLLM/SGLang as first class LLM inference engines on their hardware, I have verified that the respective vLLM/SGLang submission has been made before additional frameworks (TRT-LLM, ATOM, etc.). The only exceptions are for new hardware, such as MI455X UALoE72, Vera Rubin NVL72, Rubin NVL8, etc., and for new model architectures where there is an actual reason why vLLM/SGLang does not fundamentally support them yet.
  • Verified that the single-node recipes are similar to the official vLLM recipes and/or the SGLang cookbook:
    • If they are not, I have verified that a PR has been opened in vLLM recipe repo or SGLang repo and linked it below in the additional detail section:
  • If any of the above criteria cannot reasonably be satisfied, I have provided additional reasoning below.

🤖 Generated with Claude Code

seungrokj and others added 2 commits June 25, 2026 14:39
…els_atom.yaml

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…om.yaml-driven)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@seungrokj seungrokj changed the title [AMD] refactor server_atom.sh: drive model-specific config from models_atom.yaml [AMD] Add MiniMax-M3-FP4 MI355X ATOMESH update 0623 Jun 25, 2026
@seungrokj seungrokj added AMD all-evals Expand eval selection to every fixed-sequence config evals-only Suppress throughput and run only eval jobs; combine with all-evals to expand selection full-sweep-enabled labels Jun 25, 2026
Comment thread benchmarks/multi_node/amd_utils/models_atom.yaml
Comment thread benchmarks/multi_node/amd_utils/server_atom.sh
Comment thread benchmarks/multi_node/amd_utils/server_atom.sh
@github-actions

Copy link
Copy Markdown
Contributor

@functionstackx

Copy link
Copy Markdown
Collaborator

@Oseltamivir can u review this? tho it seems like evals r failing potentially failing

…ingFace path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@seungrokj seungrokj changed the title [AMD] Add MiniMax-M3-FP4 MI355X ATOMESH update 0623 [AMD] Add MiniMax-M3-FP8 MI355X ATOMESH update 0623 Jun 26, 2026
@seungrokj

Copy link
Copy Markdown
Collaborator Author

@functionstackx @Oseltamivir let me first check something and will ping when it is ready!

@seungrokj seungrokj removed the all-evals Expand eval selection to every fixed-sequence config label Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AMD evals-only Suppress throughput and run only eval jobs; combine with all-evals to expand selection

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants