-
Notifications
You must be signed in to change notification settings - Fork 331
Pull requests: ROCm/aiter
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
re-tune flydsl bf16 gemm tuned config
#3434
opened May 30, 2026 by
XiaobingSuper
Contributor
Loading…
1 task
Align aiter prebuild jobs with build runner CPU limit
#3433
opened May 30, 2026 by
gyohuangxin
Member
Loading…
Fix FlyDSL bf16 GEMM tuned CSV kernel names
#3432
opened May 30, 2026 by
coderfeli
Collaborator
Loading…
Add Kimi-K2 Thinking entry to per-model kernel benchmark sweep
#3431
opened May 30, 2026 by
okorzh-amd
Contributor
Loading…
Add native integer all-gather dtype support and optimize gfx942 custom all-gather
#3430
opened May 29, 2026 by
hubertlu-tw
Contributor
Loading…
5 of 6 tasks
Fuse dynamic_per_tensor_quant_fp8_i8 into one launch for the decode regime
#3429
opened May 29, 2026 by
JohnQinAMD
Contributor
Loading…
2 of 3 tasks
add MX_FP4_A8 tuned configs and dispatch for moe_gemm_a8w4
#3428
opened May 29, 2026 by
xiaohuguo2023
Member
Loading…
[FLYDSL] Measure impact of new stream-k a16w16 gemm
#3427
opened May 29, 2026 by
xytpai
Contributor
Loading…
Add bf16 qh8 qseqlen=2 MLA decode kernel (gfx950)
#3426
opened May 29, 2026 by
alexioslyrakis-amd
Contributor
Loading…
3 tasks done
[triton-mha] hint head-stride div-by-8 for vectorized global load
#3424
opened May 29, 2026 by
mgehre-amd
Contributor
•
Draft
[triton-mha] add gfx1151 tuning config
#3423
opened May 29, 2026 by
mgehre-amd
Contributor
•
Draft
2 of 3 tasks
Add GLM-5.1 FP8 blockscale GEMM/FMoE tunings for gfx942 (MI300X/MI325)
#3422
opened May 29, 2026 by
akii96
Contributor
Loading…
[gfx1151] [flash_attn_triton_amd]: Tune for gfx1151 / ViT
#3419
opened May 29, 2026 by
mgehre-amd
Contributor
•
Draft
Add PER_TOKEN_HEAD FP8 quantization and P-scale for mha_batch_prefill
#3418
opened May 29, 2026 by
msaffari-amd
Loading…
1 task done
[Bugfix] Enable MXFP4 MoE at TP=4/8 via CKTile a4w4 kernels and quant fixes
#3412
opened May 29, 2026 by
haoyangli0109
Contributor
Loading…
[Bugfix]Fix varctx scheduling in pa_mqa_logits benchmark.
#3410
opened May 29, 2026 by
charlieguo1106
Loading…
[Triton] Add New Features and Performance Improvement for GMM Kernel
ci:triton-300x
ci:triton-355
enhancement
New feature or request
triton
#3407
opened May 28, 2026 by
brunomazzottiamd
Contributor
Loading…
4 tasks done
[Triton-Gluon-MLA-GFX950] add mla_decode_gluon_bh16_dcp kernel wrapper for DCP MLA decode
#3402
opened May 28, 2026 by
Dewei-Wang-sh
Contributor
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.