Skip to content

Commit f5c9dfb

Browse files
committed
feat(moe/gluon): warp-decode MoE for gfx950 small-M decode
Squashed warp-decode work (coop-LDS stage1 + per-M split-K stage2, interleave/ K-tail/scale fixes) for rebase onto the PR lightseekorg#374 MoE-API refactor. Full per-commit history preserved in backup/gptoss-warp-decode-moe-* . Signed-off-by: Sanket Pandit <sanket.pandit@amd.com>
1 parent 38b3a35 commit f5c9dfb

1 file changed

Lines changed: 524 additions & 208 deletions

File tree

0 commit comments

Comments
 (0)