-
Notifications
You must be signed in to change notification settings - Fork 149
Pull requests: lightseekorg/tokenspeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[WIP] perf(gdn): remove gdn prefill unnecessary h_state copy
#409
opened Jun 10, 2026 by
minedec
Contributor
Loading…
[CI] Use new label for gfx950 benchmark jobs
#408
opened Jun 10, 2026 by
antiagainst
Member
Loading…
fix(runtime): use scattered token counts for MoE RSAG
#407
opened Jun 10, 2026 by
Williams500
Loading…
Add small-M Gluon warp-decode MoE path for GPT-OSS
#403
opened Jun 9, 2026 by
panditsa
Contributor
Loading…
perf(deepseek-v4): pre-compile deep_gemm JIT kernels at startup
#398
opened Jun 9, 2026 by
dongjiyingdjy
Contributor
Loading…
3 of 4 tasks
feat(engine): Sleep / Wake Up API (release/resume_memory_occupation)
#393
opened Jun 9, 2026 by
HJSang
Collaborator
Loading…
[WIP] refactor(spec-decode): simplify Llama Eagle3 attention path for #217 (1/3)
#390
opened Jun 9, 2026 by
rjzhb
Contributor
Loading…
perf(gdn): fuse causal_conv1d and QKV split for GDN prefill
#382
opened Jun 8, 2026 by
elwhyjay
Contributor
Loading…
fix(scheduler): publish prefix to radix tree during prefill for non-hybrid models
#381
opened Jun 8, 2026 by
qywu
Collaborator
Loading…
[WIP] feat(kernel): introduce moe kernel api
#374
opened Jun 7, 2026 by
borontion
Contributor
Loading…
fix(cache): Coarsely fence the compute stream behind the host loadback stream on.
#370
opened Jun 6, 2026 by
LorrinWWW
Contributor
Loading…
[Bugfix] fix smg command argument cannot be used multiple times
#366
opened Jun 6, 2026 by
lengrongfu
Loading…
feat(mla): decode-context-parallel (DCP) support in the MLA decode kernel
#364
opened Jun 5, 2026 by
RomaA2000
Loading…
feat(video) Generalize multimodal runtime support and add Qwen3.5 video
#354
opened Jun 4, 2026 by
yechank-nvidia
Collaborator
•
Draft
feat(logprobs): vLLM-style output logprobs (LogprobParams), spec-decode support
#337
opened Jun 2, 2026 by
HJSang
Collaborator
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.