-
Notifications
You must be signed in to change notification settings - Fork 706
Pull requests: vllm-project/semantic-router
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Router] Fix selection embeddings for non-qwen3 model_type
#2192
opened Jun 15, 2026 by
WUKUNTAI-0211
Collaborator
Loading…
[Bindings] Refactor onnx multimodal output extraction
#2190
opened Jun 15, 2026 by
WUKUNTAI-0211
Collaborator
Loading…
[CI/Build] refresh precommit-local image before running
#2179
opened Jun 14, 2026 by
xiaotian-yu
Contributor
Loading…
3 of 4 tasks
[CLI] add 'vllm-sr rag list' to list ingested vector stores
#2178
opened Jun 13, 2026 by
WUKUNTAI-0211
Collaborator
Loading…
[Router] fix deepseek official thinking mode
#2175
opened Jun 13, 2026 by
brelance
Contributor
Loading…
[Router] Optimize CountTokensApprox to avoid constructing intermediate slices
#2171
opened Jun 12, 2026 by
cryo-zd
Collaborator
Loading…
4 tasks
[CI/Build] Add agent memory review CI gates
#2169
opened Jun 12, 2026 by
nagisa-kunhah
•
Draft
5 tasks done
[Router] Dedup query-embedding inference on the cache-miss path (in-memory backend)
#2162
opened Jun 12, 2026 by
theohsiung
Contributor
Loading…
3 tasks done
[Bindings] Implement causal masking in the chunked-SDPA kernel
#2159
opened Jun 11, 2026 by
Peterren
Contributor
Loading…
[Docs][CI/Build] Sync zh-Hans translation status checks
#2156
opened Jun 11, 2026 by
wilsonwu
Contributor
Loading…
4 tasks done
[Router] fix router_replay history 413 on large requests (bound list response + safer default)
#2155
opened Jun 11, 2026 by
siloteemu
Contributor
Loading…
[Router] preserve thinking and tool_choice.disable_parallel_tool_use on Anthropic→Anthropic path
#2153
opened Jun 11, 2026 by
siloteemu
Contributor
Loading…
[Dashboard] Improve signal config enum and domain selection controls
#2152
opened Jun 11, 2026 by
wilsonwu
Contributor
Loading…
4 tasks done
[Router] Preserve structured system content in RAG system_prompt injection
#2149
opened Jun 11, 2026 by
theohsiung
Contributor
Loading…
3 tasks done
[Router] Add RAG result-count, truncation, and typed-error metrics
#2146
opened Jun 10, 2026 by
Peterren
Contributor
Loading…
[Router] Re-normalize truncated FFI embeddings
#2145
opened Jun 10, 2026 by
WUKUNTAI-0211
Collaborator
Loading…
[Router] Validate semantic_cache similarity_threshold range
#2143
opened Jun 10, 2026 by
theohsiung
Contributor
Loading…
3 tasks done
[Router] Preserve reasoning_content when caching streaming responses
#2141
opened Jun 10, 2026 by
theohsiung
Contributor
Loading…
3 tasks done
[Router] Escape user id in Milvus memory filter expressions (injection fix)
#2139
opened Jun 10, 2026 by
theohsiung
Contributor
Loading…
2 of 3 tasks
[Router] Validate /api/v1/similarity input: reject blank text and out-of-range priority
#2137
opened Jun 10, 2026 by
WUKUNTAI-0211
Collaborator
Loading…
[Router] Enforce hard user-scope isolation in semantic cache (cross-tenant leak fix)
#2135
opened Jun 10, 2026 by
theohsiung
Contributor
Loading…
4 of 5 tasks
[Router] Do not cache non-2xx upstream responses (cache poisoning fix)
#2131
opened Jun 10, 2026 by
theohsiung
Contributor
Loading…
5 tasks done
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.