Releases: XidaoApi/local-llm-router
Releases · XidaoApi/local-llm-router
Release list
v1.0.0 — Local + Cloud LLM Routing
v1.0.0 — Initial Release
Route prompts between local and cloud LLMs based on task complexity. Save 80%+ on AI costs while keeping simple tasks private.
Features
- Smart routing — classify prompt complexity before sending to a model
- Local-first — defaults to Ollama/llama.cpp, falls back to cloud only when needed
- 3 scoring strategies — keyword+length (fastest), classifier (accurate), LLM judge (most accurate)
- OpenAI-compatible proxy — run as a drop-in replacement for any OpenAI API client
- Cost tracking — JSON logs with latency, cost, and routing decisions per query
- YAML config — define models, thresholds, and fallback chains in one file
- CLI —
llm-router init,route,serve,statscommands
Typical Results
| Metric | Value |
|---|---|
| Local routing rate | 80-90% |
| Cost savings vs cloud-only | 85-95% |
| Scoring overhead | <50ms |
| Supported providers | Ollama, llama.cpp, vLLM, OpenAI, XiDao, any OpenAI-compatible |
Installation
pip install local-llm-routerQuick Start
llm-router init
llm-router route "Summarize this article" --file article.txt
llm-router serve --port 8080