Releases · XidaoApi/local-llm-router

v1.0.0 — Initial Release

Route prompts between local and cloud LLMs based on task complexity. Save 80%+ on AI costs while keeping simple tasks private.

Smart routing — classify prompt complexity before sending to a model
Local-first — defaults to Ollama/llama.cpp, falls back to cloud only when needed
3 scoring strategies — keyword+length (fastest), classifier (accurate), LLM judge (most accurate)
OpenAI-compatible proxy — run as a drop-in replacement for any OpenAI API client
Cost tracking — JSON logs with latency, cost, and routing decisions per query
YAML config — define models, thresholds, and fallback chains in one file
CLI — llm-router init, route, serve, stats commands

Metric	Value
Local routing rate	80-90%
Cost savings vs cloud-only	85-95%
Scoring overhead	<50ms
Supported providers	Ollama, llama.cpp, vLLM, OpenAI, XiDao, any OpenAI-compatible

pip install local-llm-router

llm-router init
llm-router route "Summarize this article" --file article.txt
llm-router serve --port 8080