Skip to content

Releases: XidaoApi/local-llm-router

v1.0.0 — Local + Cloud LLM Routing

Choose a tag to compare

@XidaoApi XidaoApi released this 14 May 14:05

v1.0.0 — Initial Release

Route prompts between local and cloud LLMs based on task complexity. Save 80%+ on AI costs while keeping simple tasks private.

Features

  • Smart routing — classify prompt complexity before sending to a model
  • Local-first — defaults to Ollama/llama.cpp, falls back to cloud only when needed
  • 3 scoring strategies — keyword+length (fastest), classifier (accurate), LLM judge (most accurate)
  • OpenAI-compatible proxy — run as a drop-in replacement for any OpenAI API client
  • Cost tracking — JSON logs with latency, cost, and routing decisions per query
  • YAML config — define models, thresholds, and fallback chains in one file
  • CLIllm-router init, route, serve, stats commands

Typical Results

Metric Value
Local routing rate 80-90%
Cost savings vs cloud-only 85-95%
Scoring overhead <50ms
Supported providers Ollama, llama.cpp, vLLM, OpenAI, XiDao, any OpenAI-compatible

Installation

pip install local-llm-router

Quick Start

llm-router init
llm-router route "Summarize this article" --file article.txt
llm-router serve --port 8080