Supported LLM Providers

veritail works with cloud LLM APIs and any OpenAI-compatible local model server. This page covers provider setup, local model configuration, and model quality guidance.

Requirements

Python >= 3.9
An LLM provider -- one of:
- OpenAI API key (included with base install)
- Anthropic API key (pip install veritail[anthropic])
- Google Gemini API key (pip install veritail[gemini])
- A running OpenAI-compatible local model server (no extra install needed -- see Local models below)

Cloud providers (recommended)

Provider	Example `--llm-model`	API key env var	Install
OpenAI	`gpt-4o`, `gpt-4o-mini`, `o3-mini`	`OPENAI_API_KEY`	included
Anthropic (Claude)	`claude-sonnet-4-5`, `claude-haiku-4-5`	`ANTHROPIC_API_KEY`	`pip install veritail[anthropic]`
Google Gemini	`gemini-2.5-flash`, `gemini-2.5-pro`	`GEMINI_API_KEY` or `GOOGLE_API_KEY`	`pip install veritail[gemini]`

Cloud models provide the highest evaluation quality and are recommended for production use.

Provider detection

veritail selects the provider based on the model name passed to --llm-model:

Names starting with claude use the Anthropic API.
Names starting with gemini use the Google Gemini API.
All other names use the OpenAI API (this also covers OpenAI-compatible local servers).

Batch API support

OpenAI, Anthropic, and Gemini all support batch evaluation via veritail run --batch. Batch mode submits all judgments in a single API call and polls for results, which can reduce costs and rate-limit pressure. For OpenAI, batch mode is only available when using the default OpenAI endpoint (not when --llm-base-url is set).

Local models via OpenAI-compatible servers

veritail connects to any server that exposes the OpenAI chat completions API (POST /v1/chat/completions). Pass --llm-base-url to point at a local endpoint:

# Ollama
ollama pull qwen3:14b
veritail run \
  --queries queries.csv \
  --adapter my_adapter.py \
  --llm-model qwen3:14b \
  --llm-base-url http://localhost:11434/v1 \
  --llm-api-key not-needed

# vLLM
veritail run \
  --queries queries.csv \
  --adapter my_adapter.py \
  --llm-model meta-llama/Llama-4-Scout \
  --llm-base-url http://localhost:8000/v1 \
  --llm-api-key not-needed

# LM Studio
veritail run \
  --queries queries.csv \
  --adapter my_adapter.py \
  --llm-model local-model \
  --llm-base-url http://localhost:1234/v1 \
  --llm-api-key lm-studio

--llm-base-url and --llm-api-key also work with veritail generate-queries.

Environment variable alternative

You can set environment variables instead of passing CLI flags:

export OPENAI_BASE_URL=http://localhost:11434/v1
export OPENAI_API_KEY=not-needed
veritail run --queries queries.csv --adapter my_adapter.py --llm-model qwen3:14b

Tested local servers

Server	Default port	Docs
Ollama	`11434`	OpenAI compatibility
vLLM	`8000`	OpenAI-compatible server
LM Studio	`1234`	API docs
LocalAI	`8080`	Features
llama.cpp server	`8080`	Server docs
SGLang	varies	Docs

Model quality guidance

veritail computes aggregate IR metrics (NDCG, MRR, MAP) from LLM relevance scores. The reliability of these metrics depends on the LLM's ability to follow instructions and produce consistent judgments.

Model tier	Examples	Metric reliability
Frontier cloud models	Claude Sonnet/Opus, GPT-4o, GPT-o3	High -- recommended for production evaluation
Large local models (70B+)	Llama 4 Maverick, Qwen 3 72B, DeepSeek V3	Good -- comparable to cloud models with sufficient hardware
Mid-size local models (14B-30B)	Qwen 3 14B/30B, Phi-4 14B, Mistral 7x8B	Adequate -- some scoring noise; suitable for rapid iteration
Small local models (<=8B)	Llama 3.2 3B, Phi-4-mini, Gemma 3 4B	Noisy -- scores may be inconsistent and affect metric reliability

For reliable metrics that can inform production search decisions, we recommend frontier cloud models or 70B+ parameter local models. Smaller models are useful for fast, low-cost iteration during development but their scores should be interpreted with caution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supported LLM Providers

Requirements

Cloud providers (recommended)

Provider detection

Batch API support

Local models via OpenAI-compatible servers

Environment variable alternative

Tested local servers

Model quality guidance

See also

FilesExpand file tree

supported-llm-providers.md

Latest commit

History

supported-llm-providers.md

File metadata and controls

Supported LLM Providers

Requirements

Cloud providers (recommended)

Provider detection

Batch API support

Local models via OpenAI-compatible servers

Environment variable alternative

Tested local servers

Model quality guidance

See also