Repo: https://github.com/Perseus-Computing-LLC/plutus (public, MIT)
Local: /opt/data/webui/minions/.minions-data/workspace/plutus/
Status: Live and running. Monitoring + auto-balancing active.
Named for the Greek god of wealth. Plutus is a credit & spend monitor for Hermes Agent that watches the money draining out of every LLM provider you use — and automatically rebalances model routing toward whichever provider has the most runway. It sits alongside Perseus / Mimir / Mneme / Minions in the toolset.
It tracks exactly three providers (the only ones the user cares about): deepseek, anthropic, google.
Fuses two data sources per provider:
- Live balance API where one exists. DeepSeek:
GET /user/balance→ real dollars. (Anthropic/Google have no usable balance API with the current keys.) - Local spend ledger for the rest, read from Hermes
state.db→sessionstable (billing_provider,estimated_cost_usd/actual_cost_usd, token counts). Gives spend today/7d/30d/all, burn rate, and projected days-left.
For no-balance providers, the user supplies a budget and Plutus computes
remaining = budget - ledger_spend. --calibrate back-solves the budget from
the real console balance so estimates self-correct over time.
Outputs: CLI table (default), --json, --html <path> dashboard, --snapshot
(append burn-rate history).
Ranks the three providers by projected days-left, then rewrites Hermes routing:
- Primary = flagship model of the highest-runway provider
- Delegation (subtasks) = fast model of the best non-primary provider
- Fallbacks = the other two providers, flagship first, fast model second
Every config write: backs up config.yaml first, refuses to write if any
provider block or API key would be lost, re-verifies the round-trip after
writing, and a no-op guard skips writes when the decision hasn't changed.
All model IDs are verified live against each provider's /models endpoint
before promotion (this caught gemini-3-pro-preview being 404/retired —
gemini-3.1-pro-preview is the live flagship).
- Primary: google / gemini-3.1-pro-preview (~4000+ days runway, barely used)
- Delegation: anthropic / claude-sonnet-4-5-20250929
- Fallbacks: anthropic/claude-opus-4-8 → deepseek/deepseek-v4-pro → anthropic/claude-sonnet-4-5 → deepseek/deepseek-v4-flash
- DeepSeek (live $57.76, ~6 days runway) pushed out of the hot path so it stops bleeding. It was the bottleneck.
- deepseek: ~$57.76 (LIVE from API), ~6 days
- anthropic: ~$52.63 (estimate: $74.46 budget − spend), ~16 days
- google: ~$93.44 (estimate: $93.59 budget − spend), effectively unlimited
| Job | ID | Schedule | What |
|---|---|---|---|
| Plutus credit refresh | p1u7u5cred01 |
every 60m | plutus-refresh.sh: regen dashboard, snapshot history, re-run the balancer |
| Plutus balance check-in | p1u7u5cal1b8 |
every 3 days | Asks the user for the real Anthropic/Google balances, then recalibrates |
Cron config: /opt/data/webui/minions-hermes-config/cron/jobs.json
Refresh script: /opt/data/webui/minions-hermes-config/cron/plutus-refresh.sh
The system is autonomous now. Things a future session may be asked to do:
- Recalibrate when the user replies to a 3-day check-in with real balances:
python3 plutus.py --calibrate anthropic=NN.NN --calibrate google=NN.NN(back-solves budget; dashboard auto-refreshes). - Watch Google's runway fall. Google now absorbs primary traffic, so its ~4000-day figure will drop fast. The hourly balancer will re-rank and reroute automatically — verify it's behaving as Google approaches Anthropic's runway.
- Optional adds the user floated: CI workflow (lint +
plutus.py --jsonsmoke test), dashboard screenshot/asciinema in README, low-balance email alert via Himalaya (DeepSeek < threshold or days-left < N). - Add a live-balance provider: write a fetcher returning
{"balance_usd":..,"ok":True}, register inBALANCE_FETCHERS+LEDGER_ALIASES.
| File | Purpose | Committed? |
|---|---|---|
plutus.py |
Monitor + calibrate | yes |
plutus_route.py |
Balancing arm | yes |
plutus-refresh.sh |
Cron driver | yes |
plutus.budgets.example.json |
Budget template | yes |
README.md, LICENSE, .gitignore |
Package | yes |
plutus.budgets.json |
Real budgets ($74.46/$93.59) | NO — gitignored |
plutus.html, *.jsonl, *.log, *.bak |
Runtime artifacts | NO — gitignored |
- DeepSeek is the only LIVE balance. Anthropic/Google are estimate-based and only as accurate as the last calibration. The 3-day check-in keeps them honest.
- No
ghCLI here. GitHub ops go through REST API + git over HTTPS, with the token read at runtime inexecute_code()from/opt/data/webui/minions-hermes-config/cache/bws_cache.json(secrets.GITHUB_TOKEN). Credential redaction mangles tokens in any tool ARGUMENT — never inline them. - Config edits are high-stakes. User has zero tolerance for sloppy config
serialization. plutus_route.py's backup + verify + round-trip checks exist for
this reason. A rollback backup is at
config.yaml.plutus-bak-<timestamp>if routing ever needs reverting. - Azure is DEAD. Standing directive: "drop Azure, all-in AWS." Any recalled memory walking through Azure portal / pasting an Azure key is STALE history, archived 2026-06-19. Do not act on it. Plutus tracks only deepseek/anthropic/google.
- New sessions pick up the Google-primary routing immediately; the session that applied it keeps its old model until restart.