Skip to content

Latest commit

 

History

History
77 lines (62 loc) · 36.1 KB

File metadata and controls

77 lines (62 loc) · 36.1 KB

CLAUDE.md

Guidance for Claude Code when working in this repository.

See README.md for user-facing docs, docs/ARCH.md for architecture detail, and docs/SERVER.md for the HTTP server (aictl-server). This file is the compact reference for code changes.

Build & Run

cargo build                    # debug build (workspace, all members)
cargo build --release          # release build
cargo run --bin aictl -- <args># run the CLI binary with arguments
cargo lint                     # clippy pedantic (alias in .cargo/config.toml; lints default-members — desktop excluded)
cargo fmt                      # format
cargo test                     # run tests across the workspace

Workspace layout

Four-crate Cargo workspace; aictl-desktop is excluded from default-members so a bare cargo build / cargo lint / cargo test keeps working without Tauri's deps. CI builds the desktop separately on macOS only.

  • crates/aictl-core/ — package aictl-core, library crate (lib name aictl_core). Hosts the agent loop, providers, tools, security, sessions, audit log, MCP/plugin/hook systems, and the aictl_core::ui::AgentUI trait that frontends implement. Does not link any terminal library; every side effect routes through AgentUI (or aictl_core::ui::warn_global for runtime warnings).
  • crates/aictl-cli/ — package aictl-cli, binary crate ([[bin]] name = "aictl"). Hosts the REPL, slash-command UI, status banner, and the PlainUI / InteractiveUI impls of AgentUI (crossterm + indicatif + termimad + rustyline live here). Re-exports aictl-core's modules under crate::* for legacy import paths.
  • crates/aictl-server/ — package aictl-server, binary crate ([[bin]] name = "aictl-server"). OpenAI-compatible HTTP LLM proxy: POST /v1/chat/completions, POST /v1/completions, POST /v1/messages, GET /v1/models, GET /v1/stats, GET /healthz. Pure proxy — no agent loop, no tool dispatch, no agents/skills/sessions. The OpenAI-shaped routes translate to the engine's Vec<Message> and dispatch via aictl_core::llm::call_<provider>. POST /v1/messages is dual-mode (see messages.rs): Anthropic models always pass through messages::passthrough verbatim to api.anthropic.com/v1/messages (tool_use / tool_result blocks untouched, prompt caching / extended thinking / anthropic-beta features intact); non-Anthropic models are rejected with 400 model_not_found unless AICTL_SERVER_MESSAGES_CROSS_PROVIDER=true, in which case they route through messages::translator — a per-provider HTTP round-trip (not aictl_core::llm::call_* — those use the engine's XML tool format) that translates the Anthropic request shape into OpenAI / Gemini / Ollama native shapes and back. The translator owns its own dispatch, native tools[] / tool_calls[] survive the round-trip, and provider streams bridge to Anthropic's structured SSE event sequence (message_start / content_block_start / content_block_delta / content_block_stop / message_delta / message_stop) via state machines under messages/translator/stream/. Unsupported Anthropic features (cache_control, thinking, PDF blocks, URL images on Gemini/Ollama) flow through messages::translator::feature_gate with strip / warn / reject modes (AICTL_SERVER_MESSAGES_FEATURE_GATE); GGUF / MLX are rejected on the cross-provider path. Reuses aictl_core::run::redact_outbound and aictl_core::security::detect_prompt_injection on every path; passthrough logs gateway:anthropic, translation logs gateway:messages:<provider>, plus feature_dropped:<provider> entries when fields are stripped. Master-key gate on every authenticated route; master_key::resolve reads AICTL_SERVER_MASTER_KEY through keys::get_secret (keyring-first, plain config fallback) and persists a generated key via keys::set_secret on first launch — so CLI-side /keys lock/unlock cycles round-trip the entry without restarting the server. axum 0.8 + tower-http live here only — they never enter aictl-core. See docs/SERVER.md for the full reference.
  • crates/aictl-desktop/ — package aictl-desktop, Tauri-based macOS desktop app (work-in-progress, unreleased). Excluded from default-members; only built explicitly with cargo build -p aictl-desktop. Reuses the aictl-core engine; no functional drift from the CLI.

Cargo features (gguf, mlx, redaction-ner) live on the aictl-core crate; the CLI and server declare them as aictl-core/<feature> passthroughs.

Module map

Submodule trees: llm/ (providers) and tools/ (tool impls) live under crates/aictl-core/src/. commands/ (slash-command handlers) lives under crates/aictl-cli/src/.

  • crates/aictl-cli/src/main.rs — CLI (clap), config + security + session init, agent loop driver, single-shot vs REPL
  • crates/aictl-core/src/run.rsrun_agent_turn loop, tool-call dispatch, outbound redaction, stream suspend wiring; also exposes Provider, with_esc_cancel, Interrupted, build_system_prompt, run_agent_single
  • agents.rs (+ agents/remote.rs) — agent prompts in ~/.aictl/agents/ (per-user catalogue) plus project-local overrides at <cwd>/.aictl/agents/ or <cwd>/.claude/agents/ as a legacy fallback (the presence of .aictl/ skips .claude/ entirely — see config::local_config_root). Both bare <name> and <name>.md filenames are accepted (the .md convention matches the remote catalogue and the project-local directories). read_agent resolves local-first; list_agents merges both with local entries winning on name collision. Each AgentEntry carries an Origin (Global / Local / LocalClaude) that drives the badge in /agent, --list-agents, and the entry-aware delete_agent_entry / save_agent_entry so menu actions land at the correct file. Loaded agent is appended to the system prompt. Optional YAML frontmatter (name, description, source, category) — source: aictl-official renders an [official] badge alongside the origin tag. agents/remote.rs fetches the live catalogue from .aictl/agents/ in the project repo via GitHub's trees API and pulls a single .md on demand (REPL browse entry or --pull-agent <name> + --force) — pulled files always land in the per-user ~/.aictl/agents/. The frontmatter is stripped before the body is injected into the system prompt.
  • audit.rs — per-session JSONL tool-call log under ~/.aictl/audit/<session-id>. set_file_override(path) (wired from --audit-file <PATH>) redirects logging to an explicit file and force-enables the subsystem so single-shot runs (no session id) can capture an audit trail.
  • commands.rs + commands/ — slash commands (/agent, /balance, /behavior, /clear, /compact, /config, /context, /copy, /exit, /gguf, /help, /history, /hooks, /info, /keys, /mcp, /memory, /mlx, /model, /ping, /plugins, /remember, /retry, /roadmap, /security, /session, /skills, /stats, /tools, /undo, /uninstall, /update, /version); any other /<name> falls through to a user-defined skill lookup.
  • config.rs~/.aictl/config loader (OnceLock<RwLock<HashMap>>), constants, load_prompt_file, local_config_root (resolves <cwd>/.aictl/ or the legacy <cwd>/.claude/ root used by agents and skills for project-local overrides)
  • keys.rs — keyring-backed API key storage with plain-text fallback; use get_secret(name) not config_get for keys
  • security.rsSecurityPolicy: shell/path validation, CWD jail, env scrub, output sanitization, prompt-injection guard
  • security/redaction.rs (+ redaction/ner.rs) — redactor used at two seams: network-boundary (redact_outbound in run.rs) and persistence-boundary (redact_for_persistence, called from session::save_messages and the REPL's add_redacted_history). Three layers — A: regex, B: entropy, C: optional NER. redact_for_persistence treats Block mode as Redact (the network call has already aborted; we still want placeholders on disk).
  • memory.rs — long-term memory store at ~/.aictl/memory.json. Two write seams: the save_memory tool the agent calls when it spots a fact worth remembering, and the CLI /remember <fact> slash command. Reads happen in run::build_system_prompt via memory::prompt_block, which appends a # Memory section listing every saved fact. enabled() reads AICTL_MEMORY_ENABLED (default true); session::is_incognito() is the stronger kill-switch — when on, add returns Disabled, load_for_prompt returns empty, and the prompt block is suppressed so a temporary chat never leaks into the long-term store and never sees prior memories. List capped at 200 entries (MAX_ENTRIES); each entry capped at 1000 chars (MAX_ENTRY_LEN); over-cap writes drop the oldest entry first. The CLI surfaces management via /memory (toggle / view / delete one / clear all) and the non-interactive --list-memories / --remember <FACT> flags. The desktop app exposes the same surface under Settings → Memory via the memory_status / memory_set_enabled / memory_add / memory_remove / memory_clear Tauri commands.
  • session.rs — session persistence + incognito toggle
  • skills.rs (+ skills/remote.rs) — ~/.aictl/skills/<name>/SKILL.md CRUD + frontmatter parse, with project-local overrides at <cwd>/.aictl/skills/<name>/SKILL.md or <cwd>/.claude/skills/<name>/SKILL.md as a legacy fallback (same .aictl/ > .claude/ precedence as agents). find resolves local-first; list merges both with local entries winning on name collision. Each SkillEntry carries an Origin (Global / Local / LocalClaude) that drives the badge in /skills and --list-skills and is consumed by the entry-aware delete_entry so menu actions target the correct directory. Skills are one-turn-scoped markdown playbooks: for one run::run_agent_turn call the skill body is concatenated onto messages[0].content (not inserted as a separate System message — Anthropic/Gemini only keep the last System they see) and never written into session history. Invoked via /<skill-name>, --skill <name>, or the /skills menu. AICTL_SKILLS_DIR overrides the per-user default directory (local overrides are unaffected). Optional YAML frontmatter (name, description, source, category) — source: aictl-official renders an [official] badge alongside the origin tag. skills/remote.rs fetches the live catalogue from .aictl/skills/<name>/SKILL.md in the project repo via GitHub's trees API and pulls a single SKILL.md on demand (REPL browse entry or --pull-skill <name> + --force) — pulled directories always land in the per-user catalogue.
  • stats.rs — usage stats under ~/.aictl/stats
  • tools.rs + tools/ — XML parsing, dispatch, duplicate guard, per-tool impls (35 tools, including save_memory which writes through memory::add, view_map which the desktop app intercepts to render OpenStreetMap/Esri pins, and draw_chart which the desktop app intercepts to render a Chart.js canvas that re-themes when the app theme flips). Tool names starting with mcp__ route to mcp::call_tool; everything else unknown falls through to plugins::find() so user-installed plugin tools dispatch through the same gate. Duplicate-call key normalizes JSON bodies for mcp__* calls so whitespace differences don't create distinct cache entries.
  • hooks.rs — user-defined lifecycle hooks loaded from ~/.aictl/hooks.json (override via AICTL_HOOKS_FILE). Events: SessionStart, SessionEnd, UserPromptSubmit (can block or rewrite the prompt), PreToolUse (can block or pre-approve a tool), PostToolUse (observe + add context), Stop (after final answer), PreCompact, Notification. Each hook is { matcher, command, timeout, enabled }; the matcher is a glob with */?/| alternation against the tool name (or * for non-tool events). Hooks receive a JSON payload on stdin (event, session_id, cwd, tool, prompt, …) and may return JSON on stdout: {"decision":"block","reason":"..."} aborts the action; {"decision":"approve","reason":"..."} skips the user confirm; {"additionalContext":"..."} injects a <hook_context> block into the next turn; {"rewrittenPrompt":"..."} (UserPromptSubmit only) replaces the user message before the agent sees it. Plain-text stdout becomes additionalContext. Exit code 2 is short-hand for block with stderr as the reason. Hooks are harness behavior — --unrestricted does not bypass them. /hooks REPL menu and --list-hooks flag manage the catalogue. Default per-hook timeout: 60s.
  • mcp.rs (+ mcp/{config,protocol,transport,stdio,http,sse}.rs) — Model Context Protocol client. Servers declared in ~/.aictl/mcp.json (override via AICTL_MCP_CONFIG) in a Claude Desktop-compatible shape. Stdio entries: { command, args, env, enabled, timeout_secs }. Remote entries set transport: "http" (modern Streamable HTTP) or "sse" (legacy HTTP+SSE) and supply { url, headers, enabled, timeout_secs }. Both env and headers values support ${keyring:NAME} substitution that pulls a secret from keys::get_secret(NAME) at parse time rather than checking it in. The mcp::transport::Transport trait (boxed-future, object-safe via Arc<dyn Transport>) is the shared dispatch surface for StdioClient / HttpClient / SseClient — the call-site in mcp::call_tool doesn't care which transport a server uses. JSON-RPC 2.0 framing is hand-rolled (no extra deps): init_with() parses the config, connects each enabled server in parallel, completes the initialize handshake under AICTL_MCP_STARTUP_TIMEOUT (default 10s), calls tools/list, stores the catalogue in a OnceLock<Vec<McpServer>>. Per-server failures land in ServerState::Failed(reason) and never abort startup. Tools surface as mcp__<server>__<tool>; the catalogue (with input schemas) is appended to the system prompt by run::build_system_prompt. tools.rs::execute_tool routes any name starting with mcp__ to mcp::call_tool, which canonicalizes the JSON body, sends tools/call, and concatenates content[] text blocks. Whole subsystem gated behind AICTL_MCP_ENABLED=true (default off) — third-party processes do not auto-spawn. Remote-transport gate: security::validate_mcp_url enforces hostname allow/deny and HTTPS-by-default — AICTL_MCP_ALLOW_HOSTS=api.example.com,*.foo.com (whitelist; empty/unset = allow-any beyond the deny list), AICTL_MCP_DENY_HOSTS=bad.example.com (blacklist; always wins), AICTL_MCP_ALLOW_HTTP=true (opt-in for plaintext http://). The check runs at config-parse time so a denied URL fails fast, and the HTTP/SSE clients re-validate at every dispatch as defense-in-depth — outbound network calls otherwise bypass the CWD jail. AICTL_MCP_DENY_SERVERS=foo,bar blanket-blocks servers at the tool-call security gate; AICTL_MCP_DISABLED=foo skips them at init time. --mcp-server <name> restricts a single process to one server without persisting the disable list. mcp::shutdown() runs on every exit path; stdio Child spawns with kill_on_drop(true) and the SSE reader task is aborted on Drop as a backstop. --list-mcp prints the catalogue; /mcp is the REPL menu. Reference smoke server at examples/mcp/tiny_add/server.py and example config at examples/mcp.json.
  • plugins.rs — discovery + execution of user-installed plugin tools under ~/.aictl/plugins/<name>/ (override via AICTL_PLUGINS_DIR). Each plugin pairs a plugin.toml manifest (name/description/entrypoint/optional requires_confirmation/timeout_secs/schema_hint) with an executable. init() walks the directory, parses the manifest with a hand-rolled mini-TOML parser (subset: strings, bools, ints, triple-quoted strings), validates the entrypoint stays inside the plugin dir (symlink-aware, rejects collisions with built-in tool names), and stores survivors. execute_plugin spawns the entrypoint directly (no shell), pipes the tool body in on stdin, returns stdout — or [exit N] <stderr> on non-zero exit. Pinned to security::working_dir() with scrubbed_env() and the plugin's manifest timeout (falling back to security::shell_timeout). The whole subsystem is gated behind AICTL_PLUGINS_ENABLED=true (default off) — third-party code must not auto-load. --list-plugins prints the catalogue; /plugins is the REPL menu.
  • crates/aictl-core/src/ui.rsAgentUI trait, ToolApproval, ProgressHandle + ProgressBackend, the WarningSink / set_warning_sink / warn_global global-warn surface. No terminal-library types in scope.
  • crates/aictl-cli/src/ui.rsPlainUI (single-shot, pipe-friendly) + InteractiveUI (REPL: termimad markdown, crossterm tool-confirm selector, indicatif progress backend, raw-mode Esc cancel listener). PlainUI carries an OutputFormat (md default = pass through raw markdown source, streamed when stdout is a TTY; text = strip markdown markers via strip_markdown regex pass and emit plain prose; json = emit a one-line {"answer", "model", "provider"} envelope on stdout). Stream chunks for text/json are swallowed because formatting markers can split across deltas — show_answer is the single emission point on those paths. text/json also suppress reasoning / auto-tool / tool-result chatter (json on stderr too) so stdout stays clean for piping. Re-exports aictl_core::ui types so legacy crate::ui::AgentUI paths keep resolving.
  • llm.rs + llm/TokenUsage, MODELS catalog, provider calls (OpenAI, Anthropic, Gemini, Grok, Mistral, DeepSeek, Kimi, Z.ai, Ollama, GGUF, MLX). llm/gguf.rs::CATALOG and llm/mlx.rs::CATALOG expose the curated starter lists (label / spec / size_label) the CLI's /gguf+/mlx menus and the desktop's "Local Models" Settings tab both consume — single source of truth so the two frontends stay in sync. llm/balance.rs exposes per-provider credit/quota probes used by /balance and --balance / --list-balances: real fetchers for DeepSeek (GET /user/balance) and Kimi (GET /v1/users/me/balance — base URL via LLM_KIMI_BASE_URL for the .cn endpoint); every other cloud provider returns Unknown with a billing-dashboard hint. Local providers are not probed. When aictl-server routing is active (AICTL_CLIENT_HOST + AICTL_CLIENT_MASTER_KEY), fetch_all short-circuits per-provider probes and instead pulls the server's /v1/stats aggregate (per-provider rows surface as Unknown with a hint that the server tracks dispatch counts, not upstream balances).
  • llm/server_proxy.rsaictl-server upstream client. When config::active_server() returns Some((url, key)) and the resolved provider is non-local (!Provider::is_local()), run::run_agent_turn routes the LLM call through ${url}/v1/chat/completions with Authorization: Bearer ${key} instead of dispatching to a per-provider module. Reuses llm/openai's pub(crate) request/response shapes verbatim — the server speaks the OpenAI shape so duplicating structs would let them drift. Streaming reuses llm::stream::drive_openai_compatible_stream. Maps the server's {"error":{"code":..,"message":..}} envelope into AictlError::{Auth,Injection,Redaction,Provider}. Performs a once-per-process GET /healthz probe before the first proxied request — non-2xx warns and proceeds, network failure warns and proceeds (the next chat call surfaces the real error). Provider::is_local() returns true for Ollama / Gguf / Mlx only.

Key behaviors (non-obvious)

  • Config: ~/.aictl/config only — no .env, no system env vars for program parameters. CLI args override. config_set / config_unset write through to disk and cache. config_overlay mutates the in-memory cache only (used by ephemeral CLI flag overrides like --client-url); keys::override_secret is the matching mechanism for secrets — its value beats both keyring and plain config in keys::get_secret lookups. config::Role (Cli / Server) + config::set_role mark which binary loaded the engine; config::config_get_scoped(server_key, cli_key) reads server_key first when role=Server (falling back to cli_key if unset) and ignores server_key entirely when role=Cli — used by security::load_policy, redaction::load_policy, and audit::enabled. The CLI never sets a role (defaults to Cli); aictl-server's main calls set_role(Role::Server) immediately after load_config.
  • Server-scoped security/redaction flags: every CLI flag that has a meaning in a pure HTTP proxy has a paired AICTL_SERVER_* form: AICTL_SERVER_SECURITY, AICTL_SERVER_SECURITY_INJECTION_GUARD, AICTL_SERVER_SECURITY_AUDIT_LOG, AICTL_SERVER_SECURITY_REDACTION, AICTL_SERVER_SECURITY_REDACTION_LOCAL, AICTL_SERVER_REDACTION_DETECTORS, AICTL_SERVER_REDACTION_EXTRA_PATTERNS, AICTL_SERVER_REDACTION_ALLOW, AICTL_SERVER_REDACTION_NER, AICTL_SERVER_REDACTION_NER_MODEL. Tool-dispatch knobs (CWD jail, shell allow/block lists, blocked env vars, disabled tools, max-write byte cap, shell timeout) are intentionally not mirrored — the server does not run tools.
  • aictl-server routing: opt-in only. The proxy is reached only when Provider::AictlServer is the active provider (via --provider aictl-server, AICTL_PROVIDER=aictl-server, or the aictl-server: section in /model). Picking any other provider goes straight to that provider's API regardless of whether AICTL_CLIENT_HOST is set — the user's chosen provider wins. Setting AICTL_CLIENT_HOST / AICTL_CLIENT_MASTER_KEY is not required to use the CLI; both keys are inert until the user explicitly picks the aictl-server provider. When that provider is selected, dispatch goes through ${AICTL_CLIENT_HOST}/v1/chat/completions with Authorization: Bearer ${AICTL_CLIENT_MASTER_KEY}; missing config surfaces a clear AictlError::Other from the agent loop. AICTL_CLIENT_MASTER_KEY follows the standard /keys lifecycle (plain config → keyring on lock); it is not the same secret as AICTL_SERVER_MASTER_KEY (which lives on the server side), but both keys participate in the same /keys lock/unlock/clear flow so a co-located host can move them together. The server resolves its key through keys::get_secret so a locked entry is found in the keyring transparently. --client-url <URL> and --client-master-key <KEY> override for one launch without persisting. The /model menu fetches ${url}/v1/models when active_server() resolves and shows the catalogue under an aictl-server: section — failures (server down, wrong key) just produce an empty list rather than aborting the menu. /ping probes /healthz then exercises the master key against /v1/models. /balance always probes the upstream provider endpoints directly with the operator's local keys (the server's /v1/stats reports dispatch counts, not balances). The dispatch branch lives in run::run_agent_turn only (CI gate: grep -rE 'server_proxy::call' crates/aictl-core/src/ | grep -v 'run.rs\|server_proxy.rs' must be empty).
  • Prompt file: AICTL.md in CWD is appended to system prompt (override via AICTL_PROMPT_FILE). Falls back to CLAUDE.md then AGENTS.md unless AICTL_PROMPT_FALLBACK=false.
  • Security gate: every tool call passes through security::validate_tool() before exec and output sanitization on return. --unrestricted bypasses validation; audit + redaction keep running. The mcp__* arm enforces a body-size cap (max_file_write_bytes) and AICTL_MCP_DENY_SERVERS; the CWD jail does not apply to MCP tools because the server runs in its own process with its own privileges. Workspace carve-out for blocked paths: security::check_path_with skips the blocked-paths rejection when the active working_dir is itself anchored inside that blocked tree and the target is inside the workspace — so the desktop default ~/.aictl/workspace/ (under the blocked ~/.aictl/) is usable while siblings like ~/.aictl/keys and ~/.aictl/audit stay off-limits.
  • CWD jail root: security::load_policy reads std::env::current_dir() for paths.working_dir, which is the jail root and the spawn dir for every tool subprocess. apply_cwd_override (in aictl-cli/src/main.rs) resolves the working directory in this order: --cwd <PATH> flag, then AICTL_WORKING_DIR_CLI config key (canonical CLI-specific anchor, parallel to AICTL_WORKING_DIR_DESKTOP), then AICTL_WORKING_DIR (legacy unsuffixed fallback — kept working for existing configs; _CLI wins when both are set), then the process launch dir. The chosen path is canonicalized (handles ~ and relative inputs), verified to be a directory, and set_current_dird into before any subsystem reads the launch dir — so the same anchor flips the jail root, config::load_prompt_file, and config::local_config_root together. Bad paths exit loud rather than silently falling back. CLI-only — apply_cwd_override is not called from aictl-server (the server has no tool dispatch and no jail).
  • Redaction: two seams. run::redact_outbound runs right before provider dispatch — local providers (Ollama/GGUF/MLX) skipped unless AICTL_SECURITY_REDACTION_LOCAL=true, and Block mode aborts the turn there. redaction::redact_for_persistence runs at write-time in session::save_messages (CLI + desktop) and repl::add_redacted_history (CLI rustyline buffer + ~/.aictl/history on exit); both treat Block as Redact so the offending message lands as [REDACTED:<KIND>] on disk rather than verbatim. The in-memory Vec<Message> is never mutated by either seam.
  • Streaming: call_X(..., on_token)Some → streaming path, None → buffered. StreamState in crates/aictl-core/src/llm/stream.rs holds back anything that could prefix <tool name=" so tool XML never hits the UI. Auto-disables under --quiet, compaction, agent-prompt generation, non-TTY stdout. The transport-level streaming flag in run_agent_single is not aware of --format, so text / json modes still receive deltas; PlainUI::stream_chunk discards them on those paths and show_answer does the emission. Skips termimad markdown re-render.
  • Agent loop: up to 20 iterations. Every provider call wrapped in tokio::time::timeout (AICTL_LLM_TIMEOUT, default 30s; 0 disables).
  • Key storage: keys::get_secret(name) checks keyring first, falls back to plain config. keyring v3 needs apple-native + sync-secret-service features or it silently uses a mock store.
  • CLI flags: long-form only. Only short flags are -v / -h.
  • Cargo features (default off): gguf (llama-cpp-2), mlx (macOS+aarch64), redaction-ner (gline-rs). Model management CLI paths compile on every build; only the inference call is feature-gated and returns a rebuild hint when missing.
  • Coding-agent mode (experimental, default off): base-prompt override gated by AICTL_CODING_AGENT (default false). When on (and Role is not Server), run::build_system_prompt returns SYSTEM_PROMPT_CODING instead of SYSTEM_PROMPT — same XML tool spec and tool catalogue, prose adds the Explore → Plan → Code → Review → Test discipline. Phase 2 tool surface (universal — applies in non-coding mode too, but the coding prompt is where the model is steered to actually use it): edit_file accepts multiple <<< … === … >>> blocks per call applied top-to-bottom and atomic (any block failure aborts the write — no partial state on disk), each optionally scoped by @N / @N-M to a 1-based inclusive line range; on a zero-hit exact match the tool retries with whitespace normalized per line (runs of spaces/tabs collapsed, trimmed) and applies the model's new text verbatim if exactly one fuzzy candidate exists, otherwise surfaces an "N candidates" error rather than guessing. search_files and find_files shell out to rg when it is on PATH (respecting .gitignore, supporting --regex / --type / --case / --context / --max / --no-ignore on search and --type on find) and fall back to the existing pure-Rust path otherwise — probe cached in a OnceLock<bool> per process, override with AICTL_TEST_FORCE_RG_FALLBACK=1. read_file takes an optional second-line --lines [N|N-M] flag that returns the requested slice with NNNNN: line-number prefixes (bare --lines numbers the whole file; end-of-range past EOF clamps with a trailing (end of file at line N) note; start past EOF returns (file ends at line N, no content)). All grammars stay additive: the single-block edit_file, the bare pattern\ndir search_files, and the unflagged read_file path keep working unchanged. The security gate's search_files / find_files dir extraction was updated to skip --flag lines so the policy still sees the actual directory under the new grammar. For production coding work prefer dedicated tools (Claude Code, OpenAI Codex CLI, opencode); aictl's mode is for quick edits and exploration. coding_agent_enabled() in aictl-core::config short-circuits to false under Role::Server so a server reading the shared config never adopts the coding prompt. The CLI exposes --coding-agent / --no-coding-agent (one-launch overlay), the /coding on|off|toggle|status slash command (persisted to disk; legacy /coding-agent and /coding_agent still route through), /skip [review|test] for phase shortcuts, a dim [phase] prefix in the REPL prompt, and a coding: line in --info. The desktop exposes the same master switch via the coding_agent_status / coding_agent_set_enabled Tauri commands, surfaced in Settings → General → Coding agent and as a chevrons-in-square composer-toolbar icon (slotted between memory and auto-accept); the composer is a single toolbar row (model picker + icon cluster ending in the ⌘↵ Send button, slotted right of the mic), so tauri.conf.json minWidth and App.tsx::CHAT_MIN_WIDTH need to fit that row (locked together at 860). Phase tracker, /skip, and the [phase] indicator are CLI-only in v1 — the desktop benefits from the prompt steering but doesn't render phase UI. WorkflowPhase enum + auto-detect helpers live in aictl-core::coding. Auto-detection in coding::detect_linter / detect_test_cmd covers Rust, Node, Python, Go, Gradle, Maven, CMake, and Make; project-level commands prefer wrappers (gradlew, mvnw) when present and prefer clang-tidy for C/C++ when compile_commands.json exists, falling back to cppcheck. The per-file lint_file registry adds .java (google-java-formatcheckstylejavac -Xlint) and .kt/.kts (ktlintktfmt) alongside the existing groups. Phase 3 (the test tool is callable in non-coding mode; retry loop and Review hook fire only when coding_agent_enabled()): dedicated test tool (tools/test.rs + tools/test/parsers.rs, TOOL_COUNT 36) shells the auto-detected runner (cargo / npm / pytest / go / gradle / maven / ctest / make), parses pass / fail / skipped counts plus per-failure detail, and stores a structured TestSummary on a private async slot. Body grammar: empty (auto-detect), <filter> (threaded through cargo test <f> / pytest -k <f> / go test -run <f> / npm test -- <f> / ./gradlew test --tests <f>), or --cmd <command> escape hatch. In coding-agent mode the agent loop drains the summary after every test dispatch and, on failed > 0, injects a <test_failure> user turn carrying the structured failures (capped at 25 / 400 char message) so the model can plan a fix; the host caps at AICTL_CODING_TEST_RETRIES re-loops and switches to <test_failure_terminal> once exhausted. A <repo_context> block (branch, last 5 commits, dirty files, top-level layout depth 2, detected build / lint / test commands) is appended to SYSTEM_PROMPT_CODING; cached per working dir in coding::collect_repo_context, busted by coding::invalidate_repo_context after every write_file / edit_file / remove_file / create_directory so the dirty-files list stays current. coding::detect_build_cmd joins detect_linter / detect_test_cmd. When the model emits a no-tool-call response in coding-agent mode and the session has touched at least one file, the host runs a structured Review hook (coding::run_structured_review) before releasing the answer: the project build + lint_file on each changed path. On Pass the answer ships with a [review: clean — …] banner prepended; on Fail the host pushes a <review_result> user turn carrying the build / lint output tails and continues the loop, capped at AICTL_CODING_REVIEW_RETRIES (default 2) before releasing with a [review: N attempt(s); failures may remain] banner. New config keys: AICTL_CODING_BUILD_CMD, AICTL_CODING_REVIEW_RETRIES, AICTL_CODING_REPO_CONTEXT / _TREE_DEPTH / _TREE_MAX, AICTL_CODING_TEST_FILTER_DEFAULT. CLI gains /coding refresh (busts the repo-context cache + clears the changed-paths tracker) and three new lines under --info (build:, lint:, test:) when coding-agent mode is on; /coding status shows the resolved commands plus the test / review retry budgets and the per-session changed-files count. Desktop gains three Tauri commands (coding_agent_build_cmd / coding_agent_lint_cmd / coding_agent_test_cmd) and a read-only "Resolved commands" section under Settings → Coding agent. Phase 4 (universal — the multi-tool grammar applies in non-coding mode too): tools::parse_tool_calls returns Vec<ToolCall> so one model response can carry multiple <tool> blocks. Read-only batches (every name returning true from tools::is_parallelizableread_file / list_directory / search_files / find_files / git status|log|blame|diff / lint_file / check_port / system_info / fetch_url / extract_website / read_document / read_image / json_query / csv_query / calculate / fetch_datetime / fetch_geolocation / clipboard read / diff_files / checksum / list_processes) dispatch concurrently via tokio::task::JoinSet in run::handle_tool_batch, chunked by AICTL_CODING_PARALLEL_TOOLS_MAX (default 4, clamped to 16; 0 disables and forces serial — the loop runs only the first call and pushes a single <tool_results> rejection envelope for the rest). Side-effect classification lives in tools::SIDE_EFFECT_TOOLS plus body inspection for git commit (vs read subcommands) and clipboard write (vs read); MCP and plugin tools are conservatively non-parallel. When a batch mixes side-effects with reads, the host short-circuits: only the first side-effect dispatches (serially, through the unchanged handle_tool_call path so its security gate / hooks / audit / Review-hook seams all run); the read-only siblings land in a single <tool_results> envelope as rejection notes so the model re-emits them next turn. Per-call <tool_result> blocks join in source order regardless of completion order; per-call hooks, per-call audit, and the redaction Block seam all fire independently. The duplicate-call guard is single-call-only (each in-batch call still has its own check inside tools::execute_tool). Approval is bundled — *auto auto-approves the whole batch; otherwise the user confirms once on the first call and the decision applies to all. --info gains a parallel: line (showing the cap or "disabled"). Mid-stream streaming wires: StreamState scans full for <phase>NAME</phase> and emits StreamEvent::PhaseChange(WorkflowPhase) on the same mpsc channel that already carries deltas / suspend; AgentUI::on_phase_change is a default-no-op the CLI's InteractiveUI overrides to store the latest phase in a Mutex<Option<WorkflowPhase>>, drained by the REPL via take_latest_phase after each turn — so the prompt prefix flips to a model-claimed phase even when the tag fired in an intermediate LLM call rather than the final answer. CI gate: grep -rE 'parse_tool_calls|dispatch_parallel|handle_tool_batch|PhaseChange' crates/aictl-server/src/ must be empty (Phase 4 lives in the engine + CLI; the server has no agent loop).

Conventions

  • Rust edition 2024, default rustfmt and clippy settings.
  • After finishing work, run cargo lint and fix any warnings, then run cargo fmt to re-format code if needed.
  • Commit messages follow .claude/skills/commit/SKILL.md — no AI attribution, imperative mood, short for small changes.
  • After implementing a feature or fixing a bug, check ROADMAP.md — remove the item if resolved.
  • Claude Code skills live in .claude/skills//commit, /update-docs, /evaluate-rust-quality, /evaluate-rust-security, /evaluate-rust-performance. Evaluation reports land in .claude/reports/.