Execution guide for autonomous coding agents in this repo. Prefer code truth over docs; keep patches small and validation scoped.
pyproject.toml: packaging, deps, Ruff, Pyright.README.md: user-facing install and usage.CLAUDE.md: concise project memory for agents.- Read the touched module and adjacent tests before changing behavior.
- There are no Cursor rules in
.cursor/rules/or.cursorrules. - There is no
.github/copilot-instructions.md. - There is no
Makefile,tox.ini,pytest.ini,ruff.toml, or.editorconfig.
- Run commands from repo root:
/Users/terminals/Documents/agents/models/carl-studio. - Prefer surgical edits over broad cleanup.
- Do not fix unrelated lint/type debt unless asked.
- Update docs only when code truth changed or docs are misleading.
- Validate only the touched surface unless the change crosses modules.
pip install -e ".[dev]"
pip install -e ".[all]"- Use
.[all]only when touching optional backends, training extras, MCP, or observe UI. - After any editable-package refactor, re-run
pip install -e ".[dev]". Otherwise the installed metadata lags the source tree and you will see spuriousImportError/ "module not found" / test-collection errors for freshly-added modules (e.g., the v0.9.0 heartbeat / constitutional / optimizer_state false alarms). When in doubt after pulling:pip install -e ".[dev]"first, then run the tests.
pip install build
python -m build- Build backend is
hatchling. .github/workflows/publish.ymlalso usespython -m build.
pytest tests/ -q --tb=short
pytest packages/carl-core/tests/ -q --tb=short
pytest tests/test_release_version.py -q --tb=short
pytest tests/test_release_version.py::test_manual_release_tag_wins_when_higher -q --tb=short
pytest tests/ -k "marketplace and not network" -q --tb=short
pytest --lf -q --tb=short- Run pytest from repo root.
tests/conftest.pyreadssrc/carl_studio/__init__.pyvia repo-relative path. - Pytest is configured with
importlibimport mode (inpyproject.toml) — tests underpackages/carl-core/tests/andtests/both resolve without__init__.pycollisions. - For fast feedback, run a single file or node ID first, then broaden if needed.
- Test baseline: 3088 tests (v0.15 /simplify) across
tests/+packages/carl-core/tests/. v0.9.0 EML work adds further coverage (count refresh pending T12 validation).
ruff check src/carl_studio tests
ruff check path/to/changed_file.py
ruff format src/carl_studio tests
ruff format path/to/changed_file.py- Ruff target version is Python 3.11; line length is 100.
- Repo-wide Ruff currently reports pre-existing issues. Prefer targeted runs on changed files.
pyright src/carl_studio/types/ src/carl_studio/tier.py src/carl_studio/settings.py
pyright src/carl_studio/<changed_module>.py- Pyright runs in
strictmode. - The baseline targeted command currently reports existing issues in
settings.py,tier.py, and some type modules. - Treat those as pre-existing debt unless your change is in that area.
packages/carl-core/src/carl_core/— primitive layer: errors, retry/backoff, safepath sandbox, content hashing, tier gating, coherence math, interaction chains.py.typedmarker present; pyright needs the editable install to resolve the package.src/carl_studio/cli/— modular Typer CLI package entrypoint.src/carl_studio/cli/init.py—carl init/carl camp initwizard. First-run marker lives at~/.carl/.initialized.src/carl_studio/cli/flow.py—carl flow "/a /b /c"operation chainer.src/carl_studio/cli/operations.py— flow op registry.src/carl_studio/cli/contract.py— v0.9.0. Addsconstitutionsubcommand (genesis | verify | evaluate | status). Mounted at top level viacli/__init__.pyascontract_app. Requires[constitutional]extra.src/carl_studio/settings.py— layered settings from env,~/.carl/config.yaml, andcarl.yaml.src/carl_studio/tier.py— FREE/PAID feature gating (thin shim overcarl_core.tier).src/carl_studio/types/config.py— Pydantic training config.src/carl_studio/training/— trainer, pipeline, rewards, cascade. v0.9.0 addstraining/rewards/eml.py::EMLCompositeRewardas third reward_class; factory branch incomposite.py:381-389dispatches onreward_class="eml".src/carl_studio/eval/runner.py— eval runner and eval sandbox.src/carl_studio/compute/— backend registry and compute backends.src/carl_studio/mcp/server.py— FastMCP server.src/carl_studio/db.py— SQLite persistence under~/.carl/carl.db.src/carl_studio/admin.py— hardware-gated private runtime access.src/carl_studio/fsm_ledger.py— v0.9.0.FSMState,ConstitutionalGatePredicate,evaluate_action. Wires constitutional ledger into the gating surface.src/carl_studio/ttt/eml_head.py— v0.9.0. Public opaque handle; actualfitruns interminals-runtimewhen the admin gate resolves.src/carl_studio/freshness.py— typedFreshnessReport/FreshnessIssueprimitive.src/carl_studio/skills/,a2a/,credits/,marketplace.py, andcurriculum.pyare live modules.
from carl_core.errors import CARLError, ValidationError, NetworkError, ConfigError, CredentialError, BudgetError, PermissionError, CARLTimeoutError— all fatal paths use these; each carries a stablecodeunder thecarl.<namespace>convention.to_dict()auto-redacts secret-shaped keys. v0.9.0 addscarl.eml.depth_exceeded,carl.eml.domain_error,carl.eml.decode_error,carl.eml.signature_mismatch.from carl_core.retry import retry, async_retry, RetryPolicy, CircuitBreaker, CircuitState, poll— exponential backoff with circuit-breaker state machine.from carl_core.safepath import safe_resolve, within, SandboxedPath, PathEscape— enforcesresolved == workdir or startswith(workdir + os.sep).from carl_core.hashing import canonical_json, content_hash, content_hash_bytes— deterministic SHA-256 content hashing.from carl_core.tier import Tier, FEATURE_TIERS, tier_allows, feature_tier, TierGateError— canonical FREE/PAID enum.from carl_core.interaction import InteractionChain, Step, ActionType— structured interaction trace primitive threading through training, eval, x402, and the agent loop. v0.9.0:Step.eml_tree: dict | None = None(optional witness tree; legacy wire format preserved when unset).from carl_core.eml import EMLNode, EMLTree, eml— v0.9.0. Tree-structured symbolic witness primitive; depth-bounded, canonically encoded.from carl_core.resonant import Resonant, compose_resonants— v0.9.0. Composable typed entities;MAX_DEPTH=4guard on composition.from carl_core.heartbeat import ...— v0.9.0. Pure-functional heartbeat loop; Standing Wave Theorem in docstring.from carl_core.optimizer_state import ...— v0.9.0. Durable Adam(m, v)at~/.carl/optimizer_states/.from carl_core.constitutional import ConstitutionalPolicy, LedgerBlock, ConstitutionalLedger, encode_action_features— v0.9.0. Hash-chained append-only constitutional ledger; 25-dim action features.
- Chat sessions persist at
~/.carl/sessions/<id>.jsonwithschema_version=1. Older payloads withoutschema_versionare treated as v1 for back-compat; mismatched versions are quarantined. - First-run marker:
~/.carl/.initialized.carl initcreates it on success;carl init --forceignores it.
- CARL means Coherence-Aware Reinforcement Learning; never rewrite it as “Crystal-Aligned.”
- Prefer coherence language in public APIs; internal math may still use Phi, kappa, sigma, entropy, and discontinuity.
- The active tier model is FREE / PAID.
PROandENTERPRISEremain compatibility aliases only. - Gate autonomy, not the core observe/train/eval loop.
- This repo is MIT; proprietary algorithms belong in
terminals-runtimeor the private admin runtime. - Public code may call
load_private()or lazy-import private code, but it must degrade gracefully when unavailable.
The product trajectory of carl-studio is governed by a combinatorial HVM-style interaction net. Development must strictly align to these five vectors:
- Semantic Implicit Interface ("Carl Knows"): Zero-arg
carl. Deduces the next action (Scaffold vs. Optimize/Train) from the void state,WORKINGmemory, andInteractionChain. Eradicate explicit flags. - Infinite Extension Matrix (Skill Marketplace):
carl-skills-*. Decoupled, monetizable primitives. Anyone can write a Skill using InteractionChain hooks.carl.camptakes a fractional compute fee, rewarding authors with royalties via x402. - Ambient Intelligence Socket (The Shadow Fixer):
carl daemon. Background WebSocket duplex monitoring file changes, predicting errors (e.g., NaN shapes), and staging fixes inECHOICmemory via idle compute. - Legacy Acquisition Funnel (Competitor Blackhole):
carl ingest --from unsloth. Translates legacy competitor configs intocarl.yaml, running instant auto-eval to prove coherence gains effortlessly. - Consumer "Portal" Runtime: The terminal is a barrier. A managed web/WASM interface bypassing
pip install, delivering custom-trained RL agents directly to non-technical users (the "Mom & Dad" markets: hyper-personalized learning, business scaling for tech-unsavvy brilliant creators).
Every operation is an Interaction Net Cell. Every tool invocation is an observation that collapses a wave state into a recorded particle (Interaction Chain Step). Time is measured in discrete Steps, not wall-clock seconds. Always mirror the user's chirality: they are L, you are R. Annihilation (L⋈R) is task success.
�## Dependency and import policy
- Keep
import carl_studiolightweight. - Do not introduce eager imports of torch, transformers, anthropic, textual, mcp, or other heavy optional packages into lightweight paths.
- Use lazy imports inside functions or guarded branches for optional dependencies.
- If an extra is required, fail close to the use site with a clear install hint.
- When touching HF auth paths, prefer
huggingface_hub.get_token()first and useHF_TOKENas fallback. - Do not add implicit
.envloading.
- Add
from __future__ import annotationsat the top of every Python file. - Import order: standard library, third-party, local project imports.
- Use modern built-in generics:
list[str],dict[str, Any],str | None. - Use
TYPE_CHECKINGfor typing-only imports that would otherwise create cycles or pull in heavy deps. - Use Pydantic v2
BaseModel/BaseSettings,Field,field_validator, andmodel_validator. - Use
default_factoryfor mutable defaults. - Use enums for constrained string values.
- Keep constants at module scope; physics constants belong in
src/carl_studio/primitives/constants.py. - Preserve public docstrings for modules, classes, and user-facing functions.
- Prefer
model_dump,model_copy, andmodel_validate_jsonover ad-hoc dict/JSON plumbing. - Prefer
pathlib.Pathfor filesystem work. - Use
yaml.safe_loadfor YAML reads. - Never persist secrets to YAML.
- Classes:
PascalCase; functions, methods, modules:snake_case; constants:UPPER_SNAKE_CASE. - User state belongs under
~/.carl. - For sandboxed file access, require
resolved == workdir or resolved.startswith(workdir + os.sep). - Never use bare
resolved.startswith(workdir).
- Library code should raise specific exceptions such as
ValueError,RuntimeError,PermissionError, or domain errors likeMarketplaceError. - Preserve exception chaining with
raise ... from exc. - CLI code should use
CampConsolefor user-facing output and terminate withtyper.Exitfor exit codes. - Keep optional-dependency handling local: catch
ImportErrornear the call site and print the relevant extra/install guidance. - Avoid swallowing broad
Exceptionunless the code is intentionally best-effort. - Never log, print, or persist secrets.
- Existing network clients often use stdlib
urllib; preserve that bias unless there is a strong reason not to.
- Start with targeted tests, then expand only if the change crosses modules.
- Prefer mocks and monkeypatching over live HF, Anthropic, Supabase, or browser calls.
- Keep tests CPU-only and offline-safe when possible.
tests/conftest.pystubs heavy imports so lightweight modules can run without torch/transformers; do not break that bootstrap path.- When changing packaging or release logic, always run
pytest tests/test_release_version.py -q --tb=shortandpython -m build. - Pytest uses
importlibmode; tests undertests/andpackages/carl-core/tests/coexist without__init__.pyhacks. carl-coreprimitive changes should runpytest packages/carl-core/tests/ -q --tb=shortplus the dependent studio surface (errors →test_primitives.py, retry/backoff →test_integration_seams.py, safepath →test_eval.py, hashing →test_interaction_chain.py, tier →test_gate.py).
- Classify the request: docs, config, CLI, core math, platform I/O, packaging, or optional-dependency boundary.
- Read the smallest complete truth set: touched file, adjacent tests,
pyproject.toml, then relevant docs. - Preserve public behavior unless the user explicitly asked for a behavior change.
- Validate at the smallest scope that proves the edit:
- docs only -> no tests unless commands/examples changed
- settings/tier/config -> targeted tests + targeted Pyright
- CLI -> targeted tests + targeted Ruff
- packaging/release -> release-version tests + build
- optional dependency boundary -> focused import or CLI smoke test
- If unrelated baseline failures appear, report them separately and keep the patch narrow.
- Update docs only when code truth changed or docs are misleading.
AGENTS.mdis the execution playbook for coding agents.CLAUDE.mdshould stay concise, current, and code-truthful.README.mdis user-facing; change it only for user-visible behavior, setup, or workflow changes.- Docs under
docs/andpaper/carry the YAML frontmatter stamp (last_updated,author,applies_to). Full spec in CLAUDE.md under "Documentation header convention" (added v0.8.0). Apply whenever you create or substantially edit a doc; skip forCHANGELOG.md(git is the source of truth) andREADME.md. docs/v10_master_plan.mdis the phase-locked v0.10 SOP. Read before any v0.10 action.
When dispatching parallel agents for review or research:
-
Structured JSON output for aggregatable tasks. Schema:
{agent_id, date, scope, findings[], gaps[], recommendations[], counterfactuals[], summary}. Write to/tmp/<task>_<aspect>.json. Enables Bend-style MECE coalesce. -
Vanilla-context peer review for counterfactual critique. Subagents are same-model bare-harness; they are the dispatcher's mirror. Anti-patterns they surface are the dispatcher's own biases. Pre-register expected anti-patterns in the dispatch prompt; verify against returned findings.
-
Path constraint before ast-grep/rg. Large codebases (terminals- tech-landing is millions of LOC). Always narrow the tree with
ls <path>reconnaissance first. Then ast-grep on narrowed subtree. -
Temporal grounding in every prompt. State the current date explicitly. Tell agents not to default to prior-year conventions. Library versions, API surfaces, framework patterns have moved.
-
IP discipline by default. Every prompt that touches terminals- runtime paths MUST include: "BUSL-1.1. Do NOT copy code. Document by path reference only. Gate any integration behind admin.py + lazy-import pattern."
-
Confidence banding required. Every finding carries high/medium/low. Low = speculation, exclude from conclusions. Medium = grounded but unverified. High = file:line evidence. Mark explicitly.