Closes the long-deferred carl.camp ↔ carl-studio parity arc. End-to-end signed remote tier verification, AXON event forwarding, managed slime training submit, constitutional ledger forward path, and a refresh of zero-rl-pipeline's tier-gate consumer.
Companion releases:
- carl-core 0.2.0 — adds 9 new error subclasses +
set_global_forwarderseam ininteraction.py. carl-studio 0.20.0 requires carl-core ≥ 0.2.0. - carl.camp backend (private repo) — see
docs/v10_remote_entitlements_spec.mdfor the wire contract.
EntitlementsClient(src/carl_studio/entitlements.py) — Python verifier of carl.camp's signed entitlements JWT. pynacl-backed Ed25519 verify, JWKS pin-on-first-use, 15-min cache + 24h offline-grace at~/.carl/entitlements_cache.json(mode 0600, atomic write). 5 new error codes undercarl.entitlements.*+carl.gate.tier_remote_mismatch.tier_gate(..., verify_remote=False)extension (src/carl_studio/tier.py) — local-fast-path-then-async ladder per AP-1. Local check runs first; on allow, firesfetch_remote_asyncbackground verify; on local-deny + paid feature, consults the cache for a matching grant within offline-grace.carl entitlements show [--json] [--refresh](src/carl_studio/cli/entitlements_cmd.py) — inspect cached v0.10 entitlements + age + key_id. Status taxonomy:fresh / stale / offline_grace / offline_grace_expired / missing / corrupt / unavailable.- AXON HTTP forwarder (
src/carl_studio/telemetry/axon.py) —AxonForwarderthread-bound queue + daemon flush, opt-in viaconsent.telemetry+AXON_FORWARD_DISABLEDenv. MapsStep.action→ 5-signal taxonomy (skill.training_started,skill.crystallized,coherence.update,interaction.created,action.dispatched); secondarycoherence.updateevent fires whenphi/kuramoto_r/channel_coherenceis populated. Idempotency viasha256(step_id:signal_type). Payload runs throughcarl_core.errors._redactbefore forward. set_global_forwarderseam (packages/carl-core/src/carl_core/interaction.py) — module-level registration hook; carl-core stays HTTP-free, carl-studio registers itself.record(...)invokes the forwarder synchronously afterself.steps.append(step). Errors swallowed + logged at WARN.SlimeSubmitClient(src/carl_studio/adapters/slime_submit.py) — managed slime training submit + status polling.assert_no_user_hf_token_leak(slime_args)guard with three detection paths: env-token exact match, key-name regex (hf_token/huggingface_tokencase-insensitive), value regex (^hf_[A-Za-z0-9_-]{34,}$). Recursive scan covers nested envelopes.carl train --managed+carl slime-schema(src/carl_studio/cli/training.py) — opt-in flag routes throughSlimeSubmitClient(gated bytier_gate(PAID, feature="train.slime.managed", verify_remote=True)); schema export printsSlimeArgs.model_json_schema()JSON.SlimeRolloutBridge.finalize_resonant(slime_run_id=...)(src/carl_studio/training/slime_bridge.py) — propagates the slime run id into resonant metadata for the carl.camp side'sX-Carl-Slime-Run-Idheader path.ConstitutionalForwarder(src/carl_studio/fsm_ledger_forward.py) — always persists signedLedgerBlocks to~/.carl/constitutional_ledger.jsonl(mode 0600, 10MB rotation, max 3 archives). HTTP forward to carl.camp's/api/ledger/appendis opt-in viaconsent.telemetry.replay_pending()retries unacked entries.evaluate_action(forward=...)kwarg (src/carl_studio/fsm_ledger.py) — wires the forwarder into the FSM evaluation path.- 9 new error subclasses (
packages/carl-core/src/carl_core/errors.py):carl.gate.tier_remote_mismatch(RemoteEntitlementError)carl.entitlements.network_unavailable(EntitlementsNetworkError)carl.entitlements.signature_invalid(EntitlementsSignatureError)carl.entitlements.cache_corrupt(EntitlementsCacheError)carl.entitlements.jwks_stale(JWKSStaleError)carl.slime.hf_token_leak(SlimeHfTokenLeakError)carl.slime.managed_submit_failed(SlimeManagedSubmitFailedError)carl.slime.run_not_found(SlimeRunNotFoundError)carl.constitutional.forward_failed(ConstitutionalForwardFailedError)
respx>=0.21added to[dev]extras for httpx mock-driven entitlements tests.
carl doctorpayload gains anentitlements: {status, age_s, key_id}block surfacing remote-tier health.carl resonant publishaccepts an optional--slime-run-id <uuid>flag → adds theX-Carl-Slime-Run-IdHTTP header.- CLI routing table in
CLAUDE.mdupdated with the three new commands;Release historynotes v0.20.0.
- 11 new pytest stubs in
tests/test_entitlements.py - 15 new in
tests/test_axon_forwarder.py - 22 new in
tests/test_slime_submit.py - 14 new in
tests/test_constitutional_forward.py - 8 new in
tests/test_tier_resolver.py - 5 new in
tests/test_cli.py - 4 new in
packages/carl-core/tests/test_interaction_forwarder.py - 1 import-cheapness regression test (
carl_studio.entitlementsdoesn't pull pynacl eagerly)
Suite baseline (post-release): ~135 v10-surface tests green in addition to the existing ~3770. ruff + pyright clean on changed files via direct CLI runs.
zero-rl-pipeline/platform/cli/gate.pyrewrite — try-importEntitlementsClientfrom carl-studio first, falls back to legacycheck-tierEdge Function with deprecation warning. Preserves the publicTier,get_tier,is_paid,is_free,requires_paidAPI surface. Private repo, separate cadence.resonancepackage — boundary verified, zero changes required. Seedocs/private/v10_resonance_boundary.md.
- Plan:
docs/superpowers/plans/put-together-a-plan-sleepy-crystal.md(or symlinked equivalent) — 11 phases A → K, 30 tracked tasks, dispatched viasuperpowers:subagent-driven-development. - Security review caught + fixed in same arc: cross-org idempotency-key disclosure in carl.camp's
slime/submitroute.
-
Unified entry-point router + sessions + trust + journey matrix (v0.18 / v0.18.1, 2026-04-22 → 2026-04-24).
src/carl_studio/cli/entry.pyowns the decision ladder BEFORE Typer dispatch: bare → REPL (+ first-run wizard + trust precheck);carl "<prompt>"→ REPL with first turn;carl -p "<q>"→ one-shotask_cmdwith trust-precheck intentionally bypassed;carl <verb>→ Typer. Fifty-three registered subcommands in a frozenREGISTERED_SUBCOMMANDSsnapshot.src/carl_studio/cli/trust.py+src/carl_studio/trust.py— bare-entry trust pre-check registry persisted at~/.carl/trust.yaml. Commands:trust status/acknowledge/enable/ disable/reset.acknowledgereplaces prior root with a visible eviction notice.src/carl_studio/cli/session_cmd.py+src/carl_studio/cli_session.py— project-aware session CLI._resolve_project_rootwalks up viaproject_context.current, socarl session listworks from any subdir.src/carl_studio/project_context.py—.carl/anchor detection with a home-guard:$HOMEcan never be a project root. Test fixtures that pin HOME must place the project attmp_path / "proj".carl init --json— probe-only fast-path (v0.18.1). Seven stable probe keys:first_run_complete,camp_session,llm_provider_detected,training_extras_healthy,project_config_present,consent_set,context_present. Never prompts on piped stdin; contract locked bytests/journeys/test_journeys_v18.py.- Journey matrix at
tests/journeys/JOURNEYS.md— 12 journeys, 48 transitions, 172 tests green on the v0.18 surface. - Parallel UAT batch spec at
tests/journeys/BATCHES.md— 8 offline batches + 6 online batches (online gated on explicit user authorization). - Provenance doc:
docs/v18_journey_coverage.md.
-
Dependency probe + auto-heal UX (v0.17.1, 2026-04-22).
carl initno longer dies on sibling-dep metadata corruption (the huggingface-hub / transformersUnable to compare versions ... found=Noneclass of failure). Newcarl_core.dependency_probemodule classifies every optional-dep into one of seven states (ok,missing,import_error,import_value_error,metadata_missing,metadata_corrupt,version_mismatch) with a concreterepair_commandhint._offer_extrasgains a consent-gated auto-heal branch that runspip install --force-reinstall --no-deps <target>after user approval. Never silent.extract_corrupt_siblingparsesdependency_versions_check-style errors so whentransformersraises abouthuggingface-hub, auto-heal targets the sibling, not the symptom package. Doctrine:docs/v17_dep_probe_doctrine.md. -
Arrow-key CLI facade (v0.17.1). New
src/carl_studio/cli/ui.pywrapsquestionary(industry-standard arrow-key UX matchinggh,vite,claude-code,codex) withtyperfallback on non-TTY or when the[cli]extra is absent. Four public functions:select/confirm/text/path.Choicedataclass withvalue/label/hint/badge/disabledfields. First option is the default by convention; numeric keys jump focus but never commit without Enter (eliminates the "pressed 1 meant 2" mistype class). Doctrine:docs/v17_cli_ux_doctrine.md. -
New
[cli]extra inpyproject.tomlpullingquestionary>=2.0,<3. Included in[all]per the extras-coverage policy. -
carl.freshness.dep_corruptissue code infreshness.py— emitted as aSEVERITY_ERRORby_check_packageswhenever a probe surfaces a corruption state. The remediation string carries the probe's exactrepair_command. Surfaced automatically incarl doctoroutput (red-highlighted, top of report). -
20 new tests (~90 LOC of regression coverage): 15 unit tests in
test_dependency_probe.py(every status + normalization + sibling parsing + never-raises), 4 integration tests intest_init_auto_heal.py(fast path, auto-heal fan-out, decline, fresh-install), 15 tests intest_cli_ui.py(fallback + modern + Ctrl-C + validation loops + password masking), 1 regression intest_freshness.pyfor the HF scenario end-to-end.
carl init's LLM-provider menu + carl.camp-sign-in prompt now use arrow-key selection with first-is-default. The sign-in flow menu ([sign in with browser] / [create account] / [skip]) routes to the existinglogin_cmdgh-style local-callback flow.carl env's 7-question wizard routes every choice question throughui.select(arrow-keys); free-text questions route throughui.text. Sequential progression preserved — each question is an arrow-key prompt, not a numbered list.carl camp config initmigrates preset + tier menus + all text prompts to theui.*facade.carl project initmigrates method + compute menus (now with descriptive hints per option) and all free-text prompts.- Remaining prompts in
cli/lab.py,cli/consent.py,cli/startup.py,cli/chat.py,cli/prompt.pymigrated toui.*.
-
CLI crash on
carl initwhenhuggingface_hubhas stale dist-info (the root trigger for v0.17.1). The naiveexcept ImportError:probe in_training_extras_installed()missed theValueError-class failure thattransformers.dependency_versions_checkraises whenimportlib.metadata.version("huggingface-hub")returnsNone(corrupt/empty METADATA file). Fixed by routing throughdependency_probe.probe()which catches the full exception surface and classifies for user-consumable remediation. -
slimetraining adapter (src/carl_studio/adapters/slime_adapter.py,src/carl_studio/adapters/slime_translator.py). Routescarl train --backend slimeto THUDM/slime (Apache-2.0) — the RL stack behind Z.ai's GLM-5 / 4.7 / 4.6 / 4.5 and the only verified OSS framework for RL training on 100B+ MoE. Megatron-LM + SGLang are user-installed; thecarl-studio[slime]extra pulls only the thin rollout-side dep (sglang). Registered in_BUILTIN_ADAPTERS—list_adapters()now returns six entries. -
SlimeRolloutBridge(src/carl_studio/training/slime_bridge.py) — wires slime's rollout + training callbacks intocarl_core.interaction.InteractionChain. Custom rewards (EMLCompositeReward/PhaseAdaptiveCARLReward/CARLReward) plug into slime's reward hook viabridge.as_slime_reward(). ACompletionTraceAdaptershim lets the existingscore_from_tracesurface run unchanged when slime provides only raw text + logprobs. -
Five tier feature keys in
FEATURE_TIERS(packages/carl-core/src/carl_core/tier.py):train.slime— FREE (BYOK adapter)train.slime.rollout_bridge— FREE (coherence bridge, capability not autonomy per the tier philosophy attier.py:16-24)train.slime.managed— PAID (carl.camp orchestration)train.slime.moe_presets— PAID (GLM-5 / DeepSeek-V3 / Qwen3-MoE)train.slime.async_disaggregated— PAID (async PD disaggregation)
-
docs/adapters/slime.md— tier split, BYOK install steps,carl.yamlexample, bridge wiring snippet, troubleshooting table. -
Tests —
tests/test_slime_adapter.py(22 cases) andtests/test_slime_bridge.py(9 cases). Mocksslime/sglang/megatronviaimportlib.util.find_specpatches so the adapter's availability + translation + submission paths exercise without touching a real GPU stack.
pyproject.toml— addedslime = ["sglang>=0.4"]optional extra and rolled it into[all]. Intentionally not in[training]: slime's full install still requires CUDA/ROCm-specific Megatron-LM + slime source builds that cannot be covered by a wheel.SlimeAdapter.available()returns False until the user finishes the source-build steps.
carl_core.data_handles(new module) —DataRef,DataVault,DataKind,DataError. Zero-dep primitives mirroring theSecretRef/SecretVaultshape for arbitrary payloads (bytes / file / stream / query / url / derived). Lazy fingerprinting + sha256 on file-backed refs; offset+length addressable reads; TTL self-revoke at resolve time.carl_core.resource_handles(new module) —ResourceRef,ResourceVault,ResourceError. The handle runtime for long-lived external resources (browser pages, subprocesses, MCP sessions, rollout engines). Caller-suppliedcloser(backend)runs at revoke, so lifecycles stay local to the toolkit that owns the backend type.carl_studio.handles.data.DataToolkit— agent-callable layer wrappingDataVaultwith audit emission (DATA_OPEN / DATA_READ / DATA_TRANSFORM / DATA_PUBLISH). Methods:open_file,open_bytes,open_url,read,read_text,read_json,transform(head / tail / gzip / gunzip / digest),publish_to_file,fingerprint,sha256,describe,list_handles. Preview cap (default 64 KB) + hard upper bound (default 16 MB) keep accidental whole-file slurps visible in the audit trail.carl_studio.cu.browser.BrowserToolkit— Playwright automation with vault-mediated pages. Agent getsResourceRefref_ids; pages never cross a tool-call boundary. Methods:open_page,navigate,click,type_text,type_from_secret(value resolved inside the toolkit),press_key,scroll,screenshot+extract_text(both route output through the sharedDataToolkit),close_page,list_pages. Playwright lazy-imported;available()reports honestly.carl_studio.cu.anthropic_compat.CUDispatcher+COMPUTER_USE_TOOL_SCHEMA— Anthropiccomputer_20250124tool schema mapping.bind_page(ref_id)dispatch({"action": "left_click", "coordinate": [x, y]})→ routes toBrowserToolkit.page_from_id(...)+ the page's low-level mouse API. Screenshots return aDataRefdescriptor; drag / mouse-up-down / hold-key are documented in the schema but rejected withcarl.cu.unsupported_action(agent should fall back to selector-level browser methods).
carl_studio.cu.privacy— regex-based content redaction (redact_text,redact_preview_spans) for email / phone / SSN / credit-card / IPv4 / DOB. Conservative defaults; openadapt's ML-assisted redactor can plug in later.- Ten new
ActionTypevalues incarl_core.interaction:DATA_OPEN,DATA_READ,DATA_TRANSFORM,DATA_PUBLISH,RESOURCE_OPEN,RESOURCE_ACT,RESOURCE_CLOSE, plus the four secret-op types (SECRET_MINT,SECRET_RESOLVE,SECRET_REVOKE,CLIPBOARD_WRITE) from the v0.16 secrets toolkit. - Seven new
FEATURE_TIERSkeys —data.open,data.read,data.transform,data.publish,resource.open,resource.act,resource.close. All FREE: the handle runtime is how Carl reasons about values it shouldn't see — gating it would break Carl as a viable agent (gate on autonomy, not capability). docs/v16_handle_runtime.md— unifying doctrine. One grammar across secrets / data / resource / computer-use; capability-security rationale; CARLAgent wiring example; end-to-end "Carl logs in without seeing the password" walkthrough.docs/v16_utils_inventory.md— best-in-class Python utility picks with version + license + handle-fit rationale (15 categories + skip list). Backs future toolkit extensions.carl_studio.handles.subprocess.SubprocessToolkit— capability- constrained subprocess lifecycle.spawn(argv: list[str])(argv-only, shell strings rejected at the type level) /poll/wait/terminate/read_stdout/read_stderr/list_processes. Default TTL 300s prevents orphan processes. stdout / stderr captured intoDataVaultso byte payloads never stream through agent context. Error codes undercarl.subprocess.*.carl_studio.handles.bundle.HandleRuntimeBundle— one-call construction of the full handle runtime.build(chain)wires every vault + toolkit against the suppliedInteractionChain;register_all(dispatcher)registers 25 agent-callable tools (data toolkit × 6, browser × 11, subprocess × 7,computer) via amake_handler()shim that converts toolkit methods (kwargs → dict) to theToolDispatcher(dict → (str, bool))contract.anthropic_tools()returns the flat schema list for the Anthropictools=API param.tool_catalog()describes the full surface for a "what can you do?" meta-tool.- Tests (~148 new cases total — the v0.16.1 line closes with):
packages/carl-core/tests/test_data_handles.py(21)packages/carl-core/tests/test_resource_handles.py(11)tests/test_data_toolkit.py(25)tests/test_browser_toolkit.py(10, incl. fake-Playwright fixture)tests/test_cu_dispatcher.py(11)tests/test_cu_privacy.py(11)tests/test_subprocess_toolkit.py(14 — real Popen against trivial Python children)tests/test_handle_bundle.py(10)
Tool-loop extraction release. chat_agent.py's tool-use loop body
collapses from 100 LOC of inlined pre-hook / DENY / validate /
dispatch / post-hook / recording logic into a single
self._dispatcher.execute_block(...) call plus a small event-fan-out
- outcome-recording tail. Closes the longest-running god-class decomposition deferral (3 re-deferrals tracked; this is the landing).
chat_agent.pytool-use loop body — ~100 LOC → ~40 LOC. The per-block lifecycle delegates toToolDispatcher.execute_block(landed in v0.14). Outcome semantics identical:ok/denied/schema_error/errorstill produce the same AgentEvent stream in the same order, InteractionChainTOOL_CALLsteps still record with matching outcome strings and duration_ms,turn_deniedcounter still tracks DENY returns for the all-denied terminal guard. Pre/post hook exceptions still surface asAgentEvent(kind="error", code="carl.hook_failed").ToolPermissioncanonicalized —chat_agent.pynow re-imports the enum fromcarl_studio.tool_dispatcher, dropping its duplicate. Back-compat preserved:from carl_studio.chat_agent import ToolPermissioncontinues to work.
tests/test_chat_agent_witness.pyfixture updated: tool-dispatch stubbing moved fromCARLAgent._dispatch_tool_safetoToolDispatcher.dispatch_safe(class-level) to match the post-extraction path. All 6 witness tests pass.tests/test_chat_agent_robustness.py— unchanged. The 85-test robustness suite validates that nothing observable changed from the caller's perspective.tests/test_tool_dispatcher_execute_block.py— unchanged; the 8 execute_block tests pin the contract the agent now depends on.
- Tests: 3088 pass / 0 fail (same count as v0.14; the extraction is purely a migration, no new features).
- Build: 0.15.0 wheel clean.
- Import time + cold-start behavior unchanged (measured via
python -c "import time; t=time.perf_counter(); import carl_studio; ...").
v0.14.0 → 93% of V_max v0.15.0 → 94% — architectural debt continues to decline. chat_agent is still the largest single file (~2,280 LOC after the tool-loop shrink) but its remaining bulk is domain logic (knowledge store, memory, constitution, one-shot inference, prompt building) rather than inline orchestration. Future extractions would pay diminishing returns.
- carl.camp marketplace search / discovery endpoints (backend).
- AXON signal emission via HTTP to carl.camp (Fano v0.10 V7 follow-up).
- HVM/py2bend integration (major, separate effort — v1.x territory).
- One-shot inference path (
_one_shot_text,_build_system_prompt) extraction from chat_agent — optional, lower-priority.
Tool-dispatch API extension + carl-env expansion. Clears the tool_dispatcher prerequisite that was blocking the full chat_agent tool-loop extraction and fleshes out carl-env with the 3 remaining questions from the original design.
ToolDispatcher.execute_block()— full per-block lifecycle (pre-hook → schema validation → dispatch → post-hook → outcome). Returns(ToolOutcome, list[ToolEvent]). Consolidates what was previously inlined inchat_agent.py's tool-use loop, giving the chat agent a single delegation point. The extraction of the loop body itself is v0.15 scope.ToolOutcome+ToolEvent— frozen dataclasses capturing the outcome state ({tool_use_id, name, input, result, is_error, outcome, duration_ms}) and agent-visible events ({kind, name, content, code}).ToolPermissionenum — migrated from chat_agent.py to tool_dispatcher.py so the permission contract lives with the dispatcher that consumes it. chat_agent keeps its existing import for back-compat.- carl env expanded questions (Q5/Q6/Q7):
reward— GRPO reward shape (static CARL composite / phase_adaptive / custom / none). Only asked when method is grpo or cascade.cascade_stages— 2 (SFT→GRPO) or 3 (SFT→DPO→GRPO). Only asked when method is cascade.eval_gate— none / metric / crystallization. BITC-aware admission policy.
EnvState.reward,EnvState.cascade_stages,EnvState.eval_gatefields. Renderer emits them when set; omitted wheneval_gate == "none"so the generated yaml stays clean.
- Tests: 3088 pass / 0 fail (+18 since v0.13.0 — 8 execute_block, 10 expanded env). All v0.13 surfaces unchanged. ToolDispatcher regression tests untouched; new execute_block tests are additive.
- Build: 0.14.0 wheel clean.
- Full tool-loop extraction from
chat_agent.py(now unblocked — the execute_block API landed in v0.14; the loop body can migrate to a single for-loop calling execute_block + recording outcomes). Deferred because the delegate call fan-out into yields + chain recording is load-bearing and wants a dedicated review session. - Marketplace search / discovery endpoints (carl.camp side).
- HVM/py2bend integration (separate major effort).
Agent-marketplace activation. carl.camp backend endpoints are live
(POST /api/agents/register + POST /api/sync/agent-cards with
envelope + rate-limit + migration 021 on a2a_agents), so carl-studio
cuts over to the real surface. Production @coherence_gate wiring
lands on the publish path — the first concrete call-site adoption.
carl agent register <name>— MIT-clean CLI command. Writes locally always; pushes to carl.camp when a bearer token is present (env varCARL_CAMP_TOKENor~/.carl/camp_token).--local-onlyskips the network path;--org <id>targets a specific org. Flow:POST /api/agents/registermints the recipe-shell UUID → carl-studio replaces the local placeholder agent_id → callsPOST /api/sync/agent-cardsto publish.carl agent publish— pushes all locally-stored cards (or one specific via--agent-id) to carl.camp. Coherence-gated via@coherence_gate(min_R=0.5, feature="agent.publish"): denies when recent success rate is below threshold. Usessuccess_rate_probeas the default endogenous probe. First production call site for the v0.11 CoherenceGate primitive — Fano V7 realization now ~90%.carl agent list [--limit N]— enumerate locally-stored cards.CampSyncClient.register_recipe_shell()— Python method mirroring the backendPOST /api/agents/registercontract. Returns typedRegisterResultwith{agent_id, org_id, lifecycle_state, created_at}or structured error.SyncResult.envelope_ok— captures the backend's{ok: true/false, ...}envelope so call sites can distinguish transport success from business-logic success.
AgentCardStore._conn_ctx()normalizes both LocalDB connection shapes (context-manager + raw). Enables use from both carl-studio's real LocalDB and test-helper wrappers.
- Tests: 3070 pass / 0 fail (+10 marketplace-flow covering register_recipe_shell happy/4xx/429/transport, envelope handling, CLI local-only + missing-token paths, content_hash required).
- Build: 0.13.0 wheel clean.
- Backend integration verified against contract shape (envelope +
error codes + rate-limit headers) via mocked transport; live-
endpoint smoke requires
CARL_CAMP_TOKEN.
- Tool-dispatch extraction from
chat_agent.py(blocked ontool_dispatcher.pyAPI extension; non-trivial). carl envexpanded to full 7-question design (reward · cascade · eval questions).
Decomposition + wizard release. First cut of the long-deferred
chat_agent.py god-class decomposition, plus the carl env
progressive-disclosure wizard MVP.
carl env— new top-level CLI command. 4-question wizard (mode · method · dataset · compute) that builds acarl.yamltraining config. Resume-capable via~/.carl/last_env_state.json. Flags:--resume,--auto,--json,--dry-run,--output. Functor-composed questions so answer order doesn't matter when fields are disjoint.src/carl_studio/env_setup/new package —state.py(EnvStatePydantic model),questions.py(registry +next_question),render.py(YAML emission).
SessionStoreextracted fromchat_agent.pyto a newsrc/carl_studio/sessions.pymodule. First cut of the multi-session god-class decomposition (was 3x deferred → auto-P1 per Anti-Deferral Protocol).chat_agent.pyre-imports the extracted names for back-compat; all existing callers continue to work unchanged.chat_agent.pyshrinks ~170 LOC.
- Tool-dispatch loop extraction (
chat_agent.py:1280-1475) → candidate for v0.13 oncetool_dispatcher.pygains the needed API. Coherence probe lives here. - One-shot inference path (
_one_shot_text,_build_system_prompt) → v0.13. - Remaining
CARLAgentclass (~1700 LOC) → expected to settle naturally as tool-dispatch and prompt-building extract.
- Tests: 3060 pass / 0 fail (+19 carl-env; 3041 → 3060). All v0.11
surfaces unchanged. SessionStore tests pass via both the new
import path (
carl_studio.sessions) and the legacy path (carl_studio.chat_agent). - Build: 0.12.0 wheel clean.
Fano-followthrough release. Closes the two P1-P2 findings that v0.10.0
left open + ships the first v0.9-designed feature (carl update).
- Step.probe_call audit trail (Fano V5 witnessability). When a
registered coherence probe populates phi/kuramoto_r/channel_coherence,
the Step records
{probe_name, inputs_sha256, output_sha256, populated}— 12-hex digests, not full payloads, to preserve BITC axiom 1 bounded support. Serialized viaStep.to_dict(). success_rate_probeincarl_core.presence. A default endogenous probe: reads the chain's own tail of same-action steps and returns{kuramoto_r: success_rate}. Pairs with@coherence_gateto close the IRE "G" realization end-to-end (Fano V7 45% → ~75%). Exported fromcarl_core.__init__.carl updatecommand +carl_studio.updatepackage. Surfaces recent git commits, PyPI dep-version deltas, and positive-framed blast-radius summary.--dry-runskips network;--jsonemits machine-readable;--summary-onlyfor one-liner;--detailedfor full lists. Consent-gated for network egress.
Stepschema gained optionalprobe_callfield (additive, backward-compatible).carl_core.__init__exportssuccess_rate_probealongside existing presence helpers.
- Tests: 3041 pass / 0 fail (3026 → 3041, +15 for
carl update). - Zero feature regression. All v0.10 surfaces unchanged.
- Build: 0.11.0 wheel + sdist clean.
Architecture-completion release. Closes the four gaps the four-agent vanilla peer review flagged against v0.8.0, plus shipping the initial marketplace agent-card client, the coherence-gated routing primitive, and the presence-report query helper. Validated by a Fano-plane (K_7) consensus pass across seven axes: boundedness, recurrence, endogenous measurability, contrastive coherence, witnessability, manifold integrity, gate realization.
v0.9 was skipped as a release tag — all v0.9-design work
(carl-update, carl-env) ships in v0.10 alongside the v0.10-A
primitives.
- CoherenceGate primitive (
carl_studio.gating). Closes theGin IRE's(M, I, Φ, G)tuple.CoherenceGatePredicatereads tail-window Kuramoto R from the active chain;@coherence_gate(min_R=...)decorator raisesCoherenceError(code="carl.gate.coherence_insufficient")when R is below threshold. Opt-in — stacks withconsent_gate/tier_gate.CoherenceSnapshot.is_degenerate+variancefield flag constant-probe signals without forcing deny. - Coherence auto-attach on InteractionChain.record()
(
carl_core.interaction). Opt-inregister_coherence_probe(fn)callback invoked at record time forLLM_REPLY/TOOL_CALL/TRAINING_STEP/EVAL_PHASE/REWARDaction types when no explicit coherence kwargs are passed. Probe exceptions swallowed; non-dict returns ignored. Explicit kwargs always override the probe. - PresenceReport + compose_presence_report (
carl_core.presence). Thin composition helper — NOT a new primitive. Returns a frozen dataclass with R, psi, crystallization, Deutsch-Marlettoconstructiveflag, recent action types, and a human-readable note. Registered as MCP toolcarl.presence.selffor agent self-introspection. - Marketplace agent cards (
carl_studio.a2a.marketplace).MarketplaceAgentCardPydantic model aligned with the carl.campPOST /api/sync/agent-cardscontract;AgentCardStorelocal SQLite persistence with paginatedlist_all(limit, offset);CampSyncClientHTTP push with pluggable transport + 429 + batch-limit handling.content_hashcanonicalization (sha256 over sorted-keys JSON). Distinct from the existingCARLAgentCard(running-instance manifest). - Tool-call witness completeness (
chat_agent.py). Every tool dispatch — ok, denied, schema_error, error — records anActionType.TOOL_CALLstep on the InteractionChain with{outcome, result}payload and measuredduration_ms. Closes the pre-v0.10 gap where CLI + memory were logged but tool calls were not. Fire-and-forget recording; chain persistence failures never propagate. - packages/carl-core/LICENSE — MIT text mirrored from repo root. carl-core ships as a separate wheel and now carries its own license file.
emit_gate_eventextended with optionalgate_codeparameter that surfaces in the step output dict for downstream filtering. Back-compat: defaultNonepreserves v0.8 behavior for existing callers.docs/private_integration.mdnow documents theload_private()three-layer fallback contract (hardware-HMAC →terminals-runtime→ HF private dataset → MIT-safe stub) + non-obligations (no pre-check required, no caching required).
- Fano-plane peer-review pattern (
AGENTS.md). Dispatching 7 vanilla-context agents aligned to BITC/IRE axes (N=7 = K_7, complete mutual observation per BITC §6.1) before any major release tag. Each writes JSON-DAG findings; MECE coalesce produces consensus. Anti-patterns flagged directly feedCLAUDE.mdfor future-session filtering.
- Step schema extension for probe audit trail (
step.probe_callsub-field) — Fano V5 witnessability finding. - Typed context manifold on InteractionChain — Fano V6 forward.
- Applying
@coherence_gateto production call sites (training admission, marketplace publish, etc.) — Fano V7 flagged zero production call sites today. The primitive is demonstrated end-to-end viatests/test_fano_consensus_fixes.pybut live wiring is explicit v0.11 scope. chat_agent.pyfurther decomposition (2,443 LOC) — auto-promotes to P1 if re-deferred per Anti-Deferral Protocol.
- Tests: 3009 pass / 0 fail (2923 v0.8 core → 3009 now, +86 new).
- Peer review: two waves (4-agent v0.10 review + 7-agent Fano consensus K_7). All findings addressed or explicitly deferred with rationale.
- Build:
python -m buildproduces clean 0.10.0 wheel + sdist. - IP boundary: MIT carl-studio unchanged; no BUSL methodology copied; admin-gate + lazy-import seam preserved.
- κ:
KAPPA = 64 / 3unchanged per Tej's ruling (exact from early Desai papers; terminals.tech's 21.37 is downstream calibration).
Consolidation release. No new product surfaces — four crystallization tracks collapse duplicated patterns from the v0.5→v0.7.1 arc into typed primitives, expose named plug-points for private-runtime extension, and publish the follow-up paper series justified by shipped work. Grounded in a four-agent review (isomorphism map · IP boundary · paper series · integration seams).
BaseGate[P: GatingPredicate]incarl_studio.gating— shared generic owning the predicate → emit → raise loop.consent_gateandtier_gatedelegate to it internally; public signatures, error codes, and decorator shapes are unchanged.ConfigRegistry[T: BaseModel]incarl_studio.config_registry— typed wrapper overLocalDB.get_config/set_configwith Pydantic v2 validation. Auto-derivednamespace.modelnamekeys; schema mismatch raisesCARLError(code="carl.config.schema_mismatch").LocalDBgained.config_registry(cls, *, namespace, key=None)factory.SpendTrackermigrated to persistSpendStateundercarl.x402.spendstate— legacy two-key format is auto-migrated on first read (opt-out viaCARL_CONFIG_MIGRATE=skip).BreakAndRetryStrategyincarl_core.resilience— composesRetryPolicyandCircuitBreakerbehind one.run()/.run_async()call. RaisesCircuitOpenError(code="carl.resilience.circuit_open")when the breaker is open. x402 facilitator calls gained a strategy binding alongside the existing breaker (additive).carl_studio.x402.register_confirm_callback(name, cb)— named registry soX402Config.confirm_payment_cb: str | Callable | Nonecan be resolved at execute time. Private runtimes persist a callback name via carl.camp settings; directCallablepath unchanged.carl_studio.metrics.public_registry()+register_external_collector(collector)— sharedCollectorRegistryaccessible to private dashboards; external collectors surface oncarl metrics serveautomatically (same registry).carl_studio.tier.register_tier_resolver(fn)— pluggable tier source.TierPredicate._effective()checks the resolver before falling back todetect_effective_tier(). Errors wrap asCARLError(code="carl.tier.resolver_error").- Paper series.
paper/carl-paper.md→paper/01-main-carl.md. Added02-phase-adaptive-methods.md,03-coherence-trap-technical-note.md,04-interaction-chains-witness-logs.md, plusdocs/paper_series.mdindex. All cross-references verified against v0.7.1 symbols. docs/private_integration.md— examples for the three plug-points.
consent_gate/tier_gateinternal shape. Both now thin delegates overBaseGate. External contract (signatures, error classes, error codes, decorator metadata) unchanged. 18 newtests/test_gating_base.pytests pin the shared primitive.SpendTrackerpersistence format. Legacycarl.x402.spend_today/carl.x402.daily_reset_atkeys are replaced bySpendStateJSON atcarl.x402.spendstate. Idempotent migration on first read.X402Config.confirm_payment_cbtype. Widened tostr | ConfirmPaymentCallback | None. Existing Callable users see no behavior change.PhaseTransitionGatemoved from inline incarl_studio/__init__.pytocarl_studio/training/gates.py. Re-exported via lazy__getattr__— import time stays under 200ms. Seed-first resolution preserved.
_ConsentFlagKeyShimand module__getattr__deprecation trampoline inconsent.py(marked for v0.8 removal since v0.6.3). Call sites must use theConsentKeyLiteral directly.TierError = TierGateErroralias incarl_studio/agent/tier_gate.py. ImportTierGateErrorfrom the canonical location.- Stale filesystem artifacts:
src/carl_studio/cli.py(pre-CLI-collapse monolith),src/carl_studio/x402_sdk.py(moved tox402_connection.pyin v0.5.0),src/carl_studio/primitives/(removed in v0.5.0 but Finder-restored as cruft).tests/test_x402_sdk.py::test_x402_sdk_module_gonenow passes.
See docs/paper_series.md for the full index and cross-reference table
(paper ↔ shipped code path). Four-paper series covering main framework,
phase-adaptive methods, the coherence trap, and interaction-chain
witness logs.
Phase-2b close-out. Hardens v0.7.0's surfaces with multi-tenant MCP session isolation, an on-platform Prometheus scrape endpoint, structured budget caps for the x402 rail, and a trajectory-delta CLI for two runs.
- x402 spend caps.
SpendTrackerenforces a daily (CARL_X402_DAILY_CAP_USD) and session (CARL_X402_SESSION_CAP_USD) cap synchronously before any network call. Breaches raiseBudgetErrorwith stable codescarl.budget.daily_cap_exceeded/carl.budget.session_cap_exceeded. Daily rolling window persists throughLocalDB.config. confirm_paymenthook.X402Client.executeaccepts an optionalconfirm_payment_cb— interactive or policy-based approval that fires after the budget check but before the consent gate. Denials raiseBudgetError(code="carl.budget.confirm_denied")without recording a contract witness.- MCP per-request session state.
MCPServerConnection.sessionreplaces the module-level_session: dict. Multi-tenant deployments now isolate auth per connection. FastMCPContextDI is supported on authenticated tools (authenticate,get_tier_status,run_skill,dispatch_a2a_task,sync_data); when bound it is preferred over the module-bound connection. Seedocs/mcp_multitenant.md. carl metrics serve. Thin Typer wrapper aroundprometheus_client.start_http_server; binds127.0.0.1:9464/metricsby default. Heartbeat daemon auto-hosts whenCARL_METRICS_PORTis set. PrivateCollectorRegistryavoids polluting the global default.carl run diff <a> <b> [--steps]. Trajectory delta between two training runs — phi_mean / q_hat / crystallization_count / contraction_holds / first-divergence-step. Renders viaCampConsole.- Shared gating primitives.
carl_studio.gating.GatingPredicateProtocol +carl.gate.*error-code namespace.ConsentPredicateandTierPredicatenow emit identicalGATE_CHECKsteps on the activeInteractionChain.
- Heartbeat maintenance is wrapped in
RetryPolicy(max_attempts=3)for transient sqlite / IO blips — failures still emit a structured log step but no longer tear down the daemon loop. CARL_HOMEis now honoured uniformly acrossdb.py,settings.py,wallet_store.py, andllm.pyvia the sharedcarl_studio.settings.carl_home()helper. The previous "partial override" caveat indocs/operations.mdhas been lifted.
- Unused
_SPEND_SESSION_KEYreservation inx402.py. - Unused private aliases
_marshal_sdk_result,_arg_is_missing,_estimate_cost,_marshal_sdk_responsein the MCP elicitation / sampling modules (the public helpers remain canonical).
The resonant-heartbeat release. Four-wave execution against a 48-ticket MECE
backlog synthesized from five parallel review teams. 2637 tests pass (+161 new).
Ships the target UX vision: bare carl greets with a pre-coded intro, extracts
JIT context from the first input, forks a resonant heartbeat loop over an
async sticky-note queue, and closes the loop with training-feedback proposals.
Fixes the framework's central reward/eval isomorphism: EvalGate now requires
coherence floor (φ ≥ SIGMA) in addition to primary metric, cascade gates on
crystallization events (not reward volume), and PhaseAdaptiveCARLReward
shifts weights by detected Kuramoto phase.
_tool_dispatch_clinow passes_scrubbed_subprocess_env()to child process (was leaking ANTHROPIC_API_KEY / HF_TOKEN / OPENAI_API_KEY / OPENROUTER_API_KEY into model-invoked subprocess). Tracks REV-001.CARLSettings.save()excludesopenrouter_api_key+openai_api_keyfrom the YAML dump (previously persisted plaintext to~/.carl/config.yaml). Tracks REV-003.
EvalSandbox.execute_codeusessys.executableinstead of bare"python", preventing phantom tool failures in docker / CI / multi-venv environments (REV-005).CARLAgent._one_shot_textaccumulates_total_cost_usd/ token counters on every API call.max_budget_usdwas previously bypassable viasuggest_learnings()auxiliary calls (REV-006).TrainingPipeline._check_gatereturnsFalseon exception (was silently PASSing OOM / network / missing-dataset failures, allowing potentially broken models to reach the Hub) (REV-007).CascadeRewardManager.__init__guardswarmup_steps = max(1, warmup_steps)againstZeroDivisionErroratget_stage_weight(REV-009).
- Deleted
carl_studio.primitivescompatibility shim (slated for v0.5.0 removal per CLAUDE.md, still present at v0.5.0 ship). Zero in-tree consumers verified.import carl_studio.primitivesnow raisesModuleNotFoundError(SIMP-001). - Renamed
carl_studio.agent.CARLAgent(FSM autonomy agent) →AutonomyAgentto resolve collision with the canonicalcarl_studio.chat_agent.CARLAgentused by all CLI paths. Module-level__getattr__emitsDeprecationWarningon legacy import, removal scheduled v0.7 (SIMP-002). .gitignorehardened against/fix_*.pyand/patch_*.pyscratch scripts (SIMP-008).
ActionType.HEARTBEAT_CYCLE+ActionType.STICKY_NOTEonInteractionChain(ARC-008).sticky_notesSQLite table +src/carl_studio/sticky.pymodule withStickyNote(Pydantic v2) andStickyQueuesupporting priority-ordered append/dequeue/complete/archive/get/status (ARC-004).src/carl_studio/jit_context.py—JITContextmodel,TaskIntentenum (EXPLORE/TRAIN/EVAL/STICKY/FREE),extract()with move-key shortcircuit- regex classification,
WorkFrame-awareframe_patchbuilder (ARC-002).
- regex classification,
src/carl_studio/cli/intro.py— env-baked pre-coded intro with 4 keyed moves[e]xplore[t]raine[v]aluate[s]ticky+ free-form. Rendered before firstinput()inchat_cmd. Zero-latency, no I/O.parse_intro_selection()accepts single-letter or full-word form. Rich markup escape pins UAT-052 regression (ARC-001 + SIMP-009).carl queueCLI sub-app:add/list/status/clearbacked byStickyQueue. Doctor gains a "Queue" stanza reporting pending count (ARC-007).src/carl_studio/heartbeat/— package withHeartbeatPhaseenum (INTERVIEW → EXPLORE → RESEARCH → PLAN → EXECUTE → EVALUATE → RECOMMEND → AWAIT),HeartbeatLoopdaemon-thread async loop draining the sticky queue,HeartbeatConnection(AsyncBaseConnection)participating in the Connection registry with full FSM lifecycle. Thread-safe sqlite via freshLocalDBper worker thread (ARC-003 + ARC-006).src/carl_studio/feedback.py—FeedbackEngine+EvalBaseline+TrainingProposalPydantic models.cli/training.pygains--from-queueflag that loads a pending proposal. Proposals are persisted toLocalDB.configunder stable keys (ARC-005).- Bootstrap phase: after turn-1,
ctx.frame_patchis applied toagent._frameviamodel_copy(update=...)(only when frame is inactive, preserving--frameoverrides) (JRN-002). - Greeting gate uses
self._turn_count <= 1(waslen(self._messages) <= 1), so resumed sessions don't re-greet (REV-004). _tool_frameinvalidates_constitution_promptcache so frame-adapted rules re-compile for the new domain (REV-010).
EvalGate.checkrequires both primary metric AND coherence floor (phi_mean ≥ SIGMA,0.3 ≤ discontinuity ≤ 0.7). Restores the reward/eval isomorphism the package has always claimed. Legacy construction preserved for backward compat; new gate active whenEvalConfig.require_coherence_gate=True(default) (SEM-001).CascadeRewardManager(gate_mode="crystallization")fires onsum(trace.n_crystallizations) >= Nover configurable window — a phase-transition signature, not a reward-volume percentile. Metric mode preserved (SEM-006).PhaseAdaptiveCARLReward(CARLReward)reads Kuramoto-R from_last_tracesand shifts(w_mc, w_cq, w_disc)by detected phase: gaseous rewards commitment, liquid balances, crystalline rewards stability (SEM-010).
carl initpersistsdefault_chat_modelso barecarljust works post-init (JRN-004).carl doctorprints a "Next steps" guide block (gated by first-run marker age) (JRN-005).- Optional sample-project scaffold in
carl init(JRN-006). - Post-init celebration + next-step pathways (JRN-008).
session_themepersisted with agent state (move:explore/train/evaluate/ sticky or free-form) (JRN-009).- Optional GitHub repo / HF model context-gathering (JRN-010).
_pump_eventshelper shared betweenchat_cmdandrun_one_shot_agent(SIMP-003).parse_flags()replaces 12 handrolled arg loops inoperations.py(-69 LOC) (SIMP-006)._PROMPT_OPSdict consolidates 6 prompt-template macro ops (SIMP-010).- Operation descriptions appear in
carl flow --list(JRN-007).
- 7 error classes (
ContractError/ConsentError/MarketplaceError/SyncError/X402Error/BillingError/CreditError) migrated ontoCARLErrorhierarchy with stable codes (carl.contractetc.). Network failures now use multi-inheritance withNetworkError(MarketplaceNetworkError,CreditNetworkError). Secret-redaction viato_dict()is now active for all of them (SIMP-004). x402_sdk.pydeleted;X402SDKClientfolded intox402_connection.py(SIMP-005).CARLSettings.SETTABLE_FIELDSderived dynamically fromcls.model_fields;load()useslocal_data.keys() & model_fields.keys()instead of a hardcoded allow-list (SIMP-007).
LocalDB._connectcommits on clean exit, rolls back on exception, serializes viathreading.Lock.sqlite3.connect(check_same_thread=False)so the lock can actually do its job (REV-002).LocalDB.__init__callsself.close()on_init_schemafailure so corrupt-DB doesn't leak a half-open connection (REV-008).- MCP
_sessiondocumented + single-tenant banner on server startup; per-request migration path documented for v0.7 (UAT-049). - A2A
sendCLI validates--inputsis a JSON object andskillis inBUILTIN_SKILLSregistry (UAT-050). - MCP
sync_datalogs JWT cache-write failures at DEBUG instead of silent swallow (UAT-051).
carl_core.connection.coherence.ChannelCoherence— per-transaction observable (phi_mean / cloud_quality / success_rate / latency_ms) that any channel can publish.channel_coherence_diff()+channel_coherence_distance()make the 1P/3P isomorphism claim measurable.BaseConnectiongainschannel_coherence()reader +publish_channel_coherence()setter;to_dict()surfaces it (SEM-002).Stepgains optionalphi/kuramoto_r/channel_coherencefields.InteractionChain.coherence_trajectory()returns the phi-vs-step series across all channels — chain is now a witness, not just a log (SEM-007).carl_core.dynamics.ContractionProbe— records trajectory, fits contraction constant q_hat via log-ratio OLS, firescontraction_violationon divergence from Banach-style contraction. Opt-in viaCARL_CONTRACTION_PROBE=1throughResonanceLRCallback; logsdynamics/q_hat(SEM-003).test_conservation_law.py— smoke test for KAPPA · SIGMA = 4 and ∫(1-phi_t) ≈ SIGMA · T_STAR within 10× tolerance on synthetic cooling trajectory (@pytest.mark.slow) (SEM-008).CoherenceProbe.measure_multi_layer(hidden_states, logits, token_ids)returnsLayeredTracewith per-layer residual cosine + optional attention entropy. Gated byCARL_LAYER_PROBE=1; fast logits-only path preserved (SEM-004).GRPOReflectionCallback.on_logcomputestaufrom CARL reward's_last_tracesvia Kuramoto-R in the public training path. Publisheswitness/tau+witness/kuramoto_Rto the logs stream. Previously the crystalline gate was permanently closed for public users; TTT micro-update now fires on real phase transitions (SEM-009).
carl_studio.primitivescompatibility shim (see WAVE-0 above).carl_studio.x402_sdkmodule — consolidated intox402_connection(see WAVE-2 above).
0.4.1 — 2026-04-18
Security + correctness hotfix on top of 0.4.0. Driven by an ultrareview pass across the 7-commit 0.4.0 window. Closes 2 P0 security holes, 9 P1 correctness issues, and 3 P2 tech-debt items. Adds 53 new tests (1864 → 1917).
-
list_filessandbox bypass (chat_agent.py): the_tool_listhandler calledPath(path).glob(pattern)directly, letting a prompt-injected model enumerate/,/etc, etc. Now routes through_resolve_safe_pathand rejects absolute/traversal globs; every match is re-verified inside workdir. -
run_shellin eval sandbox (eval/runner.py):subprocess.run(cmd, shell=True, ...)on model-generated strings allowed shell metacharacters (;,|,$(, backticks,>,<) to escape the tempdir. Now hard-rejects metacharacters, tokenizes viashlex.split, runs withshell=False. Eval datasets that relied on pipelines must migrate toexecute_code.
-
Infinite denial loop: when the permission hook denied every tool in a turn, the agent retried forever.
_MAX_CONSECUTIVE_ALL_DENIED=5terminates withcarl.all_tools_deniederror event; counter resets on any allowed tool. -
asyncio.get_event_loop()inside async fn (trainer.py:_watch_loop): would raiseRuntimeErroron Python 3.12+. Switched toasyncio.get_running_loop(). -
run_analysisenvironment leak (chat_agent.py): child subprocess inherited ANTHROPIC_API_KEY, HF_TOKEN, CARL_WALLET_PASSPHRASE, etc. Now scrubs sensitive env vars (substring match on KEY/TOKEN/SECRET/PASSWORD/ PASSPHRASE/AUTHORIZATION/BEARER/API_KEY) and usessys.executableinstead of bare"python". -
_resolve_safe_pathTOCTOU (chat_agent.py): was passingfollow_symlinks=True, defeating the protection carl_core.safepath's default provides. Nowfollow_symlinks=False; legitimate symlink use requires explicit opt-in. -
Session quarantine silent data loss (
chat_agent.py+cli/chat.py): corrupted sessions were moved to.quarantine/with no user-visible warning. Now logs a warning with the destination path, exposes_last_load_quarantinedon CARLAgent, and surfaces a visible CLI message on resume. -
_knowledgelist unbounded (chat_agent.py): everyingest_sourceappended without cap. Added_KNOWLEDGE_MAX_CHUNKS=2000with LRU eviction and configurablemax_knowledge_chunkskwarg. One-warning-per-session policy. -
MemoryStore.decay_passraces withwrite(carl_core/memory.py): tmp-replace pattern dropped concurrent appends. Added per-instancethreading.RLockguarding bothwriteanddecay_pass. -
TinkerAdapter zombie state (
adapters/tinker_adapter.py):submit()persisted PENDING state then unconditionally raised "not yet implemented", leaking state files nobody could observe. Now raises immediately withcarl.adapter.tinker_not_implementedafter translation validation, no state written. -
UnslothAdapter silent sys.exit(3) (
adapters/unsloth_adapter.py): entrypoint template handled onlysft/grpobut allowlist accepteddpo/kto/orpo. Now validates method at translation time and raisescarl.adapter.method_unsupportedbefore subprocess spawn.
-
CircuitBreakercounted programming errors:AttributeError/TypeErrortripped the breaker. Addedtracked_exceptionstuple (default(Exception,)for back-compat); callers can scope to(NetworkError, TimeoutError, ...). x402 facilitator breaker now scoped to infrastructure failures only. -
Unsloth quantization double-flag:
load_in_4bit=True+load_in_8bit=Truepassed to FastLanguageModel. Rewrote as mutually exclusive precedence chain. -
api_key=""pattern (cli/hypothesize.py,cli/commit.py): passing empty string blocked Anthropic SDK's env-var fallback. Nowapi_key=Noneper CLAUDE.md convention.
-
Adapter shared boilerplate (
adapters/_common.py): extractedstatus_common,logs_common,cancel_common,require_str. Each adapter's status/logs/cancel is now a one-line delegate (~80 LOC removed across 5 adapters). -
trainer._watch_loopduplicate branches: collapsed retryable vs non-retryable except blocks into a singleisinstancedispatch (~35 → 18 LOC). -
trainer._save_carl_checkpointnested try/except: extracted_safe_capture(label, fn)helper; five nested blocks → one-liners. -
constitution.pyoverlay error wrap: narrowed catch-all so precise innerConfigError/ValidationErrorcodes (bad_yaml,bad_rule) propagate instead of being overwritten by coarserbad_user_overlay.
AGENTS.mdtest baseline updated to current reality (1864 → 1917 post-fix).
- v0.4.0: 1864 passing
- v0.4.1: 1917 passing (+53 new covering every fix above)
- Eval datasets using
run_shellwith pipes/redirection/substitution must migrate toexecute_code(Python). Plain commands (ls,cat,python script.py,echo hello) continue to work. - Symlinks inside a chat agent workdir are now rejected by file tools by default.
0.4.0 — 2026-04-18
The "intelligence loop" release. carl-studio is now a proper research hub with
typed error codes, retry/backoff/circuit-breaker primitives, layered memory,
constitutional rules, a hypothesize→eval→infer→commit fractal, and adapters
that let you drive Unsloth, Axolotl, Tinker, and Atropos from the same
carl train --backend X surface.
Every workflow composes from:
carl chat— the meta-loop (barecarlopens chat)carl hypothesize "<statement>"— translate a hypothesis to carl.yamlcarl train --backend <trl|unsloth|axolotl|tinker|atropos>— run the experimentcarl infer --propose-hypothesis— observe and propose the next stepcarl commit "<learning>"— promote working memory to constitutional memorycarl flow "/a /b /c"— compose chains
carl_core.errors— typed hierarchy with stable codes, auto-redacted secretscarl_core.retry— retry + exponential backoff + circuit breakercarl_core.safepath— symlink-escape-proof path sandboxingcarl_core.hashing— canonical-JSONcontent_hashcarl_core.tier—Tierenum +FEATURE_TIERSregistrycarl_core.interaction— typedInteractionChainwith 11ActionTypes includingMEMORY_READ/MEMORY_WRITEcarl_core.interaction_store— JSONL trace persistence with flockcarl_core.memory— 6-layer memory store (ICONIC/ECHOIC/SHORT/WORKING/LONG/CRYSTAL) with resonance-driven recall, decay, promotion
carl_studio.constitution—Constitution.load()merges CLAUDE.md + AGENTS.md +~/.carl/constitution.yaml;compile_system_prompt(topics=...)filters by resonance tags;append()persists new rulescarl hypothesize— CARLAgent translates NL → carl.yamlcarl infer --propose-hypothesis— reads eval report + coherence trace, proposes next experimentcarl commit— writes to~/.carl/constitution.yaml;--from-session <id>extracts durable learnings from a saved session- CARLAgent injects constitution into system prompt; recalls from WORKING/LONG memory on every turn; emits MEMORY_READ/MEMORY_WRITE steps
- Session-end auto-commit: agent proposes rules to promote when a session has ≥3 turns
carl_studio.adapters.{trl,unsloth,axolotl,tinker,atropos} — each implements
the UnifiedBackend protocol. Honest available() checks. CARL gate +
coherence rewards layer on top regardless of backend.
- Runtime (
chat_agent.py): streaming try/finally with partial-cost persistence, session corruption quarantine + schema_version, budget pre-check with BudgetError code, per-tool timeouts (30/60/120s), tool arg JSON-schema validation,dispatch_clitool whitelist-gated via OPERATIONS registry + tier - Training (
trainer.py,rewards/*,callbacks.py):.carl_checkpoint.pton crash with full RNG state,.watch()retry viacarl_core.retry+ exp backoff + 5-consecutive-failure abort,_clamp_rewardfloor (NaN/inf→0, |x|>100 clipped), logits-shape guards withValidationError, callback body exception isolation - Eval (
eval/runner.py): sandbox viacarl_core.safepath.safe_resolve(blocks symlink escape + traversal), Phase 2' per-turn GPU tensor cleanup in try/finally, empty-results and zero-tool-call branches emit typed metrics instead of NaN - x402 (
x402.py): retry with jittered exp backoff, module-levelCircuitBreaker, wallet balance pre-check, strict header parsing, InteractionChain PAYMENT steps - Contract/consent (
contract.py,consent.py): verify-on-load raisesContractBroken,sign()consent gate without swallow,ConsentManager.sync_with_profileuses timestamp precedence, flock-serialized updates - camp (
camp.py): 24h TTL cache with 7× stale-serve window, JWT 401 →refresh_tokenexchange, tier-change signal to LocalDB - Credits (
trainer.py,credits/*): synchronous deduction (no try-and-ignore),--skip-creditsescape hatch, refund on post-submission failure - Marketplace (
cli/marketplace.py): idempotency keys, local publish cache, backend 409 recovery,--forcebypass - CLI: 11→23 flow ops,
carl flow --json,--no-continue-on-failure, unifiederror_with_hintformatter,carl camp init/camp flowregistered,carl hypothesize/carl committop-level - Wallet (
wallet_store.py): Fernet + PBKDF2-HMAC-SHA256 (600k iters) at~/.carl/wallet.encmode 0o600, OS keyring fallback,WalletLocked/WalletCorruptedtyped - Freshness (
freshness.py):FreshnessReport+FreshnessIssuewith stablecarl.freshness.*codes, 24h TTL - E2E UAT: 14 new failure-path scenarios exercising every typed code
- Property tests: 25 Hypothesis properties (hashing, safepath, retry, coherence, frame, x402)
- Pytest
importlibmode + explicitpythonpathfortests/+packages/carl-core/tests/coexistence py.typedmarker on carl-core- Test baseline: 1103 (v0.3.0) → 1864 (v0.4.0)
from carl_studio.primitives import Xstill works via shim in 0.4.x, but preferfrom carl_core.X import .... (The shim is removed in the next release — see the Unreleased section.)carl lab chatremoved in 0.3.0; usecarl lab replorcarl chat.- First-run wizard:
carl initwalks through signup + extras + consent + project in under a minute.
Inspired by the old Oppenheimer/Einstein-watching-Demis-and-Karpathy analogy: a small crystallized constitution (the polymath elders) oversees a swarm of session-scoped agents (the frontier researchers) whose working memory decays but whose durable learnings promote upward. Every experiment is a hypothesis; every hypothesis carries a predicted metric; every result can propose the next hypothesis. The loop closes itself.