Skip to content

Commit 95f62fe

Browse files
hanamorixHanaclaude
authored
Brain Health PR-2: wire helpers + heartbeat anomaly aggregation + nell health CLI (#19)
* feat(health): wire attempt_heal/save_with_backup into config + state files — Health T9 PersonaConfig, UserPreferences, HeartbeatConfig._load_internal, and HeartbeatState all now route through attempt_heal() (corrupt → quarantine + .bak restore) and save_with_backup() (atomic .bak rotation). Each class gains a load_with_anomaly classmethod returning (instance, BrainAnomaly|None); existing load() delegates and logs at WARNING on anomaly. HeartbeatConfig gains _load_internal_with_anomaly parallel to _load_internal. 8 new tests cover quarantine-and-reset + restore-from-bak paths. 578→586 passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(health): wire attempt_heal/save_with_backup into identity files — Health T10 InterestSet, ReflexArcSet, ReflexLog, load_persona_vocabulary, growth scheduler's _read_current_vocabulary_names and _append_to_vocabulary all now route through attempt_heal() (corrupt → quarantine + .bak restore) and save_with_backup() (atomic .bak rotation). Each affected class gains a load_with_anomaly classmethod returning (instance, BrainAnomaly|None); existing load() delegates and logs at WARNING on anomaly. 9 new tests cover quarantine-and-restore-from-bak + corrupt-no-bak-reset-to-default paths. 586→595 passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(health): generalize read_jsonl_skipping_corrupt to all log readers — Health T11 - brain/growth/log.py: read_growth_log now delegates the inline splitlines+json.loads loop to read_jsonl_skipping_corrupt, then calls _event_from_dict per entry. Removes ~15 lines of duplicate parse logic; schema-error on a well-formed JSON line is still caught and logged via the module logger. Test updated to match the shared helper's warning text ("malformed jsonl line"). - brain/engines/research.py: ResearchLog.load now routes through attempt_heal with a schema validator; corrupt files are quarantined and the most recent .bak is restored automatically. save() upgraded from bare tmp+os.replace to save_with_backup, feeding the adaptive-treatment layer. Unused json/os imports removed. - brain/engines/reflex.py: ReflexLog.load gets the same attempt_heal treatment; a WARNING with anomaly kind/action/file is emitted on corruption, matching the ReflexArcSet pattern already in place. - heartbeat.py / dream.py: confirmed no-op — two json.loads calls in heartbeat are post-write verification, not log readers; dream engine has no reader. - 6 new tests added (3 reflex, 3 research): corrupt-quarantines-warns, heals-from-bak, and load-missing-returns-empty for both log types. Full suite 601/601 green; ruff clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(health): heartbeat anomaly aggregation + audit log + cross-file walk + compact CLI — Health T12 - HeartbeatResult gains anomalies: tuple[BrainAnomaly, ...] and pending_alarms_count: int - HeartbeatConfig.load() refactored through load_with_anomaly() so the merge logic is shared - run_tick collects anomalies from config + state loads via _with_anomaly variants - >=2 anomalies per tick triggers walk_persona() cross-file scan (deduped by file+kind) - compute_pending_alarms() called every tick; count written to audit log + HeartbeatResult - audit log entries always carry "anomalies": [...] and "pending_alarms_count": int - compact CLI: banner above engine status when pending_alarms_count > 0; 🩹 line when self-healed anomalies exist but no pending alarms - 7 new tests (4 heartbeat engine, 3 CLI); 608/608 passing; ruff clean Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(health): nell health show/check/acknowledge CLI — Health T13 Three read-only / audit-append subcommands wired to brain.health: - `nell health show --persona X` prints pending alarms + recent self-treatments from the heartbeats audit log (last 7 days). - `nell health check --persona X` runs walk_persona, prints per-file ✅/⚠️/❌ status, exits 2 on unhealable alarms. - `nell health acknowledge --persona X [--file F|--all]` appends a user_acknowledged entry to the audit log; no destructive writes. No restore/add/delete actions wired (SystemExit on unknown actions enforced by argparse). 7 new tests, 615/615 passing, ruff clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * style: ruff format brain-health PR-2 (whitespace only) --------- Co-authored-by: Hana <hana@nanoclaw.local> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent edbb39c commit 95f62fe

22 files changed

Lines changed: 1665 additions & 215 deletions

brain/cli.py

Lines changed: 220 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,16 @@
2020
from brain.engines.heartbeat import HeartbeatEngine
2121
from brain.engines.reflex import ReflexEngine
2222
from brain.engines.research import ResearchEngine
23+
from brain.health.alarm import compute_pending_alarms
24+
from brain.health.jsonl_reader import read_jsonl_skipping_corrupt
25+
from brain.health.walker import walk_persona
2326
from brain.memory.hebbian import HebbianMatrix
2427
from brain.memory.store import MemoryStore
2528
from brain.migrator.cli import build_parser as _build_migrate_parser
2629
from brain.paths import get_persona_dir
2730
from brain.persona_config import PersonaConfig
2831
from brain.search.factory import get_searcher
32+
from brain.utils.time import iso_utc
2933

3034

3135
def _resolve_routing(persona_dir: Path, args: argparse.Namespace) -> tuple[str, str]:
@@ -185,6 +189,15 @@ def _heartbeat_handler(args: argparse.Namespace) -> int:
185189
)
186190
else:
187191
verbose = getattr(args, "verbose", False)
192+
193+
# Health banner — printed BEFORE engine status lines so it's visible
194+
# at the top. Only shown when there are unacknowledged alarms.
195+
if result.pending_alarms_count > 0:
196+
print(
197+
f"⚠️ Brain alarm — needs your attention. "
198+
f"Run `nell health show --persona {args.persona}` for details."
199+
)
200+
188201
print(f"Heartbeat tick complete ({args.trigger}).")
189202
print(f" elapsed: {result.elapsed_seconds / 3600:.2f}h")
190203
print(f" decayed: {result.memories_decayed} memories, pruned {result.edges_pruned} edges")
@@ -216,6 +229,21 @@ def _heartbeat_handler(args: argparse.Namespace) -> int:
216229
print(f" interests bumped: {result.interests_bumped}")
217230
elif verbose:
218231
print(" interests bumped: 0")
232+
233+
# Self-treatment line — shown when anomalies were healed but no pending alarms.
234+
# Appears after engine status, before the trailing newline.
235+
if result.anomalies and result.pending_alarms_count == 0:
236+
distinct_files = list(dict.fromkeys(a.file for a in result.anomalies))
237+
count = len(distinct_files)
238+
if count <= 2:
239+
files_desc = ", ".join(distinct_files)
240+
else:
241+
files_desc = ", ".join(distinct_files[:2]) + ", ..."
242+
print(
243+
f" \U0001fa79 brain self-treated {count} file"
244+
f"{'s' if count != 1 else ''} ({files_desc})"
245+
f" — see `nell health show` for details"
246+
)
219247
return 0
220248

221249

@@ -396,6 +424,160 @@ def _growth_log_handler(args: argparse.Namespace) -> int:
396424
return 0
397425

398426

427+
def _health_show_handler(args: argparse.Namespace) -> int:
428+
"""Dispatch `nell health show` — pending alarms + recent self-treatments (read-only)."""
429+
persona_dir = get_persona_dir(args.persona)
430+
if not persona_dir.exists():
431+
raise FileNotFoundError(
432+
f"No persona directory at {persona_dir}. Persona {args.persona!r} does not exist."
433+
)
434+
435+
from datetime import UTC, datetime, timedelta
436+
437+
audit_path = persona_dir / "heartbeats.log.jsonl"
438+
cutoff = datetime.now(UTC) - timedelta(days=7)
439+
440+
# Collect recent anomaly records from the audit log.
441+
recent_treatments: list[dict] = []
442+
for entry in read_jsonl_skipping_corrupt(audit_path):
443+
try:
444+
from brain.utils.time import parse_iso_utc
445+
446+
ts = parse_iso_utc(entry["timestamp"])
447+
except (KeyError, ValueError, TypeError):
448+
continue
449+
if ts < cutoff:
450+
continue
451+
for a in entry.get("anomalies") or []:
452+
if isinstance(a, dict):
453+
recent_treatments.append(a)
454+
455+
alarms = compute_pending_alarms(persona_dir)
456+
457+
print(f"Health for persona {args.persona!r}:\n")
458+
459+
print(f" Pending alarms: {len(alarms)}")
460+
for alarm in alarms:
461+
date_str = alarm.first_seen_at.strftime("%Y-%m-%d")
462+
print(
463+
f" {alarm.file}: {alarm.kind} {date_str} "
464+
f"({alarm.occurrences_in_window} occurrences in window)"
465+
)
466+
467+
print(f" Recent self-treatments (last 7 days): {len(recent_treatments)}")
468+
for t in recent_treatments:
469+
try:
470+
from brain.utils.time import parse_iso_utc
471+
472+
ts = parse_iso_utc(t["timestamp"])
473+
ts_str = iso_utc(ts)
474+
except (KeyError, ValueError, TypeError):
475+
ts_str = "unknown"
476+
f = t.get("file", "?")
477+
action = t.get("action", "?")
478+
cause = t.get("likely_cause", "unknown")
479+
qpath = t.get("quarantine_path")
480+
print(f" {ts_str} {f} {action} (cause: {cause})")
481+
if qpath:
482+
print(f" forensic: {qpath}")
483+
484+
return 0
485+
486+
487+
def _health_check_handler(args: argparse.Namespace) -> int:
488+
"""Dispatch `nell health check` — run walk_persona, print per-file status."""
489+
persona_dir = get_persona_dir(args.persona)
490+
if not persona_dir.exists():
491+
raise FileNotFoundError(
492+
f"No persona directory at {persona_dir}. Persona {args.persona!r} does not exist."
493+
)
494+
495+
anomalies = walk_persona(persona_dir)
496+
497+
# Classify anomalies: unhealable vs self-treated.
498+
unhealable = [a for a in anomalies if a.action == "alarmed_unrecoverable"]
499+
healed = [a for a in anomalies if a.action != "alarmed_unrecoverable"]
500+
501+
# Gather all checked file names (from anomalies only — healthy files print OK below).
502+
anomaly_files = {a.file for a in anomalies}
503+
504+
# Print per-file status for anomalies.
505+
for a in healed:
506+
print(f"⚠️ {a.file}: {a.action} ({a.kind})")
507+
for a in unhealable:
508+
print(f"❌ {a.file}: {a.kind} — unrecoverable")
509+
510+
# Healthy files: any JSON file in the walker's default set not in anomalies.
511+
walker_files = [
512+
"user_preferences.json",
513+
"persona_config.json",
514+
"heartbeat_config.json",
515+
"heartbeat_state.json",
516+
"interests.json",
517+
"reflex_arcs.json",
518+
"emotion_vocabulary.json",
519+
"memories.db",
520+
"hebbian.db",
521+
]
522+
for fname in walker_files:
523+
if fname not in anomaly_files and (persona_dir / fname).exists():
524+
print(f"✅ {fname}: OK")
525+
526+
n_healed = len(healed)
527+
n_unhealable = len(unhealable)
528+
is_alarming = n_unhealable > 0
529+
state = "alarming" if is_alarming else "healthy"
530+
print(f"\n{n_healed} file(s) healed, {n_unhealable} unhealable. Brain is {state}.")
531+
532+
return 2 if is_alarming else 0
533+
534+
535+
def _health_acknowledge_handler(args: argparse.Namespace) -> int:
536+
"""Dispatch `nell health acknowledge` — append user_acknowledged entry to audit log."""
537+
from datetime import UTC, datetime
538+
539+
persona_dir = get_persona_dir(args.persona)
540+
if not persona_dir.exists():
541+
raise FileNotFoundError(
542+
f"No persona directory at {persona_dir}. Persona {args.persona!r} does not exist."
543+
)
544+
545+
# Validate: at most one of --file / --all may be set. Default to --all.
546+
has_file = getattr(args, "ack_file", None) is not None
547+
has_all = getattr(args, "ack_all", False)
548+
549+
if has_file and has_all:
550+
print("Error: --file and --all are mutually exclusive.", file=sys.stderr)
551+
return 1
552+
553+
if has_file:
554+
files_to_ack = [args.ack_file]
555+
else:
556+
# Default: --all — acknowledge every pending alarm.
557+
alarms = compute_pending_alarms(persona_dir)
558+
files_to_ack = [alarm.file for alarm in alarms]
559+
560+
import json
561+
562+
entry = {
563+
"timestamp": iso_utc(datetime.now(UTC)),
564+
"user_acknowledged": files_to_ack,
565+
}
566+
audit_path = persona_dir / "heartbeats.log.jsonl"
567+
with audit_path.open("a", encoding="utf-8") as fh:
568+
fh.write(json.dumps(entry) + "\n")
569+
570+
n = len(files_to_ack)
571+
if n == 0:
572+
print("No pending alarms to acknowledge.")
573+
else:
574+
desc = ", ".join(files_to_ack[:3])
575+
if n > 3:
576+
desc += f", ... ({n} total)"
577+
print(f"Acknowledged {n} alarm(s): {desc}")
578+
return 0
579+
580+
399581
def _build_parser() -> argparse.ArgumentParser:
400582
"""Construct the top-level argparse parser with all stub subcommands."""
401583
parser = argparse.ArgumentParser(
@@ -605,6 +787,44 @@ def _build_parser() -> argparse.ArgumentParser:
605787
)
606788
g_log.set_defaults(func=_growth_log_handler)
607789

790+
# nell health show/check/acknowledge — read-only inspection + append-only audit.
791+
# Per Task 13: no restore/add/delete/approve/reject actions are wired.
792+
h_sub = subparsers.add_parser(
793+
"health",
794+
help="Inspect and acknowledge brain health alarms (read-only + audit-append).",
795+
)
796+
h_actions = h_sub.add_subparsers(dest="action", required=True)
797+
798+
h_show = h_actions.add_parser("show", help="Print pending alarms + recent self-treatments.")
799+
h_show.add_argument("--persona", required=True, help="Persona name (required).")
800+
h_show.set_defaults(func=_health_show_handler)
801+
802+
h_check = h_actions.add_parser(
803+
"check", help="Run a full file integrity walk; exit 2 if unhealable alarms exist."
804+
)
805+
h_check.add_argument("--persona", required=True, help="Persona name (required).")
806+
h_check.set_defaults(func=_health_check_handler)
807+
808+
h_ack = h_actions.add_parser(
809+
"acknowledge",
810+
help="Acknowledge pending alarms (appends to audit log; no destructive changes).",
811+
)
812+
h_ack.add_argument("--persona", required=True, help="Persona name (required).")
813+
h_ack.add_argument(
814+
"--file",
815+
dest="ack_file",
816+
default=None,
817+
help="Acknowledge a specific file. Mutually exclusive with --all.",
818+
)
819+
h_ack.add_argument(
820+
"--all",
821+
dest="ack_all",
822+
action="store_true",
823+
default=False,
824+
help="Acknowledge all pending alarms (default if neither --file nor --all is given).",
825+
)
826+
h_ack.set_defaults(func=_health_acknowledge_handler)
827+
608828
return parser
609829

610830

brain/emotion/persona_loader.py

Lines changed: 61 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,70 @@
99

1010
from __future__ import annotations
1111

12-
import json
1312
import logging
1413
from pathlib import Path
14+
from typing import TYPE_CHECKING
1515

1616
from brain.emotion import vocabulary
1717
from brain.emotion.vocabulary import Emotion
1818
from brain.memory.store import MemoryStore
1919

20+
if TYPE_CHECKING:
21+
from brain.health.anomaly import BrainAnomaly
22+
2023
logger = logging.getLogger(__name__)
2124

25+
_DEFAULT_VOCAB: dict = {"version": 1, "emotions": []}
26+
27+
28+
def _default_vocab_factory() -> dict:
29+
return {"version": 1, "emotions": []}
30+
31+
32+
def _vocab_schema_validator(data: object) -> None:
33+
if not isinstance(data, dict) or not isinstance(data.get("emotions"), list):
34+
raise ValueError("emotion_vocabulary schema invalid: missing 'emotions' list")
35+
36+
37+
def load_persona_vocabulary_with_anomaly(
38+
path: Path,
39+
*,
40+
store: MemoryStore | None = None,
41+
) -> tuple[int, BrainAnomaly | None]:
42+
"""Load persona vocabulary with self-healing from .bak rotation if corrupt.
43+
44+
Returns (registered_count, anomaly_or_None).
45+
- Missing file → 0, no anomaly.
46+
- Corrupt file → quarantine, restore from .bak1/.bak2/.bak3 or reset to
47+
empty default. Anomaly set.
48+
- Healthy file → counts of newly registered emotions, no anomaly.
49+
"""
50+
from brain.health.attempt_heal import attempt_heal
51+
52+
if not path.exists():
53+
if store is not None:
54+
_warn_on_referenced_but_unregistered(store)
55+
return 0, None
56+
57+
data, anomaly = attempt_heal(
58+
path, _default_vocab_factory, schema_validator=_vocab_schema_validator
59+
)
60+
61+
if anomaly is not None:
62+
logger.warning(
63+
"emotion_vocabulary anomaly detected: %s action=%s file=%s",
64+
anomaly.kind,
65+
anomaly.action,
66+
anomaly.file,
67+
)
68+
69+
registered = _register_from_data(data)
70+
71+
if store is not None:
72+
_warn_on_referenced_but_unregistered(store)
73+
74+
return registered, anomaly
75+
2276

2377
def load_persona_vocabulary(
2478
path: Path,
@@ -32,34 +86,22 @@ def load_persona_vocabulary(
3286
this twice for the same persona returns 0 the second time.
3387
3488
Missing `path` → returns 0 silently. Fresh personas don't need a file.
35-
Corrupt JSON → returns 0, logs a warning.
89+
Corrupt JSON → quarantine + heal from .bak, log a warning.
3690
Per-entry validation failure → that entry skipped + warning,
3791
other entries proceed.
3892
3993
If `store` is provided, after registration the loader scans memories
4094
for emotion names not in the registry and logs a one-time warning
4195
per missing name pointing the user at `nell migrate --force`.
4296
"""
43-
if not path.exists():
44-
if store is not None:
45-
_warn_on_referenced_but_unregistered(store)
46-
return 0
47-
48-
try:
49-
data = json.loads(path.read_text(encoding="utf-8"))
50-
except json.JSONDecodeError as exc:
51-
logger.warning("emotion_vocabulary at %s could not be parsed: %.200s", path, exc)
52-
return 0
97+
count, _anomaly = load_persona_vocabulary_with_anomaly(path, store=store)
98+
return count
5399

54-
if not isinstance(data, dict) or not isinstance(data.get("emotions"), list):
55-
logger.warning(
56-
"emotion_vocabulary at %s has invalid schema (missing 'emotions' list)",
57-
path,
58-
)
59-
return 0
60100

101+
def _register_from_data(data: dict) -> int:
102+
"""Register emotions from a parsed vocab dict; return count of newly registered."""
61103
registered = 0
62-
for entry in data["emotions"]:
104+
for entry in data.get("emotions", []):
63105
try:
64106
emotion = _entry_to_emotion(entry)
65107
except (KeyError, ValueError, TypeError) as exc:
@@ -75,10 +117,6 @@ def load_persona_vocabulary(
75117

76118
vocabulary.register(emotion)
77119
registered += 1
78-
79-
if store is not None:
80-
_warn_on_referenced_but_unregistered(store)
81-
82120
return registered
83121

84122

0 commit comments

Comments
 (0)