Skip to content

Latest commit

 

History

History
326 lines (249 loc) · 9.61 KB

File metadata and controls

326 lines (249 loc) · 9.61 KB

Memory Compounding Engine (Weekly Learning + Outcomes) Implementation Plan

Status note (2026-03-08): Active roadmap order, blockers, and contributor priority live in the Engram Feature Roadmap. This file is historical design context and may describe scope that is only partially shipped.

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Make Engram-powered agents learn week-over-week from approvals/rejections and outcomes, producing durable institutional knowledge (mistakes, rubrics, shared rules) that changes future behavior.

Architecture: Append-only feedback/outcome logs + a weekly synthesis job that produces curated artifacts (mistakes, rubrics, promoted memories). Runtime recall reads these artifacts as constraints and soft biases.

Tech Stack: TypeScript (Engram plugin), shared-context filesystem artifacts, OpenClaw cron for weekly synthesis, optional LLM usage via the same provider selection as Engram extraction (including robust local LLM).


Requirements (Decisions Locked)

  • Ship in one phase with outcomes support from day one.
  • Provider/model selection: reuse Engram extraction provider selection and failover rules.
  • Storage must be append-only for raw feedback/outcomes.
  • Every synthesized rule must be auditable: link back to source feedback entry IDs and original artifact(s).

Data Model

Raw Feedback Entry (Append-Only JSONL)

File:

  • shared-context/feedback/inbox.jsonl

Each line JSON:

  • id (uuid)
  • ts (ISO8601)
  • agent (string)
  • artifact (object)
  • artifact.kind (agent_output | recommendation | memory_recall | other)
  • artifact.ref (path or stable identifier)
  • decision (approved | rejected | approved_with_feedback)
  • reason (string, required)
  • learning (string, optional)
  • outcomes (object, optional)
  • outcomes.* (numeric/string fields like deal_value_usd, conversion_lift_pct, time_saved_minutes)
  • tags (string[], optional)

Rules:

  • Strict schema validation on write.
  • No secrets.
  • Prefer numeric outcomes with explicit units in the key name.

Weekly Rollup Artifacts

Output paths:

  • shared-context/feedback/weekly/<YYYY-WW>.md
  • shared-context/feedback/weekly/<YYYY-WW>.json
  • shared-context/mistakes.json
  • shared-context/rubrics/<agent>.md

Weekly JSON should include:

  • week
  • stats (counts by decision, top tags)
  • top_mistakes (with provenance)
  • top_rubric_updates (with provenance)
  • outcome_summary (aggregations)

Mistakes (Do-Not-Repeat)

File:

  • shared-context/mistakes.json

Structure:

  • version
  • updatedAt
  • patterns: array of:
  • patternId
  • scope (all-agents or agent name)
  • rule (imperative, concise)
  • rationale (1-2 lines)
  • provenance (array of feedback entry IDs)
  • outcomeEvidence (optional summary)

Rubrics (What “Good” Looks Like)

Files:

  • shared-context/rubrics/<agent>.md

Structure:

  • short checklist
  • examples of approved patterns
  • anti-patterns from rejections
  • optional scoring rubric (ICE/confidence)

Runtime Behavior

Pre-Action Injection

On each agent run (or Engram recall hook):

  • Read shared-context/mistakes.json and the relevant rubric.
  • Inject as:
    • hard constraints (“DO NOT ...”)
    • formatting requirements (“Always include confidence score ...”)

Soft Bias

  • If retrieval/rerank is active, apply a small boost to rubric-linked memories and a penalty to repeated anti-patterns.

Weekly Synthesis Job

Inputs

  • Recent feedback entries from shared-context/feedback/inbox.jsonl
  • Recent roundtable outputs (optional)
  • Relevant Engram memories and entities (optional): use QMD similarity search to cluster feedback by entity/topic before synthesis.

Outputs

  • Weekly rollup markdown + JSON
  • Updated mistakes.json (add/merge patterns)
  • Updated rubrics
  • Optional promotion into shared memory namespace (when v3/v4 are available)

LLM Usage

  • Optional but recommended for:
    • clustering rejections by reason
    • distilling “learning” rules
    • summarizing outcome evidence
  • Must be timeboxed and fail-open.

Implementation Tasks (High Level)

Task 1: Define config surface

Files:

  • Modify: extensions/openclaw-engram/src/types.ts
  • Modify: extensions/openclaw-engram/src/config.ts (if present)
  • Modify: extensions/openclaw-engram/README.md

Add flags:

  • sharedContextEnabled
  • sharedContextDir (optional override)
  • compoundingEnabled (default false)
  • compoundingWeeklyCronEnabled (default false)
  • compoundingSemanticEnabled (default false, timeboxed)

Task 2: Add tool(s) to record feedback + outcomes

Files:

  • Modify: extensions/openclaw-engram/src/tools.ts
  • Create: extensions/openclaw-engram/src/compounding/feedback.ts
  • Test: extensions/openclaw-engram/tests/*

Tool:

  • shared_feedback_record (or memory_feedback_record): append-only, schema validated.

Task 3: Add weekly synthesis runner

Files:

  • Create: extensions/openclaw-engram/src/compounding/synthesizer.ts
  • Modify: extensions/openclaw-engram/src/index.ts
  • Modify: extensions/openclaw-engram/src/orchestrator.ts

Provide:

  • callable function for cron agentTurn.
  • writes weekly artifacts and updates mistakes/rubrics.

Task 4: Runtime injection of mistakes/rubrics

Files:

  • Modify: extensions/openclaw-engram/src/index.ts (before_agent_start)

Task 5: Tests

  • schema validation
  • append-only enforcement
  • weekly rollup generation on fixed input
  • injection formatting and caps

Implementation Tasks (Execution-Grade)

Task 1: Config + Paths (Shared-Context Dependency)

Dependency: v4.0 shared-context path resolution (SharedContextPaths).

Files:

  • Modify: src/types.ts
  • Create: src/compounding/paths.ts (thin wrapper around shared-context paths)
  • Test: tests/compounding-paths.test.ts

Step 1: Write failing tests

  • compoundingEnabled=false means no file IO on startup/recall.
  • When enabled, compounding paths resolve under shared-context/ and never escape.

Step 2: Implement config

  • compoundingEnabled (default false)
  • compoundingWeeklyCronEnabled (default false)
  • compoundingSemanticEnabled (default false)
  • compoundingSynthesisTimeoutMs (timebox)
  • compoundingInjectEnabled (default true when compounding enabled)
  • Provider selection: reuse Engram extraction provider selection/failover rules.

Task 2: Feedback Entry Schema + Append-Only Writer

Files:

  • Create: src/compounding/schema.ts
  • Create: src/compounding/writer.ts
  • Test: tests/compounding-writer.test.ts

Step 1: Write failing tests

  • Valid entry writes one JSONL line.
  • Invalid entry is rejected with a concise error.
  • Append-only enforcement: writer never truncates the file.

Step 2: Implement

  • Zod schema for feedback entries (exact fields in this plan).
  • Writer:
  • ensures directory exists
  • appends a single line with newline
  • strips/blocks obviously sensitive keys if present (defense-in-depth).

Task 3: Tool Surface To Record Feedback + Outcomes

Files:

  • Modify: src/tools.ts
  • Create: src/compounding/tool.ts
  • Test: tests/compounding-tool.test.ts

Tool

  • shared_feedback_record
  • Params:
  • agent
  • artifactKind
  • artifactRef
  • decision
  • reason
  • learning?
  • outcomes? (object)
  • tags?

Behavior

  • Validates + appends to shared-context/feedback/inbox.jsonl.
  • Returns NO_REPLY-style minimal output to avoid chat spam when used from automations.

Task 4: Weekly Synthesizer (Deterministic Baseline)

Files:

  • Create: src/compounding/synthesizer.ts
  • Create: src/compounding/rollup.ts
  • Test: tests/compounding-weekly.test.ts

Step 1: Write failing tests

  • Given a fixture JSONL with mixed decisions + outcomes:
  • produces <YYYY-WW>.md and <YYYY-WW>.json
  • updates mistakes.json deterministically (stable ordering)
  • writes/updates rubrics/<agent>.md

Step 2: Implement deterministic synthesis

  • Group by agent + tag + decision.
  • Extract “mistakes”:
  • take repeated rejection reasons and normalize into short imperative rules.
  • Summarize outcomes:
  • basic aggregations (sum/count/avg for numeric fields; keep keys stable).

Step 3: Auditing/provenance

  • Every mistake/rubric update must include feedback entry IDs.

Task 5: Optional Semantic Enhancer (Default-Off, Same Provider As Extraction)

Files:

  • Modify: src/compounding/synthesizer.ts
  • Test: tests/compounding-semantic.test.ts

Behavior

  • If enabled:
  • propose candidate clusters deterministically first
  • call LLM to refine:
  • merge duplicate rules
  • produce clearer “learning” phrasing
  • generate short rationales and outcome evidence summaries
  • Timeboxed + fail-open.

Task 6: Runtime Injection (Mistakes + Rubrics)

Files:

  • Modify: src/index.ts
  • Create: src/compounding/inject.ts
  • Test: tests/compounding-inject.test.ts

Behavior

  • If enabled:
  • inject relevant mistakes.json and rubrics/<agent>.md under strict caps.
  • Must not block message processing; fail-open.

Task 7: Cron Entry Point (No Auto-Register By Default)

Files:

  • Modify: src/tools.ts
  • Modify: README.md

Behavior

  • Add a tool shared_compound_weekly (or similar) that runs synthesis for the current week.
  • Document recommended cron snippet (agentTurn, delivery none).

Task 8: Docs + Verification

Files:

  • Modify: README.md
  • Create: docs/compounding.md

Verification Run:

npm test
npm run check-types
npm run build

Expected: PASS.

References