Canonical source of truth. This file (
AGENTS.md) is the single source of truth for AI coding assistant context.CLAUDE.md(Claude Code) and.cursorrules(Cursor) are symlinks to it. Pattern adopted from Mirascope and scikit-learn.
If you are an AI assistant (Claude Code / Cursor / Copilot / Codex / …) onboarding to this repository, read this file first, then docs/concepts/architecture.md for the full architecture before touching code.
EverAlgo is an algorithm library for memory extraction and retrieval — not a service, not a framework.
- Algorithm-only. All memory extraction / fusion / re-ranking strategies live here. The library is stateless: it does not connect to databases, does not read or write the filesystem, does not own any business state.
- Two paths. Every operator belongs to one of two read/write paths whose contracts are symmetric (stateless, in-memory I/O):
- Extract — write path. Input: structured units (e.g.
MemCell). Output: structured memories (Episode/Profile/Case/Skill/ …). - Retrieve — read path. Input: query + caller-injected
RetrieveFn/RerankFncallables. Output: a ranked memory list. Theeveralgo-rankpackage serves as both the retrieval facade and the underlying ranking toolkit: it exposes four strategies — hybrid (dual-route RRF), agentic (LLM-guided sufficiency + multi/refined query wrapper over any base), cluster (cluster-based recall expansion), and maxsim (MaxSim nearest-neighbour reranking) — alongside the lower-level ranking primitives (rrf / lr / vector_anchored fusion, weight helpers, LLM-based rerank). Caller binds storage / model clients inside itsRetrieveFn/RerankFn; algo never touches persistence.
- Extract — write path. Input: structured units (e.g.
- Orchestration is upstream. When to call, in what order, with what concurrency, persistence to the markdown filesystem — all owned by EverOS. EverAlgo does not care whether the caller is open-source or cloud commercial; both paths share this code.
For the rationale and deeper background, read docs/concepts/architecture.md
everalgo/ # monorepo, uv virtual workspace
├── pyproject.toml # workspace root, [tool.uv] package = false
├── uv.lock # generated by `uv sync` — do not hand-edit
├── AGENTS.md ← you are here # CLAUDE.md and .cursorrules are symlinks
├── README.md
├── LICENSE # Apache-2.0
├── .gitignore .gitlab-ci.yml cliff.toml .pre-commit-config.yaml
├── docs/
│ ├── concepts/ # high-level architecture notes
│ └── api/ # API reference (per-distribution)
├── examples/ # runnable quickstart scripts (01–07, use FakeLLMClient)
├── packages/
│ ├── everalgo-core/ # types, llm (+ providers), prompts, testing
│ ├── everalgo-boundary/ # detect_boundaries + DetectionResult + workspace stub
│ ├── everalgo-clustering/ # cluster_by_geometry / cluster_by_llm over list[Cluster]
│ ├── everalgo-rank/ # 4 rankers + fusion / weight / rerank toolkit
│ ├── everalgo-parser/ # multimodal raw-file → ParsedContent (EXPERIMENTAL stub)
│ ├── everalgo-user-memory/ # BoundaryDetector + Episode / Foresight / AtomicFact / Profile
│ ├── everalgo-agent-memory/ # AgentBoundaryDetector + AgentCase / AgentSkill / AgentProfile
│ └── everalgo-knowledge/ # KnowledgeExtractor + aclassify_category (file-based knowledge extraction)
├── benchmarks/ # internal LoCoMo benchmark suite ([tool.uv] package = false, not published)
└── tests/
Eight distributions share the everalgo.* namespace through PEP 420 native namespace packages: every packages/*/src/everalgo/ directory deliberately omits __init__.py, while subpackages (everalgo/<subpkg>/__init__.py) are regular packages. This is the PyPA-recommended layout for Py3-only + pip-installed projects, and it lets from everalgo.user_memory import EpisodeExtractor work even when everalgo-user-memory and everalgo-boundary are installed from different distributions. Industrial precedents: google-cloud-* (100+ dists sharing google.cloud.*) and sphinxcontrib-* (6 official Sphinx-extension dists sharing sphinxcontrib.*).
The dev workflow is built on a uv virtual workspace ([tool.uv] package = false at the root, members under packages/*). Same shape: Apache Airflow (100+ workspace members, single root lockfile) and pydantic-ai. Note these two projects are uv-workspace references only — Airflow's airflow.providers.* is pkgutil-style legacy namespace, not PEP 420; pydantic-ai uses three independent namespaces, not one shared. LangChain and LlamaIndex are referenced for the monorepo layout only — neither uses uv workspace itself; they keep per-package venvs and lockfiles.
Dependency topology (see docs/concepts/architecture.md for the full graph and rationale):
everalgo-core
▲
┌────────────┬────────────┬──┴───────────┐
│ │ │ │
boundary clustering rank parser
▲ ▲ ▲
│ │ │
user-memory agent-memory knowledge
Edges (arrow → dependency; every package also depends on core):
user-memory → boundary
agent-memory → boundary, clustering
knowledge → parser
# Prerequisites: Python 3.12+ and uv (https://docs.astral.sh/uv/).
git clone git@github.com:EverMind-AI/EverAlgo.git
cd everalgo
# Install all workspace packages editable into a shared venv (includes dev tools).
uv sync --all-packages --group dev
# Run tests across the workspace.
uv run pytest
# Lint + format checks.
uv run ruff check .
uv run ruff format --check .
# Type check (both checkers — they catch different things).
uv run mypy .
uv run pyrightWorking on a single package? Sync only that package's dependencies:
uv sync --package everalgo-clustering
uv run pytest packages/everalgo-clustering/tests/Try any operator offline using the bundled examples (no API key required):
uv run python examples/01_boundary_chat.py # Chat → MemCell
uv run python examples/03_user_memory_episode.py # MemCell → Episode
uv run python examples/04_agent_memory_case.py # Agent trajectory → AgentCase
uv run python examples/06_full_user_memory_pipeline.py # Full pipelineThe repo ships a .pre-commit-config.yaml that runs ruff check --fix + ruff format + a set of standard sanitisers (trailing whitespace, EOF newline, merge-conflict markers, large files, line endings, YAML / TOML syntax) on every commit. This matches the workflow used by sklearn, pydantic, dspy, langchain, pandas, numpy.
Install + verify after every clone (this is per-clone state, NOT stored in the repo — every new clone / new dev machine starts with hooks disabled):
uv sync --all-packages --group dev # pulls pre-commit into the workspace venv
uv run pre-commit install # creates .git/hooks/pre-commit
ls -la .git/hooks/pre-commit # MUST exist and be executableIf the third command shows No such file or directory, the install step silently failed and every git commit will silently bypass lint. Fix before doing any work.
uv run pre-commit run --all-files is a manual invocation. It validates that the hook configuration is healthy but says nothing about whether git commit will actually trigger it. The hook only fires automatically when .git/hooks/pre-commit exists and is executable.
This trap is real: running --all-files and seeing "9/9 Passed" can mask a missing hook for an entire sprint, while every git commit quietly bypasses lint and surfaces only when CI rejects a violation that the hook would have caught locally. Always ls .git/hooks/pre-commit after install to verify.
Run against the whole tree before opening an MR (catches anything you might have committed before the hook was installed):
uv run pre-commit run --all-filesUpdate pinned hook versions periodically:
uv run pre-commit autoupdatemypy/pyright— strict type-checks over the 8-package PEP 420 workspace each take several seconds per run and would make commit feel sluggish; enforced by CI instead (pydantic / sklearn / openai-python / anthropic-sdk-python do the same).pytest— same reason. CI is the gate.
Pre-commit fires at commit time. For per-keystroke feedback, install the ruff editor plugin too:
- VSCode / Cursor: install the Ruff extension. Enable format-on-save so the editor runs
ruff check --fixandruff formatautomatically. - PyCharm / IntelliJ: install the Ruff plugin.
- Vim / Neovim: configure ruff through your LSP setup (e.g.
ruff-lspor built-in LSP vianvim-lspconfig).
The CI pipeline (.gitlab-ci.yml) re-runs ruff check . + ruff format --check . + mypy . + pyright on every MR as the load-bearing fallback. Both type-checkers run because they catch slightly different things — same dual-checker setup used by openai-python and anthropic-sdk-python. Pre-commit and editor coverage are about latency of feedback; CI is the gate.
| Action | Command |
|---|---|
| Install workspace (editable) | uv sync --all-packages --group dev |
| Run all tests | uv run pytest |
| Run a specific test | uv run pytest path/to/test.py::test_name -v |
| Lint | uv run ruff check . |
| Format | uv run ruff format . |
| Type-check (mypy) | uv run mypy . |
| Type-check (pyright) | uv run pyright |
| Build a single distribution | cd packages/everalgo-core && uv build |
| Add a runtime dep to one package | uv add --package everalgo-clustering numpy |
| Add a dev tool to the workspace | uv add --group dev pytest-asyncio |
Reference: uv workspace documentation.
The full rationale lives in docs/concepts/architecture.md. Hard rules:
- Naming contract —
aprefix means async. Methods namedaextract/arank/adetect/aparseare native async (do real I/O — LLM, network, …); call them withawait. Methods without theaprefix (rank,extract,count_tokens,rrf, …) are sync (pure compute, no I/O); call them directly. Same convention asdspy.acall/litellm.acompletion/instructor.AsyncInstructor. The one exception isLLMClient.chat(a caller-injected client Protocol, not an operator): it is async without theaprefix, mirroring the OpenAI SDK client interface. - I/O operators: async-first + sync bridge. Native async via
asyncio; sync version is derived throughasgiref.async_to_syncfor non-event-loop callers (CLI scripts, plain unit tests). Never call the sync bridge from inside a running event loop. - Pure-compute operators: sync only. No async wrapper for
fusion.rrf,_tokenize.count_tokens, clustering distances, etc. Mirrors numpy / scipy / sklearn / pandas conventions. - Prompts as Python string modules. Concrete prompt strings live in
<subpkg>/prompts/{en,zh}/<name>.pyas module-level constants. Editing a prompt = editing a.pyfile. No external.md/.yaml/.tomlprompt stores. Caller customisation: per-callprompt=argument (fine-grained) or monkey-patching the module constant at startup (coarse-grained). Protocolfor typing, notABC. EverAlgo operators are stateless; implementations do not need to subclass anything.- No dependency injection in algorithm code. Module-level functions + global config + monkeypatch in tests. Algorithm authors should be one keystroke away from running their code; do not impose framework ceremony.
- Sync bridge for I/O operators: write
extract = async_to_sync(aextract)one-liner; do not introduce aDualInterfacemixin. This keeps type inference predictable, avoids metaclass magic. Theasync_to_synchelper comes fromasgiref.sync. - Lint configuration. Workspace-wide ruff is configured in the root
pyproject.toml(line-length = 120, target version inferred fromrequires-python = ">=3.12", rule set derived from the pytorch + pydantic-ai intersection). Google-style docstrings — aligns with Google Python Style Guide.Args:/Returns:/Raises:sections, no type repetition in the body (type annotations in the signature are authoritative). - Logging discipline. On the LLM / I/O path use
logger = logging.getLogger(__name__)with lazy%-format (logger.debug("count=%d", n)— never f-strings inside log calls) andlogger.exception(...)insideexceptblocks. For user-behaviour problems and deprecations usewarnings.warn(..., stacklevel=2); for pure-algorithm errorsraise ValueError(...)with a detailed message (numpy style —shapes (3,4) and (5,6) not aligned, etc.). Every public subpackage__init__.pyalready attaches aNullHandler;everalgo.llmcarries a default-onSensitiveHeadersFilter. Forbidden in library code:logging.basicConfig,addHandler(anything butNullHandler),setLevel, explicitpropagate = True/False, and any module-levellogging.warning(...)/logging.error(...)/logging.getLogger()(no-arg) /logging.root.*— these all target the root logger and are an application's job. Forbidden in DEBUG logs: request / response bodies, prompt text, model outputs (the Filter only redacts headers; bodies leak PII the Filter cannot see). Performance timing is the user's job (cProfile/line_profiler/%timeit); the library does not log durations. ruff rule setsG+LOG+TRYenforce these at lint time. - Use the full
line-length = 120budget when hand-wrapping. For Python comments / docstrings and TOML/YAML comments, fill each line to roughly 100–115 characters before wrapping — do not pre-wrap at 70 / 79 / 80 / 88 / 100 out of habit.E501is in the ignore list, and ruff never flags lines that are too short, so this is a writer's discipline rather than a lint check. A 3-line comment that collapses cleanly into 2 lines at 120 should be 2 lines. Exceptions: bullet lists, code blocks, and any line where a natural break aids comprehension. Markdown files in this repo deliberately use the one-paragraph-per-line (no hard-wrap) style — the same convention Prettier emits by default and GitHub renders cleanly — so.mdprose is exempt from the 100–115 rule entirely; rely on editor soft-wrap. - English only in code, config, and commit messages. All Python code, comments, identifiers,
pyproject.tomlcomments, CI files, and commit messages must be English. The same rule that EverOS enforces with a pre-commit hook applies here. Content underdocs/must be English as well.
Branching: trunk-based (see DSPy / scikit-learn / instructor / pydantic — the four reference Python algorithm libraries all do this; no GitFlow).
mainis the only long-lived branch. It is GitLab-protected (Settings → Repository → Protected branches): direct push is denied for everyone; the only path to land changes onmainis via Merge Request.- Feature work happens on short-lived branches:
feat/<topic>,fix/<bug>,docs/<topic>,refactor/<topic>. Open an MR → squash-merge intomain. - Release = tag on
mainusing SemVer per distribution:everalgo-clustering/v0.2.0. Each distribution has its own version cadence — the independent-versioning model used bygoogle-cloud-python(many distributions sharinggoogle.cloud.*, each on its own tag) and Apache Airflow providers; seedocs/concepts/architecture.mdandREADME.md"Cutting a release". - Maintenance branches (
0.1.X-fixes) are introduced only when a published version needs back-ports; not by default.
Commit messages: Gitmoji + Conventional Commits. Format: <emoji> <type>(<scope>): <description>.
✨ feat(clustering): add cluster_by_llm decision prompt zh variant
🐛 fix(boundary): correct token count for emoji-only chat segments
♻️ refactor(rank): extract shared fusion helper from case / skill rankers
✅ test(user-memory): cover EpisodeExtractor tail-merge edge case
📝 docs(design): clarify cluster_previews shape
Allowed types: feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert.
MR title is load-bearing. GitLab is configured (Settings → Merge Requests → Squash commit template = %{title}) so the MR title lands verbatim as the squash commit on main. MR titles must match the format above, because the release-notes generator (git cliff, see cliff.toml + README.md "Cutting a release") parses these messages to assemble per-distribution CHANGELOGs.
Scope = distribution name without the everalgo- prefix. Use clustering / rank / core / boundary / parser / user-memory / agent-memory / knowledge. For cross-cutting changes (CI, monorepo tooling, root docs), use ci / release / repo / design / docs as the scope or omit the scope entirely.
Squashing matters for per-distribution filtering. git cliff --include-path 'packages/everalgo-<name>/**' filters commits by changed paths. Squash merges keep one commit = one MR = one scoped Conventional-Commit message, which is the unit git-cliff groups by.
CHANGELOG [Unreleased] entry is part of the MR. Every MR that adds, changes, or removes user-visible behaviour must include a one-line entry in packages/everalgo-<dist>/CHANGELOG.md under ## [Unreleased]. Written by the MR author (the person with the most context), not reconstructed at release time. See docs/releasing.md "Keeping [Unreleased] up to date" for format and scope rules.
Follow this checklist when introducing a new extractor / ranker / clusterer:
- Pick the subpackage. Decide which
packages/everalgo-<dist>/src/everalgo/<subpkg>/the operator lives in based on its product axis (user_memory / agent_memory / knowledge) or tool axis (boundary / clustering / rank / parser). When in doubt, readdocs/concepts/architecture.md - Create the module.
<subpkg>/<operator>.py— module-level functions or one stateless class. Operators need not subclass anything; if an operator consumes an injected client, type it against the relevant Protocol where that Protocol is defined (e.g.everalgo.llm.protocols.LLMClient,everalgo.rank.protocols). - Write the prompt(s). If the operator calls an LLM, drop prompt strings as module-level constants in
<subpkg>/prompts/en/<operator>.py(andzh/<operator>.pyfor the Chinese variant when applicable). - Re-export the public surface. If the operator is part of the public API of its facade subpackage, add it to
<subpkg>/__init__.py's re-export block and__all__. Seedocs/concepts/architecture.mdfor the re-export pattern. - Wire dependencies. If the new code requires a new third-party library, add it via
uv add --package everalgo-<dist> <library>, which updates the rightpyproject.toml. - Write tests. Use
everalgo.testing.FakeLLMClientto avoid real API calls; useeveralgo.testing.assert_*_shapefor structural memory checks. - Update the CHANGELOG. Add a one-line entry under
## [Unreleased]inpackages/everalgo-<dist>/CHANGELOG.mddescribing the new operator (subsection### Added). Seedocs/releasing.mdfor format. - Run lint + format + type-check + tests locally before raising the MR (
uv run ruff check . && uv run ruff format --check . && uv run mypy . && uv run pyright && uv run pytest).
Providers live inside everalgo-core's everalgo/llm/providers/<provider>/ (per ADR 004 — providers are nested in llm, not a separate distribution; the convention follows litellm / instructor / dspy / llama-index).
- Create
everalgo/llm/providers/<provider>.py. - Implement the
LLMClientProtocol fromeveralgo.llm.protocols— a singleasync def chat(...) -> ChatResponsemethod (no sync variant, no streaming). - Wire the provider into
everalgo/llm/factory.py::build_client(it currently constructsOpenAICompatClientdirectly; add provider selection there — there is no separaterouting.py). - Map provider-native exceptions onto
LLMError(chain viaraise LLMError(...) from original). - Add per-provider prompts only if the provider needs special formatting (rare — most providers are OpenAI-compatible).
- Add tests under
packages/everalgo-core/tests/llm/providers/test_<provider>.py. No mocks at the HTTP layer when a real key is available in CI; otherwise userespxto record fixtures. - Update the CHANGELOG. Add an entry under
## [Unreleased]inpackages/everalgo-core/CHANGELOG.md(subsection### Added). Seedocs/releasing.mdfor format. - Update
docs/concepts/architecture.mdandAGENTS.mdif the public surface changes.
asyncio_mode = "auto"is set workspace-wide (see rootpyproject.toml); plainasync def test_*()works without decorators.- Use
everalgo.testing.fake_llmfor deterministic LLM replays. Do not stub at the HTTP layer in unit tests — that breaks when the provider tweaks its protocol. - Cross-package integration smoke tests belong in the workspace-root
tests/directory. Per-distribution unit tests should colocate underpackages/everalgo-<name>/tests/once a distribution grows enough mass (mirroring pydantic-ai'spydantic_ai_slim/tests/). - No real network calls in default
pytest. Mark provider-network tests with@pytest.mark.integrationand gate them behind an env var.
| Subject | Where |
|---|---|
| Architecture (definitive) | docs/concepts/architecture.md |
| High-level architecture notes | docs/concepts/ |
| Runnable operator examples | examples/ — use FakeLLMClient, no API key needed |
| Source of EverOS contract | Confluence (internal) |
| uv workspace concepts | https://docs.astral.sh/uv/concepts/projects/workspaces/ |
| PEP 420 namespace packages | https://peps.python.org/pep-0420/ |
| PEP 8 (style) / 257 (docstrings) / 484 (type hints) | https://peps.python.org/ |
| Conventional Commits | https://www.conventionalcommits.org/ |
| Gitmoji | https://gitmoji.dev/ |
This file is the contract between human engineers and AI assistants on this repo. When you change it, please:
- Keep it the canonical copy.
CLAUDE.mdand.cursorrulesshould remain symlinks. - Cite a source for every concrete decision — a
docs/concepts/architecture.mdsection, an ADR, a public spec / star-project URL. No groundless claims. - Whenever the repository structure or workflow changes, update (layout) and-§4 (commands) in the same MR.