Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog 1.1.0, and this project adheres to Semantic Versioning 2.0.0.

While the project is pre-1.0, breaking changes may land on minor version bumps and will be called out under ### Changed with a migration note.

Unreleased

Added (v0 KG-hybrid foundation — shipped 2026-05-19 → 2026-05-22)

Retrieval — IRCoT 2-round (--ircot). Iterative retrieve-and-reason: round 1 retrieves + reader emits CoT; round 2 re-retrieves with the thought as augmented query; both rounds' chunks fused before the final reader call. +0.09 F1 / +0.07 EM on MuSiQue at gpt-4o-mini reader, n=200. Opt-in, default off. Code: _answer_with_reader_raw, _extract_thought_span, IRCoT path in benchmarks/runner.py. Promoted to library API in v0.1.0.
Retrieval — KG-hybrid path (--kg-retrieval). RRF-fused triple-vector ANN match + hub-weighted two-stage PPR + multi-hop confidence-decayed beam search, resolved to source chunks via the mutual chunk_facts index, RRF-fused with hybrid chunks, reranked, deduped. Lives in benchmarks/retrieval.py:kg_hybrid_neighbors.
Two-stage PPR (PropRAG) in engram.core.kg_retrieval.two_stage_ppr_facts. Stage 1 broad spread at α=0.75 from hub-weighted seeds; Stage 2 narrow focus at α=0.45 re-seeded from Stage 1's top entities. +0.032 F1 vs single-stage on synth_off / no-IRCoT. Replaces single-stage ppr_facts in production kg_hybrid_neighbors.
Multi-hop beam search in engram.core.kg_retrieval.beam_search_facts. Confidence-decayed walk over the in-memory fact graph: per-hop fan-out cap, min edge confidence, hub-aware fan-out for high-degree nodes, frontier expansion gated by path_confidence_floor. Ports Vrin's find_facts_multi_hop to networkx.
Triple-vector ANN match via MemoryBackend.neighbors_facts — second hnswlib index over (s, p, o) triple embeddings, populated automatically by upsert_facts.
KG storage layer in MemoryBackend — 6 new LMDB sub-databases (entity_by_name, entity_aliases, entity_degree_index, chunk_facts, fact_vectors, fact_vector_label), second hnswlib index for fact triples, in-memory networkx.MultiDiGraph mirror of stored facts (only active facts; rebuilt at backend open; updated incrementally on every upsert/update/supersede). Detail in docs/kg-internals.md.
Entity resolution at write (MemoryBackend.resolve_or_create_entity, MemoryBackend.resolve_entities_batch). Normalize → exact LMDB lookup → entity_aliases redirect → fuzzy SequenceMatcher ≥ 0.95 → create new. Ingest-time threshold is strict (0.95) since false positives merge entities permanently; query-time fallback uses 0.80.
EntityRecord Pydantic model in engram.core.models: canonical_name, aliases, entity_type, mention_count, first_seen_at.
Alias persistence — entity_aliases LMDB sub-db; LLM-emitted aliases on ExtractedEntity are persisted automatically.
Pronoun / coreference resolution in extraction prompts. _ENTITY_SYSTEM_PROMPT explicitly forbids pronouns / generic refs as entities. _FACT_SYSTEM_PROMPT has explicit pronoun→entity resolution instructions with two worked examples (Example 6: "She announced..." → "Sarah Martinez announced..."; Example 7: "It stands..." and "The landmark..." both → "Eiffel Tower"). Pure prompt approach — no regex coref.
Literal value sink (engram.core.entities.is_literal_value). Drops facts whose object is a number / year / percentage / date / ratio before they reach the KG. Prevents hundreds of facts sharing object "2024" or "15%" from collapsing onto a shared node and blowing up PPR propagation.
Mutual chunk↔fact indexing. MemoryBackend.upsert_facts auto-mirrors Fact.source_chunk_ids into the chunk_facts LMDB sub-db using composite keys (chunk_id|fact_id). Reverse direction is a single cursor sweep. Read API: MemoryBackend.get_facts_for_chunk.
Fact triple embedding (Phase 4.9). MemoryBackend.upsert_facts embeds each fact's (s, p, o) triple with the same embedder used for chunks and writes to fact_vectors LMDB + the second hnswlib + fact_vector_label LMDB. Cold-path Pass C restructured to batch the upsert post-gather so one embedding API call covers all new facts per ingest cycle.
Phase 1 Jaccard dedup post-rerank. engram.core.scoring.deduplicate_chunks is now wired into hybrid_neighbors and graph_aware_neighbors after Cohere rerank, before the context budget cap. Free quality improvement; uses rerank position as the dedup tiebreaker.
Cohere Rerank 3.5 integration via AWS Bedrock (cohere.rerank-v3-5:0). Default on (--rerank); credentials from the boto3 chain.
enable_synthesis flag. Opt out of the per-chunk synthesis hot path via Engram(enable_synthesis=False) or --disable-synthesis. Saves ~$0.30/1K chunks + ~4 min/1K. Per ablation: synthesis contributes +0.04 F1 only when KG retrieval is also enabled; otherwise pure overhead.
8 entity / retrieval helper functions in engram.core.entities and engram.core.retrieval: normalize_entity_name, entities_match_fuzzy, case_variants, is_literal_value, TraversalConfig, extract_frontier_entities, merge_fact_strategies (RRF fact fusion), dynamic_chunk_cutoff (CAR cluster-gap).
Confidence floor tightened from 0.6 to 0.7 (MIN_FACT_CONFIDENCE) — Vrin parity, prevents low-confidence facts from polluting PPR propagation.
Pinned MuSiQue benchmark fixtures — benchmarks/fixtures/musique_n100_seed1_ids.json and musique_n200_seed1_ids.json for reproducible eval.
scipy>=1.11 added to the memory extra (required by networkx.pagerank for sparse-matrix power iteration). networkx>=3.2 already added in the Phase 3 commit.
Documentation — full docs/ tree with architecture.md, benchmarks.md, configuration.md, kg-internals.md, llm-provider.md, and conceptual deep-dives for IRCoT, KG retrieval, synthesis + extraction, and bi-temporal supersession.

Removed / not shipped (ablated and proven net-negative)

These were implemented on feature branches, ablated against the production v1 config (baseline + IRCoT), and removed because they regressed F1:

Sufficiency judge — Self-RAG / SURE-RAG / L-MARS Judge Agent pattern. F1 regressed -0.04 to -0.13 when stacked on baseline + IRCoT.
CRAG-style chunk filter — Corrective RAG (arXiv:2401.15884) post-rerank LLM relevance filter. F1 regressed -0.13 combined with sufficiency judge.
Tested but kept opt-in / off-by-default: --multi-query, --decompose — measured to regress on top of IRCoT at n=100 and n=200.

Original Phase 1 scaffold (pre-2026-05-19)

Initial repository scaffold: pyproject.toml (hatchling), ruff and pytest configuration, source tree under src/engram/ with sub-packages for core, dialogue, backends, llm, caching, and observability.
Core data models in engram.core.models: Chunk, EnrichedChunk, Fact, Contradiction, CrossReference, and the SourceType, FactType, and RelationKind literal aliases.
Core protocols in engram.core.protocol: Enricher, CorpusBackend, LLMProvider, VectorStore, all runtime_checkable and async-first.
GitHub Actions CI: lint and test matrix on Python 3.11 and 3.12 plus one macOS cell; PyPI publish on v* tags via the trusted-publisher OIDC flow.
Apache 2.0 license, contributor docs (CONTRIBUTING.md, CODE_OF_CONDUCT.md, SECURITY.md), issue and PR templates, and CITATION.cff.
engram.dialogue.temporal: bi-temporal conflict detection ported from the Vrin temporal_consistency_manager. Public surface: detect_conflict, batch_detect_conflicts, select_valid_at, apply_resolution, and the FactConflict dataclass.
engram.dialogue.inference: ReasoningChain and CrossDocumentPattern Pydantic models seeding the multi-hop inference output. LLM-driven chain construction lands in a later release.
engram.core.scoring: jaccard_similarity, deduplicate_chunks, and score_chunk_relevance ported from Vrin's chunk-filter pipeline.
engram.core.protocol.Embedder: split out of VectorStore so callers can mix local Ollama embeddings with a hosted vector index. Takes a kind="document"|"query" parameter for asymmetric models like Cohere v3 and Voyage. VectorStore is now pure ANN over precomputed vectors.
engram.llm.embedders.LiteLLMEmbedder: routes through LiteLLM for OpenAI, Cohere, Voyage, Bedrock, Anthropic, and any other provider in the LiteLLM matrix. Maps kind to the right provider-specific input_type automatically.
engram.llm.embedders.OllamaEmbedder: hits a local Ollama server via the official ollama Python client for fully-local mode.
engram.backends.memory.MemoryBackend: LMDB + hnswlib backed CorpusBackend. The zero-config default. Persists to disk; composite-key indexes on s, p, sp, o, st axes; async wraps sync via asyncio.to_thread. Reseats hnsw labels from zero on reopen.
engram.llm.litellm_provider.LiteLLMProvider: single LLMProvider composing LiteLLM for routing plus Instructor for Pydantic-validated structured extraction plus tenacity for retries. First-class Anthropic prompt-cache support via the cache_breakpoints parameter.
engram.observability.tracing: OpenTelemetry + OpenInference span scaffolding. backend_span and llm_span async context managers with TOOL / LLM OpenInference kinds, engram.cache_breakpoint_count attribute, and a no-op tracer fallback when opentelemetry-api is not installed.

Changed

CorpusBackend.find_facts is now keyword-only and accepts a status filter (defaults to "active"). Pass status=None to include every lifecycle state.
CorpusBackend gains update_fact(fact_id, properties) for in-place patches that should not create supersession history.
LLMProvider.complete and extract accept a cache_breakpoints sequence of message indices. Providers that don't support prompt caching ignore the field.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Changelog

Unreleased

Added (v0 KG-hybrid foundation — shipped 2026-05-19 → 2026-05-22)

Removed / not shipped (ablated and proven net-negative)

Original Phase 1 scaffold (pre-2026-05-19)

Changed

Uh oh!

FilesExpand file tree

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Changelog

Unreleased

Added (v0 KG-hybrid foundation — shipped 2026-05-19 → 2026-05-22)

Removed / not shipped (ablated and proven net-negative)

Original Phase 1 scaffold (pre-2026-05-19)

Changed