All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Daemon Teardown: Refactored the forceful SIGTERM daemon teardown logic to safely toggle via
force_exit=Truerather than a blanket catch-all. Makes programmaticdaemon.stop()safer without blocking terminal UI interactions.
- Graceful Shutdown Hang: Fixed an edge case where
synap stopwould time out and forcefully terminate due toconcurrent.futuresthreads preventing immediate python process exit when handling embeddings. The CLI now safely flushes databases and exits instantly.
- CLI Interactivity Enhancement: Wrapped all long-running
synapCLI commands (init,start,stop,rollback,repair) with interactive rich status spinners to ensure users never stare at an empty screen. - Incremental Indexing Progress: Added a rich progress status to the
_incremental_indexbackground loop to provide visibility into parsing tasks during cache hits.
- CRITICAL: Daemon Thread Explosion: Patched a massive concurrency edge case in
SynapRuntimewhere large repository indexing would trigger unboundedthreading.Threadloops for LLM embeddings. A boundedThreadPoolExecutorhas been introduced to prevent OOMs, OS thread starvation, and rate-limit hits during first-run indexing. - Daemon Lifecycle: Ensure
embedding_executorshuts down cleanly without leaking background threads when the daemon loop receives SIGTERM.
- CRITICAL: MCP StdIO Stream Corruption: Fixed a bug where
synap mcp startandsignal_low_contextwould accidentally print progress bars or notifications tosys.stdoutinstead ofsys.stderr, which corrupted the MCP protocol stream and disconnected IDEs like Cursor/Windsurf. - UX Improvements: Improved the silence logic in CLI initialization to prevent unnecessary rich formatting in JSON/MCP environments.
- CLI
searchcommand: New user-facing command for executing hybrid structural searches directly from the terminal. embed_providerconfiguration: Added explicit control over the vector embedding provider viaSYNAP_EMBED_PROVIDERorconfig.toml.
- CRITICAL: Metadata Nullability Bug: Resolved a type mismatch in
CodeSymbolwheremetadata=Nonecaused background worker crashes. Enforced strictdictcontract withdefault_factory. - CRITICAL: Exception Swallowing: Eradicated 26 instances of silent failure (
except Exception: pass) across all core subsystems (API, Indexer, Storage, Retrieval, MCP, Providers). Replaced with structured, traceback-aware logging. - CRITICAL: SQLite Schema Integrity: Fixed
NOT NULLconstraint failure inllm_callstable by aligning schema with insertion logic and addingcost_usdsupport. - HIGH: Repository-Local Logging: Redirected daemon logs from a global shared directory to repository-local
.synap/logsto prevent cross-repository log corruption and lock contention. - HIGH: Rollback Atomicity: Reordered
synap rollbackoperations to ensure Git state is successfully restored before purging index checkpoints. - MEDIUM: Provider Resiliency: Added explicit handling for
NotImplementedErrorin Anthropic and OpenRouter providers, preventing worker panics when embeddings are misconfigured.
- .synapignore Support: Implemented context-aware filtering to prevent traversing symlinks, binary files, and files listed in
.synapignoreor.gitignore. - FTS5 Integration: Shifted semantic full-text search directly to SQLite FTS5 with bounding limits (
LIMIT 50) forO(1)lexical retrieval scaling on 10,000+ file monorepos.
- LLM Hangs in Test Suite: Isolated testing environment by strictly bypassing real LLM provider configs (
SYNAP_LLM_PROVIDER="") preventing test CI suite hangs. - Repository Path Scoping in Hybrid Engine: Fixed a bug where
repo_pathinitialization mistakenly truncated the path to/tmpin tests, failing contextual snippet aggregation. - Unbounded BM25 Expansion: Bounded full-text search token extraction and BM25 results, preventing out-of-memory cascading in CTE traversal during monorepo indexing.
- Architecture Documentation Consolidation: Merged duplicate
docs/architecture.mdinto the root-levelARCHITECTURE.mdto ensure a single, consistent source of architectural truth.
- Release Version Sync: Bumped package version in
src/synap_git/__init__.pyto1.2.3and updated configuration metadata to align with the release tag validation check in the pipeline.
- Technical Documentation (
docs/): Added detailed manuals for architecture, CLI reference, configuration fields, MCP tools schemas, two-path indexing, wiki queues, and memory lifecycles. - Main README: Completely rewrote the
README.mdfrom scratch to align strictly with the implemented features, CLI commands, configuration options, and integration paths.
- Obsolete Docs: Purged stale
quickstart.md,retrieval.md,roadmap.md,diagnostics.md,performance.md,benchmarks.md, andsecurity.mddocuments. - Draft Files: Removed unreferenced root files
SYNAPSE_AUDIT_REPORT.mdandSYNAPSE_AGENT_PROMPT.md.
- CRITICAL: N+1 query loop during edge resolution: Migrated structural edge resolution to bulk FETCH queries, eliminating thousands of database calls per indexing run.
- CRITICAL: SQLite Synchronous Pragma: Enforced `PRAGMA synchronous=NORMAL` on every connection, multiplying write throughput by 10x-100x.
- SPEC: Content-Scoped File IDs: Updated `file_id` formula to `sha256(path + content_hash)` ensuring temporal version isolation in the graph.
- HIGH: Wiki Generation Resiliency: Implemented exponential backoff retries for LLM wiki generation to prevent data loss on transient network errors.
- HIGH: Single Read Principle: Optimized pipeline to read each file exactly once, halving I/O overhead.
- MEDIUM: Automated Lesson Pruning: Daemon now automatically prunes expired memory lessons hourly.
- MEDIUM: Memory Bounded Indexing: First-run indexing now processes in memory-bounded batches to prevent OOM on large repos.
- LOW: Checkpoint Validation: MCP `create_checkpoint` tool now validates all input fields to prevent malformed data.
- Interactive Review Flow: New CLI command `synap lessons review` for interactive management of agent-proposed lessons.
- Context Monitoring: New MCP tool `signal_low_context` for proactive agent context window monitoring.
- Configurable Maintenance: Added `checkpoint_threshold` and `lesson_expiry_days` to `config.toml`.
- Improved Doctor: `synap doctor` now checks for Git and GitHub CLI availability.
- Onboarding Guidance: `synap init` now provides explicit next steps for starting the system.
- Mock LLM Mode: `MockLLMProvider` removed from the production codebase to maintain strict operational integrity.
-
Split Two-Path Indexing: Separated initialization and incremental indexing into
_first_run_index(full scan, CPU-parallelized) and_incremental_index(Git delta change detector). -
Asynchronous Wiki Generation Queue: Decoupled slow, non-deterministic LLM wiki generation from structural indexing using a persistent database queue (
wiki_queue) processed asynchronously by a daemon worker. -
Lazy Wiki Caching: Added synchronous wiki generation fallback to CLI (
wiki show), Web API, and MCP tools to dynamically build missing or stale pages on-demand. - Process Pool Parallel Parsing: Parallelized Tree-sitter parsing on first run across all CPU cores utilizing process-based concurrency with independent parser instances.
-
SQLite Performance Hardening:
- WAL mode and NORMAL synchronous configuration enabled during writes.
- Multi-row symbol and edge inserts batched into a single transaction via
executemany. - Dot-separated
module_keypre-computation and indexing for$O(1)$ module resolution. - SQLite FTS5 index integration for fast sub-millisecond symbol searches, avoiding full-table scans.
-
Web API Lazy Refreshes: Updated the
/wiki/{filepath}GET endpoint to perform lazy refreshes on stale or missing pages before returning content.
- FTS5 Cascade Delete: Added database trigger
tgr_symbols_deleteto automatically clean up virtualsymbols_ftsentries when parent symbols are deleted. - Duplicate File ID Collision: Handled unique, path-scoped file identifier generation ensuring files with identical content (like empty
__init__.py) do not conflict.
synap rollback --commit <ref>option: directly target a commit by hash/reference without interactive selection prompt.synap rollback --yes/-yoption: suppress confirmation prompt for non-interactive and scripted rollback flows.- Non-interactive guard in
synap rollback: fails fast with a clear error when used in piped/CI contexts without--commitor--yes. synap rollbackinvalid commit detection: validates commit reference viagit rev-parse --verifyand rejects unknown refs with a clear message.
- SQLite migration short-circuit bug: legacy un-versioned databases (user_version = 0) incorrectly skipped
CREATE TABLE IF NOT EXISTSexecution, leaving thesymbolstable and others uninitialized. The premature short-circuit is removed; all schema tables are now created correctly before bumping to version 1. - Python
import_from_statementmissing symbol extraction: Tree-sitter AST parser only extracted the module identifier fromfrom X import Ystatements, discardingY. Now correctly emitsmodule:symbolpairs for all imported names, aliases, and grouped imports. - Namespace-aware call edge resolution: Pass 2 import resolver now splits
module:symbolimport entries to narrow edge targets to the correct module file, eliminating false-positive dependency edges to duplicate class names in sibling namespaces. - FastAPI app version hardcoded:
create_app()used a hardcoded version string"0.2.0"instead of the canonical__version__. Now dynamically imported fromsynap_git.__init__. - Streaming generator cancellation safety: confirmed
httpxstream connections are closed cleanly on partial consumption (no socket leaks). - Degraded mode retry logic: confirmed 2-stage exponential backoff and graceful structural fallback under fully-offline and timeout conditions.
- Daemon resilience test (
test_daemon_resilience.py) hardened against race condition whereSIGKILLtest read a stale PID from a prior run that had already exited.
- CLI usage management:
synap usage show(displays Rich aggregated usage table and summary panel) andsynap usage clear. - CLI wiki management:
synap wiki listandsynap wiki show <filepath>(renders page in terminal via Rich Markdown). - LLM call database logging: records
prompt_tokens,completion_tokensfor retrieval and wiki generation passes. - Real-time daemon state: heartbeats integrated into
synap status,synap doctor, and the Web UI status endpoints. - Premium Web UI dashboard polish: dual L3 memory (Approved vs Pending) view, real-time LLM usage analytics, and active daemon PID badge.
- Defensive GHA release pipeline:
.github/workflows/release.ymlautomates TestPyPI and PyPI publishing, tag alignment checking, and draft release generation. - Clean Typer execution wrapper: intercepts configuration and credential exceptions to output actionable suggestions (e.g.
synap setup) instead of tracebacks.
- Deterministic MCP JSON envelope: every tool response carries
ok,data,warnings,trace_id,dirty_tree. - Structured error objects with
code,message, andsuggestionfields for all failure paths. dirty_treepropagation: agents are warned when the working tree is ahead of the index.get_approved_memory()andget_pending_memory()MCP tools exposing lesson trust status.synap mcp verifycommand to assert full protocol contract compliance.
- Formal
LessonStatusenum enforcing explicit state machine:PENDING → APPROVED/REJECTED,APPROVED → EXPIRED. - Retrieval gating: only
APPROVED, non-expired lessons are injected into LLM context as# APPROVED SYSTEM MEMORY. prune_expired_lessons()transitions stale lessons toEXPIREDstate on demand.approval_actorfield on lessons for full human-governance provenance.synap memory status— counts of pending, approved, expired lessons.synap memory prune— forces expiry evaluation and prunes dead memory.synap memory verify— checks approved lessons'files_affectedagainst current repo state; reports dangling references.synap lessons approve <id>andsynap lessons reject <id>— explicit human governance over pending lessons.
GitIgnoreSpecglob-to-regex parser inRepositoryScannerrespects.gitignorepatterns.- Auto-protection:
SynapRuntime.bootstrap()automatically adds.synap/to.gitignore. - Enhanced binary file detection via extension blocklist + control-character ratio analysis.
- Symlink traversal prevention in
RepositoryScanner(path-containment enforcement). TraceStorewrites structured operational traces to.synap/trace_latest.json.- Full hybrid retrieval tracing: latency timeline, token allocation, truncation explanations, structural hops, dirty-tree warnings.
/api/v1/trace/latestREST endpoint exposing the latest trace.- Diagnostic Web UI with visual latency timeline, source provenance table, and dirty-tree badge.
- Daemon heartbeat file (
.synap/daemon_heartbeat.json) with PID, uptime, and recovery metrics. - Daemon self-healing: SQLite corruption detected via
PRAGMA quick_checktriggers a wipe + re-bootstrap. synap rollback— interactive rollback to a previous git commit with lesson preservation.synap recover— explicit manual DB corruption recovery flow.py.typedmarker for PEP 561 compliance.
- CI pipeline split into 4 focused jobs:
lint,test,benchmark(main only),release-validation. - Release validation now runs
synap initbeforesynap doctorto ensure a valid Synap DB context. pytestconfigured withasyncio_default_fixture_loop_scope = "function"to eliminate deprecation warnings.- Benchmark tests gated behind
benchmarkmarker; skipped in fast PR CI passes.
synap doctorin CI release validation step previously ran against an uninitialized directory.ruff formatdrift incli/main.pyandindexer/daemon.pyresolved.
- Core deterministic indexing engine using Tree-sitter and Git content hashes.
- 4-stage hybrid retrieval pipeline (Temporal, Structural, Lexical, Semantic).
- Model Context Protocol (MCP) server for IDE integration.
- "Why-This-Context" retrieval tracing system.
- Secure secret management via
python-keyring. synap doctorfor system validation.- Diagnostic UI dashboard.
- Refactored entire architecture from event-sourcing to deterministic Git projections.
- Consolidated storage into unified SQLite schema with Recursive CTE support.
- Upgraded documentation to production infrastructure standards.
- Legacy "cognitive OS" and "graph operating system" abstractions.
- Speculative async priority queues and replay engines.
- Brittle regex-based parsers.