Releases: shinpr/mcp-local-rag
Releases · shinpr/mcp-local-rag
Release: v0.15.2
Bug Fixes
delete_filenow reports what was actually removed. Responses (MCP and CLI) includeremovedChunksandexisted, instead of always returningdeleted: true.delete_filenow fails on raw-data deletion errors. If a file's raw data cannot be removed, the operation reports an error rather than silently succeeding.
Contributors
- Thanks to @rudi193-cmd for the
delete_fileresponse fix (#152).
Release: v0.15.1
Added
- Configurable embedding dtype via
RAG_DTYPE. Choose the embedding quantization (fp32,fp16,q8,int8, …) with an opt-in environment variable. The default isfp32, so existing setups are unchanged. If the chosen model does not provide the requested dtype, the server fails with an error that lists the dtypes the model does provide. ChangingRAG_DTYPE(orRAG_DEVICE) changes the embedding space, so re-ingest your documents after changing it.
Changed
- Clearer error reporting. Errors now keep their real cause instead of being masked. For example, a dtype/model misconfiguration during PDF ingestion now reports the actual problem rather than a generic “Failed to parse PDF”. Full diagnostic detail (cause chain) is written to stderr logs; MCP clients receive a clean, controlled message.
- A failed re-ingest whose rollback also fails now reports a distinct error stating that existing data may not have been restored.
Release: v0.15.0
Fixes
- Ingest rollback: a failed re-ingest now restores the full prior chunk set with its real embeddings. Previously a rollback could lose chunks and leave dummy vectors behind, corrupting search results for that document.
- list / list_files: a file is now correctly reported as ingested when its base directory is reached through a symlink or directory alias — no more false
ingested: falseand duplicate orphaned entries. Paths are shown in their configured (non-resolved) form. Existing databases keep working; no migration. - Input validation: MCP tools now reject malformed input with a clear
InvalidParamserror —query_documents(query,limit1–20),ingest_data(content,metadata.source/format), anddelete_file(missing target). - Error handling: delete failures are no longer silently swallowed, and a transient full-text-search failure degrades only that single query instead of disabling hybrid search until restart.
Performance
- Faster embedding via true batched inference (one forward pass per batch).
- Lighter
list/statususing row counts and projected scans instead of loading full rows.
Dependencies
@lancedb/lancedb0.29 → 0.30 (existing databases keep working; no migration).
Release: v0.14.2
Added
BASE_DIRS— JSON array env var to configure multiple document roots, e.g.BASE_DIRS='["/a","/b"]'.- Repeatable
--base-dirforingestandlist. CLI roots replace env roots. - Precedence: CLI
--base-dir>BASE_DIRS>BASE_DIR>cwd.
Changed
- Configuration warnings (e.g.
BASE_DIRS is set; BASE_DIR is ignored.) now appear in MCP tool responses, not only stderr. list_filesreturnsbaseDirs: string[]and per-filebaseDir. LegacybaseDir(first effective root) is preserved.- Invalid
BASE_DIRSputs the server in degraded mode:statusstays callable for diagnosis, root-dependent tools return a structured error. No silent fallback.
Security
- Sensitive paths (
/etc,/usr,~/.ssh, ...) are rejected at both CLI and MCP server startup, including their realpath canonical forms (/private/etcon macOS). - The
raw-dataingest fast path is now gated by a realpath boundary check, closing traversal and symlink-escape vectors.
Release: v0.14.1
Changes
- Visual ingest now offers two quality profiles (opt-in). When
visual: true(MCP) or--visual(CLI) is set, a newvisualQualityparameter (visualQuality: 'fast' | 'quality'on MCPingest_file;--visual-quality fast|qualityon CLI) selects the VLM.fast(default) keeps the v0.14.0 modelHuggingFaceTB/SmolVLM-256M-Instruct.qualityselectsonnx-community/Qwen2.5-VL-3B-Instruct-ONNXfor figures where in-image text (axis labels, panel sub-labels, flowchart nodes) needs to be captured reliably. Default behavior is unchanged; thequalityprofile is opt-in per ingest call.
Release: v0.14.0
Changes
- Visual ingest for PDFs (opt-in).
visual: true(MCP) or--visual(CLI) inlines a short caption from a local VLM (HuggingFaceTB/SmolVLM-256M-Instruct) into the text chunks for each page with figures, tables, or diagrams. Captions are auxiliary text — not image search, not OCR, and not a faithful transcription of the figure. Default ingest is unchanged. See the README's Ingesting PDFs with figures section for usage and the security note. - GPU acceleration. Embedding runs on WebGPU when available, with a CPU fallback. Override with
RAG_DEVICE=cpu. - Bug fix (Windows paths). File metadata fields (file name, extension) now extract correctly from backslash-style paths. Previously these fields could be empty or contain the full path when ingesting from a Windows path.
- MCP server is env-only. Configuration comes from environment variables only; passing CLI flags to
npx mcp-local-ragnow fails fast with a clear error instead of being silently ignored.
Contributors
- Thanks to @mickey-mikey for GPU support (#128), the env-only MCP server fix (#123), and the Windows path-handling fix (#118).
Release: v0.13.2
Patch release improving read_chunk_neighbors input validation.
Fixes
read_chunk_neighborsnow treats an empty or whitespace-onlyfilePath/sourceas not provided. Passingsource: ""alongside a validfilePathreturns the document window instead of resolving to an empty raw-data path and returning nothing.- The
filePath/sourcevalidation error now matches the actual situation: "Provide either filePath or source, not both" when both are given, "Either filePath or source must be provided" when neither is.
Maintenance
- Upgraded dev dependencies (
@biomejs/biome,@types/node,knip,lint-staged,dpdm) to their latest non-major releases.
Contributors
- Thanks to @dburner for the initial work on empty
filePath/sourcehandling.
Release: v0.13.1
Patch release for dependency upgrades.
Dependencies
@huggingface/transformers^4.0.0 → ^4.2.0- Minor version bump in the embedding library. Embedding output may differ slightly from v0.13.0; re-ingest existing documents if you observe behavior changes.
@lancedb/lancedb^0.26.2 → ^0.27.2@modelcontextprotocol/sdk^1.28.0 → ^1.29.0jsdom^27.4.0 → ^29.1.1turndown7.2.2 → 7.2.4
Transitive
Release: v0.13.0
What's New
New tool: read_chunk_neighbors — Expand a search result by reading the chunks immediately before and after it in the same document.
After finding a relevant chunk via query_documents, pass its chunkIndex and filePath (or source) to retrieve surrounding context in a single call. Useful when a hit answers a question only partially.
MCP Tool
read_chunk_neighbors({ filePath: "/path/to/doc.md", chunkIndex: 5 })
CLI
npx mcp-local-rag read-neighbors --file-path /path/to/doc.md --chunk-index 5Key details
- Asymmetric window:
--beforeand--aftercontrol each direction independently (default 2, max 50) - Target marking: The requested chunk is included with
isTarget: true - Lenient boundaries: Out-of-range indices return only existing chunks (no error)
- Dual input: Accepts
filePath(fromingest_file) orsource(fromingest_data) - Skills docs updated: Agent Skills now include context-expansion guidance
Release: v0.12.0
Added
CHUNK_MIN_LENGTHenvironment variable and--chunk-min-lengthCLI flag to configure minimum chunk length in characters (range: 1–10,000, default: 50)CHUNK_MIN_LENGTHentry in MCP server manifest (server.json)
Changed
- Minimum chunk length default (50) is now exported as a single source of truth from the chunker module
- Error message for zero-chunk ingestion now reflects the actual configured value instead of hardcoded "50"
- Warning and error messages across all env var parsers and CLI flags now truncate user input to 100 characters