Scalable data pre processing and curation toolkit for LLMs
-
Updated
Jun 12, 2026 - Python
Scalable data pre processing and curation toolkit for LLMs
Fast Multimodal Semantic Deduplication & Filtering
Local multi-agent execution with middleware-level deduplication.
Public technical microsite for WDC-Engine, a middleware architecture for semantic deduplication and shared execution of agent-generated enterprise tasks.
Manage Chrome bookmarks with this local-first extension to search, organize, backup, and clean your browser links using AI-enhanced syntax.
Memory-as-a-Service for AI Agents & LLMs. Add persistent memory, pgvector-based semantic search, and automatic semantic deduplication with 3 simple REST API endpoints. Comes with an LRU embedding cache and a developer analytics dashboard.
Add a description, image, and links to the semantic-deduplication topic page so that developers can more easily learn about it.
To associate your repository with the semantic-deduplication topic, visit your repo's landing page and select "manage topics."