Skip to content

Commit 2949dae

Browse files
committed
Update
1 parent f6e5f48 commit 2949dae

16 files changed

Lines changed: 791 additions & 182 deletions

File tree

.github/workflows/ci.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,11 @@ jobs:
3737
POMAI_SKIP_VULKAN_TESTS: 1
3838
run: ctest --test-dir build -L bench --output-on-failure -C Release --timeout 600
3939

40+
- name: Perf Regression Gate
41+
env:
42+
POMAI_PERF_MAX_REGRESSION_PCT: 15
43+
run: ./tools/perf_gate.sh --build-dir=build --dataset=small
44+
4045
- name: Package ci-fast artifact
4146
shell: bash
4247
run: |

CLAUDE.md

Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## What Makes PomaiDB Unique
6+
7+
PomaiDB is designed as a **multimodal embedded database for edge devices** with several distinctive features:
8+
9+
- **True Multimodal Storage**: Unlike traditional vector databases, PomaiDB supports 12+ membrane types (vectors, RAG, graphs, text, time series, key-value, metadata, sketches, spatial data, meshes, sparse vectors, bitsets) allowing storage and querying of diverse data modalities in a single database.
10+
11+
- **Edge-Native Single-Threaded Architecture**: Designed for deterministic latency on constrained hardware with no mutexes, lock-free queues, or race conditions - similar to Redis/Node.js event loop but optimized for flash storage longevity.
12+
13+
- **Zero-OOM Guarantee**: Integrated with palloc for arena-style allocation with hard memory limits, combined with backpressure mechanisms to prevent out-of-memory crashes on edge devices.
14+
15+
- **Offline-First Edge RAG**: Complete retrieval-augmented generation pipeline that runs entirely on-device (ingest → chunk → embed → store → retrieve) without external APIs, featuring zero-copy chunking and pluggable embedding providers.
16+
17+
- **Multimodal Query Orchestration**: Hybrid search across different membrane types (vector + lexical + graph traversal) with heuristic execution ordering and bounded frontier RAM.
18+
19+
- **Flash-Optimized Storage**: Append-only, log-structured design with tombstone-based deletion that minimizes random writes and extends SD/eMMC card lifespan.
20+
21+
- **Built-in Edge Features**: HTTP endpoints, MQTT/WebSocket-style ingestion, hardware wear-aware maintenance, encryption-at-rest, and mini-OLAP analytical aggregates.
22+
23+
## Development Commands
24+
25+
### Build
26+
```bash
27+
# Standard release build
28+
mkdir build && cd build
29+
cmake .. -DCMAKE_BUILD_TYPE=Release
30+
make -j$(nproc)
31+
32+
# Build with tests
33+
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DPOMAI_BUILD_TESTS=ON
34+
cmake --build build -j$(nproc)
35+
36+
# Edge-optimized build (smaller binary, less debug)
37+
cmake .. -DCMAKE_BUILD_TYPE=Release -DPOMAI_EDGE_BUILD=ON
38+
```
39+
40+
### Tests
41+
```bash
42+
# Run full test suite
43+
cd build
44+
ctest --test-dir . --output-on-failure
45+
46+
# Run a single test (replace TestName with actual test name)
47+
cd build
48+
ctest -R TestName --output-on-failure
49+
50+
# Run tests with verbose output
51+
cd build
52+
ctest --verbose
53+
```
54+
55+
### Benchmarks
56+
```bash
57+
# Build benchmarks
58+
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DPOMAI_BUILD_BENCH=ON
59+
cmake --build build -j$(nproc)
60+
61+
# Run all benchmarks
62+
./scripts/run_benchmarks_one_by_one.sh
63+
64+
# Run specific benchmark
65+
cd build
66+
ctest -R bench_ --output-on-failure
67+
```
68+
69+
### Code Quality
70+
```bash
71+
# Check for malloc/new usage (should use palloc)
72+
./scripts/check_no_malloc_new.sh
73+
74+
# Verify API contract
75+
./scripts/check_api_contract.sh
76+
```
77+
78+
### Quick Start Examples
79+
See `examples/` directory for language-specific examples:
80+
- C++: `examples/quickstart_cpp/`
81+
- C: `examples/quickstart_c/`
82+
- Python: `examples/quickstart_python/`
83+
- RAG: `examples/rag_quickstart/`
84+
85+
## Architecture Overview
86+
87+
### What Makes PomaiDB a Multimodal Embedded Database
88+
89+
PomaiDB's architecture is specifically designed to be a **multimodal embedded database for edge devices**, combining several unique capabilities:
90+
91+
- **Unified Multimodal Storage**: Single database engine that handles vectors, text, graphs, time series, spatial data, 3D meshes, and more through its membrane system, enabling true multimodal AI applications on edge devices.
92+
93+
- **Deterministic Edge Performance**: Single-threaded event loop eliminates concurrency complexity while providing predictable latency crucial for real-time edge applications.
94+
95+
- **Flash-Optimized for Longevity**: Append-only, log-structured storage minimizes write amplification on SD/eMMC storage, extending device lifespan.
96+
97+
- **Hardware-Aware Resource Management**: Integration with palloc provides hard memory caps and arena-style allocation, preventing OOM crashes on memory-constrained devices.
98+
99+
- **Complete On-Device AI Pipeline**: Offline-first RAG with zero-copy chunking and pluggable embedding providers enables local AI without cloud dependencies.
100+
101+
### Core Design Principles
102+
- **Single-threaded event loop**: All operations (ingest, search, freeze, flush) run to completion in order, providing deterministic latency and trivial concurrency reasoning.
103+
- **Shared-nothing architecture**: One logical thread, one storage engine, one logical database per process.
104+
- **Zero-OOM philosophy**: Bounded memtable size, backpressure (auto-freeze when over threshold), and optional integration with palloc for arena-style allocation and hard memory caps.
105+
- **Edge-native storage**: Append-only, log-structured storage with tombstone-based deletion, designed for SD-card and eMMC longevity.
106+
- **Virtual File System (VFS)**: Storage and environment operations go through abstract `Env` and file interfaces. Default backend is POSIX; an in-memory backend supports tests and non-POSIX targets.
107+
108+
### Key Components
109+
- **DbImpl**: Main database implementation handling core operations (Put, Get, Search, Flush, Freeze, Close).
110+
- **MembraneManager**: Manages logical collections (membranes) with separate dimensions, sharding, and indexes. Supported kinds include kVector, kRag, kGraph, kText, kTimeSeries, kKeyValue, kMeta, kSketch, kSpatial, kMesh, kSparse, kBitset.
111+
- **QueryPlanner/QueryOrchestrator**: Plans and executes hybrid/multimodal searches across membranes with heuristic execution ordering, bounded frontier RAM, and metadata partition hints.
112+
- **Storage Engine**: Log-structured, append-only storage with sequential flush of in-memory buffer to disk. Uses WAL for crash recovery.
113+
- **RAG Pipeline**: Zero-copy chunking (`std::string_view`), `EmbeddingProvider` interface, and unified `RagPipeline` with `IngestDocument` and `RetrieveContext` methods.
114+
- **Memory Management**: Optional palloc (mmap-backed or custom allocator) for O(1) arena-style allocation and hard memory limits. Core and C API can use palloc for control structures and large buffers.
115+
- **I/O Layer**: Sequential write-behind; zero-copy reads (mmap where available via VFS, or buffered I/O). Designed for SD-card and eMMC longevity first, NVMe-friendly by construction.
116+
117+
### Membrane Types (Multimodal Capabilities)
118+
Each membrane kind enables specific multimodal capabilities:
119+
- `kVector`: Vector storage with ANN search (IVF, HNSW) - for embeddings and similarity search
120+
- `kRag`: Retrieval-Augmented Generation pipeline storage - for document retrieval and context augmentation
121+
- `kGraph`: Graph storage for relationships and linkages - for knowledge graphs and entity relationships
122+
- `kText`: Raw text storage - for storing and querying unstructured text
123+
- `kTimeSeries`: Time-series data storage - for sensor data, metrics, and temporal analysis
124+
- `kKeyValue`: Simple key-value store - for configuration and metadata storage
125+
- `kMeta`: Metadata storage - for flexible schema-less data tagging
126+
- `kSketch`: Probabilistic data structures (HyperLogLog, CountMinSketch) - for approximate counting and frequency estimation
127+
- `kSpatial`: Geospatial data storage - for location-based services and mapping applications
128+
- `kMesh`: 3D mesh storage with LOD management - for AR/VR, robotics, and spatial computing
129+
- `kSparse`: Sparse vector storage - for efficient storage of high-dimensional sparse data
130+
- `kBitset`: Bitmask operations and filtering - for fast set operations and feature flags
131+
132+
### Important Directories
133+
- `src/`: Core implementation (C++)
134+
- `include/`: Public headers
135+
- `sdk/`: Language bindings (Python, etc.)
136+
- `tests/`: GoogleTest unit and integration tests
137+
- `benchmarks/`: Performance benchmarks
138+
- `scripts/`: Utility scripts for building, testing, and benchmarking
139+
- `examples/`: Quickstart examples in multiple languages
140+
- `docs/`: Detailed documentation (edge release, deployment, failure semantics, etc.)
141+
142+
## Common Development Tasks
143+
144+
### Adding a New Membrane Type (Extending Multimodal Capabilities)
145+
1. Define the membrane kind in `include/pomai/types.h` (add to `MembraneKind` enum)
146+
2. Implement the membrane class in `src/core/membranes/` (follow existing patterns like `VectorMembrane` or `RagMembrane`)
147+
3. Register the membrane in `MembraneManager::CreateMembrane()` and `MembraneManager::GetMembrane()`
148+
4. Add appropriate tests in `tests/membranes/` (test basic operations, edge cases, and integration with query orchestrator)
149+
5. Update documentation if needed (mention in membrane types overview)
150+
6. Consider if the new membrane type should participate in hybrid queries (update `QueryOrchestrator` if needed)
151+
152+
### Developing for Edge Devices
153+
1. **Memory Optimization**: Use `palloc` for arena-style allocation; avoid unbounded growth
154+
2. **Flash Longevity**: Favor sequential writes; minimize random I/O operations
155+
3. **Deterministic Latency**: Avoid blocking operations; keep operations bounded and predictable
156+
4. **Power Efficiency**: Profile CPU usage; leverage SIMD instructions when available
157+
5. **Testing on Target Hardware**: Use scripts in `benchmarks/` to validate performance on actual edge devices
158+
6. **Verify No System Calls**: Run `./scripts/check_no_malloc_new.sh` to ensure proper memory practices
159+
160+
### Modifying Storage Layer (Edge-Optimized)
161+
1. Changes typically affect `src/core/storage/`
162+
2. Ensure WAL consistency and crash-recovery properties are maintained (critical for power-loss resilience)
163+
3. Run soak/power-loss tests (see `tests/storage/`)
164+
4. Verify append-only property and tombstone handling (no random writes)
165+
5. Test with `./scripts/edge_release_print_sizes.sh` to check binary footprint
166+
6. Validate performance characteristics with `./scripts/run_benchmarks_one_by_one.sh`
167+
168+
### Working with RAG Pipeline (On-Device AI)
169+
1. Core logic in `src/core/rag/`
170+
2. Embedding provider interface in `include/pomai/rag/embedding_provider.h` (implement for local models)
171+
3. Chunking strategies in `src/core/rag/chunking/` (optimize for zero-copy on edge)
172+
4. Test with `scripts/rag_smoke.py`
173+
5. Verify memory limits are respected (palloc integration)
174+
6. Test end-to-end pipeline: ingest → embed → store → retrieve → generate
175+
176+
### Building Language Bindings for Edge
177+
- Python: Located in `sdk/python/` - uses ctypes by default (minimal footprint)
178+
- For richer APIs on edge, consider minimal pybind11 or custom C-compatible wrappers
179+
- Follow existing patterns in `sdk/` for new bindings
180+
- Consider creating lightweight bindings for microcontrollers if needed
181+
182+
### Adding Edge-Specific Features
183+
1. **Connectivity**: HTTP endpoints in `src/edge_connectivity/` (health, metrics, ingestion)
184+
2. **Authentication**: Token-based auth mechanisms in edge connectivity layer
185+
3. **Hardware Features**: Utilize platform-specific optimizations (NEON, AVX2, etc.)
186+
4. **Wear Leveling**: Extend write-byte counters in storage layer for flash endurance
187+
188+
## Testing Guidelines
189+
- Unit tests: GoogleTest framework in `tests/`
190+
- Integration tests: Focus on cross-membrane operations and recovery scenarios
191+
- Benchmark tests: Located in `benchmarks/` - measure throughput and latency
192+
- Sanitizer builds: Enable ASan/UBSan/TSan via CMake for CI-like testing locally
193+
- Always run `./scripts/check_no_malloc_new.sh` to ensure proper memory allocation practices
194+
195+
## Logging and Diagnostics
196+
- Structured logging in `src/util/logging.*`
197+
- Log levels: DEBUG, INFO, WARN, ERROR
198+
- Use `POMAI_LOG(level)` macros for conditional logging
199+
- Inspect builds: `src/core/inspect.cc` provides runtime introspection
200+
- Health checks: Built-in HTTP endpoint (`/health`) when Edge connectivity features are enabled

CMakeLists.txt

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -432,6 +432,10 @@ if (POMAI_BUILD_TESTS)
432432
pomai_setup_test(edge_platform_test)
433433
pomai_add_labeled_test(edge_platform_test "unit")
434434

435+
add_executable(kernel_hardening_test tests/unit/kernel_hardening_test.cc)
436+
pomai_setup_test(kernel_hardening_test)
437+
pomai_add_labeled_test(kernel_hardening_test "unit")
438+
435439
add_executable(graph_rag_test tests/unit/graph_rag_test.cc)
436440
pomai_setup_test(graph_rag_test)
437441
pomai_add_labeled_test(graph_rag_test "unit")
@@ -476,6 +480,7 @@ if (POMAI_BUILD_TESTS)
476480
pomai_setup_test(no_train_dispatch_test)
477481
pomai_add_labeled_test(no_train_dispatch_test "unit")
478482

483+
479484
add_executable(edge_ai_core_enhancements_test tests/unit/edge_ai_core_enhancements_test.cc)
480485
pomai_setup_test(edge_ai_core_enhancements_test)
481486
pomai_add_labeled_test(edge_ai_core_enhancements_test "unit")
@@ -796,6 +801,10 @@ add_executable(vulkan_transfer_bench benchmarks/vulkan_transfer_bench.cc)
796801
target_link_libraries(vulkan_transfer_bench PRIVATE ${POMAI_EXE_DEPS})
797802
target_include_directories(vulkan_transfer_bench PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/include ${CMAKE_CURRENT_SOURCE_DIR}/src)
798803

804+
add_executable(kernel_dispatch_bench benchmarks/kernel_dispatch_bench.cc)
805+
target_link_libraries(kernel_dispatch_bench PRIVATE ${POMAI_EXE_DEPS})
806+
target_include_directories(kernel_dispatch_bench PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/include ${CMAKE_CURRENT_SOURCE_DIR}/src)
807+
799808
# Bounded benchmark smoke tests (avoid ctest default 1500s timeout on heavy workloads).
800809
if (POMAI_REGISTER_BENCH_CTEST)
801810
set(POMAI_BENCH_TIMEOUT_LONG 600)
@@ -849,6 +858,9 @@ if (POMAI_REGISTER_BENCH_CTEST)
849858

850859
add_test(NAME bench_vulkan_transfer COMMAND $<TARGET_FILE:vulkan_transfer_bench>)
851860
set_tests_properties(bench_vulkan_transfer PROPERTIES LABELS "bench" TIMEOUT ${POMAI_BENCH_TIMEOUT_SHORT})
861+
862+
add_test(NAME bench_kernel_dispatch COMMAND $<TARGET_FILE:kernel_dispatch_bench> 100000)
863+
set_tests_properties(bench_kernel_dispatch PROPERTIES LABELS "bench" TIMEOUT ${POMAI_BENCH_TIMEOUT_SHORT})
852864
endif()
853865

854866
# =========================
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
#include <chrono>
2+
#include <cstdint>
3+
#include <cstdlib>
4+
#include <iostream>
5+
#include <memory>
6+
7+
#include "core/kernel/micro_kernel.h"
8+
9+
namespace {
10+
class BenchPod final : public pomai::core::Pod {
11+
public:
12+
void Handle(pomai::core::Message&& msg) override {
13+
if (msg.status_ptr) *msg.status_ptr = pomai::Status::Ok();
14+
}
15+
pomai::core::PodId Id() const override { return pomai::core::PodId::kIndex; }
16+
std::string Name() const override { return "BenchPod"; }
17+
pomai::core::MemoryQuota GetQuota() const override { return {}; }
18+
};
19+
}
20+
21+
int main(int argc, char** argv) {
22+
const uint32_t n = (argc > 1) ? static_cast<uint32_t>(std::strtoul(argv[1], nullptr, 10)) : 200000u;
23+
pomai::core::MicroKernel kernel;
24+
auto st = kernel.RegisterPod(std::make_unique<BenchPod>());
25+
if (!st.ok()) {
26+
std::cerr << "register failed: " << st.message() << "\n";
27+
return 1;
28+
}
29+
const auto t0 = std::chrono::steady_clock::now();
30+
for (uint32_t i = 0; i < n; ++i) {
31+
pomai::Status op_st = pomai::Status::Ok();
32+
auto msg = pomai::core::Message::Create(pomai::core::PodId::kIndex, pomai::core::Op::kFlush);
33+
msg.status_ptr = &op_st;
34+
kernel.Enqueue(std::move(msg));
35+
}
36+
kernel.ProcessAll();
37+
const auto t1 = std::chrono::steady_clock::now();
38+
const double sec = std::chrono::duration_cast<std::chrono::duration<double>>(t1 - t0).count();
39+
const double mps = sec > 0.0 ? static_cast<double>(n) / sec : 0.0;
40+
std::cout << "kernel_dispatch_bench n=" << n << " sec=" << sec << " msg_per_sec=" << mps << "\n";
41+
return 0;
42+
}

data.f

Whitespace-only changes.

gdb_out.txt

Lines changed: 0 additions & 35 deletions
This file was deleted.

0 commit comments

Comments
 (0)