See also EDGE_RELEASE.md for the membrane capability matrix, release SLOs, crash-test gate, and foreign-binding roadmap.
PomaiDB is designed first for embedded / edge workloads: single-process, local storage, constrained memory, and frequent power loss. This guide summarizes recommended configuration presets and what happens on failure so you can reason about behavior on devices like Raspberry Pi, Jetson, or custom ARM boards.
This document focuses on the embedded pomai::Database API (single-instance engine) and the sharded pomai::DB API where relevant.
-
Edge build profile (size-optimized):
- Configure CMake with:
-DPOMAI_EDGE_BUILD=ON(enables-Os -g0and other size-focused flags)-DCMAKE_BUILD_TYPE=Release
- Recommended for production firmware images and containers where binary size and cold-start latency matter more than debug info.
- Configure CMake with:
-
Strict warnings for development and CI:
- Enable:
-DPOMAI_STRICT=ON
- This turns most compiler warnings into errors for PomaiDB’s own code while keeping vendored dependencies (HNSW, SIMD kernels) lenient.
- Safe to combine with
POMAI_EDGE_BUILDonce your toolchain is stable; it helps surface misconfigurations early.
- Enable:
PomaiDB stores all data under a single directory on local storage (e.g., SD card, eMMC, SSD).
-
Filesystem & mount:
- Prefer ext4 or another journaling filesystem with barriers enabled.
- Avoid network filesystems for embedded use; PomaiDB assumes low-latency local I/O.
-
Durability via
FsyncPolicy:- For the sharded
pomai::DBAPI (pomai::DBOptions):FsyncPolicy::kNever:- Best for cache-like or reconstructible data.
- Power loss may drop recent writes still in OS buffers, but on-disk data remains self-consistent.
FsyncPolicy::kAlways:- Recommended when data must survive power loss and write rates are modest.
- Every WAL / manifest commit is fsynced; expect higher latency but strong durability.
- For the embedded
pomai::DatabaseAPI (pomai::EmbeddedOptions):- Use
EmbeddedOptions::fsyncin the same way. - On intermittently powered devices, prefer
kAlwaysfor critical logs andkNeverwhere data can be rebuilt.
- Use
- For the sharded
-
Flush vs. Freeze:
Flush()ensures the WAL is pushed to disk according toFsyncPolicy.Freeze()moves the current memtable into an on-disk segment and updates manifests.- On edge devices, a common pattern from an event loop or watchdog is:
- Periodically call
Flush()andFreeze()on a timer (e.g., every N seconds) or after M ingests. - On clean shutdown, issue
Flush()andFreeze()beforeClose().
- Periodically call
For the detailed atomic commit protocol and WAL / manifest guarantees, see docs/FAILURE_SEMANTICS.md.
pomai::Database exposes explicit backpressure controls in EmbeddedOptions:
-
Key fields:
max_memtable_mb:- Hard cap for the memtable (in MiB).
0= use environment or default:- Default is tuned for edge and may differ between low-memory and normal builds.
- Hard cap for the memtable (in MiB).
pressure_threshold_percent:- Soft threshold (percent of
max_memtable_mb) where pressure handling kicks in.0= default (typically 80%).
- Soft threshold (percent of
auto_freeze_on_pressure:- If
true, when the memtable exceeds the pressure threshold, the engine will callFreeze()internally rather than returning an error.
- If
memtable_flush_threshold_mb:- Absolute size in MiB where
auto_freeze_on_pressuretriggers, overriding the percentage.0= derive frompressure_threshold_percent.
- Absolute size in MiB where
-
Recommended presets for edge:
- Tiny devices (≤ 256 MiB RAM):
max_memtable_mb = 32–64pressure_threshold_percent = 70–80auto_freeze_on_pressure = truememtable_flush_threshold_mb = 32(optional override)
- Moderate devices (512 MiB – 1 GiB RAM):
max_memtable_mb = 128–256pressure_threshold_percent = 80auto_freeze_on_pressure = true(recommended) orfalseif you want manual control viaTryFreezeIfPressured().
- Tiny devices (≤ 256 MiB RAM):
-
Environment overrides:
- The embedded engine also honors:
POMAI_MAX_MEMTABLE_MB– caps memtable size ifmax_memtable_mbis0.POMAI_MEMTABLE_PRESSURE_THRESHOLD– overridespressure_threshold_percentfor defaults.POMAI_BENCH_LOW_MEMORY– switches to lower default memtable sizes for benchmarks / tests.
- The embedded engine also honors:
-
Operational pattern:
- In a single-threaded event loop, the typical pattern is:
- Call
AddVector()/AddVectorBatch()for ingestion. - Periodically call
TryFreezeIfPressured()to keep memory use bounded. - Inspect
GetMemTableBytesUsed()for metrics / logging.
- Call
- In a single-threaded event loop, the typical pattern is:
PomaiDB’s IndexParams exposes presets tuned for edge workloads:
-
Use
IndexParams::ForEdge()wherever possible:- In
EmbeddedOptions:opt.index_params = pomai::IndexParams::ForEdge();
- This preset reduces:
- IVF list count (
nlist), probes (nprobe), - HNSW degree / ef parameters,
- and other memory-heavy knobs.
- IVF list count (
- The goal is to keep index RAM usage predictable while still providing reasonable recall.
- In
-
Distance metric:
- For most embedding-style workloads on edge devices:
- Use
MetricType::kL2(squared L2) with SQ8 or FP16 quantization for compact storage.
- Use
MetricType::kInnerProductis also supported but may be more sensitive to quantization.
- For most embedding-style workloads on edge devices:
-
Quantization knobs (when applicable):
- Prefer SQ8 or FP16 quantization where your model tolerates some loss, especially for:
- Large corpora on devices with ≤ 512 MiB RAM.
- Scenarios where on-disk size is heavily constrained (e.g., SD cards with many tenants).
- Prefer SQ8 or FP16 quantization where your model tolerates some loss, especially for:
PomaiDB is built to fail closed rather than risking silent corruption. High-level behaviors (see docs/FAILURE_SEMANTICS.md for details):
-
On
Open()(embeddedDatabase::Open/ shardedDB::Open):- Invalid configuration (e.g.,
dim == 0, emptypath) returns:Status::InvalidArgument.
- Filesystem errors (permissions, missing dirs that cannot be created) return:
Status::IOError.
- WAL or manifest corruption:
- The engine attempts to replay or recover.
- If recovery is not possible,
Open()returns a non-OKStatus(e.g.,Corruption,Aborted, orInternaldepending on context) and does not start the engine.
- Invalid configuration (e.g.,
-
During ingestion / search:
- Backpressure (embedded engine):
- If the memtable exceeds
max_memtable_mbandauto_freeze_on_pressureisfalse:AddVector/AddVectorBatchwill returnStatus::ResourceExhaustedwith a message instructing callers toFreeze()orTryFreezeIfPressured().
- If
auto_freeze_on_pressureistrue:- The engine attempts to
Freeze()internally once pressure is detected. - If freeze fails (e.g., I/O error), the operation returns the corresponding failure
Status.
- The engine attempts to
- If the memtable exceeds
- I/O failures (ENOSPC, EIO, etc.):
- Write failures on WAL / segments propagate as:
Status::IOErrororStatus::Aborted/Status::Internal, depending on the layer.
- After a serious I/O error, affected shards / the embedded engine will refuse further operations until reopened, to avoid compounding corruption.
- Write failures on WAL / segments propagate as:
- Backpressure (embedded engine):
-
Crash and restart behavior:
- On restart, both APIs:
- Re-open WALs and attempt replay up to the last valid record.
- Validate manifests and segment files; fall back from
manifest.currenttomanifest.previf needed.
- Tests such as
recovery_test,manifest_corruption_test, and WAL corruption scenarios validate the following guarantees:- No silent acceptance of corrupted manifests or WAL segments.
- Either recover to a consistent state (possibly losing a tail of recent writes) or fail to open with a non-OK
Status.
- On restart, both APIs:
-
Choose a failure policy per device class:
- For sensor nodes with upstream replicas:
- Prefer
FsyncPolicy::kNever, smallmax_memtable_mb, andauto_freeze_on_pressure = true. - Rely on upstream for long-term durability.
- Prefer
- For gateway / aggregation devices:
- Prefer
FsyncPolicy::kAlwaysfor critical data. - Use
IndexParams::ForEdge()and conservativemax_memtable_mbto bound RAM.
- Prefer
- For sensor nodes with upstream replicas:
-
Integrate health checks:
- Treat any non-OK
StatusfromOpen()as a signal to:- Log and raise an alert.
- Potentially rotate to a new storage path or device.
- Monitor:
GetMemTableBytesUsed()- Open / search error codes (e.g.,
ResourceExhausted,IOError,Corruption).
- Treat any non-OK
-
Test on your actual target:
- Run the existing integration, TSAN, and crash tests on:
- Your device type, filesystem, and kernel.
- Perform your own chaos test:
- Ingest +
Flush()/Freeze()loop. - Physically cut power or kill the process.
- Verify that:
Open()either succeeds with intact historical data or fails with a clear error code.
- Ingest +
- Run the existing integration, TSAN, and crash tests on:
These guidelines are intentionally conservative: they aim to keep your edge deployments safe even under frequent power loss and tight memory budgets.