Skip to content

test(pool): concurrent-client HTTP load test for the DB connection pool#225

Merged
tcconnally merged 1 commit into
mainfrom
test/pool-concurrency-loadtest-223
Jun 26, 2026
Merged

test(pool): concurrent-client HTTP load test for the DB connection pool#225
tcconnally merged 1 commit into
mainfrom
test/pool-concurrency-loadtest-223

Conversation

@tcconnally

Copy link
Copy Markdown
Collaborator

Closes #223.

Follow-up to the connection pool (#210, shipped in #221). The existing unit tests (pooled_database_shared_across_threads, concurrent_reader_writer_no_locks) exercise Database directly — not the HTTP/SSE transport — and the pool knobs were hard-coded, so the durability/throughput characteristics under sustained concurrent client load were unverified.

Changes

1. Make the pool tunable via env (src/db.rs)

  • MIMIR_POOL_MAX_SIZE (default 16)
  • MIMIR_BUSY_TIMEOUT_MS (default 5000)

Defaults preserve prior behavior. This lets operators size the pool to their workload and lets the load test sweep the knobs the issue asks about.

2. Add an #[ignore]d load test (src/transport.rs)
Drives the real HTTP transport — the same init_transport_state + build_transport_router + axum::serve path main.rs uses — with N concurrent ureq clients interleaving:

  • writes: mimir_remember with unique high-entropy bodies (so each is a real create, not a near-duplicate dedup — otherwise persisted == issued wouldn't test durability)
  • reads: mimir_recall + mimir_context

It asserts the four properties #223 calls out:

  • ✅ no database is locked / SQLITE_BUSY after the busy_timeout
  • ✅ no lost writes — rows persisted (counted via an independent connection) == writes issued
  • ✅ no deadlock — all client threads join
  • ✅ reports p50 / p99 / max latency + throughput so the operator can judge tail behavior

Why #[ignore]

It's a load/soak test, not a CI correctness gate — the contention characteristics "can't be proven by CI" (per the issue). Run it explicitly and sweep:

cargo test --release pool_load_test_http_transport -- --ignored --nocapture

# sweep the pool knobs
MIMIR_POOL_MAX_SIZE=4 MIMIR_BUSY_TIMEOUT_MS=5000 MIMIR_LOADTEST_CLIENTS=32 \
  cargo test --release pool_load_test_http_transport -- --ignored --nocapture

Tunables (env): MIMIR_LOADTEST_CLIENTS (16), MIMIR_LOADTEST_WRITES / MIMIR_LOADTEST_READS per client (25 / 75).

Verification (x86_64-pc-windows-msvc, debug)

Default 16 clients × 16 pool: 2800 requests, 400/400 writes persisted, 0 lock errors, 0 other errors, p50≈1ms / p99≈27ms. The default cargo test run shows it ignored (does not gate CI); existing pool/transport tests still pass.

🤖 Generated with Claude Code

Follow-up to the connection pool (#210, shipped in #221). Unit coverage existed
(pooled_database_shared_across_threads, concurrent_reader_writer_no_locks) but
exercised the Database directly — not the HTTP/SSE transport — and the pool
knobs were hard-coded, so the durability/throughput characteristics under
sustained concurrent client load were unverified.

Two changes:

1. Make the pool tunable via env (db.rs): MIMIR_POOL_MAX_SIZE (default 16) and
   MIMIR_BUSY_TIMEOUT_MS (default 5000). Defaults preserve prior behavior; this
   lets operators size the pool to their workload and lets the load test sweep
   the knobs.

2. Add an #[ignore]d load test (transport.rs) that drives the REAL HTTP
   transport — the same init_transport_state + build_transport_router +
   axum::serve path main.rs uses — with N concurrent ureq clients interleaving
   writes (mimir_remember, unique high-entropy bodies so each is a real create,
   not a dedup) and reads (mimir_recall + mimir_context). It asserts the four
   properties #223 calls out: no `database is locked` / SQLITE_BUSY after the
   busy_timeout, no lost writes (rows persisted == writes issued), no deadlock
   (all clients join), and reports p50/p99/max latency + throughput.

It is #[ignore]d on purpose — a load/soak test, not a CI correctness gate (the
contention characteristics "can't be proven by CI"). Run it explicitly and
sweep:

    cargo test --release pool_load_test_http_transport -- --ignored --nocapture
    MIMIR_POOL_MAX_SIZE=4 MIMIR_LOADTEST_CLIENTS=32 cargo test ... -- --ignored

Verified on x86_64-pc-windows-msvc (default 16x16 pass; pool=2 sweep pass).

Closes #223

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@tcconnally tcconnally merged commit 5050f28 into main Jun 26, 2026
4 checks passed
@tcconnally tcconnally deleted the test/pool-concurrency-loadtest-223 branch June 26, 2026 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test: concurrent-client load test for the DB connection pool (#210 follow-up)

1 participant