Skip to content

Commit 37c7b54

Browse files
committed
fix(backend): stem pool/clustering af op de RIG-Postgres 20-connectie-cap per user
De gedeelde RIG-Postgres (rig-db) capt elke project-DB-user op 20 connecties totaal (CONNECTION LIMIT 20; max_connections 250, reserved 10). De vorige defaults (1 worker per core × pool 10) zouden dat budget onder clustering ruim overschrijden en connecties laten weigeren (too many connections for role). - WEB_CONCURRENCY default 1 (clustering opt-in i.p.v. één worker per core). De app is I/O-bound, dus één proces met een gezonde pool is de beste balans. - DB_POOL_MAX default 15, ceiling 20 (de per-user cap). - README: rekensom replicas × WEB_CONCURRENCY × DB_POOL_MAX ≤ ~18, plus PgBouncer als route voor echte schaal.
1 parent f60abe1 commit 37c7b54

4 files changed

Lines changed: 41 additions & 28 deletions

File tree

apps/boekhouding-backend/README.md

Lines changed: 19 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -27,28 +27,37 @@ Tests draaien tegen een echte Postgres. Zet `TEST_DATABASE_URL`, of laat de defa
2727
| `CORS_ORIGIN` / `PUBLIC_HOST` | `http://localhost:5174` | Toegestane origin(s), comma-gescheiden lijst mogelijk |
2828
| `TRUST_PROXY` | `1` | Aantal proxy-hops (voor `req.ip` / rate-limit) |
2929
| `EXPOSE_API_DOCS` | `false` | Swagger UI + `/api/openapi.json` |
30-
| **`WEB_CONCURRENCY`** | aantal CPU-cores | Aantal worker-processen (clustering). `1` = uit. Geclampt op `[1, 64]` |
31-
| **`DB_POOL_MAX`** | `10` | Postgres-poolgrootte **per worker**. Geclampt op `[1, 100]` |
30+
| **`WEB_CONCURRENCY`** | `1` | Aantal worker-processen (clustering). Standaard 1 (uit); opt-in via `> 1`. Geclampt op `[1, 64]` |
31+
| **`DB_POOL_MAX`** | `15` | Postgres-poolgrootte **per worker**. Geclampt op `[1, 20]` (de per-user cap) |
3232
| **`DB_CONNECT_TIMEOUT`** | `10` | Seconden voordat een nieuwe DB-verbinding faalt |
3333
| **`DB_IDLE_TIMEOUT`** | `30` | Seconden voordat een idle DB-verbinding wordt gesloten |
3434
| **`RATE_LIMIT_MAX`** | `300` | Verzoeken per IP per minuut (cluster-breed; zie onder) |
3535

3636
Ongeldige/ontbrekende waarden vallen veilig terug op de default.
3737

38-
## Schalen (clustering)
38+
## Schalen en de connectie-limiet
3939

40-
De server draait standaard één worker per CPU-core (`node:cluster`), zodat de
41-
beschikbare CPU wordt benut. Migraties draaien éénmalig vóór de workers starten
42-
(container-CMD: `migrate && index`).
40+
De gedeelde RIG-Postgres (`rig-db`) staat op `max_connections: 250` met
41+
`reserved_connections: 10`, en — bindend voor ons — **elke project-DB-user is
42+
gecapt op 20 connecties** (`CONNECTION LIMIT 20`, ingesteld na een incident waarbij
43+
één project alle slots opslokte en Keycloak brak). Dat aantal van **20 is dus het
44+
totale budget over álle pods, replica's en workers samen**.
4345

44-
**Pool-rekensom — belangrijk:** elke worker heeft zijn eigen pool. Zorg dat
46+
Omdat de app I/O-bound is (lage CPU-load) en DB-werk op de DB-server draait, levert
47+
**één worker met een gezonde pool** de beste balans — niet veel workers met mini-pools.
48+
Daarom: `WEB_CONCURRENCY=1` (standaard), `DB_POOL_MAX=15` → 15 connecties bij 1 replica,
49+
ruim onder 20.
50+
51+
**Rekensom — bewaak het budget:**
4552

4653
```
47-
WEB_CONCURRENCY × DB_POOL_MAX ≤ Postgres max_connections (default 100) − headroom
54+
replicas × WEB_CONCURRENCY × DB_POOL_MAX ≤ ~18 (20 minus headroom)
4855
```
4956

50-
(headroom voor migraties, Keycloak en beheer). Voorbeeld: 4 workers × 10 = 40 → ruim
51-
binnen 100. Een te hoge combinatie kan de database uitputten (self-DoS).
57+
Wil je toch meerdere cores benutten (clustering), zet dan `WEB_CONCURRENCY > 1` én
58+
verlaag `DB_POOL_MAX` navenant (bv. 6 workers × 3 = 18). Wil je echt naar veel
59+
gelijktijdige gebruikers schalen, dan is een **connection pooler (PgBouncer)** de
60+
juiste route (al voorzien als rig-cluster-*future*) i.p.v. een grotere per-worker-pool.
5261

5362
De in-memory rate-limit is per worker; het entrypoint deelt `RATE_LIMIT_MAX` daarom
5463
door het aantal workers, zodat de cluster-brede limiet bij benadering gelijk blijft.

apps/boekhouding-backend/src/config.ts

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -49,12 +49,15 @@ export const config = {
4949
exposeApiDocs: process.env.EXPOSE_API_DOCS === 'true',
5050
trustProxy: parseTrustProxy(),
5151
databaseUrl: process.env.DATABASE_SERVER_FULL || 'postgresql://parassessment:parassessment@localhost:5432/parassessment',
52-
// Postgres connection pool. Defaults match postgres.js but are now explicit and
53-
// tunable per deployment. With clustering, size DB_POOL_MAX so that
54-
// workers × DB_POOL_MAX stays below the server's max_connections (default 100),
55-
// leaving headroom for migrations and other clients.
52+
// Postgres connection pool, PER worker process. The RIG shared Postgres caps
53+
// each project DB user at 20 connections total (see README), so the total
54+
// across all workers AND replicas must stay under that:
55+
// replicas × WEB_CONCURRENCY × DB_POOL_MAX ≤ ~18 (20 minus headroom).
56+
// The app is I/O-bound, so the default is a single worker with a healthy pool;
57+
// the ceiling is the per-user cap. Raise the pool / add workers only within
58+
// that budget, or put a connection pooler (PgBouncer) in front.
5659
db: {
57-
max: parsePositiveInt(process.env.DB_POOL_MAX, 10, 100),
60+
max: parsePositiveInt(process.env.DB_POOL_MAX, 15, 20),
5861
connectTimeout: parsePositiveInt(process.env.DB_CONNECT_TIMEOUT, 10, 300),
5962
idleTimeout: parsePositiveInt(process.env.DB_IDLE_TIMEOUT, 30, 86400),
6063
},

apps/boekhouding-backend/src/index.ts

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
import cluster from 'node:cluster'
2-
import { availableParallelism } from 'node:os'
32
import { buildApp } from './app.js'
43
import { config } from './config.js'
54

6-
// One worker per CPU core by default, so we actually use the available CPU.
7-
// WEB_CONCURRENCY pins the count (1 disables clustering; lower it when scaling
8-
// horizontally via replicas instead). The value is clamped in config.ts.
9-
const workers = config.webConcurrency ?? availableParallelism()
5+
// Single worker by default. The app is I/O-bound (low CPU), and the shared
6+
// Postgres caps this DB user at 20 connections total, so each extra worker
7+
// multiplies connection pressure (workers × DB_POOL_MAX). Opt into clustering
8+
// by setting WEB_CONCURRENCY > 1, and then lower DB_POOL_MAX so that
9+
// WEB_CONCURRENCY × DB_POOL_MAX stays within the budget (see README/config.ts).
10+
const workers = config.webConcurrency ?? 1
1011

1112
if (workers > 1 && cluster.isPrimary) {
1213
// Migrations already ran once before this process started (the container CMD

apps/boekhouding-backend/test/cov/config.cov.test.ts

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -187,38 +187,38 @@ describe('config — parseTrustProxy', () => {
187187
describe('config — db pool (parsePositiveInt with clamping)', () => {
188188
it('uses safe defaults when the pool env vars are unset', async () => {
189189
const config = await loadConfig()
190-
expect(config.db).toEqual({ max: 10, connectTimeout: 10, idleTimeout: 30 })
190+
expect(config.db).toEqual({ max: 15, connectTimeout: 10, idleTimeout: 30 })
191191
})
192192

193193
it('accepts a valid override within range', async () => {
194-
process.env.DB_POOL_MAX = '20'
194+
process.env.DB_POOL_MAX = '12'
195195
process.env.DB_CONNECT_TIMEOUT = '5'
196196
process.env.DB_IDLE_TIMEOUT = '120'
197197
const config = await loadConfig()
198-
expect(config.db).toEqual({ max: 20, connectTimeout: 5, idleTimeout: 120 })
198+
expect(config.db).toEqual({ max: 12, connectTimeout: 5, idleTimeout: 120 })
199199
})
200200

201201
it('falls back to the default for a non-numeric value', async () => {
202202
process.env.DB_POOL_MAX = 'abc'
203203
const config = await loadConfig()
204-
expect(config.db.max).toBe(10)
204+
expect(config.db.max).toBe(15)
205205
})
206206

207207
it('falls back to the default for a value below 1 (e.g. 0)', async () => {
208208
process.env.DB_POOL_MAX = '0'
209209
const config = await loadConfig()
210-
expect(config.db.max).toBe(10)
210+
expect(config.db.max).toBe(15)
211211
})
212212

213-
it('clamps a value above the maximum (pool capped at 100)', async () => {
213+
it('clamps a value above the per-user cap (pool capped at 20)', async () => {
214214
process.env.DB_POOL_MAX = '500'
215215
const config = await loadConfig()
216-
expect(config.db.max).toBe(100)
216+
expect(config.db.max).toBe(20)
217217
})
218218
})
219219

220220
describe('config — webConcurrency (parseWebConcurrency)', () => {
221-
it('returns null when WEB_CONCURRENCY is unset (entry point defaults to CPU count)', async () => {
221+
it('returns null when WEB_CONCURRENCY is unset (entry point defaults to 1 worker)', async () => {
222222
const config = await loadConfig()
223223
expect(config.webConcurrency).toBeNull()
224224
})

0 commit comments

Comments
 (0)