Production-oriented, local-first reference architecture for real-time fraud detection using:
- Redpanda for event transport (Kafka API)
- RisingWave for continuous SQL and materialized views
- Grafana for operational dashboards
- Python producer for synthetic banking event generation
Designed for platform engineering and fraud analytics teams who need a runnable streaming baseline with clear lineage, observability, and extensible detection rules.
This repository simulates a banking fraud environment end-to-end:
- Producer emits transactions, logins, card events, alerts, and KYC updates.
- Redpanda topics buffer and distribute events.
- RisingWave consumes topics into source tables and computes staged + detection + risk views.
- Grafana queries RisingWave directly for live fraud operations dashboards.
The pipeline includes signal logic for:
- Velocity bursts
- Geographic impossibility
- Account takeover indicators
- Card-not-present spikes
- Brute-force login storms
- Structuring / smurfing
- Correlated multi-alert bursts
Domain-separated input topics feed the pipeline. Malformed events are routed to a Dead Letter Queue (dlq.malformed_events) rather than dropped.
Four SQL layers transform raw Redpanda events into investigation-ready insights: Sources → Staging MVs (enrichment & KYC joins) → Signal MVs (business logic & windowed aggregations) → Risk/Case MVs (per-account risk scores and open case alerts).
flowchart LR
subgraph Producer["Python Producer"]
E1[transactions]
E2[login_events]
E3[card_events]
E4[alert_events]
E5[kyc_profile_events]
end
subgraph Broker["Redpanda"]
T[(Kafka Topics)]
DLQ[(fraud_dlq)]
end
subgraph Stream["RisingWave"]
S1[Sources]
S2[Staging MVs]
S3[Fraud Signal MVs]
S4[Risk + Case MVs]
end
subgraph Observe["Grafana"]
D[Fraud Ops Dashboards]
end
E1 --> T
E2 --> T
E3 --> T
E4 --> T
E5 --> T
T --> S1 --> S2 --> S3 --> S4 --> D
T -. publish errors .-> DLQ
For deeper system design, see:
docs/architecture.mddocs/data_flow.mddocs/components.md
.
├── docker-compose.yml # Service orchestration
├── Makefile # Developer workflow targets
├── fraud_detection/ # dbt project for RisingWave models
│ ├── models/ # dbt models (sources, staging, signals, risk, cases)
│ ├── dbt_project.yml # dbt project configuration
│ └── profiles.yml # RisingWave connection profile
├── producers/ # Python event producer
│ ├── generators/ # Synthetic event generation modules
│ ├── tests/ # Unit tests for generators + producer logic
│ └── main.py # Multi-threaded producer runtime
├── grafana/ # Provisioned datasource + dashboards
├── scripts/ # Health check and utility scripts
└── docs/ # Architecture + ops documentation
- Docker Engine + Docker Compose v2
make
git clone https://github.com/alwyndsouza/rp-dbt-rw-fraud-monitor
cd rp-dbt-rw-fraud-monitor
make upWait 90–120 seconds, then validate:
make validate- Redpanda Console: http://localhost:8080
- RisingWave Dashboard: http://localhost:5691
- Grafana: http://localhost:3000 (
admin/adminby default) - Producer health: http://localhost:8001/health
make status # container health + key view counts
make logs # follow all logs
make logs-producer # producer logs only
make kpis # latest KPI rows
make risk # critical risk accounts
make cases # open fraud investigation cases
make psql # SQL shell to RisingWaveEasiest - run all CI checks at once:
make ciIndividual commands:
# Run pytest (from producers directory with uv)
cd producers && uv run pytest tests -q && cd ..
# Run ruff linting
uv run ruff check producers
# Run SQL validation checks
python3 scripts/ci_sql_checks.py
# Run dbt tests (requires Docker + RisingWave running)
docker run --rm \
--network rp-dbt-rw-fraud-monitor_fraud-net \
-v $(pwd)/fraud_detection:/dbt \
-e RISINGWAVE_HOST=risingwave \
python:3.11-slim \
sh -c 'pip install dbt-risingwave==1.9.7 --quiet 2>&1 >/dev/null && cd /dbt && dbt test --profiles-dir .'Prerequisites:
- Install dependencies:
uv sync --group dev(root) andcd producers && uv sync --extra dev - Ensure Docker network
rp-dbt-rw-fraud-monitor_fraud-netexists (runmake upfirst)
Configuration is environment-variable driven (via .env and compose overrides).
Common knobs:
FRAUD_RATE(default0.10)TRANSACTION_RATE(default20)LOGIN_RATE,CARD_RATE,ALERT_RATE,KYC_RATECUSTOMER_POOL_SIZE(default500)STRUCTURING_THRESHOLD(default10000)MINUTE_TICK_ENABLEDand minute tick batch sizesALERT_WORKERSandALERT_QUEUE_SIZEfor bounded reactive alert processingREDPANDA_SECURITY_PROTOCOL,REDPANDA_SASL_*for secured broker auth in shared/prod environments
Use .env.example as baseline. make up will create .env if missing.
This repo is optimized for local and sandbox environments; for production:
- Pin versions instead of
:latestimages indocker-compose.yml. - Externalize secrets (Grafana admin credentials, broker auth).
- Enable broker security (TLS/SASL) and network segmentation.
- Use managed storage + backups for Redpanda/RisingWave/Grafana state.
- Set resource limits / autoscaling based on event throughput SLOs.
- Run CI gates (lint/tests + SQL checks) on every PR.
A practical migration path is Kubernetes with:
- StatefulSets for broker/stream DB
- sealed-secrets/external-secrets
- Prometheus + alerting for lag, MV freshness, and DLQ growth
- Producer liveness endpoint:
/health - End-to-end smoke test:
scripts/check_pipeline.sh(make validate) - Data-quality guardrails:
make dq(freshness, duplicates, null-key checks) - Grafana dashboards pre-provisioned from
grafana/dashboards/
Recommended SLO indicators:
- event ingest lag
- materialized-view freshness
- fraud signal latency
- DLQ publish failure rate
- critical-risk case backlog
docs/architecture.md— detailed system architecturedocs/data_flow.md— lineage, transformations, and model layersdocs/components.md— component-by-component referencedocs/troubleshooting.md— common failure modes and fixesdocs/runbook.md— operational proceduresdocs/fraud_patterns.md— business + regulatory context for each signal
See CONTRIBUTING.md for:
- coding standards
- branch/PR strategy
- review checklist
GitHub Actions CI runs:
ruffandpyteston Python 3.11- dbt model compilation and validation
- smoke checks for compose config and data-quality artifacts
Workflow file: .github/workflows/ci.yml.

