Real-Time Fraud Detection Streaming Reference

Production-oriented, local-first reference architecture for real-time fraud detection using:

Redpanda for event transport (Kafka API)
RisingWave for continuous SQL and materialized views
Grafana for operational dashboards
Python producer for synthetic banking event generation

Designed for platform engineering and fraud analytics teams who need a runnable streaming baseline with clear lineage, observability, and extensible detection rules.

1) Project Overview

This repository simulates a banking fraud environment end-to-end:

Producer emits transactions, logins, card events, alerts, and KYC updates.
Redpanda topics buffer and distribute events.
RisingWave consumes topics into source tables and computes staged + detection + risk views.
Grafana queries RisingWave directly for live fraud operations dashboards.

The pipeline includes signal logic for:

Velocity bursts
Geographic impossibility
Account takeover indicators
Card-not-present spikes
Brute-force login storms
Structuring / smurfing
Correlated multi-alert bursts

2) High-Level Architecture

Redpanda Topic Structure & Data Flow

Domain-separated input topics feed the pipeline. Malformed events are routed to a Dead Letter Queue (dlq.malformed_events) rather than dropped.

RisingWave Materialized View Pipeline

Four SQL layers transform raw Redpanda events into investigation-ready insights: Sources → Staging MVs (enrichment & KYC joins) → Signal MVs (business logic & windowed aggregations) → Risk/Case MVs (per-account risk scores and open case alerts).

flowchart LR
    subgraph Producer["Python Producer"]
      E1[transactions]
      E2[login_events]
      E3[card_events]
      E4[alert_events]
      E5[kyc_profile_events]
    end

    subgraph Broker["Redpanda"]
      T[(Kafka Topics)]
      DLQ[(fraud_dlq)]
    end

    subgraph Stream["RisingWave"]
      S1[Sources]
      S2[Staging MVs]
      S3[Fraud Signal MVs]
      S4[Risk + Case MVs]
    end

    subgraph Observe["Grafana"]
      D[Fraud Ops Dashboards]
    end

    E1 --> T
    E2 --> T
    E3 --> T
    E4 --> T
    E5 --> T
    T --> S1 --> S2 --> S3 --> S4 --> D
    T -. publish errors .-> DLQ

For deeper system design, see:

docs/architecture.md
docs/data_flow.md
docs/components.md

3) Repository Structure

.
├── docker-compose.yml             # Service orchestration
├── Makefile                       # Developer workflow targets
├── fraud_detection/               # dbt project for RisingWave models
│   ├── models/                    # dbt models (sources, staging, signals, risk, cases)
│   ├── dbt_project.yml            # dbt project configuration
│   └── profiles.yml               # RisingWave connection profile
├── producers/                     # Python event producer
│   ├── generators/                # Synthetic event generation modules
│   ├── tests/                     # Unit tests for generators + producer logic
│   └── main.py                    # Multi-threaded producer runtime
├── grafana/                       # Provisioned datasource + dashboards
├── scripts/                       # Health check and utility scripts
└── docs/                          # Architecture + ops documentation

4) Quick Start (Local)

Prerequisites

Docker Engine + Docker Compose v2
make

Bring up stack

git clone https://github.com/alwyndsouza/rp-dbt-rw-fraud-monitor
cd rp-dbt-rw-fraud-monitor
make up

Wait 90–120 seconds, then validate:

make validate

Service URLs

Redpanda Console: http://localhost:8080
RisingWave Dashboard: http://localhost:5691
Grafana: http://localhost:3000 (admin / admin by default)
Producer health: http://localhost:8001/health

5) Development Workflow

Core commands

make status         # container health + key view counts
make logs           # follow all logs
make logs-producer  # producer logs only
make kpis           # latest KPI rows
make risk           # critical risk accounts
make cases          # open fraud investigation cases
make psql           # SQL shell to RisingWave

Python tests / lint

Easiest - run all CI checks at once:

make ci

Individual commands:

# Run pytest (from producers directory with uv)
cd producers && uv run pytest tests -q && cd ..

# Run ruff linting
uv run ruff check producers

# Run SQL validation checks
python3 scripts/ci_sql_checks.py

# Run dbt tests (requires Docker + RisingWave running)
docker run --rm \
  --network rp-dbt-rw-fraud-monitor_fraud-net \
  -v $(pwd)/fraud_detection:/dbt \
  -e RISINGWAVE_HOST=risingwave \
  python:3.11-slim \
  sh -c 'pip install dbt-risingwave==1.9.7 --quiet 2>&1 >/dev/null && cd /dbt && dbt test --profiles-dir .'

Prerequisites:

Install dependencies: uv sync --group dev (root) and cd producers && uv sync --extra dev
Ensure Docker network rp-dbt-rw-fraud-monitor_fraud-net exists (run make up first)

6) Configuration

Configuration is environment-variable driven (via .env and compose overrides).

Common knobs:

FRAUD_RATE (default 0.10)
TRANSACTION_RATE (default 20)
LOGIN_RATE, CARD_RATE, ALERT_RATE, KYC_RATE
CUSTOMER_POOL_SIZE (default 500)
STRUCTURING_THRESHOLD (default 10000)
MINUTE_TICK_ENABLED and minute tick batch sizes
ALERT_WORKERS and ALERT_QUEUE_SIZE for bounded reactive alert processing
REDPANDA_SECURITY_PROTOCOL, REDPANDA_SASL_* for secured broker auth in shared/prod environments

Use .env.example as baseline. make up will create .env if missing.

7) Deployment Notes (Production)

This repo is optimized for local and sandbox environments; for production:

Pin versions instead of :latest images in docker-compose.yml.
Externalize secrets (Grafana admin credentials, broker auth).
Enable broker security (TLS/SASL) and network segmentation.
Use managed storage + backups for Redpanda/RisingWave/Grafana state.
Set resource limits / autoscaling based on event throughput SLOs.
Run CI gates (lint/tests + SQL checks) on every PR.

A practical migration path is Kubernetes with:

StatefulSets for broker/stream DB
sealed-secrets/external-secrets
Prometheus + alerting for lag, MV freshness, and DLQ growth

8) Observability & Operations

Producer liveness endpoint: /health
End-to-end smoke test: scripts/check_pipeline.sh (make validate)
Data-quality guardrails: make dq (freshness, duplicates, null-key checks)
Grafana dashboards pre-provisioned from grafana/dashboards/

Recommended SLO indicators:

event ingest lag
materialized-view freshness
fraud signal latency
DLQ publish failure rate
critical-risk case backlog

9) Documentation Index

docs/architecture.md — detailed system architecture
docs/data_flow.md — lineage, transformations, and model layers
docs/components.md — component-by-component reference
docs/troubleshooting.md — common failure modes and fixes
docs/runbook.md — operational procedures
docs/fraud_patterns.md — business + regulatory context for each signal

10) Contributing

See CONTRIBUTING.md for:

coding standards
branch/PR strategy
review checklist

11) CI Pipeline

GitHub Actions CI runs:

ruff and pytest on Python 3.11
dbt model compilation and validation
smoke checks for compose config and data-quality artifacts

Workflow file: .github/workflows/ci.yml.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.claude		.claude
.github/workflows		.github/workflows
docs		docs
fraud_detection		fraud_detection
grafana		grafana
producers		producers
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.sqlfluff		.sqlfluff
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time Fraud Detection Streaming Reference

1) Project Overview

2) High-Level Architecture

Redpanda Topic Structure & Data Flow

RisingWave Materialized View Pipeline

3) Repository Structure

4) Quick Start (Local)

Prerequisites

Bring up stack

Service URLs

5) Development Workflow

Core commands

Python tests / lint

6) Configuration

7) Deployment Notes (Production)

8) Observability & Operations

9) Documentation Index

10) Contributing

11) CI Pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Real-Time Fraud Detection Streaming Reference

1) Project Overview

2) High-Level Architecture

Redpanda Topic Structure & Data Flow

RisingWave Materialized View Pipeline

3) Repository Structure

4) Quick Start (Local)

Prerequisites

Bring up stack

Service URLs

5) Development Workflow

Core commands

Python tests / lint

6) Configuration

7) Deployment Notes (Production)

8) Observability & Operations

9) Documentation Index

10) Contributing

11) CI Pipeline

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages