Generate a local monitoring snapshot after building the DuckDB marts:
uv run python -m src.monitoring.snapshot --snapshot-date 2025-06-30Default output:
artifacts/monitoring/snapshot_date=2025-06-30/monitoring_snapshot.json
artifacts/monitoring/snapshot_date=2025-06-30/monitoring_snapshot.md
The snapshot checks:
- DuckDB mart availability.
- Activation rate and activation mart freshness.
- Experiment support, complaint, and app-crash guardrails.
- Pricing exposure coverage, net margin, complaint rate, and human-review load.
- Pricing recommendation coverage.
- Activation batch score extract availability.
- API contract file readiness.
fail means the release should stop. warn means the project can still run, but
the result needs human review before a public release or ramp-up.
The Streamlit dashboard also includes a Monitoring tab that computes the same snapshot against the current DuckDB path and shows the overall status, status counts, attention items, and full check table.
The repository includes a GitHub Actions workflow at
.github/workflows/monitoring-snapshot.yml that can run manually or on a weekly
schedule. It rebuilds a deterministic synthetic warehouse, trains the activation
model artifact, generates batch activation scores, writes the product monitoring
snapshot, writes the activation model monitoring report, and uploads the
monitoring/scoring outputs as workflow artifacts.
Run it from GitHub Actions with Monitoring Snapshot > Run workflow before a portfolio refresh or public demo review. The workflow is intentionally synthetic: it proves the operational path without requiring real customer data, secrets, or cloud warehouse credentials.
On a first run, the activation model monitoring report can return warn for
score-distribution drift because no previous score extract exists yet. In a live
setup, the previous successful artifact or warehouse score table would be passed
as the reference extract.
After generating daily activation scores, create a model monitoring report:
uv run python -m src.monitoring.model_report --report-date 2025-06-30For score-distribution drift, pass a previous score extract as the reference:
uv run python -m src.monitoring.model_report `
--score-path artifacts/scoring/activation/score_date=2025-06-30/customer_scores_daily.parquet `
--reference-score-path artifacts/scoring/activation/score_date=2025-06-23/customer_scores_daily.parquet `
--report-date 2025-06-30Default output:
artifacts/monitoring/model_activation/report_date=2025-06-30/activation_model_monitoring.json
artifacts/monitoring/model_activation/report_date=2025-06-30/activation_model_monitoring.md
The model report checks probability bounds, score volume, targeting rate,
vulnerable-customer review load, threshold validity, and score-distribution PSI.
Use fail as a release stop, and use warn as a human-review trigger before a
rollout or public demo refresh.
After loading neobank_ml.customer_scores_daily, render the warehouse-side score
monitoring query:
uv run python -m src.cloud.bigquery_score_monitoring_plan `
--score-date 2025-06-30 `
--project neobank-growth-platform-ross `
--dataset neobank_ml `
--location EU `
--min-rows 5000The query checks scored-user volume, duplicate users, model-version count,
targeting rate, vulnerable-customer review load, probability bounds, and score
quantiles directly in BigQuery. Treat monitoring_status = fail as a release
stop and monitoring_status = warn as a human-review trigger.
The demo GCP score monitoring query was exercised on 2026-05-31 for score date
2025-06-30 and returned monitoring_status = pass: 5,000 scored users, 5,000
unique users, 1 model version, 1,390 targeted users, 27.80% targeting rate, 191
vulnerable-review users, 3.82% vulnerable-review rate, and activation
probabilities bounded from 0.0000 to 1.0000.
For scheduled cloud monitoring, render Cloud Scheduler triggers for the Cloud Run Jobs. Deploy the runnable job image and Cloud Run Jobs first:
uv run python -m src.cloud.cloud_run_job_deploy_plan `
--project neobank-growth-platform-ross `
--project-number 319492039091 `
--region europe-west2 `
--bucket neobank-growth-platform-ross-raw `
--bq-location EU `
--bq-ml-dataset neobank_ml `
--bq-monitoring-dataset neobank_monitoring `
--score-date 2025-06-30 `
--users 5000 `
--months 6Omit --score-date for a rolling daily schedule; keep it for the reproducible
portfolio demo run.
Then render the scheduler triggers:
uv run python -m src.cloud.cloud_run_scheduler_plan `
--project neobank-growth-platform-ross `
--project-number 319492039091 `
--run-region europe-west2 `
--scheduler-region europe-west2 `
--service-account-email neobank-scheduler@neobank-growth-platform-ross.iam.gserviceaccount.comThe default monitoring cadence runs scoring at 06:00 Europe/London and score monitoring at 06:30 Europe/London. Keep the second job dependent in practice by scheduling it after the scoring job's usual completion window and by treating a missing or low-row score partition as a monitoring failure.
The demo GCP schedules were resumed and verified on 2026-05-31:
gcloud scheduler jobs list --location=europe-west2Expected active schedules:
neobank-daily-activation-scoring 0 6 * * * (Europe/London) ENABLED
neobank-daily-score-monitoring 30 6 * * * (Europe/London) ENABLED
Enabled log-based alert policies watch Cloud Run Job and private API service error logs:
gcloud monitoring policies list --format="table(displayName,enabled)"Expected alert policies:
Neobank Cloud Run job failure alert True
Neobank API service failure alert True
The scheduled job alert filter is:
resource.type="cloud_run_job"
resource.labels.job_name=~"neobank-(activation-score-load|score-monitoring)"
severity>=ERROR
The private API service alert filter is:
resource.type="cloud_run_revision"
resource.labels.service_name="neobank-api"
severity>=ERROR
A project budget alert is configured as the cost-control guardrail for the demo GCP project. Budget alerts are notifications, not hard spend caps, so pause the schedules if an alert fires unexpectedly.
After D7 outcomes have matured for a scored cohort, generate the calibration report by joining score extracts to realised activation labels:
uv run python -m src.monitoring.calibration_report `
--score-path artifacts/scoring/activation/score_date=2025-06-30/customer_scores_daily.parquet `
--db neobank.duckdb `
--report-date 2025-07-07You can also provide a label extract with --label-path when labels are exported
from a warehouse table. The file must contain user_id and activated_d7.
Default output:
artifacts/monitoring/model_activation_calibration/report_date=2025-07-07/activation_calibration_monitoring.json
artifacts/monitoring/model_activation_calibration/report_date=2025-07-07/activation_calibration_monitoring.md
The calibration report checks matched label coverage, sample size, expected calibration error, Brier score, portfolio prediction bias, and the largest segment calibration gap across income segment, signup channel, and region. Run this after the prediction window closes; before then, use the score-distribution report as the early-warning signal.
Use this lightweight release gate before refreshing public screenshots or ramping a synthetic rollout:
- Run
dbt build. - Generate batch activation scores.
- Run the monitoring snapshot.
- Run the score-distribution report against a recent reference extract.
- After D7 labels mature, run the calibration report.
Stop the release when any report returns fail. Review the affected check,
regenerate upstream data only if the failure is caused by stale local artifacts,
and document the decision before continuing. Treat warn as a human-review
state: acceptable for a demo when explained, but not for an unattended rollout.