Skip to content

feat: built-in alert de-dup / cooldown for webhook channels (kill-switch-cf) #20

Description

@mikeumus

Priority: HIGH.

With a 5-min cron and daily-cumulative thresholds (requests/neurons/rows accumulate over the UTC day), once a threshold is crossed it STAYS crossed for the rest of the day → the same alert re-fires every 5 min (~288×/day). PagerDuty self-dedups via dedup_key, but Discord/Slack/custom webhooks do not — we got an all-day Slack alert stream.

The managed monitoring-engine already has a 6h cooldown; the self-hosted kill-switch-cf should ship the same. We implemented it with a KV namespace keyed by a digit-stripped signature of the violation set (so changing numbers in messages don't defeat the key), TTL = cooldown, PagerDuty exempt.

Reference implementation (built + running in prod at Divinci): Divinci-AI/cloudflare-billing-kill-switch.
From FEEDBACK-from-divinci-deployment.md — real-world findings from the Divinci self-hosted deployment, 2026-06-17.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:alertingAlert channels, de-dup, thresholdsenhancementNew feature or requestpriority:highCritical — outage risk or biggest coverage gap

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions