Skip to content

v2.1.0 — opt-in public-GitHub-activity liveness signal

Latest

Choose a tag to compare

@oedokumaci oedokumaci released this 08 Jun 16:04

Highlights

Adds an opt-in second-chance liveness signal. When CHECK_PUBLIC_ACTIVITY=true and a heartbeat interval has elapsed, the switch consults your public GitHub activity feed before advancing state — recent non-bot events on any repo other than this one keep the switch ALIVE for one more cycle. Off by default; zero behaviour change unless you explicitly enable it.

Implements docs/PLAN_liveness.md. Went through 4 rounds of multi-agent review (14 reviewers total) before merge.

Why

If you commit frequently to other repos but rarely to the DMS repo, the v2.0 commit-only liveness check fires false-positive warnings. The new override consults your public activity feed and rescues the switch when you're alive-but-quiet-here.

Setup (when you want it on)

  1. Create a 1-year fine-grained PAT (leave all permission checkboxes at No access — endpoint is the public events feed).
  2. Add it as repo secret GH_ACTIVITY_TOKEN.
  3. Add repo variable CHECK_PUBLIC_ACTIVITY=true.
  4. (Optional) GH_USERNAME, BOT_AUTHOR_PATTERNS, BOT_MESSAGE_PATTERNS.

See the README's Optional: GitHub activity as fallback liveness signal section for the click-by-click walkthrough, hard limitations, and threat model.

Security posture

  • Custom urllib opener installs no HTTPRedirectHandler (Authorization-header leak defense) but DOES install HTTPErrorProcessor + HTTPDefaultErrorHandler so 4xx/5xx raise HTTPError and route to the fail-closed branch
  • Regex inputs capped at 4 KB (ReDoS defense on long PR/issue bodies)
  • Per-event errors caught and skipped; the bot-filter call sits OUTSIDE that try, so filter-logic bugs propagate (loud failure, not silent over-fire)
  • Fail-closed network branch catches URLError (incl. HTTPError), JSONDecodeError, TimeoutError, AND http.client.HTTPException (the last doesn't subclass URLError)
  • Generic ::warning:: text on fail-closed (no detail leaked as a token-expiry / rate-limit timing oracle)
  • Token shape validated at startup; CHECK_PUBLIC_ACTIVITY strict-parsed (yes/1/TRUE all raise)
  • Schema-drift guard: a non-list 200 body also fails closed

Hard invariants preserved

  • ALREADY_DECLARED_DEAD and DISARMED still take precedence — override gate runs only if interval has elapsed AND feature enabled AND not a manual dispatch
  • Manual dispatch short-circuits the API call entirely (no rate-limit burn on every "Run workflow" click; config validation still runs)
  • Heartbeat intervals longer than 30 days silently clamp to GitHub's events-feed ceiling with a ::notice::

Stats

  • Code: +423 LOC in dead_mans_switch.py
  • Tests: +1351 LOC in tests/test_liveness.py (332 passing, 1 skipped integration test)
  • Docs: +221 LOC in README
  • Workflow: 1 added line (CHECK_PUBLIC_ACTIVITY: "false" default)
  • 100% line + branch coverage maintained (enforced via --cov-fail-under=100)
  • No new third-party dependencies (stdlib urllib, json, http.client, re, datetime only)

Upgrade

If you don't want the feature, do nothing — it's off by default. To enable, follow the README walkthrough above.

Full Changelog: v2.0.0...v2.1.0