When an alert fires, one command should tell you everything: what's failing (from alerts and logs), what changed, how they connect—across clusters, with history and learned patterns. That's kblame. That's the vision.
80% of Kubernetes outages trace to recent changes (Komodor, 2025). Yet today, finding them is a slog. Kubectl events expire in an hour. ArgoCD shows syncs, not details. Slack becomes your search engine. Forty-five minutes later: someone found a ConfigMap edit.
Our first release fills a gap: a unified change timeline across common Kubernetes resource types—images, ConfigMaps, Secrets, RBAC, HPA, NetworkPolicy, env vars, node events—in a single command. No open-source CLI has this today.
$ kblame -n payments --since 30m
CHANGES (last 30m) NAMESPACE: payments
────────────────────────────────────────────────────────────────────
TIME KIND NAME CHANGE
14:21:03 UTC Deployment payments-service image: v2.3.1 → v2.3.2
14:21:14 UTC Event payments-svc-7f.. Failed to pull image "v2.3.2"
14:22:01 UTC ConfigMap payments-config updated by payments-team (2 fields)
14:25:30 UTC HPA payments-hpa scaled 3 → 5 replicas
Single binary. Client-side today—no agents, no RBAC, no persistent storage. An in-cluster controller may be introduced in V3+ for persistent history and deeper correlation.
Here's how we're building toward the full vision:
| Version | Functionality | Status |
|---|---|---|
| V1 | What changed — unified change timeline across common Kubernetes resource types | Current |
| V2 | What changed + what's failing — logs, confirmed alert correlation | Planned |
| V3 | What changed + what's failing + how they connect — dependency-aware correlation, persistent history via in-cluster controller | Planned |
| V4 | Multi-cluster, web UI, learned patterns | Planned |
| V5 | Ideas fermenting... |
Several tools address parts of the incident response workflow. We believe kblame brings these capabilities together cohesively in a single CLI:
| Tool | What It Does | Where It Falls Short |
|---|---|---|
| kubectl get events | Raw event stream | Expires after 1 hour. Events, not changes. Wall of noise. |
| kubectl rollout history | Deployment revision history | Single resource type only. No cross-resource view. |
| Sloop (Salesforce) | Event timeline web dashboard | Requires deploying a persistent service. Web UI, not CLI. |
| KHI (Google) | Audit log timeline visualization | Web app. Primarily GKE + Cloud Logging. |
| Robusta | In-cluster change tracking + alerts | Requires Helm installation, persistent cluster resources. |
| kubectl-blame | Field-level attribution (who edited field X) | Different problem — field audit, not change timeline. |
| Komodor | Commercial change intelligence | Proprietary, requires agent deployment, paid. |
| ArgoCD/Flux | GitOps sync history | Shows syncs, not what specifically changed or why it broke. |
kblame is different:
- CLI-native — run from terminal, pipe to jq, script it
- Client-side only — no agents, no RBAC changes, no persistent storage
- Multi-resource — images, ConfigMaps, Secrets, RBAC, HPA, NetworkPolicy, env vars, node events in one view
- Instant — no setup, no deployment, just run it
Pre-built binaries — download and run, no Go required:
| Platform | Architecture | Download | Status |
|---|---|---|---|
| macOS | ARM64 (Apple Silicon) | kblame-darwin-arm64 | Available |
| macOS | AMD64 (Intel) | kblame-darwin-amd64 | Available |
| Linux | AMD64 | kblame-linux-amd64 | Available |
| Linux | ARM64 | kblame-linux-arm64 | Available |
| Windows | AMD64 | kblame-windows-amd64.exe | Available |
From source (requires Go 1.22+):
go install github.com/abcdedf/kblame/cmd/kblame@latest
# Or clone and build
git clone https://github.com/abcdedf/kblame.git
cd kblame
make build
# Binary at bin/kblameCopy bin/kblame to your PATH. To use as a kubectl plugin:
mv bin/kblame ~/.local/bin/kubectl-kblame
# Now run as: kubectl kblame# Show changes in current namespace, last 30 minutes
kblame
# Show changes in a specific namespace, last 1 hour
kblame -n payments --since 1h
# All namespaces
kblame -A --since 15m
# Filter by resource kind
kblame -n payments --kind deployment,configmap
# JSON output (for scripting / piping to jq)
kblame -n payments --output json
# Markdown output (for Slack / GitHub issues)
kblame -n payments --output mdkblame can correlate changes with Prometheus AlertManager alerts. This feature exists in V1 but is intentionally experimental—it shows temporal and label-based scoring, but is not the primary value yet. Richer, dependency-aware correlation comes in V2+.
# Correlate changes with active alerts
kblame -n payments --alerts --since 30m
# Specify AlertManager URL
kblame --alerts --alertmanager-url http://alertmanager.monitoring:9093When --alerts is enabled, kblame scores each (change, alert) pair using temporal proximity (40%), namespace match (20%), label overlap (20%), and change severity (20%), then shows the top 3 candidates. Use this to spot potential connections, but verify them manually.
| Change Type | Detection Method | Severity |
|---|---|---|
| Image updates | ReplicaSet revision diffing | 0.9 |
| ConfigMap changes | managedFields timestamps | 0.8 |
| Secret changes | managedFields (values redacted) | 0.8 |
| Environment variables | Pod spec diffing | 0.8 |
| Resource limits | Pod spec diffing | 0.7 |
| RBAC changes | Role/RoleBinding creation/modification | 0.7 |
| HPA scaling | Kubernetes events | 0.6 |
| NetworkPolicy changes | Resource creation/modification | 0.9 |
| Node issues | Node events (OOMKill, NotReady, etc.) | 0.6 |
- Event TTL: Kubernetes events expire after 1 hour by default. kblame can only see changes within this window (or within
--sinceduration) unless your cluster has extended event retention. - Client-side only: No persistent history. Each run queries the live cluster. For persistent change tracking, consider Sloop or an in-cluster controller (planned for V3).
- Not a monitoring tool: kblame complements your monitoring stack (Datadog, Grafana, PagerDuty). It does not replace it.
- No application-level bugs: kblame detects infrastructure changes (deploys, config, scaling). It cannot detect application logic bugs.
# Build
make build
# Run tests
make test
# Lint
make lint
# Cross-compile for all platforms
make cross-compileSet up a local kind cluster with sample services:
chmod +x scripts/demo-setup.sh
./scripts/demo-setup.sh
# Then run kblame against it
go run ./cmd/kblame -n demo --since 5mMIT