Skip to content

sirmarkz/staff-engineer-mode

Repository files navigation

Staff Engineer Mode

Staff Engineer Mode gives coding agents the judgement of senior staff-level engineers.

Give it a design, diff, rollout, incident, migration, or maintenance problem. The router picks the right specialist and makes the agent reason through availability, correctness, resilience, release safety, observability, performance, privacy, recovery, and maintainability before the work ships or changes production.

The specialists also carry built-in lessons from real outages, so the agent checks failure modes that have broken production systems before.

Sources

The practice library draws from first-party engineering sources: Amazon's Builders' Library, Google's SRE books and Software Engineering at Google, Meta Engineering, Microsoft's SDL and DevOps guidance, Apple's security and privacy documentation, and Netflix's resilience work. Standards and guidance come from NIST, CISA, OWASP, OpenSSF, IETF, and W3C.

Public outage and incident records provide case studies: AWS post-event summaries, Azure post-incident reviews, Google Cloud and Google Workspace incident reports, Meta's outage writeups, and Netflix's AWS-outage analysis.

See the source index for the full reference set. Staff Engineer Mode is independent and is not endorsed by or affiliated with these organizations.

How It Works

Ask a normal engineering question. Hand the agent a task, design, diff, incident, rollout, or maintenance problem. The router picks one specialist (occasionally one secondary), reads that file, and returns concrete decisions, risks, checks, owners, supporting details, and next steps. You never name a specialist.

Supported tools should list only the native staff-engineer-mode router. Specialist files live under specialists/ and load only after routing. The router picks one primary specialist by default.

For commits and amends, Staff Engineer Mode calls agent-pr-review against the exact staged diff. For releases, tags, version bumps, packages, artifacts, and promotions, it calls release-build-reproducibility and production-readiness-review together.

Installation

Commands labeled "terminal" are run in your shell. Commands labeled "agent chat" are typed inside that tool's interactive agent session.

Claude Code

Terminal:

claude plugin marketplace add https://github.com/sirmarkz/staff-engineer-mode.git
claude plugin install staff-engineer-mode@staff-engineer-mode

Agent chat:

/plugin marketplace add https://github.com/sirmarkz/staff-engineer-mode.git
/plugin install staff-engineer-mode@staff-engineer-mode

Codex

Terminal:

codex plugin marketplace add https://github.com/sirmarkz/staff-engineer-mode.git
codex plugin add staff-engineer-mode@staff-engineer-mode

Cursor

Terminal:

git clone https://github.com/sirmarkz/staff-engineer-mode.git ~/.cursor/staff-engineer-mode-src
mkdir -p ~/.cursor/plugins
ln -s ~/.cursor/staff-engineer-mode-src ~/.cursor/plugins/staff-engineer-mode

OpenCode

Terminal:

opencode plugin 'staff-engineer-mode@git+https://github.com/sirmarkz/staff-engineer-mode.git'

GitHub Copilot CLI

Terminal:

copilot plugin marketplace add https://github.com/sirmarkz/staff-engineer-mode.git

Install the plugin:

copilot plugin install staff-engineer-mode@staff-engineer-mode

Gemini CLI

Terminal:

gemini extensions install https://github.com/sirmarkz/staff-engineer-mode

Verify

Start a fresh session inside any open repo and ask one of:

  • "Before implementing partner webhooks, design delivery retries, replay, and dead-letter handling."
  • "For a new inventory dependency call, decide timeout, retry, and fallback."
  • "Review my last commit."

The agent should load the router, choose one specialist, and respond with concrete decisions, risks, checks, owners, supporting details, and next steps.

For more coverage, see the sample prompts.

What's Inside

One native router skill: staff-engineer-mode. It routes to 64 specialist files under specialists/; those files are not installed or listed as separate native skills.

Specialists by surface:

Surface Specialist files
Architecture & interfaces architecture-decisions, api-design-and-compatibility, data-contracts, event-workflows, resilience-requirements, persistent-connection-systems
Correctness & testing state-machine-correctness, testing-and-quality-gates, test-data-engineering
Reliability & resilience slo-and-error-budgets, high-availability-design, dependency-resilience, backup-and-recovery, resilience-experiments, performance-and-capacity, cost-aware-reliability, multi-region-and-data-residency, scheduled-job-reliability
Data, storage & privacy distributed-data-and-consistency, database-operations, data-pipeline-reliability, caching-and-derived-data, privacy-and-data-lifecycle, data-lineage-and-provenance
Delivery & change safety progressive-delivery, feature-flag-lifecycle, release-build-reproducibility, fleet-upgrades, migration-and-deprecation, configuration-and-automation-safety, dev-environment-parity, service-decommission-and-sunset
Code quality & maintainability code-readability-for-agents, dependency-and-code-hygiene
Operations & incident response observability-and-alerting, incident-response-and-postmortems, oncall-health, operational-ownership-transfer
Security secure-sdlc-and-threat-modeling, identity-and-secrets, cryptography-and-key-lifecycle, software-supply-chain-security, vulnerability-management, tenant-isolation, edge-traffic-and-ddos-defense, llm-application-security, input-validation-and-injection-defense, client-application-security
Platform & infrastructure infrastructure-and-policy-as-code, internal-service-networking, platform-golden-paths, container-runtime-and-orchestration
Client & frontend web-release-gates, mobile-release-engineering, accessibility-gates
AI/ML systems llm-evaluation, ml-reliability-and-evaluation, llm-serving-cost-and-latency
Governance & readiness agent-pr-review, ai-coding-governance, documentation-lifecycle, engineering-control-evidence, production-readiness-review, experimentation-and-metric-guardrails

Contributing

Patches welcome, especially practices from authoritative sources: first-party engineering publications, official documentation, standards bodies, peer-reviewed papers, or widely cited practitioner references.

New specialist files must be technology-agnostic, cite source-index references, and avoid vendor endorsement. Read CONTRIBUTING.md before opening a PR. The voice is enforced.

License

MIT