A Claude Code plugin that brings a strict working model for agent orchestration into every session: Claude is not the worker, but the CEO of an agent organization — delegate instead of doing it yourself, run in parallel instead of serially, have work checked instead of signing it off yourself, everything on opus, and forget nothing.
The plugin is generic and tied to no industry, company, or task type. It encodes an orchestration pattern and applies to any large, multi-part, or quality-critical assignment.
The plugin encodes a strict, nested five-level team-lead hierarchy. Each level delegates work downward instead of doing it itself:
- CEO (the main session) — commissions team leads, reviews their results, follows up, consolidates for the user, and maintains the memory. The CEO never does substantive work itself (no research of its own, no coding of its own, no analysis of its own). The sole exception is maintaining the memory — that counts explicitly as administration, not as substantive work.
- Team Lead — leads, reviews, consolidates; likewise does no substantive work itself, but hires Specialists instead.
- Specialist — performs the actual substantive work; may start Interns for sub-pieces.
- Intern — takes on a well-bounded sub-task of a Specialist.
- Assistant — smallest, tightly scoped supporting tasks.
A Team Lead may build up to three levels beneath it; the total depth is capped at five levels.
Four rules constitute the strictness:
opusenforcement: EVERY agent dispatch at EVERY level runs withmodel: "opus". No Sonnet/Haiku exceptions — not even for Interns or Assistants.- CEO/Team Lead delegate only: they never do substantive work themselves; they commission, review, and consolidate.
- Author != reviewer: nobody assesses their own work. A separate reviewer agent checks each deliverable against its requirements before it counts as "done". "Done" is a state reached after a passed review, not after delivery — with verification-before-completion (real test/build evidence instead of claims).
- Mandatory memory upkeep: status, agent IDs, and handoff are kept gap-free in a single memory file, so any new session takes over seamlessly and nothing is lost.
Independent agents are always launched in parallel (multiple Agent calls in one block). Each agent gets a name; gaps are followed up via SendMessage to the running agent instead of relaunching it.
Precedence rule. This plugin is not a copy of a plain parallel fan-out pattern or an in-session plan execution. It layers the four points above — the strict 5-level hierarchy, the
opusenforcement via hook, the mandatory memory upkeep, and the author != reviewer gate — on top of that. It also distinguishes itself from a flat, deterministic workflow script: such a script orchestrates with hard-wired phases (build phase, then review, then fix) and calls all sub-agents FLAT on ONE level — breadth instead of depth, script instead of model autonomy. The Agent tool used here is recursive instead: every agent may itself call the Agent tool, is its own Claude instance with its own context window, and decides for itself how to sub-delegate further; there is no central script (called "Agent" in the harness, officially the "Task tool" — the same concept). Mnemonics: Agent tool = recursive, model-driven delegation in depth; Workflow tool/script = flat, script-driven orchestration in breadth. For this pattern, the nested Agent/Task hierarchy applies exclusively — even when another recommendation (for example, a Workflow tool offered by Ultracode) is active.
This pattern can become very expensive. Two factors multiply:
opusenforcement: every agent at every level runs on the most expensive model. There are deliberately no cheaper Sonnet/Haiku tiers for "small" sub-tasks.- Recursive fan-out: up to five levels deep, and each level can start several parallel agents. A Team Lead with five Specialists who each start three Interns quickly becomes dozens of parallel
opusinstances — each with its own context window.
opus enforcement times recursive fan-out means cost and token consumption can grow exponentially before you notice.
Recommendation before you turn the pattern loose on large assignments:
- Cap breadth: set an upper bound on the number of parallel agents per level (for example, at most 3 to 5).
- Cap depth: do not reflexively exhaust all five levels. Most assignments are fine with two to three levels; go deeper only when the task genuinely requires it.
- Start small: test the pattern on a small assignment first and observe consumption before pointing it at something large.
The plugin is obtained through a marketplace. After adding the marketplace, you install the plugin and restart Claude Code (hooks and skills load at session start).
1. Add the marketplace — the path points to the repo whose root holds the .claude-plugin/marketplace.json manifest (source: "./"):
/plugin marketplace add mguttmann/claude-team-hierarchy
2. Install the plugin:
/plugin install team-hierarchy@team-hierarchy-dev
Here team-hierarchy is the plugin name (from .claude-plugin/plugin.json) and team-hierarchy-dev is the marketplace name (from .claude-plugin/marketplace.json). The @<marketplace> suffix is always the marketplace name, never the repo name.
3. Restart Claude Code. Before that, hooks and skills do not take effect — they load only at session start.
Note: this is a PRIVATE repo. Access requires authentication. Make sure you are signed in with the
ghCLI (gh auth login) and have access tomguttmann/claude-team-hierarchy; otherwise/plugin marketplace add mguttmann/claude-team-hierarchycannot fetch it.
For local development — or to avoid the private-repo authentication entirely — you can use the absolute path to the plugin directory (the one containing .claude-plugin/marketplace.json) as the marketplace instead of a GitHub repo:
/plugin marketplace add /absolute/path/to/claude-team-hierarchy
/plugin install team-hierarchy@team-hierarchy-dev
The absolute path must point to wherever you actually cloned the repo on disk — the folder name is irrelevant and need not be claude-team-hierarchy; only the .claude-plugin/marketplace.json inside it matters. The same applies here: restart Claude Code after installation.
There are three ways to trigger the pattern. Each is a different entry point into the same orchestration model; they are distinct from the components that implement it (covered below).
- Slash command
/team-start <assignment>— the most convenient entry point. It instructs the session not to do the given assignment itself, but to apply the orchestration pattern (delegate, parallel, review). - Skill
team-orchestration— triggers on demand when the task is large, multi-part, or quality-critical, or on phrasings such as "delegate", "team lead", "orchestrate", or "team-start". It teaches the method; deep-dive texts live underskills/team-orchestration/references/(roles, lifecycle, conventions, prompt templates, case study). - Global
~/.claude/CLAUDE.mdrule — the manual, project-independent trigger: a rule you add yourself so the pattern applies in every session regardless of project. See Optional manual step: global CLAUDE.md rule below.
The plugin has five components that work together: the three agents (team-lead, specialist, reviewer) plus the PreToolUse hook plus the skill. (The slash command and the manifests below are deliverable files that round out the package; the slash command is also one of the three triggers above.) The agents are: team-lead (level-2 archetype: decomposes, dispatches Specialists in parallel, has them reviewed, consolidates), specialist (level-3 archetype: does the actual work, may start Interns, delivers real evidence), and reviewer (independent reviewer, author != reviewer, returns a numbered defect list with severity, fixes nothing itself). The opus PreToolUse hook intercepts every agent/task dispatch and checks the model field before the dispatch runs (details under Limitations).
| Component | File | Function |
|---|---|---|
Skill team-orchestration |
skills/team-orchestration/SKILL.md |
Teaches the method on demand. Loads when the task is large, multi-part, or quality-critical (triggers include "delegate", "team lead", "orchestrate", "team-start"). Deep-dive texts under skills/team-orchestration/references/{roles,lifecycle,conventions,prompt-templates,case-study}.md. |
Agent team-lead |
agents/team-lead.md |
Level-2 archetype: decomposes the assignment, dispatches Specialists in parallel (all opus), has them reviewed separately, consolidates, and delivers a management summary. Slug: team-hierarchy:team-lead. |
Agent specialist |
agents/specialist.md |
Level-3 archetype: does the actual work, may start Interns in parallel, delivers real evidence, writes no self-approval. Slug: team-hierarchy:specialist. |
Agent reviewer |
agents/reviewer.md |
Independent reviewer (author != reviewer), read-only oriented: checks against the requirements, returns a numbered defect list with severity, fixes nothing itself. Slug: team-hierarchy:reviewer. |
Slash command /team-start |
commands/team-start.md |
Convenient entry point: instructs the session not to do the given assignment itself, but to apply the orchestration pattern. |
PreToolUse hook (opus enforcement) |
hooks/hooks.json, hooks/enforce-opus.sh, hooks/run-hook.cmd |
Intercepts every agent/task dispatch and checks the model field. An explicitly set non-opus model (sonnet/haiku) is blocked (permissionDecision: deny); opus and an empty/missing model are allowed (allow). The hook rewrites nothing; it only blocks or allows. |
| Manifest | .claude-plugin/plugin.json |
Plugin manifest (name team-hierarchy). Skills, agents, commands, and the hook are auto-discovered. |
| Dev marketplace | .claude-plugin/marketplace.json |
Marketplace manifest team-hierarchy-dev with source: "./" for local or repo-based installation. |
To make the pattern apply project-independently in every session, you can add a rule to your personal ~/.claude/CLAUDE.md. This is a manual step you apply yourself — the plugin does not make this change automatically:
## Team hierarchy (global)
- For larger/multi-part/quality-critical tasks: do NOT work yourself,
but apply the team-orchestration pattern (delegate, parallel, review).
- EVERY agent dispatch at EVERY level with `model: "opus"`. No exceptions.
- Author != reviewer: a separate reviewer before anything is "done".
- Mandatory memory upkeep: keep status/agent IDs/handoff gap-free.This rule is a strong instruction, but not an enforcement (see Limitations).
This is the most important section: not every rule of the pattern can be enforced equally hard. Do not sell any of these rules as a guarantee. Only the opus PreToolUse hook is approximately deterministic; everything else is steering — effective, but not enforcement.
Important regarding hook behavior: the hook is deterministic only for explicitly set non-opus models. Specifically:
model: "opus"(or an opus identifier such asclaude-opus-...) → allow. The match is a case-insensitive delimited match foropus—opusmust be bounded by a non-letter (or a string end) on both sides. So real opus ids likeclaude-opus-4-8are allowed, whileoctopusis correctly denied (it is no longer mistaken foropus).model: "sonnet"/"haiku"(any explicitly set non-opus model) → deny, with a re-dispatch message asking to dispatch again withmodel: "opus".- Empty or missing
model→ allow. This is hardened on purpose: subagent dispatches whose model is set exclusively via the agent frontmatter (model: opus) carry notool_input.model. Adenywould wrongly block these legitimate dispatches — so the hook lets them through, and the frontmatter model applies. - If
jqis missing from the environment, the hook works fail-open (allow plus a warning) instead of hard-blocking dispatches.
The matcher in hooks/hooks.json is anchored to ^(Agent|Task)$ — it covers the two common names of the agent-dispatch tool. The actual tool name in the target harness (Agent or Task) should be verified before distributing; if the name does not match, the hook does not fire. Verify with claude --debug and /hooks.
| Rule | Mechanism | Enforceability |
|---|---|---|
model: "opus" on every explicitly set model |
PreToolUse hook checks tool_input.model/.model; explicit non-opus → deny |
DETERMINISTIC (limited) — the harness runs the hook, not the model. A dispatch with explicit sonnet/haiku is hard-blocked. Empty/missing model is, however, allowed (the frontmatter model applies). |
| Skill delivers the method | Skill description triggers probabilistically |
PROBABILISTIC — the model decides whether and when the skill loads. It may not trigger. |
| CLAUDE.md rules (delegate, parallel, review, memory) | Instruction in context | PROBABILISTIC — strong steering, but the model can deviate (context pressure, other instructions). |
| "CEO never does substantive work itself" | — | NOT HARD-ENFORCEABLE. A hook sees the model field of a dispatch but cannot reliably tell whether a Read/Edit/Bash call is the CEO's own substantive work or legitimate CEO administration (for example, memory upkeep, quality check). This boundary remains a behavioral rule. |
| "Author != reviewer" | Reviewer agent + convention | PROBABILISTIC — a PostToolUse/Stop hook could heuristically prompt ("Was it reviewed?"), but cannot prove that a separate agent checked. |
Consequence: the opus hook is the only approximately hard piece — and only for explicitly set non-opus models. The skill, the CLAUDE.md rule, "CEO never does substantive work itself", and "author != reviewer" are probabilistic or heuristic respectively — good wording makes them effective but does not replace enforcement.
Before distributing, the hook must be tested — not only that it allows, but also that it blocks:
- Positive (opus): an agent dispatch with
model: "opus"is allowed (permissionDecision: allow). - Negative (sonnet/haiku): an agent dispatch with explicit
model: "sonnet"or"haiku"must be blocked (permissionDecision: deny), with a re-dispatch message asking to dispatch again withmodel: "opus". - Empty model: a dispatch without a
modelfield is allowed (permissionDecision: allow) — the frontmatter model applies. - Missing
jq: ifjqis not available in the environment, the hook fails open (permissionDecision: allow, plus a warning) instead of hard-blocking — a missing tool is an environment problem, not a policy violation.
You can check this with claude --debug and /hooks, or, as in CI, by directly invoking hooks/run-hook.cmd. In addition, the matcher in hooks/hooks.json must be verified against the actual name of the agent-dispatch tool in the target harness (Agent|Task covers the two common names).
Note: hooks load at session start. Changes to the hook take effect only after restarting Claude Code.
The GitHub Actions pipeline under .github/workflows/ci.yml runs on every push and pull request and checks six areas:
- Manifest validity and consistency —
plugin.json,marketplace.json, andhooks.jsonare valid JSON (jq empty); the plugin name (team-hierarchy) matches theplugins[]entry in the marketplace; the marketplace name isteam-hierarchy-dev;sourceis./. - Hook matcher anchoring — asserts the
hooks.jsonPreToolUsematcher equals exactly^(Agent|Task)$, and a regex unit test confirms that matcher acceptsAgent/Taskand rejectsTaskCreate/TaskStop/MultiAgent. - Frontmatter checks — the agents (
team-lead,specialist,reviewer) carrymodel: opusplusname,description, andcolor; the skill carriesnameanddescription; the command carriesargument-hintand uses$ARGUMENTSin the body. - Hook behavior test — drives the hook through
hooks/run-hook.cmdand checks the decisions:opus→ allow,sonnet/haiku→ deny,octopus→ deny (delimited match), emptymodel→ allow, andjq-missing → fail-open (allow). - Secret/PII gate — generic secret/PII regex patterns — private-key (PEM) headers, AWS access key ids, credential assignments (an api key/secret/token/password assigned a quoted literal), and raw email addresses (with a small allowlist for no-reply/example/localhost) — scanned case-insensitively (
git grep -niE) across all tracked files; a hit aborts CI. - ShellCheck — static analysis of
hooks/enforce-opus.sh(the polyglot wrapperrun-hook.cmdis deliberately excluded).
MIT — see LICENSE.