feat: add double-check skill (cross-provider verification) by citypaul · Pull Request #177 · citypaul/.dotfiles

citypaul · 2026-06-28T13:28:31Z

What

Adds a new auto-discovered skill, double-check, that gets an independent second opinion on finished work from a different AI provider's CLI agent (codex / claude / gemini / cursor-agent) running locally on the machine, then drives a constructive back-and-forth between the two agents until both genuinely agree — convergence, not a single rubber-stamp pass.

Why

A model is the worst-placed reviewer of its own work: it shares every blind spot that produced the bug. A genuinely different reasoning system catches what self-review can't.

Design highlights

Host-agnostic. Detects the hosting agent and excludes it; works run from Claude, Codex, Gemini, or Cursor.
Lab diversity, not binary diversity. "Different provider" means a different underlying model lab — cursor-agent running Sonnet is still Anthropic and doesn't count against a Claude host.
Best model + max effort + read-only. e.g. codex --sandbox read-only -c model_reasoning_effort="xhigh", claude --model opus --effort max --permission-mode plan.
A real loop, not one shot. Adversarial brief → structured findings → host fixes/pushes back → re-verify → converge, tracked in a finding ledger; genuine disagreement escalates to the human.
Security guardrails. Verifier output treated as untrusted; secrets must not reach the verifier (pasted or referenced); data-not-instructions rule in the brief.

Files

claude/.claude/skills/double-check/SKILL.md
claude/.claude/skills/double-check/resources/providers.md — verified CLI invocations + best model/effort/sandbox flags + model-lab table
claude/.claude/skills/double-check/resources/brief-template.md — adversarial verifier brief
CLAUDE.md / README.md — registered the skill, counts 27 → 28
.changeset/double-check-skill.md — minor

Dogfooded

The skill was authored using the exact loop it prescribes, with codex (gpt-5.5, xhigh) as the cross-provider verifier:

Round	Codex verdict	Result
1	`issues-found`	8 findings — all valid, fixed
2	`issues-found`	round-1 confirmed resolved + 2 new — fixed
3	`no-issues`	convergence

Real issues codex caught and that were fixed: Claude verifier not at max effort; cursor-agent has no read-only mode; lab-vs-binary diversity; a Claude-specific tool in a host-agnostic skill; a verdict-contract loophole hiding nit-only findings; and a secrets-disclosure gap (referencing a file transmits it).

All repo tests pass.

🤖 Generated with Claude Code

New auto-discovered skill that gets an independent second opinion on finished work from a *different* AI provider's CLI agent (codex, claude, gemini, or cursor-agent) running locally, then drives a constructive back-and-forth between the two agents until both genuinely agree. Host-agnostic: detects the hosting agent's model lab and picks a verifier from a different lab, at its best model and highest reasoning effort, in a read-only sandbox. Includes a provider command reference and an adversarial verifier brief template in resources/. The skill was itself authored using the loop it prescribes — codex (gpt-5.5, xhigh) acted as the cross-provider verifier across three rounds until it returned no-issues. - Add claude/.claude/skills/double-check/{SKILL.md,resources/} - Register in CLAUDE.md skill list + trigger line - Add README discovery bullet, bump skill counts 27 -> 28 - Add minor changeset Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Two refinements from review feedback, both re-verified by the codex cross-provider loop (no-issues): - Context vs target: the verifier should read enough surrounding context (docs, callers, tests, conventions) to judge the work properly, while keeping the review *target* fixed on the named work — understand broadly, judge narrowly. Replaces the earlier over-tight "read only named paths" wording that starved the verifier of context. - Work not yet on disk: handle double-checking an in-progress plan or change that lives only in the host agent's conversation. The host must materialize it (scratch file or inline) and tell the verifier that THAT is the work and the committed repo is background context only, so it never reviews stale committed code. Adds a "where the work lives" section to the brief template and a "Delivering work that isn't on disk yet" subsection. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

citypaul and others added 2 commits June 28, 2026 14:28

citypaul merged commit 476f512 into main Jun 28, 2026
1 check passed

citypaul deleted the feat/double-check-skill branch June 28, 2026 13:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add double-check skill (cross-provider verification)#177

feat: add double-check skill (cross-provider verification)#177
citypaul merged 2 commits into
mainfrom
feat/double-check-skill

citypaul commented Jun 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

citypaul commented Jun 28, 2026

What

Why

Design highlights

Files

Dogfooded

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant