Skip to content

Add init, guard, runtime, advisory intelligence, and contribution prompting#2

Merged
thebenignhacker merged 5 commits into
mainfrom
feat/init-guard-runtime
Mar 2, 2026
Merged

Add init, guard, runtime, advisory intelligence, and contribution prompting#2
thebenignhacker merged 5 commits into
mainfrom
feat/init-guard-runtime

Conversation

@thebenignhacker

Copy link
Copy Markdown
Contributor

Summary

  • opena2a init: Security posture assessment with trust score, credential scan, hygiene checks, and prioritized next steps (screenshot moment)
  • opena2a guard sign/verify/status: ConfigGuard MVP -- SHA-256 hash pinning for config file integrity
  • opena2a runtime start/status/tail/init: ARP runtime protection wrapper with auto-config generation
  • opena2a verify: Enhanced with trust profile queries and Ed25519-signed oracle verdicts from registry
  • Advisory intelligence: Fetches advisories from registry, matches against project dependencies, warns during init
  • Scan report submission: Opt-in community intelligence with delayed contribution prompting (asks after 3+ scans)
  • GitHub star CTA: Added to init report footer and HTML security report footer
  • Shared credential pattern extraction (used by both init and protect)
  • Fix intent-map regex for multi-word modifiers (e.g., "scan my MCP server")
  • Fix shared package test script (--passWithNoTests)

Stats

  • 7 new source files, 3 new test files, 6 modified files
  • 130 tests passing (16 test files)
  • 0 lint errors

Test plan

  • All 130 tests pass (npx vitest run)
  • Build passes (npx turbo run build)
  • Lint passes (npx turbo run lint)
  • Manual: opena2a init in project with credentials shows protect as first next step
  • Manual: opena2a guard sign then modify file then opena2a guard verify detects tampering
  • Manual: opena2a verify --package hackmyagent shows trust score from registry

- opena2a init: Security posture assessment with trust score, credential
  scan, hygiene checks, and prioritized next steps (protect flow)
- opena2a guard: ConfigGuard MVP with sign/verify/status subcommands
  for config file integrity via SHA-256 hash pinning
- opena2a runtime: ARP wrapper with start/status/tail/init subcommands
  for agent runtime protection
- verify: Enhanced with trust profile and oracle verdict queries from
  registry, showing trust score, verdict, and dependency risks
- Extract credential patterns to shared module for init/protect reuse
- Remove guard/runtime from adapter registry (now direct commands)
- 22 new tests (init: 8, guard: 8, runtime: 6)
- Advisory check: init command fetches advisories from registry and
  warns about flagged tools in the project's dependencies
- Advisory cache: 5-minute local cache to avoid repeated fetches
- Package detection: Scans package.json, go.mod, requirements.txt
  for dependency names to match against advisories
- Scan report submission: Utility for sharing detailed scan findings
  with registry when contribute mode is enabled (opt-in)
- 7 new tests for advisory matching, caching, and error handling
- Track scan count in user config, only prompt to share reports after 3+ scans
- Add shouldPromptContribute/dismissContributePrompt to shared config
- Add open source link to init report footer and HTML security report
- 5 new telemetry config tests

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review failed

@thebenignhacker thebenignhacker merged commit e8c8bf9 into main Mar 2, 2026
9 of 10 checks passed
@thebenignhacker thebenignhacker deleted the feat/init-guard-runtime branch March 2, 2026 03:41
thebenignhacker added a commit that referenced this pull request Apr 22, 2026
…ck (#90)

* fix(cli): aggregate HMA findings into review output so Observations block reflects full scan

aggregateFindings was skipping the HMA phase entirely, so `opena2a review`
on a 34-critical fixture reported 1 critical while `hackmyagent secure` on
the same dir reported 34. The composite score already consumed hmaData.score
(weighted at 8%) so the headline grade was correct, but the text-mode
totalFindings, severity counts, and the @opena2a/cli-ui Observations block
saw only credential-scan + shield data -- the Categories line collapsed to
"other (1 critical)" even when HMA fired on 17 distinct categories.

Changes:
- Expose all failed HMA findings via a new HmaPhaseData.allFailedFindings
  field. topFindings is preserved for the HTML report's HMA tab (deduped-
  by-checkId + capped at 30); allFailedFindings carries the raw set so the
  Observations block and overview severity counts match the HMA scan 1:1.
- aggregateFindings now accepts hmaData and maps each HmaFinding to a
  ReviewFinding with source='hma'. checkId flows through as the ReviewFinding
  id so cli-ui's classifyCategory routes CRED-*, MCP-*, SKILL-*, etc. into
  the right buckets.
- Prefer-HMA dedupe: quickCredentialScan matches at the same file:line as an
  HMA finding are dropped. HMA credential detection is context-gated across
  200+ check IDs, which makes it the authoritative layer.
- computeCompositeScore already reads hmaData.score, not findings.length, so
  the composite remains display-only stable. Verified on test/hma/: score
  stays 66/100, findings jump 1 -> 232 (56 critical, 118 high, 57 medium),
  Categories line breaks out credentials/MCP/prompt/skill/etc., verdict
  names an HMA-detected lead finding.

Tests (881 total, +4 new):
- review credentials smoke asserts CRED-* findings surface in the cli-ui
  "credentials" Categories bucket end-to-end (skipHma=true).
- aggregateFindings unit tests cover the prefer-HMA dedupe, the no-overlap
  pass-through, and the null-hmaData backward-compat path.

Closes the P1 follow-up tracked in memory/project_hma_observations_ux_shipped.md.

* fix(cli): scope credential dedupe to HMA credential checks + handle line-less HMA findings

Adversarial self-review surfaced two defects in the initial aggregateFindings
HMA merge:

1. HMA often emits credential findings without a line number (AST-CRED-001,
   AST-CRED-003, ENVLEAK-*, etc. on real output routinely have file=... and
   line=null). The original dedupe keyed credData matches against `file:line`
   only, so the "prefer HMA" path was a no-op on the canonical credential
   scenario. Users saw both the credential-scan CRITICAL and the HMA HIGH/
   CRITICAL at the same file, inflating totalFindings. Now we key by BOTH
   `file:line` AND `file` alone, so credData is suppressed when HMA already
   fired on the same location — with or without a line.

2. The file-only key was unscoped: a non-credential HMA finding on a file
   (e.g. GIT-002 on .gitignore, DEP-004 on package.json) would have silently
   masked a credential that quickCredentialScan found in that same file.
   Dedupe is now scoped to HMA credential-category check prefixes (CRED,
   AST-CRED, WEBCRED, SEM-CRED, AGENT-CRED, ENVLEAK, CLIPASS, DRIFT) so
   non-credential HMA findings on a file never suppress a credential
   finding in that same file.

3. MEDIUM bug: when dedupe did fire, the surviving HMA finding kept its own
   severity. If credData flagged CRITICAL (e.g. sk-ant-*) and HMA flagged
   HIGH (SEM-CRED-002), the user-visible severity silently narrowed. The
   HMA finding's severity is now upgraded to max(hmaSeverity, credSeverity)
   so the verdict and category counts reflect the worst case.

Three new unit tests cover: the line-less HMA dedupe path (regression for
#1), the non-credential-prefix bypass (regression for #2), and the
severity upgrade path (regression for #3).

Verified: `opena2a review --ci test/hma/` still reports 232 findings /
66/100 composite (unchanged — credData returns 0 on that fixture, so the
dedupe behavior isn't exercised there). The regressions addressed are
path-dependent on credData producing overlap with HMA, which the new unit
tests exercise directly.

Full test suite: 884/884 passing.

* fix(cli): guard against out-of-targetDir paths in review credData aggregation

Auto-reviewer on #90 flagged the credData loop for lack of path boundary
validation. credData.matches is sourced from walkFiles(targetDir, ...) in
credential-patterns.ts, which already constrains traversal to within
targetDir — so this is defense in depth rather than a missing security
control. If a symlink escape or future upstream contract change ever
leaked an absolute or parent-traversal path into aggregation, we now
drop it rather than render a misleading ".." row in the review output.

Also adds a JSDoc note on HmaPhaseData.topFindings clarifying it is the
display-only, deduped-by-checkId + slice(0,30) list for the HMA tab,
distinct from allFailedFindings which is the source of truth for counts.
Addresses the dual-data-path architecture concern in the same review.

New unit test covers the boundary: three credData matches (abs path,
parent traversal, in-scope) → only the in-scope one survives.
885 tests passing.

* fix(cli): strengthen review path boundary check with resolve+startsWith

Auto-reviewer round 2 correctly pointed out that `rel.startsWith('..') ||
path.isAbsolute(rel)` doesn't catch Windows-drive-letter paths on Unix
(e.g. `C:\\etc\\passwd` is not absolute under POSIX `path.isAbsolute`).
Switch to the stronger pattern: resolve both sides, require the credential
file to be the targetDir itself or start with `resolvedTarget + path.sep`.

Still defense in depth — credData.matches is still produced by walkFiles
under targetDir — but now survives the specific bypass the reviewer named.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant