Skip to content

feat(security): OWASP MAESTRO trust-enforcement hardening on invocation path#132

Open
nicknacnic wants to merge 5 commits into
dns-aid:mainfrom
nicknacnic:feat/owasp-maestro-trust-enforcement
Open

feat(security): OWASP MAESTRO trust-enforcement hardening on invocation path#132
nicknacnic wants to merge 5 commits into
dns-aid:mainfrom
nicknacnic:feat/owasp-maestro-trust-enforcement

Conversation

@nicknacnic

Copy link
Copy Markdown
Collaborator

Summary

Substrate-layer trust enforcement at TLS handshake and at invocation time,
driven by the OWASP MAESTRO threat catalog. Every hardening ships as an
opt-in SDKConfig flag — defaults are unchanged so existing callers see
the same WebPKI-only / no-re-verify behavior. Hardened deployments enable
the flags individually or as a profile.

Threats addressed

MAESTRO ID Threat Mitigation in this PR
T47 / T7.1 / T9 Rogue Server / Agent Impersonation / Identity Spoofing DANE TLSA prefer-then-fallback at the TLS layer (prefer_dane / require_dane)
BV-9 Time-of-Check-to-Time-of-Use between verify() and invoke() verify_freshness_seconds opt-in re-resolve with drift detection
BV-2 Tool Description Poisoning (Rug Pull) Same freshness gate compares cap_sha256 between cached and fresh records; existing cap_fetcher hash check catches content-side drift
T7.6 mTLS Fallback Downgrade RFC 9460 §8 mandatory= enforced on consumption — publishers declare which keys clients MUST honor; the SDK skips records whose mandatory list names a key it doesn't implement

Design posture

DNSSEC adoption on the public internet is partial; DANE TLSA adoption is
much rarer; mTLS is mostly internal. Strict-by-default would break a large
fraction of legitimate zones. The PR's posture:

  • Defaults match real-world internet adoption — no behavior change for existing callers
  • Every hardening is opt-in via SDKConfig flag or publisher-driven mandatory= declaration
  • Mismatch always refuses — a present-but-broken TLSA record is an attack signal, not a fallback case (RFC 7671)
  • Prefer-then-fallback for DANE — absent TLSA falls back to WebPKI in permissive mode; only refuses when require_dane=True

New SDKConfig flags

Flag Default Mitigates Behavior
prefer_dane False T47 / T7.1 / T9 Query TLSA; pin when present and matches; fall back to WebPKI when absent; refuse on mismatch
require_dane False T47 strict Refuse when TLSA absent. Implies prefer_dane=True.
require_dnssec False T37 Refuse answers without AD flag / bogus
verify_freshness_seconds 0 BV-9, BV-2 Re-resolve stale records; refuse on target_host / port / cap_sha256 drift

All four have matching DNS_AID_* environment variables.

What's NOT in this PR (scoped out)

  • Default require_dnssec=True migration — explicitly out of scope; adoption is too low to warrant a breaking default change
  • dns-aid monitor continuous re-verify daemon — separate PR
  • SLSA build provenance / Sigstore signing — CI-only, separate PR
  • Audit-log hash chaining (T23) — separate PR
  • policy= bundle schema (T46) — separate experimental design PR
  • W3C traceparent propagation (T44) — small, separable

See docs/security/owasp-maestro-mapping.md for the full status across T1-T47 + BV-1-12.

Files

New code:

  • src/dns_aid/core/_dane.py — shared DANE TLSA preflight helper

Modified code:

  • src/dns_aid/core/models.pyAgentRecord.discovered_at timestamp
  • src/dns_aid/core/discoverer.py — populate discovered_at; RFC 9460 §8 mandatory= enforcement
  • src/dns_aid/sdk/_config.py — four new opt-in flags + env-var wiring
  • src/dns_aid/sdk/client.py — freshness gate, DANE preflight gate, _parse_target_port, _reverify_agent

New tests (50 total):

  • tests/unit/test_dane_preflight.py (12) — DANE preflight matrix with mocked DNS + TLS
  • tests/unit/test_mandatory_keys.py (8) — RFC 9460 parser & gate
  • tests/unit/sdk/test_dane_invoke.py (10) — AgentClient wiring + URL parsing
  • tests/unit/sdk/test_verify_freshness.py (11) — TOCTOU re-verify positive + drift
  • tests/integration/test_dane_e2e.py (9) — end-to-end against a real in-process TLS server: generates a fresh RSA-2048 self-signed cert, spins up asyncio.start_server on a random localhost port, exercises the real handshake + cert extraction + RFC 6698 §2.1 matching. Hermetic, no Docker/BIND required, ~0.8s.

New docs:

  • docs/security/owasp-maestro-mapping.md — full T1-T47 + BV-1-12 mapping
  • docs/security/best-practices.md — operator-facing guidance + profiles

Doc updates:

  • README.md — security docs pointer
  • docs/api-reference.md — new flags + env var reference
  • docs/architecture.md — Trust-Enforcement Layer subsection

Attribution

This PR is built directly on published threat-model work. Full credit in docs/security/owasp-maestro-mapping.md:

Verification

All gates run locally on this branch:

  • pytest tests/unit tests/integration/test_dane_e2e.py -q1589 passed (1550 baseline + 30 unit + 9 e2e)
  • ruff format --check src/dns_aid → 81 files clean
  • ruff check src/dns_aid → clean
  • mypy src/dns_aid → 81 files, no issues

The end-to-end test in tests/integration/test_dane_e2e.py covers the
verification previously flagged for manual exercise — a real TLS
handshake against a self-signed cert + real cert extraction + real RFC 6698
match/mismatch — without needing Docker or external infrastructure.

Test plan

  • DANE preflight matrix (absent / match / mismatch / error) × (permissive / strict) — mocked
  • AgentClient wiring for DANE (skip-when-off, prefer-paths, refuse-paths)
  • Freshness re-verify positive + drift on target_host + drift on cap_sha256 + missing + exception
  • AgentClient gating for freshness (disabled, fresh, no-timestamp, stale+match, stale+drift, stale+failure)
  • mandatory= parser + consumer skip with unknown keys
  • _parse_target_port URL parsing edge cases
  • End-to-end: real TLS server + self-signed cert + real handshake → MATCH/MISMATCH verdicts; selectors 0/1; matching types 0/1/2; multi-record RRset

Introduce core/_dane.py as the single source of truth for TLSA lookup
and certificate-matching logic, used by the verifier (advisory mode
and full-cert-match mode inside verify()) and — in subsequent commits —
the SDK invocation path.

The helper exposes:

- DanePreflightStatus enum (match / absent / mismatch / error)
- DanePreflightResult dataclass with ok bool plus diagnostic fields
- fetch_tlsa_records() — DNS-only TLSA resolution
- match_cert_against_tlsa() — TLS handshake + RFC 6698 §2.1 matching
  across selector 0/1 and matching type 0/1/2
- dane_preflight() — high-level prefer-then-fallback semantics:
    absent + permissive -> ok (WebPKI fallback)
    absent + strict     -> refuse
    present + match     -> ok
    present + mismatch  -> always refuse (RFC 7671 promise)
    transient errors    -> fail-soft permissive, fail-hard strict

DANE-EE deployments commonly use self-signed certs; the handshake
intentionally disables WebPKI verification so the TLSA match is what
binds trust, per RFC 6698.

Mitigates OWASP MAESTRO T47 / T7.1 / T9 at the substrate-helper layer.
The threat enumeration comes from the OWASP Multi-Agentic System Threat
Modelling Guide v1.0 (Ken Huang et al.) and Scott Courtney's ANS/MAESTRO
mapping; full attribution lives in docs/security/owasp-maestro-mapping.md
landing in a subsequent commit.

Signed-off-by: Layer8 <NWillAU900@gmail.com>
…rcement

Two related substrate-side additions:

1. AgentRecord.discovered_at — POSIX timestamp populated by the discoverer
   when an agent record is built from a live resolution. Used by the SDK's
   verify_freshness_seconds knob (subsequent commit) to gate implicit
   re-verification before invoke. Records constructed outside the discovery
   pipeline (tests, manual builds) leave the field as None and are treated
   as fresh for backward compatibility.

2. RFC 9460 §8 mandatory= enforcement on consumption. The discoverer now
   parses the mandatory list from each SVCB record's text presentation and
   discards records whose mandatory list names any SvcParamKey the SDK
   does not implement. _SUPPORTED_SVCB_KEYS enumerates the RFC 9460
   baseline plus the DNS-AID extensions registered in models.DNS_AID_KEY_MAP;
   keyNNNNN aliases are normalized to their human-readable names.

   This is the publisher-driven fail-closed mechanism: a publisher who cares
   about (say) cap-sha256 integrity can declare it mandatory and know that
   downstream clients without that support will refuse to use the record
   rather than silently downgrading. Mitigates OWASP MAESTRO T7.6 (fallback
   downgrade).

Both additions are non-breaking. Existing records without mandatory keys
pass through unchanged; existing AgentRecord construction call-sites that
omit discovered_at get None and the SDK treats them as fresh.

Tests:
- tests/unit/test_mandatory_keys.py — parser correctness, key normalization,
  satisfied/unsatisfied matrix, mixed known/unknown cases

Signed-off-by: Layer8 <NWillAU900@gmail.com>
Adds three OWASP MAESTRO-driven hardening modes to the SDK invocation path.
Every mode is opt-in via SDKConfig flag; defaults preserve today's
WebPKI-only / no-re-verify behavior to match the real-world adoption
levels of DNSSEC, DANE TLSA, and mTLS on the public internet.

New SDKConfig flags (and DNS_AID_* env vars):

- prefer_dane (default False) — query TLSA before each invocation; pin
  TLS cert against the DNS-published key when present and matches; fall
  back to WebPKI when absent. Mismatch always refuses regardless of
  strictness (RFC 7671 promise).

- require_dane (default False) — refuse invocation when TLSA is absent.
  Implies prefer_dane=True. Use for zones committed to publishing TLSA.

- require_dnssec (default False) — refuse answers without AD flag /
  bogus DNSSEC. Off by default because most of the public DNS does not
  yet sign zones.

- verify_freshness_seconds (default 0) — when > 0, invocations against a
  stale DiscoveryResult (older than this many seconds) implicitly re-
  resolve via discover() and compare essential fields (target_host, port,
  cap_sha256) between cached and fresh records. Drift refuses with
  StaleDiscoveryDrift; match adopts the fresh record so downstream DANE
  and cap-sha256 checks operate on the latest authoritative state.

Threats addressed:

- T47 / T7.1 / T9 — rogue server / agent impersonation / identity spoofing
  via DANE TLSA pinning at the TLS layer
- BV-9 — time-of-check-to-time-of-use between verify() and invoke()
- BV-2 — tool description poisoning / rug-pull (the freshness gate's
  cap_sha256 comparison detects rotated cap-docs; cap_fetcher's existing
  hash check catches content-side drift on every fetch)

Implementation:

- src/dns_aid/sdk/_config.py — new fields + env-var wiring via
  SDKConfig.from_env()
- src/dns_aid/sdk/client.py — _parse_target_port() URL helper,
  _reverify_agent() pure helper, freshness gate (before policy
  evaluation moved order so DANE preflight operates on fresh record),
  DANE preflight gate (between policy eval and protocol handler invoke).
  Both gates short-circuit with a structured InvocationSignal
  (REFUSED) instead of reaching the protocol handler.

Tests:
- tests/unit/test_dane_preflight.py — full prefer/require matrix:
  absent + permissive/strict, present + match, present + mismatch (always
  refuses), transient errors (fail-soft permissive, fail-hard strict)
- tests/unit/sdk/test_dane_invoke.py — AgentClient wiring: skip-when-off,
  prefer-match-proceeds, prefer-absent-falls-back, mismatch-refuses,
  require-absent-refuses, require-error-refuses; plus _parse_target_port
  URL semantics (https/http default, explicit port, unknown scheme)
- tests/unit/sdk/test_verify_freshness.py — _reverify_agent positive and
  drift cases (target_host, cap_sha256, missing agent, resolver
  exception), then AgentClient gating: disabled, fresh, no-timestamp,
  stale+match (fresh adopted), stale+drift refused, stale+failure refused

Signed-off-by: Layer8 <NWillAU900@gmail.com>
Lands the operator-facing and threat-model documentation for the
trust-enforcement hardening shipped in the prior commits.

Three new documents:

- docs/security/owasp-maestro-mapping.md — every threat in the canonical
  OWASP MAESTRO catalog (T1-T47 + BV-1 through BV-12) annotated against
  the dns-aid-core codebase: Mitigated / Partial / Out of scope / Gap.
  Summary at the bottom: 10 Mitigated, 9 Partial, 2 Gap (T23 audit-log
  integrity, BV-12 observability-overload classification — both tracked
  for follow-up PRs), remainder out of scope.

- docs/security/best-practices.md — operator-facing guidance split by
  persona (publishing operators vs calling operators). Includes the
  threat-to-flag matrix, three recommended SDKConfig profiles
  (permissive default / standard / strict), and env-var reference.

- docs/architecture.md — new Trust-Enforcement Layer subsection that
  shows the three substrate-layer gates (freshness, DANE preflight,
  caller policy) and their order in AgentClient.invoke().

Updates to existing docs:
- README.md — Documentation index now links docs/security/
- docs/api-reference.md — SDKConfig fields table extended with the four
  new flags + their MAESTRO threat citations; DNS_AID_* env vars added

Attribution

The threat enumeration, descriptions, and MAESTRO layering throughout
this PR are the work of:

- Ken Huang, A. Sheriff, J. Sotiropoulos, R. F. Del, V. Lu, et al. —
  OWASP Multi-Agentic System Threat Modelling Guide v1.0 (April 2025)
  https://genai.owasp.org/resource/multi-agentic-system-threat-modeling-guide-v1-0/
- MAESTRO Playbook & Threat Taxonomy (CC BY-SA 4.0)
  https://agentic-threat-modeling.github.io/MAESTRO/playbook/02-threat-taxonomy.html
- Scott Courtney (GoDaddy) — ANS / MAESTRO mapping
  https://github.com/godaddy/ans-registry/blob/main/MAESTRO.md

dns-aid-core's applicability commentary and code-side implementations
are derived from this prior art.

Signed-off-by: Layer8 <NWillAU900@gmail.com>
… server

Closes the 'manual verification needed' gap from the prior commits. The
new test file generates a fresh RSA-2048 self-signed certificate at
runtime, spins up an asyncio TLS server with that cert on a random
localhost port, and exercises the full DANE preflight against it.

Only fetch_tlsa_records is mocked (no DNS server required). The actual
TLS handshake, cert presentation, and the body of match_cert_against_tlsa
run for real, which is the path the existing unit tests with mocks
cannot verify.

Coverage (9 tests, hermetic, ~0.8s total):

- Real TLS handshake + TLSA(3, 1, 1) with the correct SPKI hash -> MATCH
- Real TLS handshake + TLSA(3, 1, 1) with a WRONG hash -> MISMATCH
  (refused in both permissive and strict modes — RFC 7671 promise)
- No TLSA + permissive -> ABSENT, ok=True (WebPKI fallback path)
- No TLSA + require_dane=True -> ABSENT, ok=False
- Selector 0 mtype 0 (exact full-cert match) against the live server
- Selector 0 mtype 1 (SHA-256 over full cert)
- Selector 0 mtype 2 (SHA-512 over full cert)
- RFC 6698 §2.1 multi-record RRset — first record wrong, second correct;
  match succeeds because any TLSA in the set matching is sufficient

No Docker or BIND container required; runs in the standard uv pytest
invocation. This complements the existing testbed/docker-compose
scaffolding for higher-level DCV/DNSSEC scenarios but keeps the DANE
substrate logic verifiable in pure-Python CI.

Signed-off-by: Layer8 <NWillAU900@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant