Skip to content

feat(experimental): EDNS(0) agent-hint signaling [experimental]#123

Open
nicknacnic wants to merge 5 commits into
dns-aid:mainfrom
nicknacnic:experimental/edns-signaling
Open

feat(experimental): EDNS(0) agent-hint signaling [experimental]#123
nicknacnic wants to merge 5 commits into
dns-aid:mainfrom
nicknacnic:experimental/edns-signaling

Conversation

@nicknacnic

@nicknacnic nicknacnic commented May 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

Experimental DNS-AID extension — an EDNS(0) option (agent-hint, code 65430 in the RFC 6891 private-use range) that lets clients attach selector filters to outgoing DNS queries. Any hop on the resolution path that understands the option may use the hint to narrow the response or short-circuit with a cached pre-filtered match. Stock authoritative servers treat the option as inert per RFC 6891 §6.1.1 — graceful degradation.

Forward-looking design + client-side reference implementation, not a shipping feature. The primary deliverable is the design doc at docs/experimental/edns-signaling.md, written with section headers that map cleanly to future IETF-draft structure so content lifts when an LF spec home arrives.

Two-axis selector taxonomy

Axis 1 — substrate filters (0x01–0x0F). What records to return. Participate in the cache key.

Code Name Value What it asks for
0x01 realm UTF-8 Match SVCB realm= param (multi-tenant scope)
0x02 transport mcp / a2a / https Encoded in _{proto}._agents + alpn
0x03 policy_required "1" (or absent) Only records carrying a policy= URI
0x04 min_trust signed / dnssec / signed+dnssec Gated on sig param + DNSSEC chain
0x05 jurisdiction ISO region Compliance lever

Axis 2 — metering / lifecycle (0x10–0x1F). How to handle the request. Do NOT fragment the cache.

Code Name Value What it asks for
0x10 client_intent_class discovery / invocation Browsing vs imminent-call; rate-limit policy
0x11 max_age seconds Cache freshness budget
0x12 parallelism uint Sibling-query count; fan-out signal to caches
0x13 deadline_ms uint Wait budget. Hint-only — no SLA refuse in v0

0x20+ reserved. client_cookie (0x20) and correlation_id (0x21) are documented as future selectors with explicit privacy / threat-model caveats; not coded in v0.

Not DNS-layer selectors. capabilities and intent live in Channel 1 JSON advertisement (edns_signaling.honored_selectors in cap-doc / agent-card), used for post-fetch local filtering. The reason is layering: SVCB doesn't carry capability strings — those live in cap-doc JSON that an auth would have to dereference per-query, breaking DNS latency budgets.

Load-bearing invariant: Axis 1 ⊆ cache key

AgentHint.signature() includes Axis 1 only. Two queries that differ only in metering — say one with parallelism=4 and another with parallelism=64 — hit the same cache entry. Same answer set, different request policy. Locked in by test_axis2_only_differences_still_hit_cache and test_signature_includes_axis1_only.

Three loci of processing (forward-looking)

The wire format and advertisement schema support hint-aware processing at any hop along the resolution path:

  1. Locus 1 — in-client programmable hop. Today: EdnsAwareResolver. Long-term: the SDK growing into a real agentic cache, or a small DNS-like cache process co-located with the agent runtime. Always usable; ships in this PR.
  2. Locus 2 — hint-aware forwarder / recursive resolver. Future work.
  3. Locus 3 — hint-aware authoritative DNS server. Future work; the design treats this as a first-class deployment, not an aspiration.

What's in this PR

New experimental/ namespace convention (established here; documented in docs/architecture.md): code in src/dns_aid/experimental/, never re-exported from the top-level package, env-flag-gated runtime, [experimental] stderr banner on CLI commands, design docs in docs/experimental/.

Code:

  • src/dns_aid/experimental/{__init__,edns_hint,edns_cache}.pyAgentHint, AgentHintEcho, EdnsSignalingAdvertisement, EdnsAwareResolver
  • src/dns_aid/core/_edns_hint_ctx.py — private contextvar helper shared by discoverer.py and indexer.py (avoids circular import while keeping the hint threaded through discovery)
  • src/dns_aid/core/discoverer.pyagent_hint= kwarg on discover(), body extracted so the contextvar try/finally wraps cleanly
  • src/dns_aid/core/indexer.py — applies the hint on the _index._agents.{domain} TXT query path
  • src/dns_aid/core/{cap_fetcher,a2a_card,http_index}.py — lift the optional edns_signaling advertisement from JSON (forward-compat on unknown shapes)
  • src/dns_aid/cli/main.py — new dns-aid edns-probe <domain> command, env-flag-gated, [experimental] banner

Tests (70 experimental):

  • test_edns_hint.py (40) — wire format round-trip both axes, malformed-input rejection, signature stability, Axis-1-only invariant, first-wins on duplicate selector codes (3 adversarial regression tests), empty Axis-1 values treated as field-not-set (3 tests)
  • test_edns_cache.py (12) — cache hit/miss, TTL, hint-mismatch, echo surfacing, Axis-2-only-differences-still-hit-cache
  • test_edns_hint_ctx.py (18) — env-flag gating, truthy/non-truthy matrix, exception-swallow (experimental crash MUST NOT propagate into stable core discovery), contextvar reset scoping

End-to-end wire verification (local): tests/testbed/smoke_edns.sh against the BIND9 testbed. tcpdump capture confirms the option (code 0xff96) reaches the authoritative on all three discovery query paths (index TXT / SVCB / capabilities TXT), with wire bytes matching the design doc bit-for-bit. Stock BIND9 returns no echo — correct behaviour for an inert authoritative, and the client's local-filter fallback engages cleanly. tcpdump is now pre-installed in the agent Dockerfile so the smoke script is self-contained.

Docs:

  • docs/experimental/edns-signaling.md — full design doc (Overview / Motivation incl. async fan-out / Conceptual model / Wire format / Advertisement / Privacy / Security / Open Questions / Future Work)
  • docs/experimental/edns-signaling.abnf — wire-format ABNF with axis-encoded code ranges
  • docs/experimental/README.md — index + namespace conventions
  • README.md — Experimental Features pointer
  • docs/api-reference.md — new "Experimental: EDNS(0) signaling" section
  • docs/architecture.md — new "Experimental namespace" section

Self-audit against prior DCV review

DCV pattern from prior review EDNS feature status
First-wins on duplicate keys (mirroring _parse_txt_value) ✓ Adversarial regression test: realm=prod + realm=evilrealm=="prod"
Empty-string vs None semantics ✓ Empty Axis-1 values decode to None, matching encode-side if self.realm: skip
Fail-closed on malformed wire data ✓ Truncated payload / invalid UTF-8 / garbage numeric → ValueError
Forward-compat on unknown codes ✓ Silently skipped + test
DoS guards on parsed inputs MAX_OPTION_PAYLOAD=512, MAX_SELECTOR_VALUE_LEN=255
Lazy imports of experimental from core TYPE_CHECKING or function-body imports only
Doc coverage (README / api-ref / architecture) ✓ All three updated
MagicMock (not AsyncMock) for resolver in tests ✓ Enforced via _make_upstream helper
Pre-commit gates green ruff format, ruff check, mypy all clean

CI gates verified locally

  • ruff format --check src/dns_aid — clean
  • ruff check src/dns_aid — clean
  • mypy src/dns_aid — Success: no issues found in 84 source files
  • pytest tests/unit/ — 1569 passed; same 35 pre-existing CEL/ML-DSA failures, no new regressions
  • tests/testbed/smoke_edns.sh — option appears on the wire to BIND9 (manual; requires Docker)

Test plan

  • All 70 experimental tests pass
  • Full unit suite shows no regressions beyond pre-existing CEL/ML-DSA failures
  • mypy, ruff format --check, ruff check clean
  • Lazy import: from dns_aid import discover does not trigger dns_aid.experimental import
  • CLI gate respected: without DNS_AID_EXPERIMENTAL_EDNS_HINTS=1, dns-aid edns-probe prints env-var instruction and exits non-zero
  • First-wins on duplicate selector codes locked in by tests
  • Wire verification via tests/testbed/smoke_edns.sh — option 0xff96 appeared on the wire across all three discovery query paths

Out of scope (forward work)

  • Reference hint-aware authoritative server (wire format and advertisement schema designed to support it; building it is separate work)
  • IANA option-code reservation (currently private-use)
  • Recursive / forwarder reference implementation at Locus 2
  • Echo authentication (DNSSEC doesn't cover OPT records — open question §9.4)
  • deadline_ms enforcement (currently hint-only; structured SLA-refuse RCODE is §10 future work)

Branch

Three commits on `experimental/edns-signaling`:

  • `ec51cb5` — initial feature shape
  • `641068f` — taxonomy reshape to two-axis + adversarial-review fixes
  • `64a44f2` — testbed smoke-test polish (tcpdump in agent Dockerfile, v2 CLI flags)

Squash at merge time is fine if preferred.

Acknowledgments

Thanks to John Zinky (Akamai) for design-level input on the experimental EDNS(0) agent-hint signaling work — particularly on how hint-aware hops could fit into the broader resolver ecosystem.

@nicknacnic nicknacnic requested a review from iracic82 as a code owner May 12, 2026 17:14
Comment thread src/dns_aid/experimental/edns_hint.py Fixed
@nicknacnic nicknacnic force-pushed the experimental/edns-signaling branch 2 times, most recently from 0f30fc9 to 63a5998 Compare May 12, 2026 17:26
Introduce an experimental EDNS(0) option (agent-hint, code 65430, private use)
that lets a DNS-AID client signal selector filters with each outgoing query.
Any hop on the resolution path that understands the option may use the hint
to narrow the response or short-circuit with a cached pre-filtered match.
Stock authoritative servers treat the option as inert per RFC 6891 §6.1.1.

The design accommodates three loci of processing:
  1. In-process client cache (EdnsAwareResolver — shipped in this PR)
  2. Hint-aware forwarder / recursive resolver (out of scope, future work)
  3. Hint-aware authoritative DNS server (out of scope, future work)

Wire format carries up to 255 selectors of (1B code + 1B length + UTF-8 value),
with a soft 512-byte payload cap. v0 defines four selector codes matching the
existing Path A filter taxonomy: capabilities (0x01), intent (0x02),
transport (0x03), auth_type (0x04). A response-side echo (VERSION 0x80) lets
hint-aware hops advertise which selectors they honoured — absence is meaningful
(no upstream filtering happened, client should fall back to local filtering).

Publisher advertisement uses two complementary channels:
  - Channel 1 (JSON): `edns_signaling` block in cap-doc / agent-card / agents-index
    JSON. Lifted onto CapabilityDocument, A2AAgentCard, and HttpIndexAgent as an
    optional dict field. Forward-compat on unknown shapes (silent).
  - Channel 2 (DNS): response-side OPT echo (preferred), or an optional SVCB
    advertisement param key65409 = "agent-hint" (reserved, not emitted in v0).

Establishes the `src/dns_aid/experimental/` namespace convention:
  - Experimental APIs never re-exported from dns_aid/__init__.py — import
    explicitly: `from dns_aid.experimental import AgentHint`.
  - Per-feature env flag (DNS_AID_EXPERIMENTAL_EDNS_HINTS=1) gates runtime
    behaviour. Without the flag, code paths stay dormant.
  - CLI commands print `[experimental]` banner on stderr; exit non-zero when
    the flag isn't set.
  - Design docs live in docs/experimental/ with section headers that map to
    future IETF-draft structure for clean migration to LF spec material later.

Files:
  - src/dns_aid/experimental/{__init__,edns_hint,edns_cache}.py — new subpackage
  - src/dns_aid/core/_edns_hint_ctx.py — private contextvar helper shared by
    discoverer and indexer (avoids circular import while keeping the hint
    threaded through the discovery chain)
  - src/dns_aid/core/discoverer.py — agent_hint= kwarg on discover(); body
    extracted into _discover_body() so the contextvar try/finally wraps cleanly
  - src/dns_aid/core/indexer.py — applies the hint on the index TXT query path
  - src/dns_aid/core/{cap_fetcher,a2a_card,http_index}.py — lift the optional
    edns_signaling advertisement from JSON
  - src/dns_aid/cli/main.py — new edns-probe command with --show-wire
  - tests/unit/test_edns_hint.py + test_edns_cache.py — 37 new unit tests
  - tests/testbed/smoke_edns.sh — manual tcpdump-based wire verification against
    the BIND9 testbed
  - docs/experimental/{README,edns-signaling.md,edns-signaling.abnf} — full
    design doc with Overview / Motivation / Conceptual model / Wire format /
    Advertisement / Privacy / Security / Open questions / Future work
  - docs/api-reference.md — new "Experimental: EDNS(0) signaling" section
  - docs/architecture.md — new "Experimental namespace" section
  - README.md — Experimental Features pointer in the Documentation list

Out of scope (forward work):
  - Reference hint-aware authoritative server
  - IANA option-code reservation
  - Recursive/forwarder reference at Locus 2
  - Echo authentication (DNSSEC doesn't cover OPT records)

Signed-off-by: Layer8 <NWillAU900@gmail.com>
…ew fixes

Replaces the v1 selector set (capabilities / intent / transport / auth_type)
with a structured two-axis taxonomy after design review. The v1 selectors were
the wrong layer — capabilities and intent live in cap-doc JSON, not in SVCB,
so an authoritative server cannot filter on them without breaking DNS latency
budgets. Moves those to Channel 1 JSON (post-fetch local filter) and reshapes
the wire option around what the substrate actually has access to.

Wire format (option code 65430, RFC 6891 private use):

  Axis 1 — substrate filters (0x01–0x0F).  Participate in the cache key.
    0x01 realm            — multi-tenant scope; SVCB realm= param
    0x02 transport        — "mcp" | "a2a" | "https"
    0x03 policy_required  — only records carrying a policy= URI ("1" or absent)
    0x04 min_trust        — "signed" | "dnssec" | "signed+dnssec"
    0x05 jurisdiction     — ISO region tag ("eu", "us-east", ...)

  Axis 2 — metering / lifecycle (0x10–0x1F).  Drive request policy; do NOT
  fragment the cache.
    0x10 client_intent_class — "discovery" | "invocation"
    0x11 max_age             — UTF-8 decimal seconds
    0x12 parallelism         — sibling-query count (fan-out signal)
    0x13 deadline_ms         — wait budget; hint-only, no SLA refuse in v0

  0x20+ reserved.  client_cookie (0x20) and correlation_id (0x21) documented
  as future selectors; not coded in v0.

The two-axis split is structural: AgentHint.signature() includes Axis 1 only.
Two queries differing only in metering (parallelism, deadline_ms, etc.) hit
the same cache entry — the answer set is the same, the policy applied to the
request is different.  Locked in by test_axis2_only_differences_still_hit_cache.

Self-audit fixes against Igor's prior DCV review (CLAUDE.md / memory):

- First-wins on duplicate selector codes.  Mirrors _parse_txt_value in dcv.py.
  Defends against a hostile forwarder appending an overriding selector after
  a legitimate one.  Verified live: payload with realm=prod, realm=evil now
  decodes to realm="prod" (was "evil" pre-fix).
- Empty Axis-1 string values on decode treated as field-not-set.  Matches the
  encode-side semantics (encode skips empty strings via `if self.realm:`) and
  prevents a forged empty-value payload from fragmenting the cache key under
  a value the legitimate client would never produce.
- New tests/unit/test_edns_hint_ctx.py (18 tests).  Covers the most security-
  relevant runtime gate — no wire emission without DNS_AID_EXPERIMENTAL_EDNS_
  HINTS=1.  Parametrized truthy/non-truthy env-value matrix, exception-swallow
  (experimental crash MUST NOT propagate into stable core discovery), and
  contextvar reset scoping.

Conceptual model widened in the design doc: Locus 1 is now framed as "any
in-client programmable hop the SDK provides" rather than just the resolver
wrapper — covers the SDK growing into a real agentic cache, or a small DNS-
like cache process co-located with the agent runtime.  Axis-aware code-range
numbering documented as a deliberate trade-off vs flat numbering (§9.3).
Async fan-out is now the primary worked example in §4.4.

Tests: 70 experimental (was 37 in v1).  ruff format/check, mypy --strict, full
unit suite (1569 passing) all clean before commit.

Signed-off-by: Layer8 <NWillAU900@gmail.com>
- Pre-install tcpdump in agents/Dockerfile so smoke_edns.sh doesn't need
  external network access at runtime (the testbed agent containers are
  intentionally pinned to the internal DNS topology with no upstream
  resolver, so apt-get install at smoke time can't fetch packages).
- Update the smoke_edns.sh probe call to use the new v2 CLI flags
  (--realm / --transport / --min-trust / --intent-class / --parallelism
   / --deadline-ms), matching the two-axis selector taxonomy.
- Tighten the result-check shell logic so `set -euo pipefail` doesn't bail
  early when grep doesn't find the marker on a real failure.

End-to-end verified locally against bind-orga (stock BIND9 9.20). The
agent-hint OPT record (code 0xff96 = 65430) reaches the authoritative on
all three discovery query paths — index TXT, SVCB, capabilities TXT — and
the wire payload matches the design doc bit-for-bit:

  0xff96 (option-code)  0x002b (length=43)
  0x00 (VERSION) 0x06 (SELECTOR-COUNT)
  01 04 "prod"        realm
  02 03 "mcp"         transport
  04 06 "signed"      min_trust
  10 0a "invocation"  client_intent_class
  12 01 "4"           parallelism
  13 05 "30000"       deadline_ms

Stock BIND9 returns the answer set without an AgentHintEcho — correct
behaviour for an inert-to-the-option authoritative, and the client's
"no upstream filtering happened" fallback path engages cleanly.

Signed-off-by: Layer8 <NWillAU900@gmail.com>
@nicknacnic nicknacnic force-pushed the experimental/edns-signaling branch from 63a5998 to 57c25ad Compare May 12, 2026 17:28
Rework the main README to position DNS-AID as a neutral DNS-layer
substrate rather than a centralized aggregator or competitor to other
agent-discovery efforts:

- Replace 'Companion services' hosted-directory billing with an
  'Ecosystem and Integrations' section that names no operator.
- Path B example: drop trust_tier / min_security_score from the lead
  snippet; frame the directory as opt-in convenience with the DNS
  substrate as the authoritative trust gate.
- Telemetry section: remove fetch_rankings() from the intro; keep
  local rank() and configurable HTTP push; mention community
  rankings only as caller-configured.
- CLI: dns-aid search / dns-aid submit use placeholder directory
  URLs and a --to flag.
- Delete the 'Server-Side: Agent Directory Pipeline' ASCII diagram
  (CRAWLING / CURATION / INDEXING / SERVING with trust_score /
  TSVECTOR / Rankings); replace with a brief paragraph stating
  directory/indexing services are out of scope and free to define
  their own scoring and ranking.
- Replace 'Why DNS-AID? / vs Competing Proposals / The Sovereignty
  Question / Google's Agent Ecosystem / .agent comparison table'
  with a single 'How DNS-AID Relates to Other Efforts' section in
  standards-org voice. Names ANS, A2A, AgentDNS, NANDA, ai.txt /
  llms.txt, and the .agent gTLD as parallel work at different layers
  of the stack rather than competitors.
- Soften three 'ANS-compatible' repetitions to 'format aligns with
  the ANS schema'.

Net: -145 / +48 lines.
Signed-off-by: Layer8 <NWillAU900@gmail.com>
Signed-off-by: Layer8 <NWillAU900@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants