Skip to content

Nb/ensdb byte parity#9

Open
nikbhintade wants to merge 2 commits into
mainfrom
nb/ensdb-byte-parity
Open

Nb/ensdb byte parity#9
nikbhintade wants to merge 2 commits into
mainfrom
nb/ensdb-byte-parity

Conversation

@nikbhintade

@nikbhintade nikbhintade commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary by CodeRabbit

  • New Features

    • Added validation-focused indexer configurations for testing and verification
    • Added label healing and name interpretation utilities for improved data accuracy
    • Added database validation comparison script for ENS indexer data consistency
  • Documentation

    • Added ENSDb parity analysis documenting schema compatibility and closure options
  • Refactoring

    • Standardized naming conventions across the indexer (camelCase to snake_case fields; singular to plural entity names)
    • Updated event handling patterns and data persistence layer throughout
  • Chores

    • Added @adraffy/ens-normalize dependency

nikbhintade and others added 2 commits June 11, 2026 17:53
Validated against the ENSIndexer reference (PLUGINS=subgraph,
SUBGRAPH_COMPAT=false) on mainnet 3327417-3700000: all 27 subgraph
tables byte-identical at the DB level.

- rename entities/fields to exact ENSDb postgres names (plural
  snake_case tables, snake_case columns); drop non-standard isActive
- add ENSRainbow label healing via Effect API (pinned subgraph/0
  label set) + ENSIP-15 label interpretation (lib/interpretation.ts)
- index Resolver events wildcard (no address filter) instead of
  contractRegister: the reference's Resolver datasource has no
  address, so it captures resolver events emitted before/without any
  NewResolver registration
- gate ALL RegistryOld event types on Domain.is_migrated (root-node
  exception for NewResolver), before any entity writes
- create root domain before first event (created_at=0, name='',
  is_migrated=true); drop preload-unsafe module-level init guards
- heal addr.reverse subname labels from tx sender/owner addresses
  (receipt/trace fallback TODO; throws loudly if required)
- add SUBGRAPH_COMPAT=true env mode mirroring the reference: legacy
  subgraph IDs (no chainId prefix, registrationId=labelHash),
  subgraph-interpreted labels, no reverse-address healing, null root
  name — for validation against the ENS Subgraph via
  ens-subgraph-transition-tools

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- scripts/compare-ensdb.py: byte-for-byte diff of all subgraph_*
  tables between the reference ENSIndexer postgres schema and the
  envio database (column sets, row counts, per-cell values)
- config.validation.yaml: mainnet-only subgraph scope at the
  snapshot-eq reference blockheight (21921222)
- config.validation-370k.yaml: same scope, end_block 3700000 for
  fast iteration (first auction-registrar activity)
- docs/ENSDB_PARITY.md: gap analysis notes

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR migrates the ENS indexer to a new schema naming convention with pluralized, snake_case entities (e.g., subgraph_domains, expiry_date), adds label healing/interpretation infrastructure for ENS domain names, updates resolver configuration to wildcard indexing, and harmonizes all handler code, tests, and documentation accordingly.

Changes

ENS Indexer Schema and Label Healing Migration

Layer / File(s) Summary
Schema, Configuration, and Dependencies
schema.graphql, config.yaml, config.validation.yaml, config.validation-370k.yaml, package.json
GraphQL schema redefined with pluralized entity names (subgraph_domains, subgraph_registrations, etc.) and snake_case field names (expiry_date, transaction_id, block_number, etc.). Configuration updated to enable wildcard resolver indexing across Ethereum, Base, Linea, and Optimism chains; NewOwner event field selection expanded to include transaction from value. Validation configurations added as references. @adraffy/ens-normalize dependency added for label normalization.
Label Interpretation and Healing Module
src/lib/interpretation.ts
New module providing label hashing, ENSIP-15 normalization checks, literal-to-interpreted label conversion, DNS packet parsing with strict validation, and external ENSRainbow heal API integration with retry logic and error handling. Exports IS_SUBGRAPH_COMPAT flag to control compat-mode behavior for ID generation and name encoding.
Helper Functions and ID Generation
src/lib/helpers.ts, src/lib/protocol-acceleration.ts, src/lib/registrar-helpers.ts
Updated helper utilities for entity upserts (upsertAccount, upsertResolver, upsertRegistration), event composition (sharedEventValues), and garbage collection to use pluralized stores and snake_case fields. ID generation helpers (makeResolverId, makeEventId, makeRegistrationId) now adjust behavior based on IS_SUBGRAPH_COMPAT flag. Protocol-acceleration coin-type handling and registrar-tracking block-number parameters switched to snake_case naming.
Registry Event Handlers
src/handlers/Registry.ts
Major refactor to gate RegistryOld events for migrated domains via shouldIgnoreRegistryOldEvents(), ensure root domain creation, integrate label healing via ENSRainbow, and emit events to pluralized entity stores. NewOwner handler updates migration flags and conditionally derives domain names; Transfer handler implements recursive garbage collection; NewResolver handler manages resolver references; NewTTL handler updates expiry-time fields.
Registrar Event Handlers
src/handlers/Registrar.ts, src/handlers/BaseRegistrar.ts, src/handlers/LineaRegistrar.ts
Registrars updated to heal and interpret labels via healLabelByLabelHash + interpretHealedLabel, use pluralized domain/registration stores, and emit events with snake_case fields. Controllers (EAController, RegController, UpgController, EthController, LegacyController, WrappedController, UnwrappedController) now pass plaintext label_name to setNamePreimage. Registration, renewal, and transfer events log using new field names (expiry_date, registration_date, new_owner_id, block_number).
Resolver and NameWrapper Handlers
src/handlers/Resolver.ts, src/handlers/NameWrapper.ts
Resolver handlers registered with wildcard: true for all event types; entity interactions migrated to pluralized stores and snake_case field names (content_hash, coin_types, resolved_address_id, coin_type). NameWrapper handler decodes interpreted names, updates wrapped-domain ownership and expiry, and manages wrapped-owner/fuse state using new schema.
Additional Handlers and Minor Updates
src/handlers/ThreeDNS.ts, src/handlers/ReverseRegistrar.ts, src/handlers/TokenScope.ts
ThreeDNS updated to use new domain/registration stores, manage subdomain_count on parent domains, and emit events with pluralized entity names. ReverseRegistrar and TokenScope updated with snake_case parameter and field names (coin_type, block_number).
Comprehensive Test Suite Updates
test/BaseRegistrar.test.ts, test/LineaRegistrar.test.ts, test/NameWrapper.test.ts, test/Registrar.test.ts, test/Registry.test.ts, test/Resolver.test.ts, test/ThreeDNS.test.ts, test/helpers.test.ts
All test files updated to read entities from pluralized store collections (subgraph_domains?.sets, subgraph_registrations?.sets, subgraph_resolvers?.sets), assert snake_case field names (expiry_date, transaction_id, block_number, label_name, wrapped_owner_id, coin_type, is_migrated), and verify event/entity correctness under new schema.
Documentation and Database Tooling
docs/ENSDB_PARITY.md, scripts/compare-ensdb.py
Added gap-analysis documentation describing divergences between Envio Postgres layout and ENSDb expectations (table pluralization, camelCase-to-snake_case field naming, type compatibility), with three proposed closure strategies. Python script for byte-for-byte comparison of ENS subgraph tables across two PostgreSQL containers, with configurable diff output limiting and exit codes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • enviodev/ens-indexer#8: Both PRs implement the same schema/entity naming migration to pluralized snake_case entities and update corresponding handler persistence and event-writing code paths.
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 73.81% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Nb/ensdb byte parity' is vague and uses non-descriptive terms that don't convey the actual scope of changes (schema migration, field renaming, new validation configs). Consider a more descriptive title that captures the main change, such as 'Migrate schema to snake_case naming and update entity references' or 'Refactor ENS schema to byte-parity with ENSDb conventions'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/ThreeDNS.test.ts (1)

14-29: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard token-dependent integration tests to avoid hard CI failures.

These tests require HyperSync credentials, and CI is currently failing when ENVIO_API_TOKEN is absent. Add a shared gate (or fixture mode) so token-less runs skip integration specs instead of failing the whole suite.

Proposed minimal guard pattern
+const hasHyperSyncToken = Boolean(process.env.ENVIO_API_TOKEN);
+const itIfHyperSync = hasHyperSyncToken ? it : it.skip;
+
 describe("ThreeDNS (Optimism + Base)", () => {
   describe("ThreeDNSToken on Optimism", () => {
-    it("creates domains with hardcoded resolver on NewOwner", async () => {
+    itIfHyperSync("creates domains with hardcoded resolver on NewOwner", async () => {
       // ...
     });
   });

   describe("ThreeDNSToken on Base", () => {
-    it("decodes DNS name and creates registration on RegistrationCreated", async () => {
+    itIfHyperSync("decodes DNS name and creates registration on RegistrationCreated", async () => {
       // ...
     });
   });
 });

Also applies to: 53-68

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/ThreeDNS.test.ts` around lines 14 - 29, The integration tests in
test/ThreeDNS.test.ts that call createTestIndexer() and indexer.process(...)
depend on ENVIO_API_TOKEN and should be skipped when the token is missing; add a
guard at the top of the ThreeDNS test suite (or in the shared test setup/fixture
used by createTestIndexer) that checks process.env.ENVIO_API_TOKEN and calls the
test runner's skip/describe.skip (or returns early) to mark these specs as
skipped, ensuring both the early block that processes Ethereum (the
indexer.process calls) and the later block (lines around the second
indexer.process) are wrapped or gated so CI without the token does not run them.

Source: Pipeline failures

🧹 Nitpick comments (7)
src/handlers/Registry.ts (1)

136-157: 💤 Low value

Addr.reverse healing may throw on certain transactions.

The fallback error (Lines 146-151) is intentionally fail-fast, but this means the indexer will crash if neither transaction.from nor owner produces a matching label hash. The comment acknowledges receipt/trace healing isn't implemented.

This is acceptable for parity purposes, but consider logging a warning before throwing to aid debugging when this edge case is hit in production.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/handlers/Registry.ts` around lines 136 - 157, Before throwing the
fail-fast error in the addr.reverse healing branch, emit a warning with
contextual details: use context.logger.warn (or existing processLogger) inside
the IS_SUBGRAPH_COMPAT && parentNode === ADDR_REVERSE_NODE && event.chainId ===
1 block just before the throw to log labelHash, event.transaction.hash,
event.transaction.from and owner and note that receipt/trace fallback is
unimplemented; leave the throw of the Error unchanged and don't alter the
healing fallback to await healLabelByLabelHash or suppress the exception.
src/handlers/TokenScope.ts (1)

35-52: 💤 Low value

Inconsistent field naming in name_sale entity.

While block_number was correctly renamed to snake_case, other fields like logIndex, transactionHash, chainId, orderHash, contractAddress, tokenId, assetNamespace, assetId, domainId remain in camelCase. This creates an inconsistent naming convention within the same entity.

If this is intentional (e.g., matching a specific schema requirement), this can be ignored. Otherwise, consider aligning all fields to snake_case for consistency with the broader migration.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/handlers/TokenScope.ts` around lines 35 - 52, The name_sale entity
population in context.name_sale.set uses mixed casing; update the field keys to
snake_case for consistency (e.g., change logIndex → log_index, transactionHash →
transaction_hash, chainId → chain_id, orderHash → order_hash, contractAddress →
contract_address, tokenId → token_id, assetNamespace → asset_namespace, assetId
→ asset_id, domainId → domain_id) while leaving values and functions
(makeEventId, sale.nft.*, sale.payment.*, BigInt(event.block.timestamp))
unchanged so the object keys in the call to context.name_sale.set follow the
same snake_case convention as block_number.
docs/ENSDB_PARITY.md (1)

86-90: ⚡ Quick win

Verify drizzle snake_case conversion against live data.

The documentation notes that irregular identifier conversions (e.g., transactionIDtransaction_id vs. transaction_i_d) need verification against a live ENSDb dump to ensure the snake_case transformation matches drizzle's actual output.

Consider adding a validation step in the testing workflow to compare column names against the reference implementation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/ENSDB_PARITY.md` around lines 86 - 90, The docs warn that drizzle's
snake_case conversion for irregular identifiers (e.g., transactionID,
contentHash) can be incorrect, so add an automated validation step in the
test/workflow that fetches a live ENSDb dump (or a trusted sample) and compares
drizzle-generated column names against the source to detect mismatches
(transactionID → transaction_id vs transaction_i_d, contentHash → content_hash);
implement this as a test utility invoked in CI that runs drizzle's conversion on
the sample, diffs the column names, and fails the job with a clear message if
any conversions (transactionID, contentHash, blockNumber, etc.) diverge from the
reference.
schema.graphql (2)

318-343: ⚡ Quick win

Inconsistent field naming in registrar plugin entities.

The subregistry, registration_lifecycle, and registrar_action types mix snake_case and camelCase field names. While block_number and incremental_duration use snake_case, fields like subregistryId, expiresAt, baseCost, decodedReferrer, transactionHash, and eventIds remain in camelCase.

For consistency with the subgraph plugin migration (which uses snake_case throughout), consider renaming:

  • subregistryIdsubregistry_id
  • expiresAtexpires_at
  • baseCostbase_cost
  • decodedReferrerdecoded_referrer
  • transactionHashtransaction_hash
  • eventIdsevent_ids
  • logicalEventKeylogical_event_key
  • logicalEventIdlogical_event_id
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@schema.graphql` around lines 318 - 343, The schema mixes snake_case and
camelCase in registrar plugin types (types: subregistry, registration_lifecycle,
registrar_action) — rename the listed fields to snake_case to be consistent:
subregistryId → subregistry_id, expiresAt → expires_at, baseCost → base_cost,
premium (if desired) stay or change consistently, decodedReferrer →
decoded_referrer, transactionHash → transaction_hash, eventIds → event_ids, and
any logicalEventKey/logicalEventId to logical_event_key/logical_event_id; update
every schema declaration and all code that references these symbols (mappings,
resolvers, queries, tests) to use the new field names so compilation and runtime
lookups remain correct.

354-382: ⚡ Quick win

Inconsistent field naming in TokenScope plugin entities.

The name_token and name_sale types mix snake_case and camelCase field names. While block_number uses snake_case, fields like domainId, chainId, contractAddress, tokenId, assetNamespace, mintStatus, logIndex, transactionHash, orderHash, and assetId remain in camelCase.

For consistency with the subgraph plugin migration, consider snake_case for all fields in these types.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@schema.graphql` around lines 354 - 382, The schema has inconsistent field
naming in the TokenScope plugin entities (types name_token and name_sale);
convert all camelCase fields to snake_case to match block_number (e.g.,
domainId→domain_id, chainId→chain_id, contractAddress→contract_address,
tokenId→token_id, assetNamespace→asset_namespace, mintStatus→mint_status,
logIndex→log_index, transactionHash→transaction_hash, orderHash→order_hash,
assetId→asset_id, etc.), preserve any `@index` directives and types, and then
update any code, mappings or queries that reference these symbols (name_token,
name_sale and the renamed fields) so they use the new snake_case names
throughout the project.
scripts/compare-ensdb.py (2)

159-159: 💤 Low value

Add strict=True to zip() for Python 3.10+.

The zip() call compares values from rrow and erow without checking that they have the same length. While the equality check on line 157 (rrow == erow) would catch length mismatches, explicitly adding strict=True makes the intent clearer and provides better error messages.

for col, rv, ev in zip(shared, rrow, erow, strict=True):

This is only available in Python 3.10+. If supporting older Python versions, keep the current approach.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/compare-ensdb.py` at line 159, The loop using zip(shared, rrow, erow)
should be made explicit about sequence length equality by passing strict=True to
zip to surface mismatched-length errors (change the call in the loop that
iterates over shared, rrow, erow to use zip(..., strict=True)); if you must
support Python <3.10 keep the existing zip but add an explicit length check
comparing len(rrow) and len(erow) before the loop to raise a clear error. Ensure
you update the loop that binds col, rv, ev and any related error handling
accordingly.

71-74: 💤 Low value

Consider parameterized SQL queries for robustness.

While the current SQL construction using f-strings is safe in this context (inputs are controlled constants), consider using parameterized queries for better maintainability and to avoid potential issues if the script is adapted for other uses.

For example:

sql = """
    SELECT column_name, data_type 
    FROM information_schema.columns 
    WHERE table_schema = %s AND table_name = %s 
    ORDER BY column_name
"""
# Then pass parameters separately

Note: psql's -c flag doesn't support parameterized queries directly, but the current approach is acceptable for this local development tool with controlled inputs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/compare-ensdb.py` around lines 71 - 74, The SQL is currently built
with an f-string in the variable sql which can lead to fragility if schema/table
values change; change this to a parameterized query and pass (schema, table) as
parameters to the DB execute call instead of interpolating into sql directly
(e.g., replace the f-string in sql with a parameterized WHERE table_schema = %s
AND table_name = %s and modify the execute call that uses sql to supply (schema,
table)). Update the code that invokes sql (the DB execute/connection call in
compare-ensdb.py) to use those parameters so the query is executed safely and
maintainably.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@config.yaml`:
- Line 13: config.yaml declares transaction_fields: - from for
RegistryOld.NewOwner, Registry.NewOwner, and ThreeDNSToken.NewOwner but no
handler persists it; update either the GraphQL entity and handlers to persist
the field or remove the config entry. Specifically, add a "from" field to the
subgraph_new_owners GraphQL type/schema, run pnpm codegen, and in
src/handlers/Registry.ts and src/handlers/ThreeDNS.ts include
event.transaction.from when calling context.subgraph_new_owners.set (combine it
with sharedEventValues(...) so the saved object contains from), or alternatively
delete the transaction_fields: from lines from config.yaml if you intend not to
store it.

In `@schema.graphql`:
- Around line 11-20: Replace all entity relationship fields that reference other
entities (e.g., parent: subgraph_domains, owner: subgraph_accounts!, registrant:
subgraph_accounts) with ID fields named using the *_id convention and String
types (e.g., parent_id: String, owner_id: String!, registrant_id: String);
update every similar field across the schema to follow this pattern (use
non-null String! when original field was non-null), and then update any code
that accesses those fields (mappings, resolvers, ingest logic) to read/write the
new *_id fields instead of the entity references (search for occurrences of
parent, owner, registrant, resolver, wrapped_owner, resolved_address, etc., and
replace with parent_id, owner_id, registrant_id, resolver_id, wrapped_owner_id,
resolved_address_id as appropriate).

In `@src/lib/interpretation.ts`:
- Around line 219-220: In healLabelByLabelHash, the fetch(url) call can hang
indefinitely; replace it with a fetch that uses AbortSignal.timeout(...) and
pass the resulting signal via the fetch options (e.g., fetch(url, { signal }))
so the request times out; after the fetch, keep assigning the parsed JSON to
body as before (const response = await fetch(...); const body = await
response.json() as HealResponse), and ensure you choose a reasonable timeout
value (e.g., 5–15s) or read it from configuration; reference the
healLabelByLabelHash function and the response/body variables when making this
change.

---

Outside diff comments:
In `@test/ThreeDNS.test.ts`:
- Around line 14-29: The integration tests in test/ThreeDNS.test.ts that call
createTestIndexer() and indexer.process(...) depend on ENVIO_API_TOKEN and
should be skipped when the token is missing; add a guard at the top of the
ThreeDNS test suite (or in the shared test setup/fixture used by
createTestIndexer) that checks process.env.ENVIO_API_TOKEN and calls the test
runner's skip/describe.skip (or returns early) to mark these specs as skipped,
ensuring both the early block that processes Ethereum (the indexer.process
calls) and the later block (lines around the second indexer.process) are wrapped
or gated so CI without the token does not run them.

---

Nitpick comments:
In `@docs/ENSDB_PARITY.md`:
- Around line 86-90: The docs warn that drizzle's snake_case conversion for
irregular identifiers (e.g., transactionID, contentHash) can be incorrect, so
add an automated validation step in the test/workflow that fetches a live ENSDb
dump (or a trusted sample) and compares drizzle-generated column names against
the source to detect mismatches (transactionID → transaction_id vs
transaction_i_d, contentHash → content_hash); implement this as a test utility
invoked in CI that runs drizzle's conversion on the sample, diffs the column
names, and fails the job with a clear message if any conversions (transactionID,
contentHash, blockNumber, etc.) diverge from the reference.

In `@schema.graphql`:
- Around line 318-343: The schema mixes snake_case and camelCase in registrar
plugin types (types: subregistry, registration_lifecycle, registrar_action) —
rename the listed fields to snake_case to be consistent: subregistryId →
subregistry_id, expiresAt → expires_at, baseCost → base_cost, premium (if
desired) stay or change consistently, decodedReferrer → decoded_referrer,
transactionHash → transaction_hash, eventIds → event_ids, and any
logicalEventKey/logicalEventId to logical_event_key/logical_event_id; update
every schema declaration and all code that references these symbols (mappings,
resolvers, queries, tests) to use the new field names so compilation and runtime
lookups remain correct.
- Around line 354-382: The schema has inconsistent field naming in the
TokenScope plugin entities (types name_token and name_sale); convert all
camelCase fields to snake_case to match block_number (e.g., domainId→domain_id,
chainId→chain_id, contractAddress→contract_address, tokenId→token_id,
assetNamespace→asset_namespace, mintStatus→mint_status, logIndex→log_index,
transactionHash→transaction_hash, orderHash→order_hash, assetId→asset_id, etc.),
preserve any `@index` directives and types, and then update any code, mappings or
queries that reference these symbols (name_token, name_sale and the renamed
fields) so they use the new snake_case names throughout the project.

In `@scripts/compare-ensdb.py`:
- Line 159: The loop using zip(shared, rrow, erow) should be made explicit about
sequence length equality by passing strict=True to zip to surface
mismatched-length errors (change the call in the loop that iterates over shared,
rrow, erow to use zip(..., strict=True)); if you must support Python <3.10 keep
the existing zip but add an explicit length check comparing len(rrow) and
len(erow) before the loop to raise a clear error. Ensure you update the loop
that binds col, rv, ev and any related error handling accordingly.
- Around line 71-74: The SQL is currently built with an f-string in the variable
sql which can lead to fragility if schema/table values change; change this to a
parameterized query and pass (schema, table) as parameters to the DB execute
call instead of interpolating into sql directly (e.g., replace the f-string in
sql with a parameterized WHERE table_schema = %s AND table_name = %s and modify
the execute call that uses sql to supply (schema, table)). Update the code that
invokes sql (the DB execute/connection call in compare-ensdb.py) to use those
parameters so the query is executed safely and maintainably.

In `@src/handlers/Registry.ts`:
- Around line 136-157: Before throwing the fail-fast error in the addr.reverse
healing branch, emit a warning with contextual details: use context.logger.warn
(or existing processLogger) inside the IS_SUBGRAPH_COMPAT && parentNode ===
ADDR_REVERSE_NODE && event.chainId === 1 block just before the throw to log
labelHash, event.transaction.hash, event.transaction.from and owner and note
that receipt/trace fallback is unimplemented; leave the throw of the Error
unchanged and don't alter the healing fallback to await healLabelByLabelHash or
suppress the exception.

In `@src/handlers/TokenScope.ts`:
- Around line 35-52: The name_sale entity population in context.name_sale.set
uses mixed casing; update the field keys to snake_case for consistency (e.g.,
change logIndex → log_index, transactionHash → transaction_hash, chainId →
chain_id, orderHash → order_hash, contractAddress → contract_address, tokenId →
token_id, assetNamespace → asset_namespace, assetId → asset_id, domainId →
domain_id) while leaving values and functions (makeEventId, sale.nft.*,
sale.payment.*, BigInt(event.block.timestamp)) unchanged so the object keys in
the call to context.name_sale.set follow the same snake_case convention as
block_number.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 12dc1d2f-55ae-4381-9d19-358a3ea95431

📥 Commits

Reviewing files that changed from the base of the PR and between 12b7dfe and 451a931.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (28)
  • config.validation-370k.yaml
  • config.validation.yaml
  • config.yaml
  • docs/ENSDB_PARITY.md
  • package.json
  • schema.graphql
  • scripts/compare-ensdb.py
  • src/handlers/BaseRegistrar.ts
  • src/handlers/LineaRegistrar.ts
  • src/handlers/NameWrapper.ts
  • src/handlers/Registrar.ts
  • src/handlers/Registry.ts
  • src/handlers/Resolver.ts
  • src/handlers/ReverseRegistrar.ts
  • src/handlers/ThreeDNS.ts
  • src/handlers/TokenScope.ts
  • src/lib/helpers.ts
  • src/lib/interpretation.ts
  • src/lib/protocol-acceleration.ts
  • src/lib/registrar-helpers.ts
  • test/BaseRegistrar.test.ts
  • test/LineaRegistrar.test.ts
  • test/NameWrapper.test.ts
  • test/Registrar.test.ts
  • test/Registry.test.ts
  • test/Resolver.test.ts
  • test/ThreeDNS.test.ts
  • test/helpers.test.ts

Comment thread config.yaml
field_selection:
transaction_fields:
- hash
- from

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for NewOwner handler usage of transaction.from

# Check Registry and RegistryOld handlers
rg -n 'NewOwner' --type=ts -A 10 -B 2

# Check for transaction.from usage in handlers
rg -n 'transaction\.from' --type=ts src/handlers/

Repository: enviodev/ens-indexer

Length of output: 18455


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Find sharedEventValues definition
rg -n "sharedEventValues" -S src | head -n 50

# 2) Check where subgraph_new_owners is set and whether any `from` field is included directly
rg -n "subgraph_new_owners\.set\(" -S src/handlers

# 3) Search for direct assignment of a `from` property inside subgraph_new_owners writes
rg -n "subgraph_new_owners\.set\\(\\{[^}]*from" -S src/handlers || true

# 4) Inspect config.yaml around line 13
if [ -f config.yaml ]; then
  echo "----- config.yaml (lines 1-60) -----"
  sed -n '1,60p' config.yaml | cat -n
fi

# 5) Look for `from` field in the NewOwner entity/schema if present in repo (best-effort)
rg -n "subgraph_new_owners|NewOwner" -S schema.graphql src | head -n 100 || true

Repository: enviodev/ens-indexer

Length of output: 7489


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect sharedEventValues implementation
echo "----- sharedEventValues (src/lib/helpers.ts around 195) -----"
sed -n '160,260p' src/lib/helpers.ts | cat -n

# 2) Inspect subgraph_new_owners entity in schema.graphql
echo "----- subgraph_new_owners in schema.graphql -----"
# print around the type definition
sed -n '40,110p' schema.graphql | cat -n

# 3) Confirm config.yaml includes transaction_fields.from for ThreeDNSToken.NewOwner
echo "----- ThreeDNSToken NewOwner config (config.yaml) -----"
rg -n "ThreeDNSToken|ThreeDNSToken.*NewOwner|event: \"NewOwner" config.yaml
rg -n "ThreeDNSToken" config.yaml -n --context 2

Repository: enviodev/ens-indexer

Length of output: 6811


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Search schema.graphql for any `from` field on new owners-related entities
echo "----- schema.graphql: any 'from' references -----"
rg -n "\\bfrom\\b" schema.graphql || true

# 2) Show exact config.yaml block for ThreeDNSToken NewOwner (incl transaction_fields if present)
echo "----- config.yaml: ThreeDNSToken NewOwner block (around first occurrence) -----"
# Print around the first NewOwner event for ThreeDNSToken
python3 - <<'PY'
import re, itertools
path="config.yaml"
with open(path,'r',encoding='utf-8') as f:
    lines=f.readlines()
# find line indexes of ThreeDNSToken NewOwner event
idxs=[i for i,l in enumerate(lines) if 'name: ThreeDNSToken' in l]
# find first 'event: "NewOwner' after a ThreeDNSToken name
first=None
for i in idxs:
    for j in range(i, min(i+200, len(lines))):
        if 'event: "NewOwner' in lines[j]:
            first=j
            break
    if first is not None:
        break
if first is None:
    print("Could not find ThreeDNSToken NewOwner event block.")
    raise SystemExit(0)
start=max(0, first-40)
end=min(len(lines), first+80)
for k in range(start,end):
    ln=k+1
    print(f"{ln:4d}\t{lines[k].rstrip()}")
PY

# 3) Check all writes to subgraph_new_owners and see if any include `from`
echo "----- Handlers: subgraph_new_owners.set writes that mention from -----"
rg -n "subgraph_new_owners\\.set\\(" -S src/handlers
echo "----- Handlers: any writes including `from:` -----"
rg -n "subgraph_new_owners\\.set\\(\\{[^}]*from\\s*:" -S src/handlers || true

# 4) Check if any handler reads event.transaction.from inside Registry/ThreeDNS NewOwner flow
echo "----- Handlers: NewOwner handlers that reference transaction.from -----"
rg -n "NewOwner|handleNewOwner|contract: \"RegistryOld\"|contract: \"Registry\"|contract: \"ThreeDNSToken\"" src/handlers/Registry.ts src/handlers/ThreeDNS.ts -S
rg -n "transaction\\.from" src/handlers/Registry.ts src/handlers/ThreeDNS.ts -S || true

Repository: enviodev/ens-indexer

Length of output: 8706


NewOwner transaction.from is selected in config.yaml but not persisted by any NewOwner handler

  • config.yaml requests transaction_fields: - from for RegistryOld.NewOwner, Registry.NewOwner, and ThreeDNSToken.NewOwner, but sharedEventValues() only sets transaction_id (tx hash) and schema.graphql’s subgraph_new_owners has no from field.
  • src/handlers/Registry.ts and src/handlers/ThreeDNS.ts both write context.subgraph_new_owners.set({ ...sharedEventValues(...), ... }) without adding a from value; event.transaction.from is only used in Registry.ts for label healing, not stored on the NewOwner entity.

Update the GraphQL entity/types to include a from field and set it from event.transaction.from in all NewOwner handlers (then run pnpm codegen), or remove transaction_fields: from from config.yaml if it’s intentionally unused.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config.yaml` at line 13, config.yaml declares transaction_fields: - from for
RegistryOld.NewOwner, Registry.NewOwner, and ThreeDNSToken.NewOwner but no
handler persists it; update either the GraphQL entity and handlers to persist
the field or remove the config entry. Specifically, add a "from" field to the
subgraph_new_owners GraphQL type/schema, run pnpm codegen, and in
src/handlers/Registry.ts and src/handlers/ThreeDNS.ts include
event.transaction.from when calling context.subgraph_new_owners.set (combine it
with sharedEventValues(...) so the saved object contains from), or alternatively
delete the transaction_fields: from lines from config.yaml if you intend not to
store it.

Source: Coding guidelines

Comment thread schema.graphql
Comment on lines +11 to +20
parent: subgraph_domains @index
subdomain_count: Int!
resolved_address: subgraph_accounts @index
resolver: subgraph_resolvers
ttl: BigInt
isMigrated: Boolean!
createdAt: BigInt!
owner: subgraph_account! @index
registrant: subgraph_account @index
wrappedOwner: subgraph_account @index
expiryDate: BigInt
subdomains: [subgraph_domain!]! @derivedFrom(field: "parent")
is_migrated: Boolean!
created_at: BigInt!
owner: subgraph_accounts! @index
registrant: subgraph_accounts @index
wrapped_owner: subgraph_accounts @index

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Schema relationships violate coding guidelines.

The schema uses entity references (e.g., parent: subgraph_domains, owner: subgraph_accounts!) instead of ID fields. According to the coding guidelines, relationships must use entity_id fields, not entity references.

For example:

  • Line 11: parent: subgraph_domains should be parent_id: String
  • Line 18: owner: subgraph_accounts! should be owner_id: String!
  • Line 19: registrant: subgraph_accounts should be registrant_id: String

This pattern is repeated throughout the schema in all entity definitions.

As per coding guidelines: "Use entity_id for relationships in schema - Use token_id: String! not token: Token!"

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@schema.graphql` around lines 11 - 20, Replace all entity relationship fields
that reference other entities (e.g., parent: subgraph_domains, owner:
subgraph_accounts!, registrant: subgraph_accounts) with ID fields named using
the *_id convention and String types (e.g., parent_id: String, owner_id:
String!, registrant_id: String); update every similar field across the schema to
follow this pattern (use non-null String! when original field was non-null), and
then update any code that accesses those fields (mappings, resolvers, ingest
logic) to read/write the new *_id fields instead of the entity references
(search for occurrences of parent, owner, registrant, resolver, wrapped_owner,
resolved_address, etc., and replace with parent_id, owner_id, registrant_id,
resolver_id, wrapped_owner_id, resolved_address_id as appropriate).

Source: Coding guidelines

Comment thread src/lib/interpretation.ts
Comment on lines +219 to +220
const response = await fetch(url);
const body = (await response.json()) as HealResponse;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Does fetch AbortSignal.timeout work in Node.js 18+?

💡 Result:

Yes, AbortSignal.timeout is supported and works in Node.js 18 [1][2]. The AbortSignal.timeout(delay) method was added to Node.js in versions 17.3.0 and 16.14.0 [1][3]. Since Node.js 18 was released after these versions, it includes this functionality out of the box [4][5]. When used with the built-in fetch API in Node.js 18+, it allows you to specify a timeout for your requests [6]: import { fetch } from 'undici'; // or use the global fetch in Node.js 18+ const controller = new AbortController; const signal = AbortSignal.timeout(5000); // Aborts after 5 seconds try { const response = await fetch('https://example.com', { signal }); const data = await response.json; } catch (err) { if (err.name === 'TimeoutError') { console.error('Request timed out'); } } Important notes regarding its behavior: - Node.js process lifecycle: AbortSignal.timeout uses an unref'd timer by design [7]. This means it will not keep the Node.js event loop alive if there are no other active tasks [8]. - Troubleshooting: If you encounter issues where it seems not to work as expected with fetch, ensure you are passing the signal correctly in the options object [9][6]. Some advanced timeout configurations (like specific connection or header timeouts) may require using a custom undici Dispatcher/Agent instead of just a standard AbortSignal [10][6].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the relevant section around the cited lines
FILE="src/lib/interpretation.ts"
echo "== $FILE (lines 170-260) =="
nl -ba "$FILE" | sed -n '170,260p'

Repository: enviodev/ens-indexer

Length of output: 173


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the healLabelByLabelHash function definition (for additional context)
rg -n "healLabelByLabelHash|heal label|healLabel" src/lib/interpretation.ts || true

Repository: enviodev/ens-indexer

Length of output: 156


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check whether fetch is used elsewhere in this file without a timeout for consistency
echo "== fetch calls in src/lib/interpretation.ts =="
rg -n "await fetch\\(" src/lib/interpretation.ts || true

Repository: enviodev/ens-indexer

Length of output: 158


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="src/lib/interpretation.ts"
echo "== $FILE (lines 190-255) =="
awk 'NR>=190 && NR<=255 {printf "%d:%s\n", NR, $0}' "$FILE"

Repository: enviodev/ens-indexer

Length of output: 2273


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="src/lib/interpretation.ts"
echo "== $FILE (lines 155-190) =="
awk 'NR>=155 && NR<=190 {printf "%d:%s\n", NR, $0}' "$FILE"

Repository: enviodev/ens-indexer

Length of output: 1369


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="src/lib/interpretation.ts"
echo "== $FILE (lines 255-320) =="
awk 'NR>=255 && NR<=320 {printf "%d:%s\n", NR, $0}' "$FILE"

Repository: enviodev/ens-indexer

Length of output: 112


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE="src/lib/interpretation.ts"
echo "== $FILE (lines 190-255) =="
awk 'NR>=190 && NR<=255 {printf "%d:%s\n", NR, $0}' "$FILE"

Repository: enviodev/ens-indexer

Length of output: 2273


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Check fetch/timeout-related runtime assumptions
echo "== package.json =="
cat package.json

Repository: enviodev/ens-indexer

Length of output: 715


Add an AbortSignal.timeout to the ENSRainbow fetch call (src/lib/interpretation.ts:219)

healLabelByLabelHash uses await fetch(url) without a timeout, so a hung connection can stall healing indefinitely; Node (engines >=22) supports AbortSignal.timeout.

🛡️ Proposed fix to add timeout
-        const response = await fetch(url);
+        const response = await fetch(url, { signal: AbortSignal.timeout(30_000) });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const response = await fetch(url);
const body = (await response.json()) as HealResponse;
const response = await fetch(url, { signal: AbortSignal.timeout(30_000) });
const body = (await response.json()) as HealResponse;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/interpretation.ts` around lines 219 - 220, In healLabelByLabelHash,
the fetch(url) call can hang indefinitely; replace it with a fetch that uses
AbortSignal.timeout(...) and pass the resulting signal via the fetch options
(e.g., fetch(url, { signal })) so the request times out; after the fetch, keep
assigning the parsed JSON to body as before (const response = await fetch(...);
const body = await response.json() as HealResponse), and ensure you choose a
reasonable timeout value (e.g., 5–15s) or read it from configuration; reference
the healLabelByLabelHash function and the response/body variables when making
this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant