Skip to content

Implement cross-section join for client-side filtering#1292

Open
DZakh wants to merge 13 commits into
claude/amazing-thompson-IzI1Mfrom
claude/tender-davinci-C5CyW
Open

Implement cross-section join for client-side filtering#1292
DZakh wants to merge 13 commits into
claude/amazing-thompson-IzI1Mfrom
claude/tender-davinci-C5CyW

Conversation

@DZakh

@DZakh DZakh commented Jun 9, 2026

Copy link
Copy Markdown
Member

Summary

Adds cross-section join logic to propagate row-level filters across the block ⊃ transaction ⊃ log hierarchy. When filters on one section exist alongside output from another, the client now correctly restricts rows via foreign key relationships (block number, transaction index).

Key Changes

  • JoinPlan struct: Determines when cross-section joining is needed based on which sections are filtered vs. selected for output. Activates only when at least two sections are relevant and one carries a row-level filter.

  • Extra field injection: JoinPlan::extra_fields() specifies which join-key columns (block.number, transaction.block_number, transaction.transaction_index, log.block_number, log.transaction_index) must be fetched from the server to evaluate joins client-side.

  • Bidirectional propagation: apply_join() implements down-propagation (filtered parents drop children) and up-propagation (filtered children drop parents with no surviving descendants). Filtered sections act as inner joins; unfiltered ones as optional.

  • Helper utilities: Added read_u64_col(), read_u64(), kept_set(), kept_pair_set(), materialize(), retain(), and section_rows() to extract and manipulate join keys and masks.

  • Integration: Updated compute_masks() signature to accept JoinPlan, and modified the executor to construct the plan from filter and output metadata before querying.

  • Test coverage: Added three tests validating bidirectional join behavior, block filter restriction of logs, and join activation logic.

Implementation Details

  • Join keys are read as u64 with support for multiple Arrow types (UInt64, UInt8, Binary).
  • Masks are materialized to full-length boolean vectors when absent, enabling uniform join logic.
  • The join plan is constructed once per query using filtered_sections() (from WhereFilter) and output_sections() (from Selection), avoiding repeated computation.

https://claude.ai/code/session_01RCunfUs5qfz83LiB3V2123

claude and others added 11 commits June 3, 2026 14:07
Filters on one section (block/transaction/log) now restrict rows of the
related sections via the block/transaction/log foreign keys. Filtered
sections act as inner joins, unfiltered ones as optional: a filtered
parent drops its children, and a filtered child drops parents with no
surviving descendant. Join keys (blockNumber/transactionIndex) are
auto-injected into the field selection when a join is active; block.number
ranges stay scan-window-only and request no extra column.
* Persist batches concurrently with processing

Make the in-memory store fire its batch write into a single-slot
pendingPersistence and return control, so the next batch can process
while the previous one writes. At most one write is in flight: the next
write awaits the prior before firing, keeping writes in batch order.

- InMemoryStore: split writeBatch into a synchronous prepare (snapshot +
  store reset + committedCheckpointId advance) and a fired storage write.
  Concurrency is gated on keepLatestChanges; when the store drops its
  latest changes the write is awaited inline so later DB reads stay
  consistent. flushPendingPersistence awaits the in-flight write and is
  called from prepareRollbackDiff before clearing the cache.
- LoadLayer: serve effect cache hits from the in-flight write's effects
  snapshot before reading the not-yet-committed DB rows.
- GlobalState: flush the pending write before rollback DB reads and
  before the success-exit paths.
- MockIndexer: getBatchWritePromise awaits the in-flight write.

Known pending: one realtime-ordering E2E test ("Live source should not
participate in initial height fetch but should after sync") asserts a
fetch-vs-write interleaving that legitimately shifts now that the write
no longer blocks EventBatchProcessed. Awaiting decision on how to update.

https://claude.ai/code/session_01VEEEfkaYzNwoeb1iuqm9A1

* Update realtime height-race test for concurrent batch writes

Reaching head now flips isRealtime before the first waitForNewBlock
since the batch write no longer blocks EventBatchProcessed, so the
first race already runs in realtime mode (Live primary, Sync secondary).

https://claude.ai/code/session_01VEEEfkaYzNwoeb1iuqm9A1

* Flush pending write on mock restart; fix effect read-through across schema change

Mock indexer restart now awaits the in-memory store's in-flight write
before starting the new indexer on the same DB, so the old and new
writes don't race.

Fix the effect cache read-through to only serve from the in-flight
write's snapshot when it's the same effect instance. A different effect
sharing the name (e.g. an updated output schema) must go through the
DB-load path, which re-validates and invalidates stale outputs; serving
the raw pending value bypassed invalidation and fed handlers a stale
result.

https://claude.ai/code/session_01VEEEfkaYzNwoeb1iuqm9A1

* Chain metadata persistence via throttled idle-flush on the background cycle (#1276)

* Make in-memory store persistence a standalone background cycle

Decouple the database write from batch processing. Processing now only
updates the in-memory store and continues; a persistence cycle owned by
the in-memory store drains changes to Postgres on its own.

- Split the checkpoint pointer into committedCheckpointId (last persisted
  to db) and processedCheckpointId (in-memory frontier). createBatch keys
  off processedCheckpointId; history retention still keys off committed.
- commitBatch accumulates batch metadata and triggers a single-writer
  background loop (strictly one write in flight, overlapping processing).
- Snapshot rawEvents/effects/entity changes synchronously at write start so
  the in-memory store is never reset before its changes are committed;
  effect outputs being written stay readable via a pending dict.
- Capacity gate (50k changes) before each batch: drop committed changes,
  else await a commit.
- Drain the cycle before a rollback and flush it before a successful exit.
- Serialize chain-metadata writes with batch writes to avoid concurrent
  updates to the chains table.
- MockIndexer awaits the full write (and settles) before returning.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Address PR review on EventProcessing

- Await in-memory store capacity before starting the batch timer.
- Drop the redundant comment over commitBatch.
- Remove db-write duration from processing metrics; the write now happens
  off the processing path in the in-memory store cycle.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Fold chain-metadata write into the in-memory store cycle

Persist chain metadata from the persistence cycle instead of a separate
throttled write. Because the cycle is the single db writer, the metadata
write no longer races batch writes on the chains table, so the throttler
and the serializeDbWrite mutex are both removed.

Also make the effect table's pendingDict always present instead of
optional, for simplicity.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Write chain metadata via a separate throttler again

Revert the in-cycle chain-metadata write back to a throttled, separate
setChainMeta, serialized through the store's serializeDbWrite so it never
overlaps a background batch write on the chains table.

Also replace drainForRollback with flush - awaiting the write cycle
already drains all pending batches, so the explicit resets were redundant.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Fold chain metadata into the batch write; tidy store fields

- Persist chain metadata as part of the batch write transaction instead of
  a separate throttled write. The store keeps current vs committed metadata
  and only the stale per-chain diff is folded into writeBatch, so metadata
  never races the batch write and the throttler/serializeDbWrite are gone.
- Make persistence and config immutable creation params of InMemoryStore
  instead of mutable fields set per batch.
- Stop the ProcessEventBatch loop once an exit is decided, so the async exit
  flush doesn't let further batches process (fixes the auto-exit smoke test
  processing past the first event block).

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Carry isInReorgThreshold on the batch and split writes on its boundary

Move isInReorgThreshold onto Batch.t (set at creation from the chain
manager) instead of passing it separately into commitBatch.

The persistence cycle no longer merges all queued batches blindly. It
drains the leading run of processed batches that share isInReorgThreshold
and writes only those, leaving the rest for the next write. Entity changes
are snapshotted up to the run boundary so a single write never mixes
history-saving modes (avoids over-saving history across the threshold
transition).

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Make persistence/config required; single-pass batch-run drain

- persistence and config are now non-optional fields of InMemoryStore.make.
  The in-memory-only test helper supplies a shared default persistence.
- Drive the write cycle off processedBatches being non-empty, so
  drainBatchRun is never called with an empty array.
- drainBatchRun now splits the run and accumulates checkpoints/progress in a
  single pass instead of one forEach plus five map+concatMany.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Review fixes: avoid capacity deadlock and remove dead code

- awaitCapacity only waits for a commit when there is a queued batch to
  free capacity. A large rollback diff is staged without a batch, so
  waiting on it would deadlock; let processing proceed instead.
- Remove resetButKeepLatestChanges/resetButKeepLoadedFromDbChanges, dead
  since the cycle uses snapshotChanges/dropCommittedChanges. Replace the
  obsolete unit test with one for dropCommittedChanges.
- Remove the now-unused chain-metadata throttle env var.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Persist chain metadata on a throttled idle path with delta tracking

Stage chain metadata as a per-chain dirty delta computed at stage time via
structural comparison instead of a JSON-stringify diff on every write. A batch
write folds the delta into its transaction for free; when no batch is flowing,
a throttled standalone upsert flushes it, restoring idle freshness while keeping
all writes serialized through the single write loop.

* Track chain-meta dirtiness with a bool instead of a delta dict

setChainMeta writes a single unnest upsert regardless of chain count, so a
per-chain delta bought nothing at the db level. Replace dirtyChainMeta with a
flag and write a shallow-copied snapshot of the latest per-chain metadata.

* Defer Throttler execution to setImmediate

Run scheduled functions on the next setImmediate instead of synchronously
inside schedule, so work queued before them (e.g. a batch task) runs first.
This makes chain-metadata fold into the imminent batch write by default and
replaces the startThrottled priming.

* Tighten comments; share setImmediate binding via NodeJs

Condense the persistence-cycle and chain-metadata comments to one line where
they earn it, and move the duplicated setImmediate external into NodeJs so
Throttler and GlobalStateManager share a single binding.

* Reuse NodeJs.setImmediate in Throttler test; retry timing tests

Drop the duplicate setImmediate external from the test and reuse the shared
NodeJs binding. Add retry to the two interval-timing tests, matching the
others, since deferred execution adds macrotask jitter.

---------

Co-authored-by: Claude <noreply@anthropic.com>

* Persist the in-memory store on a standalone background cycle (#1275)

* Make in-memory store persistence a standalone background cycle

Decouple the database write from batch processing. Processing now only
updates the in-memory store and continues; a persistence cycle owned by
the in-memory store drains changes to Postgres on its own.

- Split the checkpoint pointer into committedCheckpointId (last persisted
  to db) and processedCheckpointId (in-memory frontier). createBatch keys
  off processedCheckpointId; history retention still keys off committed.
- commitBatch accumulates batch metadata and triggers a single-writer
  background loop (strictly one write in flight, overlapping processing).
- Snapshot rawEvents/effects/entity changes synchronously at write start so
  the in-memory store is never reset before its changes are committed;
  effect outputs being written stay readable via a pending dict.
- Capacity gate (50k changes) before each batch: drop committed changes,
  else await a commit.
- Drain the cycle before a rollback and flush it before a successful exit.
- Serialize chain-metadata writes with batch writes to avoid concurrent
  updates to the chains table.
- MockIndexer awaits the full write (and settles) before returning.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Address PR review on EventProcessing

- Await in-memory store capacity before starting the batch timer.
- Drop the redundant comment over commitBatch.
- Remove db-write duration from processing metrics; the write now happens
  off the processing path in the in-memory store cycle.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Fold chain-metadata write into the in-memory store cycle

Persist chain metadata from the persistence cycle instead of a separate
throttled write. Because the cycle is the single db writer, the metadata
write no longer races batch writes on the chains table, so the throttler
and the serializeDbWrite mutex are both removed.

Also make the effect table's pendingDict always present instead of
optional, for simplicity.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Write chain metadata via a separate throttler again

Revert the in-cycle chain-metadata write back to a throttled, separate
setChainMeta, serialized through the store's serializeDbWrite so it never
overlaps a background batch write on the chains table.

Also replace drainForRollback with flush - awaiting the write cycle
already drains all pending batches, so the explicit resets were redundant.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Fold chain metadata into the batch write; tidy store fields

- Persist chain metadata as part of the batch write transaction instead of
  a separate throttled write. The store keeps current vs committed metadata
  and only the stale per-chain diff is folded into writeBatch, so metadata
  never races the batch write and the throttler/serializeDbWrite are gone.
- Make persistence and config immutable creation params of InMemoryStore
  instead of mutable fields set per batch.
- Stop the ProcessEventBatch loop once an exit is decided, so the async exit
  flush doesn't let further batches process (fixes the auto-exit smoke test
  processing past the first event block).

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Carry isInReorgThreshold on the batch and split writes on its boundary

Move isInReorgThreshold onto Batch.t (set at creation from the chain
manager) instead of passing it separately into commitBatch.

The persistence cycle no longer merges all queued batches blindly. It
drains the leading run of processed batches that share isInReorgThreshold
and writes only those, leaving the rest for the next write. Entity changes
are snapshotted up to the run boundary so a single write never mixes
history-saving modes (avoids over-saving history across the threshold
transition).

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Make persistence/config required; single-pass batch-run drain

- persistence and config are now non-optional fields of InMemoryStore.make.
  The in-memory-only test helper supplies a shared default persistence.
- Drive the write cycle off processedBatches being non-empty, so
  drainBatchRun is never called with an empty array.
- drainBatchRun now splits the run and accumulates checkpoints/progress in a
  single pass instead of one forEach plus five map+concatMany.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Review fixes: avoid capacity deadlock and remove dead code

- awaitCapacity only waits for a commit when there is a queued batch to
  free capacity. A large rollback diff is staged without a batch, so
  waiting on it would deadlock; let processing proceed instead.
- Remove resetButKeepLatestChanges/resetButKeepLoadedFromDbChanges, dead
  since the cycle uses snapshotChanges/dropCommittedChanges. Replace the
  obsolete unit test with one for dropCommittedChanges.
- Remove the now-unused chain-metadata throttle env var.

https://claude.ai/code/session_01TuuFyaX6X8RzzDK2v6gfAt

* Raise capacity limit to 100k incl. batch items; surface write errors via onError

- keepLatestChangesLimit 50k -> 100k, now counts queued batch items alongside
  entity changes so a low-entity/high-item workload can't outrun persistence.
- InMemoryStore.make takes a required onError callback; a failed background
  write reports through it immediately instead of being thrown at the next
  batch's awaitCapacity. Main wires it to dispatch ErrorExit.
- awaitCapacity/flush no longer rethrow persistenceError; they stop draining
  since onError owns surfacing the failure.

* Replace persistenceError option<exn> with a hasFailedWrite bool

The stored exn was never read back - it's handed straight to onError at the
failure site. The field only gates the write loop, so a plain bool says what
it is.

* Surface unexpected writes from in-memory-only test store instead of ignoring

These stores never run the persistence cycle, so onError firing means a test
is wired wrong - log and raise rather than swallow it.

* Route fatal errors through a single onError handler

Hold one onError callback (log + exit) on GlobalState and share it with the
in-memory store. The store calls it directly on a background write failure
instead of dispatching ErrorExit, and the ErrorExit action delegates to the
same callback rather than inlining its own log + exit.

* Tighten comments

* Pass required onError to InMemoryStore in ChainMeta_test

The merged store signature makes onError required; the in-memory test
store raises on any unexpected persistence write.

https://claude.ai/code/session_01Taw9xnp2tLPUvHiW1BSumS

---------

Co-authored-by: Claude <noreply@anthropic.com>

* Defer raw event creation and deep NUL stripping to batch write (#1278)

* Build raw_events in PgStorage from batch items

Move raw event row construction out of the per-event processing path and
into PgStorage.writeBatch, which now derives the rows by iterating the
batch items being written. Carry batch items through drainBatchRun so they
reach the write, and drop the rawEvents accumulator and the ~rawEvents
parameter from the storage interface.

* Cover raw_events in the e2e indexer test

Enable `raw_events: true` in the e2e_test config and assert the indexer
writes one raw_events row per processed event, with the decoded params,
src address and transaction fields matching the known first event.

* Sanitize NUL bytes in raw_events writes (#1195)

A NUL byte in event params made the raw_events jsonb INSERT fail with
22P05, poisoning the batch transaction and aborting unrelated entity
writes. Route the raw_events write through the same escape-and-retry path
used for entities: on a Postgres encoding error, escape the offending
table and retry. The stripper now recurses into nested objects/arrays so
a NUL buried inside an event param object (or a json entity field) is
removed, and the classifier also recognizes the jsonb-specific error
message in addition to the text-column one.

---------

Co-authored-by: Claude <noreply@anthropic.com>

* Track effect cache entries by checkpoint for commit-gated eviction (#1279)

* Track effect cache entries by checkpoint for commit-gated eviction

Store each effect cache entry as a Change stamped with the per-item
checkpointId (mirroring entity changes) instead of a raw output in a dict
that was wiped after every write. Committed entries are now reclaimed by
dropCommittedEffects in awaitCapacity, and effect entries count toward the
in-memory changes limit. cache:false outputs are stored in memory but not
persisted, and are evictable (re-run on a later miss). Removes pendingDict
and the per-write dict swap.

Make the changes limit configurable via ENVIO_MAX_IN_MEMORY_CHANGES.

Known open item: E2E "Track effects in prom metrics" fails. It swaps an
effect's output schema mid-run (no restart) and expects the warm in-memory
entry to be re-validated/invalidated. Under the new model committed entries
stay warm and are only re-validated on a db reload (i.e. across restarts,
the real-world schema-change path). Pending a decision on adapting the test.

* Test effect-cache schema invalidation via the single restart

Under commit-gated eviction a committed effect entry stays warm in memory,
so a mid-run output-schema change isn't re-validated. Schema changes are
code changes that take effect on restart, where the db cache is reloaded and
re-validated. Restructure the test so both cache entries are written before
the existing restart, then exercise the new schema in the post-restart batch
(avoids a second restart, which collides on the checkpoint pkey).

* Rename env to ENVIO_IN_MEMORY_OBJECTS_TARGET; inline mapChangeToEffectOutput

* Evict committed changes before db-loaded ones in awaitCapacity

Tiered backpressure: drop our committed writes first (cheap to re-derive),
then db-loaded entries, and only then wait for a commit. Applies to both
entity and effect tables via keepLoadedFromDb.

* Fix doc comment placement for dropCommitted/awaitCapacity

---------

Co-authored-by: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
* Add --runtime flag to envio metrics CLI command

Lets `envio metrics --runtime` fetch the indexer's /metrics/runtime
endpoint instead of the default /metrics.

https://claude.ai/code/session_01FruGiMDrPPWC2wL3KjxFb4

* Make envio metrics runtime a subcommand

`envio metrics runtime` fetches the indexer's /metrics/runtime endpoint;
`envio metrics` still fetches /metrics.

https://claude.ai/code/session_01FruGiMDrPPWC2wL3KjxFb4

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
* Add internal onRollbackCommit indexer callback

Add `indexer.~internalAndWillBeRemovedSoon_onRollbackCommit`, an unstable
internal hook fired once per chain affected by a reorg rollback, after the
rollback diff is durably written. A throwing callback bubbles to the write
loop's onError, crashing the indexer like a failed write.

The whole feature is concentrated in RollbackCommit.res plus three call
sites (registration in Main, per-chain block snapshot in GlobalState, fire
on a successful rollback write in InMemoryStore) so it can be removed in one
piece.

https://claude.ai/code/session_01E1JiTmZ9APfXhzDvjEVgX7

* Source rollback commit info from the in-memory store rollback object

Carry per-chain last-valid blocks on `Persistence.rollback` instead of a
global pending ref, and guard the fire on a registered callback so an empty
registry doesn't schedule an extra microtask.

https://claude.ai/code/session_01E1JiTmZ9APfXhzDvjEVgX7

---------

Co-authored-by: Claude <noreply@anthropic.com>
* test: reproduce effect cache nested-option sentinel leak

A cached effect with an optional output that resolves to None returns the
ReScript nested-option sentinel { BS_PRIVATE_NESTED_SOME_NONE: 0 } instead
of undefined on a cache hit, because the in-memory effect cache stores
option<output> and the hit path returns it without unwrapping the outer
option, leaking Some(None).

* fix: don't leak nested-option sentinel from effect cache hits

The in-memory effect cache stored option<output>; getEffectOutput wrapped
the raw output in Some(), encoding Some(None) as the nested-option sentinel
{ BS_PRIVATE_NESTED_SOME_NONE: 0 } for effects with an optional output that
resolved to None. getUnsafeInMemory unwrapped with Option.getUnsafe (an
identity), so the sentinel leaked to the handler instead of undefined.

Split getEffectOutput into hasEffectOutput (presence check) and
getEffectOutputUnsafe (raw output), mirroring InMemoryTable.Entity.getUnsafe,
so the output is never wrapped in an extra option.

---------

Co-authored-by: Claude <noreply@anthropic.com>
…am names (#1286)

* test: reproduce same-signature event param name dedup bug

Two contracts emitting the same-signature event with differently named
params share one native decoder entry (collectEventParams dedupes by
sighash+topicCount, first-contract-wins). The second contract's events
then decode under the first contract's param names, so its handler reads
undefined.

https://claude.ai/code/session_01MyjtCSDfE2XybnkeEa9q9y

* fix: decode same-signature events under each contract's own param names

Two contracts emitting the same-signature event with differently named
params shared one native decoder entry: collectEventParams deduped by
(sighash, topicCount) first-contract-wins, so the second contract's events
decoded under the first contract's names and its handler read undefined.

The native decoder now returns params keyed by contract name. The inner
ABI decode stays single (keyed by signature); each contract sharing the
signature re-applies its own names over the shared positional values. After
the router resolves a log to its contract by address, the source picks
params[contractName].

https://claude.ai/code/session_01MyjtCSDfE2XybnkeEa9q9y

* Guard missing contract key in decoded params lookup

Replace Dict.getUnsafe with a safe lookup at both source pick sites. A
missing contract key now folds into each source's existing decode-miss
path (HyperSyncSource raises via handleDecodeFailure for non-wildcard
events; RpcSource skips) instead of silently returning undefined params.

Document why decoder signatures are keyed by MetaKey alone: the upstream
decoder collapses by (topic0, topic count) too, so an indexed-layout
fingerprint can't be distinguished at this layer regardless.

https://claude.ai/code/session_01MyjtCSDfE2XybnkeEa9q9y

---------

Co-authored-by: Claude <noreply@anthropic.com>
* test: reproduce renamed-event decode failure (issue #1285)

An event given a `name:` that differs from its on-chain name fails to
decode over HyperSync (and RPC, which shares the native decoder). The
native decoder rebuilds each event's inner signature from the display
`name:` (`reconstruct_signature` in hypersync_source/decode.rs), so its
topic0 becomes keccak256("ApprovalRenamed(...)") and never matches the
real log's keccak256("Approval(...)"). The inner decoder map misses, the
log decodes as null, and every such log is dropped with
"Event ... was unexpectedly parsed as undefined".

Not fixed by 3.1.1 (#1286), which only addressed param-name sharing
across contracts and left the display-name signature reconstruction
untouched.

This test feeds a renamed event ("ApprovalRenamed" over the real
"Approval" sighash) through the production decoder and asserts it decodes
to named params; it currently returns undefined.

https://claude.ai/code/session_01ReZeFShes7qPxp8zCkAgWo

* fix: decode renamed events under their on-chain sighash (issue #1285)

The native decoder built its inner event decoder via
`hypersync_client::Decoder::from_signatures`, which keys each decode by
the keccak selector of the signature string. That string was rebuilt from
the event's display `name:`, so a renamed event (display name != on-chain
name) keyed on keccak256("ApprovalRenamed(...)") and never matched the
real log's keccak256("Approval(...)"). The log decoded as null and every
such event was dropped with "Event ... was unexpectedly parsed as
undefined" — over both HyperSync and RPC, which share this decoder.

Build the positional `DynSolEvent` ourselves and pin its topic0 to the
on-chain sighash the MetaKey already carries. The display name still
recovers the ABI types (names are layered on per-contract by
`apply_names`), but topic0 keying now matches the real log. This also
drops the dependency on the inner hypersync decoder's selector keying.

https://claude.ai/code/session_01ReZeFShes7qPxp8zCkAgWo

* refactor: build event decoder from param types directly

`build_event_decoder` rebuilt a signature string from the event name,
parsed it, and resolved it to a DynSolEvent just to recover the param
types — then discarded the name-derived selector in favour of the
MetaKey sighash. Skip the round-trip: parse each param's abi_type into a
DynSolType directly and split on indexed. This drops `reconstruct_signature`
and makes it explicit that the event name has no role in decoding.

https://claude.ai/code/session_01ReZeFShes7qPxp8zCkAgWo

* perf: single map lookup and clone-free decode on the hot path

Per-log decode did two HashMap lookups of the same MetaKey (one for the
decoder, one for variants) where the second could never miss, and
apply_names deep-cloned every decoded DynSolValue even when a single
contract owned the decode.

Merge the two maps into HashMap<MetaKey, RegisteredEvent> so each log
does one lookup, and have apply_names consume the DecodedEvent by value
(iterators over indexed/body instead of index-and-clone). The common
single-variant case moves the decode in without cloning; only the rare
same-signature-across-contracts case clones, once per extra contract.

https://claude.ai/code/session_01ReZeFShes7qPxp8zCkAgWo

* fix: satisfy clippy map_entry and annotate Utils.magic in test

CI ran `cargo clippy -- -D warnings` and rejected the contains_key +
insert + get_mut pattern in from_params (clippy::map_entry). Use the
HashMap entry API, which also makes registration a single lookup.

Annotate the Utils.magic cast in RenamedEventDecode_test with explicit
input/output types per the repo guideline.

https://claude.ai/code/session_01ReZeFShes7qPxp8zCkAgWo

* Reject incompatible ABI layouts colliding on the same MetaKey

When two events share a (topic0, topic_count) MetaKey but split their
params into indexed/body differently, the shared positional decoder
(built from the first variant) would silently mis-type the later one.
Config parsing should prevent this, but add a decoder-side backstop:
from_params now errors when a later variant's layout (ABI types +
indexed flags + nested components, names ignored) differs from the
registered one. Same-signature variants that differ only in param names
still register, since apply_names applies each variant's own names.

https://claude.ai/code/session_01ReZeFShes7qPxp8zCkAgWo

---------

Co-authored-by: Claude <noreply@anthropic.com>
* Fix SVM onSlot indexer getting stuck after restart

On resume, FetchState.make initializes knownHeight from the DB but leaves
the buffer empty. For onBlock-only indexers (e.g. SVM onSlot with no event
contracts), getNextQuery returns NothingToQuery because there are no
partitions to generate fetch queries, and shouldWaitForNewBlock is false
(since onBlock is behind headBlockNumber). This creates a deadlock where
no code path calls updateInternal to populate the buffer.

Fix: call updateInternal at the end of FetchState.make when knownHeight > 0
and onBlockConfigs are present. This populates the buffer with pending
onBlock items so ProcessEventBatch can consume them and keep the cycle going.

https://claude.ai/code/session_017fkJnxkYa4WxAsYFEN4axX

* Refactor onBlock buffer population into shared appendOnBlockItems

* Replace onBlock resume unit tests with svm_test MockIndexer regression test

* Make mutItems a labeled argument in appendOnBlockItems

---------

Co-authored-by: Claude <noreply@anthropic.com>
* Remove Belt usage in favor of ReScript Stdlib

Replace all Belt.* calls across the runtime library and test scenarios
with their Stdlib equivalents (Array/Int/Float/Option). Reimplement
ChainMap on top of dict instead of Belt.Map/Belt.Id, preserving the
immutable API and ascending chain-id iteration order.

* Keep Belt-based ChainMap implementation

* Remove Belt from SlotResume_test merged from main

* Use Float.toInt for backoff interval conversion consistency

---------

Co-authored-by: Claude <noreply@anthropic.com>
…1291)

* Move batch-processing flags from GlobalState into InMemoryStore

Move currentlyProcessingBatch (renamed isProcessing) and the processedBatches
counter (renamed processedBatchesCount) out of the versioned GlobalState record
into the mutable InMemoryStore. Both were already updated identically in the
valid and invalidated reducer paths, so they never relied on snapshot
semantics; mutating them in place is behavior-preserving and drops a record
spread plus a duplicated reducer update.

* Rename InMemoryTable.Entity module to EntitiesState

The module held only an Entity submodule and was always referenced as
InMemoryTable.Entity. Flatten it into a top-level EntitiesState module so the
type reads EntitiesState.t.

* Revert "Rename InMemoryTable.Entity module to EntitiesState"

This reverts commit cc5fffa.

* Drop StartProcessingBatch action, set isProcessing in place

isProcessing now lives in the mutable InMemoryStore, so the action was a
vestigial state transition: it only flipped the store flag and returned the
state unchanged. It is dispatched synchronously before any await in the
ProcessEventBatch task, after the task already passed the stateId validity
check, so it could never be routed to the invalidated reducer. Replace the
dispatch with a direct mutation.

---------

Co-authored-by: Claude <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a8a13b96-9b74-476c-a665-dd79e251cd85

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Comment @coderabbitai help to get the list of available commands and usage tips.

claude added 2 commits June 9, 2026 09:33
…-C5CyW

# Conflicts:
#	packages/cli/CommandLineHelp.md
- selecting a child for output never drops a childless parent
- a filtered child drops parents left with no surviving row
- block.number range stays scan-window-only; a set marks block filtered
- the next-page hint round-trips cross-section filters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants