Skip to content

Commit abc13df

Browse files
zwxxbcursoragent
andauthored
[move-flow] Structured trace capture in move-flow replay (#19979)
* [move-flow] Add structured trace capture to move_replay_transaction Wraps the MoveDebugger with an opt-in TracingDebugger that records state-view requests and (optionally) per-read storage accesses into the replay response. The recorder is bounded by max_trace_events, redacts StateKey debug output by default, and surfaces the captured trace in the error message when transaction-output materialization fails so the caller does not lose visibility into preceding reads. Ships a move-replay skill template documenting the new trace, trace_storage_reads, max_trace_events, and redact_storage_keys parameters. * address comments - Validate `max_trace_events` when `trace` is enabled: reject 0 (would drop the guaranteed `state_view` event) and values above a 100k server-side cap. Surfaces as a structured `invalid_params` error. - Add unit tests for `require_user_transaction`: happy path plus rejection branches for `StateCheckpoint` and `Genesis`, asserting the error code, variant name, and version land in the message. - Add unit tests for the new `validate_capture_opts` helper covering the lower bound, the cap, and the default. - Document the `[1, 100_000]` range for `max_trace_events` in the move-replay skill. * [move-flow] Fix rustfmt formatting in replay_transaction test Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: zwxxb <zwxxb@users.noreply.github.com> Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent 6a67410 commit abc13df

7 files changed

Lines changed: 870 additions & 62 deletions

File tree

Cargo.lock

Lines changed: 4 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

aptos-move/flow/Cargo.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,16 @@ path = "src/main.rs"
1919
anyhow = { workspace = true }
2020
aptos-cli-common = { workspace = true }
2121
aptos-framework = { workspace = true }
22+
aptos-gas-profiling = { workspace = true }
2223
aptos-gas-schedule = { workspace = true }
2324
aptos-move-cli = { workspace = true }
2425
aptos-move-debugger = { workspace = true }
2526
aptos-rest-client = { workspace = true }
2627
aptos-types = { workspace = true }
2728
aptos-validator-interface = { workspace = true }
2829
aptos-vm = { workspace = true, features = ["testing"] }
30+
aptos-vm-types = { workspace = true }
31+
async-trait = { workspace = true }
2932
clap = { workspace = true, features = ["derive"] }
3033
codespan-reporting = { workspace = true }
3134
legacy-move-compiler = { workspace = true }
@@ -55,6 +58,7 @@ url = { workspace = true }
5558
walkdir = { workspace = true }
5659

5760
[dev-dependencies]
61+
aptos-crypto = { workspace = true }
5862
aptos-package-builder = { workspace = true }
5963
move-core-types = { workspace = true }
6064
move-prover-test-utils = { workspace = true }
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
{{ frontmatter(name="move-replay", description="Replay a committed on-chain Aptos transaction locally to debug its outcome. Use when investigating a failed or unexpected transaction, reproducing an abort, or testing a local Move patch against a historical transaction.") }}
2+
3+
## When to Use This Skill
4+
5+
Use this skill whenever the user wants to:
6+
7+
- Understand why an on-chain transaction succeeded or failed (Move abort, execution failure, out-of-gas).
8+
- Reproduce a transaction's behavior locally without re-submitting it.
9+
- Test whether a *local* Move package fix would change the outcome of a committed transaction (regression check for a proposed patch).
10+
- Inspect the storage reads a single transaction issued through the debugger.
11+
12+
The underlying tool is read-only: it fetches the committed transaction and aux info from the network, then executes it against the historical state. It does **not** mutate any on-chain state.
13+
14+
## Tool
15+
16+
Use the `{{ tool(name="move_replay_transaction") }}` MCP tool. Do not invoke the Aptos CLI's `aptos move replay` directly — this tool wraps it and returns structured JSON.
17+
18+
### Required Parameters
19+
20+
- **`txn_id`** (`u64`) — Committed ledger version of the transaction to replay.
21+
- **`network`** (`string`) — One of `"mainnet"`, `"testnet"`, `"devnet"`, or a full REST endpoint URL (e.g. `"https://my-node.example.com/v1"`).
22+
23+
### Optional Parameters
24+
25+
- **`local_package_paths`** (`string[]`, default `[]`) — Paths to local Move packages whose modules override the on-chain versions during replay. Each path must point to a directory containing `Move.toml`. Use this to simulate a fix.
26+
- **`named_addresses`** (`object`, default `{}`) — Named-address bindings (`{"name": "0xADDR"}`) used when compiling the local packages. Only consulted when `local_package_paths` is non-empty.
27+
- **`node_api_key`** (`string`) — Bearer token sent as `Authorization: Bearer <key>` to the node. Use this when the public endpoint is rate-limited.
28+
- **`trace`** (`bool`, default `false`) — When `true`, record a structured trace of debugger state-view requests (one `state_view { version, with_overrides }` entry per call) into the response. Off by default; tracing adds overhead. Only state-view requests are intercepted — the wrapper does not introspect Move bytecode execution itself.
29+
- **`trace_storage_reads`** (`bool`, default `false`) — When `true`, additionally record one `storage_read` entry per state-view read. Off by default because a single replay typically issues hundreds of reads, which crowd out the higher-signal events. Only consulted when `trace` is `true`.
30+
- **`max_trace_events`** (`usize`, default `500`) — Trace truncation limit. Only consulted when `trace` is `true`. Must be between `1` and the server-side cap of `100_000` (inclusive); requests outside that range fail fast with an `invalid_params` error. Raise it only if `truncated > 0` in the response.
31+
- **`redact_storage_keys`** (`bool`, default `true`) — When `true`, storage-read trace entries omit the `Debug`-formatted `StateKey`. Only consulted when both `trace` and `trace_storage_reads` are `true`. Disable only when the key contents themselves are needed for debugging.
32+
33+
### Constraints
34+
35+
- Only **user** transactions are supported. Genesis, BlockMetadata, BlockEpilogue, StateCheckpoint, and ValidatorTransaction variants are rejected with a structured `invalid_params` error.
36+
- The tool enforces a server-side timeout. If replay times out, suggest turning `trace_storage_reads` back off, dropping local overrides, or raising the server's `--tool-timeout`.
37+
38+
## Interpreting the Response
39+
40+
The tool returns a JSON object with these fields:
41+
42+
| Field | Meaning |
43+
|---|---|
44+
| `success` | `true` = `Keep(Success)`. `false` = `Keep(<any failure>)`. `null` = `Discard` or `Retry` (transaction was not committed in the normal sense). |
45+
| `vm_status` | Human-readable VM status, same formatting as the Aptos CLI's `replay` command. |
46+
| `abort` | Present only when the status is `MoveAbort`. Includes `location` (`"0xADDR::module_name"` or `"script"`), `code`, and optional `reason` / `description` if the module shipped abort metadata. |
47+
| `execution_failure` | Present only when the status is `ExecutionFailure`. Includes `location`, `function` index, and `code_offset` within that function. |
48+
| `transaction_hash` | Hex hash of the signed transaction. |
49+
| `version` | Echo of the input `txn_id`. |
50+
| `sender` | Sender address as a `0x…` hex literal. |
51+
| `sequence_number` | Present when the transaction uses sequence-number replay protection; absent for orderless (nonce-based) transactions. |
52+
| `gas_used`, `gas_unit_price` | Same as on-chain. |
53+
| `local_override_in_use` | `true` iff `local_package_paths` was non-empty — i.e. the replay diverged from on-chain bytecode. |
54+
| `trace` | Captured trace entries, only when `trace: true` was set on the request. Each entry is one of `state_view` (always exactly one per replay) or `storage_read` (zero by default; many when `trace_storage_reads: true`). |
55+
56+
### Reading the Status
57+
58+
1. **`success == true`** → transaction would commit normally. If the user expected a failure, double-check the inputs.
59+
2. **`success == false` with `abort` populated** → a Move `abort` was hit. Report:
60+
- `abort.location` (which module),
61+
- `abort.code` (raw code),
62+
- `abort.reason` / `abort.description` if available (these come from `#[error]` / abort-info metadata in the module),
63+
- The matching constant in the source if the reason name is symbolic (e.g. `EINSUFFICIENT_BALANCE`).
64+
3. **`success == false` with `execution_failure` populated** → a non-abort runtime failure (arithmetic overflow, type error, vector bounds, etc.). Report `location`, `function`, and `code_offset`; offer to disassemble the module if the user wants the exact bytecode site.
65+
4. **`success == false` with neither populated** → likely `OutOfGas` or `MiscellaneousError`. The `vm_status` string carries the detail.
66+
5. **`success == null`** → transaction was `Discard`ed or marked `Retry`. The `vm_status` string explains why; common causes are signature/validation issues that prevent execution.
67+
68+
## Workflows
69+
70+
### A. Plain Debugging — "Why did this transaction fail?"
71+
72+
1. Confirm with the user which `network` the transaction lives on.
73+
2. Call `{{ tool(name="move_replay_transaction") }}` with just `txn_id` and `network`.
74+
3. Read `success` first; then drill into `abort` or `execution_failure`.
75+
4. If the user wants the source-level reason, query the module with `{{ tool(name="move_package_query") }}` (or read the module source) to find the constant matching `abort.code` / the function at `execution_failure.function`.
76+
77+
### B. Patch Testing — "Would my fix change this transaction's outcome?"
78+
79+
1. Ask for (or locate) the local Move package that re-implements the relevant module(s). It must be a buildable package with a `Move.toml`.
80+
2. Determine the named-address bindings required to compile it. They must resolve every named address used in the package's source.
81+
3. Call the tool with:
82+
- `local_package_paths` set to the package directory (or list of directories),
83+
- `named_addresses` mapping each name to its on-chain address,
84+
- the same `txn_id` and `network` as the failing transaction.
85+
4. The response will have `local_override_in_use: true`. Compare its `success` / `abort` / `execution_failure` against the unmodified replay (workflow A) to see whether the patch changed behavior.
86+
5. **Important**: if the patched module's bytecode is type-incompatible with the on-chain version (different public function signatures, removed structs, etc.), the VM will fail at link time — surface this clearly rather than treating it as a Move bug.
87+
88+
### C. Tracing — "Show me what the VM did step by step"
89+
90+
1. Start with `trace: true` alone. You will get exactly one `state_view { version, with_overrides }` entry — the state view the VM consumed for the run. With `with_overrides: false` you can confirm the on-chain path was taken; with `with_overrides: true` you can confirm the local-override path was taken. For most "why did this fail" questions this entry plus the structured `abort` / `execution_failure` fields are all you need.
91+
2. Only set `trace_storage_reads: true` when you specifically need to see which `StateKey`s were touched during execution. Expect hundreds of entries per replay; raise `max_trace_events` (e.g. to `5000`) when you do this. Leave `redact_storage_keys: true` unless you need the `Debug`-formatted key bytes.
92+
3. When reporting back, **enumerate the actual entries verbatim** — never collapse them to counts. The single `state_view` entry should always appear in the output you show the user; quote the `storage_read` entries one by one when the user asked for them.
93+
4. If the response shows `truncated > 0`, the cap was hit. Either raise `max_trace_events` or turn `trace_storage_reads` back off.
94+
5. The trace is at debugger-wrapper granularity — it does **not** introspect Move bytecode execution itself, so it cannot show the in-Move call frame that hit an abort. Use the structured `abort` / `execution_failure` fields plus a module query for that.
95+
96+
## Reporting Results
97+
98+
When summarizing for the user:
99+
100+
- Always quote `success`, `vm_status`, and the structured `abort` / `execution_failure` fields verbatim — these are the ground truth.
101+
- When citing an abort, give both the symbolic reason (if present) **and** the raw code; the symbolic name can be absent for older modules.
102+
- If `local_override_in_use == true`, label the result as "replayed with local overrides" so the user does not confuse it with the on-chain outcome.
103+
- Do not speculate about state changes beyond what the tool returned. If the user wants deeper post-state inspection, suggest re-running with tracing enabled rather than guessing.

aptos-move/flow/src/mcp/tools/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ mod package_spec_infer;
77
mod package_status;
88
pub(crate) mod package_test;
99
mod package_verify;
10+
pub(crate) mod replay_tracing;
1011
mod replay_transaction;
1112

1213
use super::package_data::VerifiedScope;

0 commit comments

Comments
 (0)