Skip to content

fix(flat): decode list/pair elements without type tags#5

Open
rodrigomd94 wants to merge 1 commit into
txpipe:mainfrom
rodrigomd94:main
Open

fix(flat): decode list/pair elements without type tags#5
rodrigomd94 wants to merge 1 commit into
txpipe:mainfrom
rodrigomd94:main

Conversation

@rodrigomd94

@rodrigomd94 rodrigomd94 commented Nov 17, 2025

Copy link
Copy Markdown

The Bug

The decoder was calling decode_constant() for list and pair elements, which tries to read type tags for each element. However, in UPLC flat encoding, type tags are only written once for the parent List/Pair type, not for each individual element.

This caused a "Missing type tag" error when decoding List constants at byte 163 in production scripts.

The Fix

  1. Created decode_constant_value() function that decodes values without reading type tags (the type is already known from the parent)
  2. Updated List decoding to use decode_constant_value()
  3. Updated Pair decoding to use decode_constant_value()

How It Works

When decoding a constant with type List:

  • Old behavior: Read type tags ONCE, then for each element read TYPE + VALUE
  • New behavior: Read type tags ONCE, then for each element read VALUE only

This matches the UPLC flat encoding specification where typed containers encode the type once, then encode only values for elements.

Testing

Verified with production Cardano scripts that previously failed at byte 163. The fix allows successful decoding of List and other typed collections.

Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling for unsupported cryptographic type operations.
  • Refactor

    • Streamlined constant value decoding to eliminate redundant type processing.
    • Enhanced handling of nested and complex data structures for improved performance.

## The Bug
The decoder was calling decode_constant() for list and pair elements,
which tries to read type tags for each element. However, in UPLC flat
encoding, type tags are only written once for the parent List/Pair type,
not for each individual element.

This caused a "Missing type tag" error when decoding List<Data> constants
at byte 163 in production scripts.

## The Fix
1. Created decode_constant_value() function that decodes values without
   reading type tags (the type is already known from the parent)
2. Updated List decoding to use decode_constant_value()
3. Updated Pair decoding to use decode_constant_value()

## How It Works
When decoding a constant with type List<T>:
- Old behavior: Read type tags ONCE, then for each element read TYPE + VALUE
- New behavior: Read type tags ONCE, then for each element read VALUE only

This matches the UPLC flat encoding specification where typed containers
encode the type once, then encode only values for elements.

## Testing
Verified with production Cardano scripts that previously failed at byte 163.
The fix allows successful decoding of List<Data> and other typed collections.
@coderabbitai

coderabbitai Bot commented Nov 17, 2025

Copy link
Copy Markdown

Walkthrough

The change refactors flat decoding logic in the UPLC module by introducing a new helper function decode_constant_value that decodes constants when their type is already known, eliminating redundant type tag re-reading. Complex types (List, Pair, Data) are updated to use this helper, and BLS types now return an unsupported error.

Changes

Cohort / File(s) Summary
UPLC Flat Decode Refactoring
crates/uplc/src/flat/decode/mod.rs
Introduced decode_constant_value helper function for type-aware decoding without re-reading type tags. Updated List and Pair decoding to recursively use the helper with provided subtypes. Added Data type CBOR decoding via d.bytes and minicbor::decode_with. Added BlsTypeNotSupported error for BLS12-381 types.

Sequence Diagram

sequenceDiagram
    autonumber
    participant decode_constant as decode_constant<br/>(legacy path)
    participant decode_constant_value as decode_constant_value<br/>(new optimized path)
    
    rect rgb(220, 240, 255)
    Note over decode_constant: Old: Reads type tag + decodes
    decode_constant->>decode_constant: read type tag
    decode_constant->>decode_constant: decode based on tag
    end
    
    rect rgb(240, 220, 255)
    Note over decode_constant_value: New: Type known, skip tag read
    decode_constant_value->>decode_constant_value: use provided Type
    decode_constant_value->>decode_constant_value: decode with subtype info
    end
    
    rect rgb(255, 240, 220)
    Note over decode_constant: Complex types now delegate
    decode_constant->>decode_constant_value: List(sub_typ)
    decode_constant->>decode_constant_value: Pair(sub_typ1, sub_typ2)
    decode_constant->>decode_constant_value: Data (CBOR decode)
    decode_constant_value-->>decode_constant: return Constant
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Verify decode_constant_value correctly handles all type variants without type tag reads
  • Ensure recursive List and Pair decoding maintains correctness with subtype delegation
  • Validate Data type CBOR decoding integration with minicbor::decode_with
  • Confirm BLS type error path is appropriate and doesn't break existing behavior
  • Check that optimization doesn't introduce edge cases or performance regressions

Poem

🐰 A helper function hops into view,
Type tags skip'd—efficiency shines bright and true!
Lists and Pairs now know their way,
Data decodes with CBOR's display,
BLS types bow out—optimization's here to stay! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main fix: avoiding type tag re-reading for list/pair elements during decoding, which aligns with the core bug fix objective.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example:

"Create a concise high-level summary as a bullet-point list. Then include a Markdown table showing lines added and removed by each contributing author."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
crates/uplc/src/flat/decode/mod.rs (2)

236-291: Function logic is correct, but consider refactoring to reduce duplication.

The decode_constant_value function correctly decodes constant values when the type is already known, and the recursive handling of List and Pair types is appropriate. However, there's substantial code duplication between this function and decode_constant (lines 179-234).

Suggested refactoring:

Consider restructuring to eliminate duplication. The decode_constant function could call decode_type first, then delegate to decode_constant_value:

fn decode_constant<'a>(
    ctx: &mut Ctx<'a>,
    d: &mut Decoder,
) -> Result<&'a Constant<'a>, FlatDecodeError> {
    let ty = decode_type(ctx, d)?;
    decode_constant_value(ctx, d, ty)
}

This would:

  • Eliminate ~55 lines of duplicated code
  • Make the distinction clearer: decode_constant reads type tags first, decode_constant_value uses provided type
  • Reduce maintenance burden (changes only need to be made in one place)
  • Keep the fix's correctness intact

236-237: Consider adding comprehensive documentation.

The inline comment is helpful but brief. Consider adding a doc comment that explains:

  • When to use this function vs decode_constant
  • That it's used for decoding elements of composite types (List, Pair) where the type is already known from the parent
  • The relationship to the UPLC flat encoding specification

Example:

/// Decodes a constant value when the type is already known (e.g., from a parent List or Pair).
///
/// Unlike `decode_constant`, this function does not read type tags from the decoder,
/// as UPLC flat encoding writes type tags only once for the parent type, not per-element.
///
/// # Arguments
/// * `ctx` - The decoding context with arena allocator
/// * `d` - The decoder positioned at the constant value (after type tags)
/// * `ty` - The already-known type of the constant to decode
fn decode_constant_value<'a>(
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bd174b3 and 48e045a.

📒 Files selected for processing (1)
  • crates/uplc/src/flat/decode/mod.rs (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
crates/uplc/src/flat/decode/mod.rs (3)
crates/uplc/src/constant.rs (9)
  • proto_list (74-80)
  • proto_pair (82-95)
  • data (66-68)
  • integer (27-29)
  • integer (46-48)
  • byte_string (54-56)
  • bool (62-64)
  • string (58-60)
  • unit (70-72)
crates/uplc/src/flat/decode/decoder.rs (1)
  • integer (184-188)
crates/uplc/src/data.rs (2)
  • integer (42-44)
  • byte_string (50-52)
🔇 Additional comments (4)
crates/uplc/src/flat/decode/mod.rs (4)

210-215: LGTM! Core fix correctly eliminates redundant type tag reads for list elements.

The change from decode_constant to decode_constant_value correctly addresses the bug where type tags were being read for each list element. By passing sub_typ to the new helper, the decoder now reads the type tag once for the parent List and uses it for all elements, matching the UPLC flat encoding specification.


216-223: LGTM! Pair element decoding correctly fixed.

The change to use decode_constant_value for both pair elements (fst and snd) correctly addresses the bug, ensuring type tags are read once for the parent Pair type and not redundantly for each element.


224-229: LGTM! Data type handling correctly added.

The Data type decoding logic correctly reads CBOR-encoded bytes and decodes them using minicbor, which is appropriate for PlutusData deserialization.


230-233: LGTM! Appropriate error handling for unsupported BLS types.

Returning BlsTypeNotSupported error for BLS types provides clear feedback and prevents silent failures.

pgrange added a commit to pgrange/dolos that referenced this pull request Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant