🐛 Bug
The TypeTag string parser lexer (third_party/move/move-core/types/src/parser.rs) rejects identifiers starting with _ (underscore) or $, while Identifier::is_valid accepts them. This creates a Display→Parse roundtrip divergence: a valid StructTag with a _-prefixed module/struct name displays correctly, but the resulting string cannot be parsed back.
To reproduce
use move_core_types::{
account_address::AccountAddress,
identifier::Identifier,
language_storage::{StructTag, TypeTag},
};
use std::str::FromStr;
// Identifier with leading underscore is valid per is_valid()
let module = Identifier::new("_test").unwrap();
let tag = TypeTag::Struct(Box::new(StructTag {
address: AccountAddress::ONE,
module,
name: Identifier::new("S").unwrap(),
type_params: vec![],
}));
// BCS roundtrip — works fine
let bcs_bytes = bcs::to_bytes(&tag).unwrap();
let from_bcs: TypeTag = bcs::from_bytes(&bcs_bytes).unwrap();
assert_eq!(tag, from_bcs);
// Display — produces valid-looking string
let display = tag.to_string();
// => "0x00000000000000000000000000000001::_test::S"
// Parse — FAILS
TypeTag::from_str(&display).unwrap_err();
// Error: "unrecognized token"
Root cause
identifier.rs:92 accepts _-prefixed (and $-prefixed) identifiers:
pub const fn is_valid(s: &str) -> bool {
let b = s.as_bytes();
match b {
[ba..=bz, ..] | [bA..=bZ, ..] => all_bytes_valid(b, 1),
[b_, ..] | [b, ..] if b.len() > 1 => all_bytes_valid(b, 1),
_ => false,
}
}
parser.rs:181 — lexer only checks is_ascii_alphabetic() for the first character:
c if c.is_ascii_alphabetic() => { // _ and are not alphabetic
let mut r = String::new();
r.push(c);
for c in it {
if identifier::is_valid_identifier_char(c) { ... }
}
(name_token(r), len)
},
The inner loop correctly uses is_valid_identifier_char (allows _ and $), but the first-character check is stricter. The Move compiler (move-compiler-v2/.../lexer.rs:499) also allows _ as a valid identifier start character, confirming the parser is out of sync.
The is_valid code was updated to accept _/$ prefixes, but the parser lexer was not updated to match.
Expected fix
c if c.is_ascii_alphabetic() || c == _ || c == => {
Affected files
third_party/move/move-core/types/src/parser.rs — first-character check
- Also present in the archived
move-language/move upstream
🐛 Bug
The
TypeTagstring parser lexer (third_party/move/move-core/types/src/parser.rs) rejects identifiers starting with_(underscore) or$, whileIdentifier::is_validaccepts them. This creates a Display→Parse roundtrip divergence: a validStructTagwith a_-prefixed module/struct name displays correctly, but the resulting string cannot be parsed back.To reproduce
Root cause
identifier.rs:92accepts_-prefixed (and$-prefixed) identifiers:parser.rs:181— lexer only checksis_ascii_alphabetic()for the first character:The inner loop correctly uses
is_valid_identifier_char(allows_and$), but the first-character check is stricter. The Move compiler (move-compiler-v2/.../lexer.rs:499) also allows_as a valid identifier start character, confirming the parser is out of sync.The
is_validcode was updated to accept_/$prefixes, but the parser lexer was not updated to match.Expected fix
Affected files
third_party/move/move-core/types/src/parser.rs— first-character checkmove-language/moveupstream