Escape command name in _remove_command regex#123
Open
Nas01010101 wants to merge 1 commit into
Open
Conversation
`_remove_command` interpolated the user-supplied `command` directly into a
regex pattern, so a command name containing regex metacharacters was matched
as a pattern rather than literally. For example, deleting a command named
`todo.` would also strip `\todoX{...}` because `.` matches any character.
Wrap the name with `regex.escape()`, consistent with the other pattern
constructions in this module (e.g. the filename handling in
`_replace_includegraphics`). Add parameterized regression tests covering `.`,
`*`, and `+` in command names; they fail before this change and pass after.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
_remove_commandinterpolates the user-suppliedcommanddirectly into a regexbase_pattern:So a command name containing regex metacharacters is matched as a pattern rather than literally. For example, with
commands_to_delete=['todo.']:.matches any character, so\todoX{...}is wrongly removed alongside\todo.{...}(over-removal).*/+in a name make the pattern match the wrong span (or fail to match), so the targeted command may not be removed at all (under-removal).Fix
Wrap the name with
regex.escape()so it is matched literally:This is consistent with how the rest of this module already builds patterns from
dynamic strings (e.g. the filename handling in
_replace_includegraphics, whichuses
regex.escapein several places).commands_to_delete/commands_only_to_deleteare documented as literal commandnames (CLI
--helpandcleaner_config.yaml), so literal matching is the intendedbehavior; this does not change handling of normal alphabetic command names.
Tests
Adds a parameterized regression test (
test_remove_command_escapes_regex_metacharacters_in_name)covering
.,*, and+in command names. Each case asserts the look-alike command ispreserved and the targeted command is removed. The cases fail before this change and pass after;
the full unit-test suite remains green.