check uri blacklist before fetching animation_url in nft parser#20186
Open
madib06ops wants to merge 1 commit into
Open
check uri blacklist before fetching animation_url in nft parser#20186madib06ops wants to merge 1 commit into
madib06ops wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
raw_image_uriis checked againsturi_blacklistbefore it is fetched, but theraw_animation_uribranch parses and optimizes the URI without that check. A token whose JSON metadata setsanimation_urlto a blacklisted entry is still fetched and optimized, so the blacklist is only half enforced. Bothimageandanimation_urlcome from the same untrusted on-chain metadata, so an operator that blacklists a host to stop the crawler from reaching it can still be driven to it through the animation field. This adds the same check on the animation path, mirroring the image branch (skip, markdo_not_parse, bump theblacklistskip counter, return early).How Has This Been Tested?
cargo check -p aptos-nft-metadata-crawlerandcargo clippy -p aptos-nft-metadata-crawlerare clean. The new branch is identical in shape to the existingraw_image_uriblacklist branch, which is exercised by the sameparseflow.Workerrequires a live Postgres connection, so there is no unit-test harness for this path to extend.Key Areas to Review
The animation branch returns
Ok(())after settingdo_not_parse, matching how a blacklisted image URI is handled (both skip the trailing retry/success bookkeeping). Worth confirming that early-return is the intended behavior for a blacklisted animation, same as for a blacklisted image.Type of Change
Which Components or Systems Does This Change Impact?
Checklist
Note
Low Risk
Small, symmetric guard in the NFT metadata crawler parser with no auth or schema changes; behavior aligns with existing image blacklist handling.
Overview
The NFT metadata crawler already blocked
raw_image_uriagainsturi_blacklistbefore fetch/optimize, butanimation_urlcould still hit blacklisted hosts. This change runs the sameis_blacklisted_uricheck onraw_animation_uriimmediately beforeURIParser::parseand animation optimization.On a match, the worker sets
do_not_parse, upserts, incrementsSKIP_URI_COUNTwith labelblacklist, and returns early—matching the image and top-level asset URI blacklist behavior so operators cannot be driven to blocked hosts via metadataanimation_url.Reviewed by Cursor Bugbot for commit 3c066ba. Bugbot is set up for automated code reviews on this repo. Configure here.