Skip to content

fix: isolate scraper failures so one crash doesn't break community search (#106)#107

Merged
PandyaJeet merged 1 commit into
PandyaJeet:mainfrom
saurabhhhcodes:fix/scraper-fault-tolerance-106
Jun 25, 2026
Merged

fix: isolate scraper failures so one crash doesn't break community search (#106)#107
PandyaJeet merged 1 commit into
PandyaJeet:mainfrom
saurabhhhcodes:fix/scraper-fault-tolerance-106

Conversation

@saurabhhhcodes

Copy link
Copy Markdown

Closes #106

Summary

asyncio.gather without return_exceptions=True propagates the first exception from any scraper, causing the entire /api/search/community endpoint to return 500.

Change: Added return_exceptions=True to asyncio.gather and replaced any failed scraper result with an empty list. Working scrapers still return data even when others fail.

Type of Change

  • Bug fix

…arch (PandyaJeet#106)

asyncio.gather without return_exceptions=True propagates the first
exception from any scraper, causing the entire /api/search/community
endpoint to return 500. Added return_exceptions=True and replaced
failed scraper results with empty lists so working sources still
return data.

@itsdakshjain itsdakshjain left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good third party scrapers arerisky, so using return_exceptions=True is the perfect move to stop a single crash from killing the whole endpoint.

but here you loop through the results to log the errors, but then manually do four separate if isinstance checks right after to reset the variables. you could actually handle the logging and the fallback assignment all inside that same loop to cut down on the repetitive code.

its optional rest all look good to merge @PandyaJeet

@PandyaJeet PandyaJeet added level:beginner Good for new contributors (+20 pts) type:performance Performance improvement (+15 pts) gssoc:approved GSSoC approved PR - earns base 50 pts mentor:itsdakshjain labels Jun 25, 2026
@PandyaJeet PandyaJeet merged commit 3238e08 into PandyaJeet:main Jun 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gssoc:approved GSSoC approved PR - earns base 50 pts level:beginner Good for new contributors (+20 pts) mentor:itsdakshjain type:performance Performance improvement (+15 pts)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] - Failure in single web scraper crashes the entire community search route with 500 error

3 participants