Skip to content

bug: No input validation on search query params - allows abuse #20

Description

@PandyaJeet

Summary

The search endpoints take q as an unbounded query string with no Pydantic constraints, allowing arbitrarily long or empty queries through to upstream APIs and scrapers.

Reproduction

curl 'http://localhost:8000/api/search/seo?q=' 
# returns HTTP 400 (handled)

curl 'http://localhost:8000/api/search/seo?q='$(python -c "print('a'*10000)")
# accepted - 10k char query passed to SerpAPI and all 3 scrapers

Current code

Expected fix

Add Field constraints with min/max length. For GET endpoints:

@router.get('/seo')
async def get_seo(q: str = Query(..., min_length=1, max_length=500)):
    ...

For the Pydantic request model in ai.py:

class ContextualAIRequest(BaseModel):
    query: str = Field(..., min_length=1, max_length=500)
    persona: Literal['default', 'chatgpt', 'gemini', 'perplexity', 'claude'] = 'default'
    context: Optional[Dict] = None
    region: str = Field(default='us', max_length=4)

Acceptance Criteria

  • Queries over 500 chars return HTTP 422
  • Empty queries return HTTP 422 (instead of the manual 400 currently returned)
  • Invalid persona names return HTTP 422
  • Existing valid queries still work

GSSoC Points

level:beginner (+20) + type:bug (+10) + type:security (+20) + gssoc:approved (+50) = 100 base pts

Metadata

Metadata

Assignees

Labels

good first issueGood for newcomerslevel:beginnerGood for new contributors (+20 pts)type:bugSomething isn't working (+10 pts)type:securitySecurity improvement (+20 pts)

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions