fix(loadtest): eliminate harness false-positives across 30+ admin endpoints#5456
Open
Lang-Akshay wants to merge 3 commits into
Open
fix(loadtest): eliminate harness false-positives across 30+ admin endpoints#5456Lang-Akshay wants to merge 3 commits into
Lang-Akshay wants to merge 3 commits into
Conversation
Fix four categories of false-positive failures causing ~7.6% error rate
in `make load-test-cli` runs despite no real gateway misbehavior:
- Use `_validate_status` for /admin/events (SSE endpoint, not JSON)
- Allow 422 for GET /prompts/{id} (valid PromptError response)
- Use full uuid4().hex (128-bit) for user emails to prevent birthday
collisions at ~542 RPS sustained over 10 minutes
- Clear associated_resources from SSE virtual server to eliminate
ambiguous resource routing errors
Real gateway bugs (TaskGroup crash, -32603 tool error) tracked in #5453
and #5454.
Closes #5321
Signed-off-by: Lang-Akshay <akshay.shinde26@ibm.com>
7fc3215 to
d219976
Compare
Collaborator
|
PR fixes 4 load-test false positives (SSE endpoint wrongly validated as JSON, missing 422 case for prompt errors, 32-bit UUID collisions at high RPS, and ambiguous dual-registered resources). Might need a rebase on the latest main. Looks good otherwise. |
Collaborator
|
Log form the latest run. run.log This is singificant improvement given that it is a best effort task given the environment based variations. |
Signed-off-by: Lang-Akshay <ashinde266@gmail.com>
Collaborator
|
3e7eae2 to
48e15c0
Compare
…asses to prevent false failures Signed-off-by: Lang-Akshay <akshay.shinde26@ibm.com>
Collaborator
Author
|
Thanks for the review @ja8zyjits
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_validate_json_responseto_validate_status— these endpoints return HTML (not JSON) whenMCPGATEWAY_ADMIN_API_ENABLED=false, causing false-positiveCatchResponseErrorfailures in load testsSOFT_JSONRPC_ERROR_CODES = frozenset({-32000})constant and tolerates JSON-RPC-32000errors in_validate_jsonrpc_response—-32000is expected when MCP backends are unavailablePOST /auth/email/admin/usersnow accepts400alongside403/409/422— server returns 400 for email validation and password policy failures/admin/eventsswitched from_validate_json_responseto_validate_status— endpoint returns SSE (text/event-stream), not JSONGET /prompts/{id}now accepts422as a valid response — gateway mapsPromptError(missing required args) to HTTP 422uuid.uuid4().hex(128-bit) instead ofuuid.uuid4().hex[:8](32-bit) for email addresses, eliminating birthday collisions at ~542 RPS over 10 minutesassociated_resources: []to remove ambiguous resource routing caused by both HTTP and SSE servers registering the same resource URIsAffected admin endpoint groups (all converted to
_validate_status):heatmap,timeseries,top-errors,top-volume,top-slow,tools/usage,tools/performancecache,history,system,requestslist,stats,detailservers,gateways,resources,prompts,users,toolstools/ids,servers/idsmcp-registry/servers,metrics/reset,well-known,grpcThe two genuine gateway bugs surfaced by this investigation are tracked separately in #5453 (TaskGroup crash on
prompts/get) and #5454 (tools/callreturning-32603).Closes #5321