The SightX Inference Engine is a clinical diagnostic service. It uses a custom ResNet-50 V2 architecture to grade Diabetic Retinopathy (DR) severity from retinal fundus photographs.
- Monte-Carlo TTA Ensemble: Each prediction runs through a 108-iteration Test-Time Augmentation loop to ensure robustness and capture epistemic uncertainty.
- Bayesian Prior Correction: The engine corrects for training-set imbalance (EyePACS) using Bayesian normalization against real-world clinical prevalence.
- Risk-Minimized Decisions: Decisions are made using Bayesian decision theory, prioritizing clinical safety by penalizing false negatives according to a medical cost matrix.
- Framework: FastAPI
- Runtime: PyTorch (CUDA/MPS Accelerated)
- Image Processing: Torchvision + Pillow
- Math & Ops: NumPy
- Model Warm-up: The model is loaded into memory on startup. Ensure the
checkpoints/best_model.ptis present in the container. - Resolution Standard: All inference pipelines must maintain a
384x384resolution to stay consistent with V2 training weights. - Accelerator Selection: The engine automatically detects and utilizes GPU (NVIDIA) or MPS (Apple Silicon) if available.
- Use TTA for Diagnostics: Always maintain the TTA ensemble loop for official clinical results.
- Calibrate Probabilities: Ensure
OPTIMAL_TEMPERATUREis periodically re-calibrated usingcalibrate_temperature.py. - Sanitize Responses: Return clear clinical tiers (
Doctor Visit Mandatory, etc.) alongside raw model grades.
- Modify Pre-processing: Do not alter
preprocessing.pynormalization tokens (IMAGENET_MEAN/STD) as they are baked into the pre-trained backbone. - Bypass Post-processor: Never return raw softmax outputs directly; they must pass through the
clinical_postprocessor.pyfor risk minimization.
-
Local Server:
pip install -r requirements.txt uvicorn main:app --reload
-
Docker Deployment:
docker-compose build inference-engine
© 2026 SightX • Clinical AI Documentation