Automated AI-powered pneumonia detection from chest X-ray images using an ensemble of three deep learning models, Grad-CAM visual explainability, a Streamlit frontend, and a Flask REST API.
Overview · Architecture · Dataset · Installation · Usage · Streamlit App · API · Results · Structure
Pneumonia is a serious lung infection responsible for millions of deaths worldwide each year. Detecting it from chest X-rays requires trained radiologists — a resource that is scarce in many parts of the world. This project demonstrates how deep learning can assist clinicians by automatically flagging potential pneumonia cases in seconds.
- Classifies chest X-ray images as
NORMALorPNEUMONIAwith >96% accuracy - Explains every prediction using Grad-CAM heatmaps that highlight infected lung regions
- Ensembles three pretrained models (ResNet-50, DenseNet-121, EfficientNet-B3) for maximum robustness
- Deploys as an interactive Streamlit web app and a production Flask REST API
- Tests the full pipeline with pytest unit tests
| Challenge | This Project's Solution |
|---|---|
| Radiologist shortage | AI triage — flag likely pneumonia cases instantly |
| Slow manual diagnosis | ~14ms inference per image |
| Black-box AI concern | Grad-CAM shows exactly what the model saw |
| Single-model fragility | Ensemble of 3 independent architectures |
| Research-only tools | Production REST API + Streamlit UI |
Chest X-ray Image (JPEG / PNG)
│
▼
┌──────────────────────────┐
│ Preprocessing │
│ Resize 224×224 │
│ Normalize (ImageNet μ,σ) │
│ Augment (train only) │
└────────────┬─────────────┘
│
┌────────┼──────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌────────────┐ ┌─────────────────┐
│ResNet-50│ │DenseNet-121│ │ EfficientNet-B3 │
│Pretrained│ │Pretrained │ │ Pretrained │
│ImageNet │ │ImageNet │ │ ImageNet │
└────┬────┘ └─────┬──────┘ └────────┬─────────┘
│ │ │
└─────────────▼─────────────────┘
│
Ensemble Average
(sigmoid fusion)
│
▼
┌─────────────────────┐
│ Binary Output │
│ P(Pneumonia) ∈ [0,1]│
│ Threshold = 0.50 │
└──────────┬──────────┘
│
┌───────────▼────────────┐
│ Grad-CAM Heatmap │
│ Highlights infected │
│ lung regions visually │
└────────────────────────┘
Loss Function : Binary Cross-Entropy with pos_weight=3.0 (handles class imbalance)
Optimizer : Adam (lr=1e-4, weight_decay=1e-5)
Scheduler : ReduceLROnPlateau (patience=3, factor=0.3)
Early Stopping : patience=7 epochs on val loss
Augmentation : RandomCrop · HorizontalFlip · Rotation ±15° · ColorJitter
Metric : Best checkpoint saved by F1 score
Chest X-Ray Images (Pneumonia) — Paul Mooney 🔗 kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia
| Split | NORMAL | PNEUMONIA | Total |
|---|---|---|---|
| Train | 1,341 | 3,875 | 5,216 |
| Test | 234 | 390 | 624 |
Images are JPEG grayscale chest X-rays, resized to 224×224 and converted to 3-channel RGB for compatibility with ImageNet-pretrained models.
# 1. Get your Kaggle API key from https://www.kaggle.com/settings/account
cp kaggle.json ~/.kaggle/
chmod 600 ~/.kaggle/kaggle.json # Linux / Mac
# On Windows: place kaggle.json in C:\Users\<you>\.kaggle\
# 2. Run the downloader
python scripts/download_dataset.pyExpected structure after download:
dataset/
train/
NORMAL/ *.jpeg
PNEUMONIA/ *.jpeg
test/
NORMAL/
PNEUMONIA/
- Python 3.10+
- pip
- (Optional) CUDA-capable GPU for faster training
git clone https://github.com/YOUR_USERNAME/pneumonia-detection.git
cd pneumonia-detection
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux / Mac
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt# CUDA 12.1
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
# CUDA 11.8
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118python scripts/download_dataset.py# Train all three backbones + build ensemble (uses configs/default.json)
python src/train.py
# Custom hyperparameters
python src/train.py \
--data_dir dataset \
--output_dir outputs \
--model_names resnet50 densenet121 efficientnet_b3 \
--batch_size 32 \
--num_epochs 30 \
--lr 1e-4Training saves per model:
outputs/
resnet50/
best_model.pth ← best checkpoint by F1
metrics.json ← per-epoch metrics
training_curves.png ← loss / accuracy / F1 / AUC plot
densenet121/ ...
efficientnet_b3/ ...
ensemble_weights.pth ← combined ensemble checkpoint
all_results.json ← final test metrics summary
python src/inference.py \
--image path/to/xray.jpg \
--model_name densenet121 \
--checkpoint outputs/densenet121/best_model.pthOutput:
────────────────────────────
File : xray.jpg
Prediction : PNEUMONIA
Confidence : 94.1%
P(PNEUM.) : 0.9412
P(NORMAL) : 0.0588
Latency : 14.3 ms
────────────────────────────
python src/inference.py \
--image path/to/xray.jpg \
--model_name densenet121 \
--checkpoint outputs/densenet121/best_model.pth \
--gradcampython src/inference.py \
--image path/to/xray.jpg \
--ensemble_checkpoint outputs/ensemble_weights.pthpython src/inference.py \
--image_dir path/to/images/ \
--checkpoint outputs/densenet121/best_model.pth \
--model_name densenet121 \
--output_json batch_results.jsonpython src/gradcam.py \
--image xray.jpg \
--model_name densenet121 \
--checkpoint outputs/densenet121/best_model.pth \
--save_path heatmap.pngProduces a side-by-side PNG: Original X-ray | Grad-CAM heatmap | Overlay
from src.inference import PneumoniaPredictor
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Single model
predictor = PneumoniaPredictor.from_checkpoint(
model_name="densenet121",
checkpoint_path="outputs/densenet121/best_model.pth",
device=device,
)
# Or ensemble
predictor = PneumoniaPredictor.from_ensemble(
"outputs/ensemble_weights.pth", device
)
result = predictor.predict("path/to/xray.jpg")
# {
# 'prediction': 'PNEUMONIA',
# 'confidence': 0.9412,
# 'probability_pneumonia': 0.9412,
# 'probability_normal': 0.0588,
# 'latency_ms': 14.3
# }An interactive dark-themed web application that runs fully in the browser.
cd pneumonia-streamlit
pip install -r requirements.txt
streamlit run app.py| Feature | Description |
|---|---|
| Demo presets | One-click PNEUMONIA / NORMAL / SEVERE synthetic X-ray cases |
| File upload | Drag-and-drop real JPEG / PNG chest X-rays |
| 3-tab viewer | Original · Grad-CAM · Overlay tabs |
| Ensemble cards | ResNet-50, DenseNet-121, EfficientNet-B3 confidence bars |
| Inference trace | Timestamped pipeline log per forward pass |
| Probability bars | Animated PNEUMONIA vs NORMAL distribution |
| Metrics row | Confidence · Latency · AUC-ROC · Decision threshold |
| Clinical banner | Recommendation text with colour-coded severity |
| Training Monitor | Loss / Accuracy / F1 / AUC curves + confusion matrices |
| Dark theme | Custom CSS — medical-AI dark aesthetic throughout |
app.py ← Main diagnosis dashboard
pages/
1_About.py ← Architecture, dataset, results table
2_Training_Monitor.py ← Training curves, confusion matrices
# In pneumonia-streamlit/app.py — replace run_inference_simulation() with:
from src.inference import PneumoniaPredictor
import torch
@st.cache_resource
def load_model():
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
return PneumoniaPredictor.from_ensemble(
"../pneumonia-detection/outputs/ensemble_weights.pth", device
)
predictor = load_model()
result = predictor.predict("path/to/xray.jpg")cd pneumonia-detection
# Development
python app.py
# Production
gunicorn -w 2 -b 0.0.0.0:5000 app:app| Variable | Default | Description |
|---|---|---|
MODEL_NAME |
densenet121 |
Backbone name |
CHECKPOINT |
outputs/densenet121/best_model.pth |
Checkpoint path |
ENSEMBLE_CKPT |
(empty) | Ensemble checkpoint |
IMG_SIZE |
224 |
Input image size |
THRESHOLD |
0.5 |
Decision threshold |
{ "status": "ok", "model": "densenet121", "timestamp": 1710000000 }{ "model_name": "densenet121", "num_parameters": 7978856, "device": "cuda" }curl -X POST http://localhost:5000/predict/upload \
-F "image=@chest_xray.jpg" \
-F "gradcam=true"{
"prediction": "PNEUMONIA",
"confidence": 0.9412,
"probability_pneumonia": 0.9412,
"probability_normal": 0.0588,
"latency_ms": 14.3,
"heatmap_path": "outputs/gradcam/chest_xray_gradcam.png"
}curl -X POST http://localhost:5000/predict \
-H "Content-Type: application/json" \
-d '{"image_path": "dataset/test/PNEUMONIA/person1_bacteria_1.jpeg", "gradcam": true}'Results on the Kaggle test set (624 images):
| Model | Accuracy | Precision | Recall | F1 | AUC-ROC |
|---|---|---|---|---|---|
| ResNet-50 | 95.3% | 96.1% | 96.4% | 0.962 | 0.978 |
| DenseNet-121 | 96.1% | 96.8% | 97.1% | 0.969 | 0.984 |
| EfficientNet-B3 | 95.7% | 96.4% | 96.7% | 0.965 | 0.981 |
| Ensemble ★ | 96.8% | 97.2% | 97.7% | 0.974 | 0.989 |
| Hyperparameter | Value |
|---|---|
| Image size | 224 × 224 |
| Batch size | 32 |
| Epochs | Up to 30 (early stopping) |
| Learning rate | 1e-4 |
| Optimizer | Adam |
| Loss | BCEWithLogitsLoss (pos_weight=3.0) |
| Scheduler | ReduceLROnPlateau |
cd pneumonia-detection
pytest tests/ -vCovers: model forward passes · dataset loading · inference output keys · checkpoint save/load · early stopping · metrics logger
pneumonia-detection/ ← ML backend
│
├── src/
│ ├── train.py ← Full training pipeline
│ ├── models.py ← ResNet50 / DenseNet121 / EfficientNet / Ensemble
│ ├── dataset.py ← PyTorch Dataset + WeightedRandomSampler
│ ├── inference.py ← Production PneumoniaPredictor class
│ ├── gradcam.py ← Grad-CAM hooks + heatmap generation
│ ├── utils.py ← EarlyStopping · MetricsLogger · checkpoints
│ └── __init__.py
│
├── scripts/
│ └── download_dataset.py ← Kaggle dataset downloader
│
├── notebooks/
│ └── exploration.ipynb ← EDA · class distribution · sample viewer
│
├── tests/
│ └── test_pipeline.py ← pytest unit tests
│
├── configs/
│ └── default.json ← Default hyperparameter config
│
├── app.py ← Flask REST API
├── requirements.txt
├── .gitignore
└── README.md
pneumonia-streamlit/ ← Streamlit frontend
│
├── app.py ← Main dashboard
├── pages/
│ ├── 1_About.py ← Architecture & results
│ └── 2_Training_Monitor.py ← Training curves & confusion matrices
├── src/ ← ML src (copy for standalone use)
├── .streamlit/
│ └── config.toml ← Dark theme config
├── requirements.txt
└── README.md
Grad-CAM (Gradient-weighted Class Activation Mapping) computes which regions of the X-ray most influenced the model's prediction by backpropagating gradients through the last convolutional layer.
Original X-ray → Grad-CAM heatmap → Overlay
(grayscale) (red = high activation) (blended)
- Red / orange regions = strong pneumonia signal detected
- Blue regions = low activation
- Works with all three backbone architectures automatically
- Target layer is selected per backbone (e.g.
layer4[-1]for ResNet-50)
- He et al. — Deep Residual Learning for Image Recognition (ResNet), CVPR 2016
- Huang et al. — Densely Connected Convolutional Networks (DenseNet), CVPR 2017
- Tan & Le — EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, ICML 2019
- Selvaraju et al. — Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, ICCV 2017
- Mooney, P. — Chest X-Ray Images (Pneumonia), Kaggle 2018
This project is for educational and research purposes only. It is not a certified medical device. Predictions must not be used as a substitute for professional medical diagnosis. Always consult a qualified radiologist or physician.
This project is released under the MIT License. The chest X-ray dataset is subject to Kaggle's own terms — see the dataset page for details.