Three ways to ask a medical image classifier "where are you looking", shown together so you can compare them: Grad-CAM, Grad-CAM++, and occlusion sensitivity. In a clinical setting a saliency map that lands on the wrong anatomy is a warning sign, and the three methods agreeing is a quick way to trust the answer.
pip install -r requirements.txt
python src/explain.py --image chest.png --out outputs/
It runs on a torchvision DenseNet121 straight away, so you can confirm it works without a trained checkpoint, and the same code moves onto your own model by swapping the backbone and the target layer.
You get three overlays out: gradcam.png, gradcampp.png and occlusion.png.
The occlusion map is the honest cross check, it slides a patch across the image
and measures how much the prediction drops, with no gradient tricks involved, so
where it agrees with Grad-CAM you can be fairly sure the attention is real.