A comprehensive study of autoencoder architectures for medical image analysis using the PathMNIST dataset from MedMNIST.
This project implements and evaluates five distinct autoencoder architectures for medical image reconstruction, denoising, and feature extraction. The models are trained and tested on colorectal cancer histopathology images from the PathMNIST dataset, which contains 32x32 RGB images across 9 tissue classes.
- Implement multiple autoencoder architectures with varying design principles
- Evaluate reconstruction quality using MSE, SSIM, and PSNR metrics
- Compare latent space representations across architectures
- Demonstrate practical applications: denoising, compression, and feature extraction
| Model | Architecture | Key Characteristics |
|---|---|---|
| Basic Autoencoder | Fully connected dense layers | Baseline model, does not exploit spatial structure |
| Convolutional Autoencoder | Conv2D / Conv2DTranspose layers | Preserves spatial hierarchies, parameter efficient |
| Denoising Autoencoder | Convolutional + noise injection | Learns robust representations, effective for denoising |
| Variational Autoencoder (VAE) | Probabilistic encoder with KL divergence | Continuous latent space, suitable for generation |
| Sparse Autoencoder | Convolutional + sparsity penalty | Interpretable sparse features, regularized representations |
medical-autoencoders-pneumoniamnist/
│
├── notebooks/ # Jupyter notebooks (run in order)
│ ├── 01_data_explore_preprocess.ipynb # Dataset exploration and preprocessing
│ ├── 02_basic_ae.ipynb # Basic autoencoder implementation
│ ├── 03_conv_ae.ipynb # Convolutional autoencoder
│ ├── 04_denoising_ae.ipynb # Denoising autoencoder
│ ├── 05_vae.ipynb # Variational autoencoder
│ ├── 06_sparse_ae.ipynb # Sparse autoencoder
│ ├── 07_eval_single_model.ipynb # Individual model evaluation
│ └── 08_compare_all_models.ipynb # Comparative analysis
│
├── figures/ # Generated visualizations
│ ├── basic_ae/ # Basic AE training curves and reconstructions
│ ├── conv_ae/ # Conv AE results
│ ├── denoising_ae/ # Denoising results
│ ├── vae/ # VAE results and generated samples
│ ├── sparse_ae/ # Sparse AE results
│ ├── comparison/ # Cross-model comparison charts
│ └── evaluation/ # Detailed evaluation figures
│
├── results/ # Quantitative results (JSON/CSV)
│ ├── *_evaluation.json # Per-model metrics
│ └── comparison_report.json # Comparative analysis data
│
├── report/ # LaTeX report source files
│ ├── main.tex # Report document
│ └── references.bib # Bibliography
│
├── report.pdf # Compiled final report
├── requirements.txt # Python dependencies
├── LICENSE # MIT License
└── README.md # This file
Performance comparison on the PathMNIST test set (7,180 samples):
| Model | MSE | SSIM | PSNR (dB) | Parameters |
|---|---|---|---|---|
| Convolutional AE | 0.0020 | 0.758 | 28.75 | 862,211 |
| Denoising AE | 0.0035 | 0.591 | 25.66 | 862,211 |
| Sparse AE | 0.0050 | 0.820 | 24.57 | 870,403 |
Key Findings:
- Convolutional AE achieves the best pixel-wise reconstruction (lowest MSE, highest PSNR)
- Sparse AE provides the best perceptual quality (highest SSIM)
- VAE produces smooth, continuous latent spaces suitable for generative tasks
- Denoising AE effectively removes Gaussian noise while learning robust features
- Python 3.8 or higher
- TensorFlow 2.x
- Google Colab (recommended for GPU access)
Install dependencies:
pip install -r requirements.txt- Upload the
notebooks/folder to Google Colab - Mount Google Drive for data storage
- Execute notebooks sequentially (01 through 08)
- Results and figures are saved to Google Drive
- Clone the repository
- Create a virtual environment and install dependencies
- Download the PathMNIST dataset via the
medmnistpackage - Run notebooks in order
PathMNIST is a subset of MedMNIST containing colorectal cancer histopathology images:
- Image dimensions: 32 x 32 pixels, RGB (3 channels)
- Number of classes: 9 tissue types
- Training samples: 89,996
- Validation samples: 10,004
- Test samples: 7,180
Reference: Yang, J., et al. (2023). MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification. Scientific Data.
The complete analysis, methodology, and discussion are available in report.pdf. The report includes:
- Detailed architecture descriptions
- Training procedures and hyperparameters
- Quantitative evaluation with metrics comparison
- Qualitative analysis of reconstructions and latent spaces
- Discussion of strengths, limitations, and real-world applications
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
- Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. ICLR.
- Vincent, P., et al. (2010). Stacked Denoising Autoencoders. JMLR.
- Yang, J., et al. (2023). MedMNIST v2. Scientific Data.
- Ng, A. (2011). Sparse Autoencoder. CS294A Lecture Notes, Stanford University.
This project is licensed under the MIT License. See LICENSE for details.
Deep Learning Assignment 2 | December 2025