Autoencoder - Image Compression

Context

Developed an autoencoder neural network for image compression on the Fashion-MNIST dataset. The project focused on optimizing the architecture to maximize Structural Similarity Index (SSIM) while minimizing Mean Squared Error (MSE).

Technologies Used

Framework: PyTorch
Dataset: Fashion-MNIST (60,000 training images)
Libraries: NumPy, Matplotlib, Scikit-learn
Metrics: SSIM, MSE, compression ratio

Implementation

Architecture:

Encoder: Convolutional layers for feature extraction
Latent space: Compact bottleneck representation
Decoder: Transposed convolutions for image reconstruction

Training:

Loss function: Combination of MSE and SSIM
Optimizer: Adam
Regularization: Dropout and BatchNorm

The autoencoder learns to compress 28×28 grayscale images into a compact latent representation and reconstruct them with minimal information loss.

Results

Achieved excellent compression ratio
High SSIM score indicating good perceptual quality
Low MSE demonstrating accurate reconstruction
Visual quality: Near-perfect reconstruction for most samples

Challenges & Learnings

Finding the optimal latent space dimension (trade-off between compression and quality)
Tuning loss functions to prioritize perceptual similarity
Understanding the learned features in the latent space