Data Acquisition & Exploration:
Connected directly to Kaggle's API to retrieve a large dataset of retinal images, followed by thorough exploration of image properties including size distributions and duplicate detection using hash functions.
Data Preprocessing: Designed a robust preprocessing pipeline involving:
-
Corrupted image detection and removal
-
Uniform resizing and cropping for model consistency
-
Normalization strategies suitable for medical images
-
Detection of low-quality images (e.g., blur, brightness issues)
Class Imbalance Handling:
Employed advanced techniques such as oversampling, data augmentation, and class-weighted loss functions
Prepared the dataset for multi-label classification, accounting for the real-world scenario where one image may exhibit signs of multiple diseases
Model Development:
Built and evaluated a baseline CNN model to validate data readiness
Designed plans for a more sophisticated architecture, with a focus on improving accuracy while preserving interpretability for clinical use
and more. To change and reuse text themes, go to Site Styles.
Coming soon:
- Retina Picture before and after preprocessing
- Dataset class imbalance
- An architecture diagram of the CNN
- Notebook preview (GIF)
Machine Learning for Retinal Disease and Eye Cancer Classification




Ocular diseases such as glaucoma, diabetic retinopathy , and age-related macular degeneration are leading causes of blindness in the world.
Early detection through automated deep learning models can allow early detection, and improve patient outcomes.
Using deep learning, this project identifies multiple retinal diseases including glaucoma, diabetic retinopathy, and even rare cancers like choroidal melanoma from retinal images.
The goal: support faster, more accurate diagnoses and improve patient outcomes through early detection.