Retinal scans are useful tools for detecting a number of eye conditions. However, it requires an expert to read and decipher these images. The goal of this project is to develop models for detecting various types of disease. In the first iteration the model created will try to distinguish between,
- Diabetic Retinopathy
- Glaucoma
- Normal scans
- Other ailments
This dataset was taken from a Kaggle competition. The details can be found here (https://www.kaggle.com/c/vietai-advance-course-retinal-disease-detection/overview)
- Jupyter Notebooks
- OpenCV
- PIL (pillow)
- Python 3.8
- Tensorflow 2.3
The image below shows a high contrast example of Diabetic Retinopathy
The image below shows an example of Glaucoma. The area affected by Glaucoma is around the optic disk.
image taken from H. Fu et al., "Disc-Aware Ensemble Network for Glaucoma Screening From Fundus Image," in IEEE Transactions on Medical Imaging, vol. 37, no. 11, pp. 2493-2501, Nov. 2018, doi: 10.1109/TMI.2018.2837012.
The image below is an example of a normal scan.
Based on the observations above the following image processing techniques were tried.
An analysis was carried out to see if a particular channel represents dataset separation better than other. The results are shown below.
The channels do show variations within an image but the separation between images is not very clear.
The images below show increased contrast. This does enhance differences between Diabetic Retinopathy and Glaucoma and to some extent normal images as well.
The original dataset contains 7 categories. However, for the first model these were separted into 4 categories as mentioned above. The other category contained more than 1500 images which made the samples imbalanced. So about 750 examples were randomly chosen from these images to ensure more balanced training data.
The first approach was to try and create a single multiclassification model. For second approach please scroll below to that section.
A number of models were used from tensorflow applications. The following models were tested,
- VGG19
- EfficinetNet B5
- MobileNet V2
- ResNet50
Out of all these Resnet50 showed the best results on both 7 category dataset and the 4 category dataset.
Heatmaps were created using EfficientNet B5 and Resnet50 outputs. The heat maps show that the models focused mainly on the visual disc and brighter spots on the retinal scans. The model does not at the moment do a very good job however of separating different conditions.
For label propogation K-nearest-neighbor algrithm was used.
The pretrained model was loaded and output from final prediction layer was fed to a knn model with 4 categories. The model was then used for making predictions on the test data. Based on these predictions the test data was moved to in a new training dataset. A new resnet50 model was initiated with same datagen parameters. The training results however were not very encouraging. The accuracy did increase to around 40% compared to about 30% for the original model.
One of the reasons could be that the original model was not doing a very good job at identifying the true categories.
In this approach one vs normal models were trained.
Based on the scans it was oserved that each eye ailment addresses different areas/problems. So a normal vs diabetic retinopathy approach enhanced the images by increasing contrast and brightness to highlight damage to retina. A normal vs Glaucoma approach increased contrast only to shift the focus to visual disc only.
A custom architecture with a residual connection was used to train these models. This was done to test the concept and speed up the training process.
The Diabetic Retinopathy test resulted in 87% accuracy The Glaucoma test resulted in The other test resulted in
The heat maps generated for Diabetic Retinopathy are given below, The custom model does not have a very good representation in the heat maps but work is being done to fix this.
The problem that arose in label transfer is that all three models need to be working well to ensure good propogation. This would involved checking each image multiple times and assigning labels based on highest confidence. In this case when only one model was tested it was able to identify normal images well but missclassified a lot of other images.
The notebook custom_arch_label-prop explores this in more detail.