Fast Vision Transformer Framework for Proactive Diabetic Retinopathy Diagnosis in Fundus Images
- DOI
- 10.2991/978-94-6463-866-0_63How to use a DOI?
- Keywords
- Convolutional Neural Networks (CNN); Diabetic Retinopathy (DR); and Fast Vision Transformer (Fast ViT)
- Abstract
Diabetic Retinopathy (DR) is the most common cause of vision loss globally, prompt detection is essential for efficient treatment and intervention. High-resolution context of the retinal fundus images is problematic from viewpoints of resolution constraints, loss of important lesion information, and computational inefficiency, even though deep learning-based techniques like Convolutional Neural Networks, have shown improved DR detection. This study proposes an integrated framework combining CNN and Transformer-based learning for early DR detection. This research explores the use of Fast Vision Transformer (FastViT) for DR classification, leveraging its hybrid convolution-transformer architecture to enhance feature extraction while maintaining computational efficiency. The proposed model processes retinal fundus images using optimized self-attention mechanisms and multi-scale feature representations, ensuring it captures both local and global structures critical for diagnosis. With an F1-score of 97.85%, the model demonstrated high accuracy in classifying various DR levels of severity after receiving training on the APTOS 2019 Blindness Detection dataset. Performance evaluation through loss and accuracy curves, as well as a confusion matrix, validates its reliability. The study highlights Fast ViT’s potential in real-time medical diagnostics, granting a successful and scalable automated DR screening solution. For better availability and early detection outcomes, future studies can concentrate on improving model interpretability, growing datasets, and implementing the system in clinical environments.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - R. Kingsly Stephen AU - S. I. Surruthi AU - M. Varshaa AU - B. S. Abishek PY - 2025 DA - 2025/10/31 TI - Fast Vision Transformer Framework for Proactive Diabetic Retinopathy Diagnosis in Fundus Images BT - Proceedings of the International Conference on Intelligent Systems and Digital Transformation (ICISD 2025) PB - Atlantis Press SP - 774 EP - 786 SN - 2589-4919 UR - https://doi.org/10.2991/978-94-6463-866-0_63 DO - 10.2991/978-94-6463-866-0_63 ID - Stephen2025 ER -