Diabetic Retinopathy Detection using Vision Transformers
Project Overview
This project implements a state-of-the-art Vision Transformer (ViT) model for automated diabetic retinopathy detection from retinal fundus images. The system achieves 94% accuracy in classifying disease severity levels, providing a crucial tool for early detection and prevention of diabetes-related vision loss.
Key Achievements
- High Accuracy: 94% classification accuracy on diabetic retinopathy severity levels
- Modern Architecture: Leverages Vision Transformers for superior feature learning from retinal images
- Production Deployment: Fully containerized and deployed on AWS Lambda for scalable predictions
- Fast Inference: Optimized for real-time screening in clinical settings
Technical Stack
Deep Learning Framework
- Vision Transformer (ViT) architecture
- Custom preprocessing pipeline for retinal images
- Transfer learning from pre-trained medical imaging models
API & Deployment
- RESTful API for image upload and prediction
- Asynchronous request handling
- Input validation and error handling
Containerization
- Multi-stage Docker builds for optimized image size
- Reproducible development and production environments
- Easy deployment across different platforms
Cloud Infrastructure
- Serverless deployment for cost-effective scaling
- API Gateway integration for HTTPS endpoints
- S3 integration for model artifact storage
Model Architecture
The Vision Transformer processes retinal images by:
- Dividing images into fixed-size patches
- Linearly embedding each patch
- Adding positional embeddings
- Processing through transformer encoder layers
- Classification head for severity prediction
Clinical Impact
This system assists ophthalmologists by:
- Providing rapid screening results
- Prioritizing high-risk patients
- Reducing diagnostic workload
- Enabling early intervention for vision preservation
