An exploratory study on integrating radiomics with vision transformers for enhancing medical imaging classification accuracy.
BACKGROUND: Medical image analysis has witnessed substantial advancements through recent deep learning (DL) algorithms development. Vision Transformers (ViTs) have emerged as a powerful alternative solution by leveraging self-attention to model both local and global interactions. Despite their promise, ViTs are data-intensive and lack inductive biases, limiting their utility in medical imaging. Conversely, radiomics offers domain-specific, interpretable descriptors of image heterogeneity but lacks scalability and integration with deep learning. This study proposes a unified Radiomics-Embedded Vision Transformer (RE-ViT) framework that combines handcrafted radiomic features and data-driven visual embeddings within a ViT architecture. PURPOSE: To develop and evaluate a RE-ViT framework that integrates radiomics and patch-wise ViT embeddings to improve feature representation for medical image classification across heterogeneous datasets. METHODS: Following the classic ViT design, the input image was first resampled into multiple image patches. For each image patch, handcrafted radiomic features, including intensity, texture, and spatial heterogeneity descriptors, were extracted. Simultaneously, standard patch embeddings were obtained via linear projection of pixel intensities. The two embeddings were averaged, normalized, and combined with positional encodings before being tokenized and processed by a ViT encoder. A learnable token aggregates patch-level information for final classification. The model was evaluated on three publicly available datasets, BUSI (lesion malignancy diagnosis through breast ultrasound), ChestXray2017 (lung pneumonitis diagnosis through chest x-ray), and Retinal OCT (retina disease diagnosis through retinal OCT), using 10-fold cross-validation. Performance metrics included accuracy, macro area under the ROC curve (AUC), sensitivity, and specificity. Ablation studies were implemented to assess the contribution of RE-ViT architectural components on these three clinical problems. Comparative analyses were also conducted against CNN (VGG-16, ResNet) and hybrid (TransMed) models. RESULTS: The proposed RE-ViT model demonstrated consistently robust classification performance across all three medical imaging datasets. In BUSI, RE-ViT achieved an accuracy of 0.848 ± 0.027, AUC of 0.950 ± 0.011, sensitivity of 0.796 ± 0.042, and specificity of 0.905 ± 0.020. In ChestXray2017, it yielded an accuracy of 0.950 ± 0.012, AUC of 0.989 ± 0.004, sensitivity of 0.953 ± 0.010, and specificity of 0.975 ± 0.005. In Retinal OCT, RE-ViT achieved an accuracy of 0.938 ± 0.001, AUC of 0.986 ± 0.001, sensitivity of 0.914 ± 0.023, and specificity of 0.969 ± 0.024. In the comparison studies, the RE-ViT matches or outperforms alternatives. Ablation revealed significant performance drops when removing either radiomics or projection-based embeddings. Attention map visualizations demonstrated imaging modality-specific utilization of radiomics and learned features, with improved localization of clinically relevant regions. CONCLUSIONS: The proposed radiomics-embedded vision transformer was developed for multiple image classification tasks. Current results underscore the potential of our approach to advance other transformer-based medical image classification tasks.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Radiomics
- Nuclear Medicine & Medical Imaging
- Image Processing, Computer-Assisted
- Humans
- Diagnostic Imaging
- Deep Learning
- 5105 Medical and biological physics
- 4003 Biomedical engineering
- 1112 Oncology and Carcinogenesis
- 0903 Biomedical Engineering
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Radiomics
- Nuclear Medicine & Medical Imaging
- Image Processing, Computer-Assisted
- Humans
- Diagnostic Imaging
- Deep Learning
- 5105 Medical and biological physics
- 4003 Biomedical engineering
- 1112 Oncology and Carcinogenesis
- 0903 Biomedical Engineering