Self-Distilled Vision Transformer (SD-ViT) to Classify Brain Tumors using MRI images

Tanjina Ahmed Tuly; Tanjid Ahammed Shafin; Jahangir Alam Tamal; Jamil Hasan; Md Zahid Hasan; Md. Mashruf Hasan

doi:10.2991/978-94-6239-664-7_8

<Previous Article In Volume

Next Article In Volume>

Self-Distilled Vision Transformer (SD-ViT) to Classify Brain Tumors using MRI images

Authors

Tanjina Ahmed Tuly¹^{, *}, Tanjid Ahammed Shafin¹, Jahangir Alam Tamal¹, Jamil Hasan¹, Md Zahid Hasan¹, Md. Mashruf Hasan¹

¹Department of Computer Science and Engineering, Daffodil International University, Dhaka, 1216, Bangladesh

^*Corresponding author. Email: tuly15-4902@diu.edu.bd

Corresponding Author

Tanjina Ahmed Tuly

Available Online 8 June 2026.

DOI: 10.2991/978-94-6239-664-7_8 How to use a DOI?
Keywords: Brain Tumor Classification; Vision Transformer; Self-Distillation; Deep Learning; Medical Imaging; MRI
Abstract: Medical imaging is essential in the identification of brain tumors early and accurately in order to plan diagnosis and treatment. The latest progress in Vision Transformers (ViTs) has shown good promise in the medical image classification tasks. Nonetheless, typical ViT models typically need big data and many computational resources to have the best performance, which can be restrictive in medical imaging applications where the data are small and diverse. This study suggests a Self-Distilled Vision Transformer (SD-ViT) model to classify binary brain tumors to address this problem. The suggested model takes advantage of self-distillation concept, allowing the student branch of the network to learn the intermediate representations of the teacher branch of the same architecture, thus improving feature generalization and learning representations without any external supervision. The publicly available Brain Tumor MRI data was used in experiments and included images of tumor and non-tumor brain scans. The baseline Vision Transformer had an accuracy of 88% whereas a standard ResNet-50 model had a 75% accuracy with the same experimental conditions. Conversely, the proposed SD-ViT model achieved a high classification accuracy of 91% indicating that both the precision and robustness have been improved significantly. The findings imply that self-distillation improves the ability of the Vision Transformer to learn more discriminative tumor features, and reduce overfitting, especially on medical image classification problems with small amounts of data. In general, the suggested SD-ViT offers a simple but a powerful framework to detect brain tumors automatically, which can open the way to better clinical decision support systems with the help of radiological diagnosis.
Copyright: © 2026 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)
Series: Advances in Intelligent Systems Research
Publication Date: 8 June 2026
ISBN: 978-94-6239-664-7
ISSN: 1951-6851
DOI: 10.2991/978-94-6239-664-7_8 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Tanjina Ahmed Tuly
AU  - Tanjid Ahammed Shafin
AU  - Jahangir Alam Tamal
AU  - Jamil Hasan
AU  - Md Zahid Hasan
AU  - Md. Mashruf Hasan
PY  - 2026
DA  - 2026/06/08
TI  - Self-Distilled Vision Transformer (SD-ViT) to Classify Brain Tumors using MRI images
BT  - Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025)
PB  - Atlantis Press
SP  - 88
EP  - 100
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6239-664-7_8
DO  - 10.2991/978-94-6239-664-7_8
ID  - Tuly2026
ER  -

download .riscopy to clipboard