Proceedings of the International Conference on Recent Advancements and Modernisations in Sustainable Intelligent Technologies and Applications (RAMSITA 2025)

An Empirical Study on Various Word Sense Disambiguation Techniques in the Biomedical Domain

Authors
Pawan Makhija1, *, Sanjay Tanwani2
1Research Scholar, Department of Computer Engineering, Institute of Engineering and Technology, DAVV, Indore, India
2School of Computer Science and IT, DAVV, Indore, India
*Corresponding author. Email: pawanmakhija@acropolis.in
Corresponding Author
Pawan Makhija
Available Online 26 May 2025.
DOI
10.2991/978-94-6463-716-8_79How to use a DOI?
Keywords
Word Sense Disambiguation(WSD); Natural Language Processing(NLP); Long Short Term Memory (LSTM); Support Vector Machine (SVM); Convolutional Neural Network (CNN)
Abstract

Word sense disambiguation (WSD) in the biomedical domain is a composite task that involves establishing correct meaning of a word based on its explicit context within biomedical literature. Biomedicine literature, constitutes research papers, clinical reports, electronic health records, and pharmaceutical articles. All these sources of biomedical literature are rich with terminology that often has numerous meanings. This leads to uncertainty in drawing an accurate meaning to a word present in the biomedical text. This polysemy introduces a challenge, as the accurate connotation of such terms is important for various subsequent applications that includes information retrieval, text mining, and knowledge extraction. The vagueness in word meanings of biomedical text can lead to misunderstandings or increased errors in biomedical data analysis, thus influencing clinical decision-making, research outcomes, drug discovery and patient’s treatment. Therefore, a technique for WSD is required to ensure that the computational systems can precisely process and analyze biomedical literature, that forms a base to more reasonable and fruitful insights. This paper introduces an in-depth investigation of the challenges associated with WSD in the biomedical domain. This investigation includes the structural complexity of medical language, the demand for domainspecific knowledge, and the drawbacks of existing natural language processing (NLP) techniques. We discuss the relevance of WSD in improving the accuracy and efficiency of biomedical data analysis. This study includes several methods, from fundamental rule-based approaches to more advanced machine learning and deep learning models, that are evaluated for their productiveness in addressing WSD challenges in biomedical texts. After conducting an empirical study, the findings showcase effectiveness of BERT model over other machine learning models for the classification problems. Our study also includes exploration of various research that have undergone in this area as well as applications of WSD in biomedical domain.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Recent Advancements and Modernisations in Sustainable Intelligent Technologies and Applications (RAMSITA 2025)
Series
Advances in Intelligent Systems Research
Publication Date
26 May 2025
ISBN
978-94-6463-716-8
ISSN
1951-6851
DOI
10.2991/978-94-6463-716-8_79How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Pawan Makhija
AU  - Sanjay Tanwani
PY  - 2025
DA  - 2025/05/26
TI  - An Empirical Study on Various Word Sense Disambiguation Techniques in the Biomedical Domain
BT  - Proceedings of the International Conference on Recent Advancements and Modernisations in Sustainable Intelligent Technologies and Applications (RAMSITA 2025)
PB  - Atlantis Press
SP  - 1059
EP  - 1078
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-716-8_79
DO  - 10.2991/978-94-6463-716-8_79
ID  - Makhija2025
ER  -