An Empirical Study on Various Word Sense Disambiguation Techniques in the Biomedical Domain

Pawan Makhija; Sanjay Tanwani

doi:10.2991/978-94-6463-716-8_79

<Previous Article In Volume

Next Article In Volume>

An Empirical Study on Various Word Sense Disambiguation Techniques in the Biomedical Domain

Authors

Pawan Makhija¹^{, *}, Sanjay Tanwani²

¹Research Scholar, Department of Computer Engineering, Institute of Engineering and Technology, DAVV, Indore, India

²School of Computer Science and IT, DAVV, Indore, India

^*Corresponding author. Email: pawanmakhija@acropolis.in

Corresponding Author

Pawan Makhija

Available Online 26 May 2025.

DOI: 10.2991/978-94-6463-716-8_79 How to use a DOI?
Keywords: Word Sense Disambiguation(WSD); Natural Language Processing(NLP); Long Short Term Memory (LSTM); Support Vector Machine (SVM); Convolutional Neural Network (CNN)
Abstract: Word sense disambiguation (WSD) in the biomedical domain is a composite task that involves establishing correct meaning of a word based on its explicit context within biomedical literature. Biomedicine literature, constitutes research papers, clinical reports, electronic health records, and pharmaceutical articles. All these sources of biomedical literature are rich with terminology that often has numerous meanings. This leads to uncertainty in drawing an accurate meaning to a word present in the biomedical text. This polysemy introduces a challenge, as the accurate connotation of such terms is important for various subsequent applications that includes information retrieval, text mining, and knowledge extraction. The vagueness in word meanings of biomedical text can lead to misunderstandings or increased errors in biomedical data analysis, thus influencing clinical decision-making, research outcomes, drug discovery and patient’s treatment. Therefore, a technique for WSD is required to ensure that the computational systems can precisely process and analyze biomedical literature, that forms a base to more reasonable and fruitful insights. This paper introduces an in-depth investigation of the challenges associated with WSD in the biomedical domain. This investigation includes the structural complexity of medical language, the demand for domainspecific knowledge, and the drawbacks of existing natural language processing (NLP) techniques. We discuss the relevance of WSD in improving the accuracy and efficiency of biomedical data analysis. This study includes several methods, from fundamental rule-based approaches to more advanced machine learning and deep learning models, that are evaluated for their productiveness in addressing WSD challenges in biomedical texts. After conducting an empirical study, the findings showcase effectiveness of BERT model over other machine learning models for the classification problems. Our study also includes exploration of various research that have undergone in this area as well as applications of WSD in biomedical domain.
Copyright: © 2025 The Author(s)
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the International Conference on Recent Advancements and Modernisations in Sustainable Intelligent Technologies and Applications (RAMSITA 2025)
Series: Advances in Intelligent Systems Research
Publication Date: 26 May 2025
ISBN: 978-94-6463-716-8
ISSN: 1951-6851
DOI: 10.2991/978-94-6463-716-8_79 How to use a DOI?
Open Access: Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

ris enw bib

TY  - CONF
AU  - Pawan Makhija
AU  - Sanjay Tanwani
PY  - 2025
DA  - 2025/05/26
TI  - An Empirical Study on Various Word Sense Disambiguation Techniques in the Biomedical Domain
BT  - Proceedings of the International Conference on Recent Advancements and Modernisations in Sustainable Intelligent Technologies and Applications (RAMSITA 2025)
PB  - Atlantis Press
SP  - 1059
EP  - 1078
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-716-8_79
DO  - 10.2991/978-94-6463-716-8_79
ID  - Makhija2025
ER  -

download .riscopy to clipboard