Proceedings of the International Conference on Recent Advancements and Modernisations in Sustainable Intelligent Technologies and Applications (RAMSITA 2025)

A Hybrid Approach for Biomedical Question Answering: Combining Sparse and Dense Retrieval with LLM-based Reranking

Authors
Pawan Makhija1, *, Sanjay Tanwani2
1Research Scholar, Department of Computer Engineering, Institute of Engineering and Technology, DAVV, Indore, India
2Professor, School of Computer Science and IT, DAVV, Indore, India
*Corresponding author. Email: pawanmakhija@acropolis.in
Corresponding Author
Pawan Makhija
Available Online 26 May 2025.
DOI
10.2991/978-94-6463-716-8_82How to use a DOI?
Keywords
Biomedical Question Answering; Large Language Models (LLMs); Natural Language Processing; ColBERT Reranking
Abstract

Biomedical question-answering systems typically involve retrieving relevant documents and then reranking them based on their relevance to the query. Traditional sparse retrievers, like BM25 often fail to capture semantic relationships. While the dense retrievers address these limitations, they can also miss relevant documents due to short queries, vocabulary mismatch, and document specificity issues stemming from embeddings. To address these types of challenges, we propose a hybrid approach that consolidates the strengths of both sparse and dense retrieval methods, resulting in better performance. This ensemble approach generates a comprehensive list of candidate documents, which is then passed through a LLM-based reranking model named ColBert, a fine-tuned late interaction mechanism that works on document relevance to refine the ranking of the documents. We use the Flan-T5 answer generation model to produce a final answer to the query. The experiments were performed on the BioASQ dataset, which remarkably demonstrated the effectiveness of our approach and showcased its ability to improve retrieval performance.

Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Recent Advancements and Modernisations in Sustainable Intelligent Technologies and Applications (RAMSITA 2025)
Series
Advances in Intelligent Systems Research
Publication Date
26 May 2025
ISBN
978-94-6463-716-8
ISSN
1951-6851
DOI
10.2991/978-94-6463-716-8_82How to use a DOI?
Copyright
© 2025 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Pawan Makhija
AU  - Sanjay Tanwani
PY  - 2025
DA  - 2025/05/26
TI  - A Hybrid Approach for Biomedical Question Answering: Combining Sparse and Dense Retrieval with LLM-based Reranking
BT  - Proceedings of the International Conference on Recent Advancements and Modernisations in Sustainable Intelligent Technologies and Applications (RAMSITA 2025)
PB  - Atlantis Press
SP  - 1108
EP  - 1123
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-716-8_82
DO  - 10.2991/978-94-6463-716-8_82
ID  - Makhija2025
ER  -