A Hybrid Approach for Biomedical Question Answering: Combining Sparse and Dense Retrieval with LLM-based Reranking
- DOI
- 10.2991/978-94-6463-716-8_82How to use a DOI?
- Keywords
- Biomedical Question Answering; Large Language Models (LLMs); Natural Language Processing; ColBERT Reranking
- Abstract
Biomedical question-answering systems typically involve retrieving relevant documents and then reranking them based on their relevance to the query. Traditional sparse retrievers, like BM25 often fail to capture semantic relationships. While the dense retrievers address these limitations, they can also miss relevant documents due to short queries, vocabulary mismatch, and document specificity issues stemming from embeddings. To address these types of challenges, we propose a hybrid approach that consolidates the strengths of both sparse and dense retrieval methods, resulting in better performance. This ensemble approach generates a comprehensive list of candidate documents, which is then passed through a LLM-based reranking model named ColBert, a fine-tuned late interaction mechanism that works on document relevance to refine the ranking of the documents. We use the Flan-T5 answer generation model to produce a final answer to the query. The experiments were performed on the BioASQ dataset, which remarkably demonstrated the effectiveness of our approach and showcased its ability to improve retrieval performance.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Pawan Makhija AU - Sanjay Tanwani PY - 2025 DA - 2025/05/26 TI - A Hybrid Approach for Biomedical Question Answering: Combining Sparse and Dense Retrieval with LLM-based Reranking BT - Proceedings of the International Conference on Recent Advancements and Modernisations in Sustainable Intelligent Technologies and Applications (RAMSITA 2025) PB - Atlantis Press SP - 1108 EP - 1123 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6463-716-8_82 DO - 10.2991/978-94-6463-716-8_82 ID - Makhija2025 ER -