Hybrid Semantic Retrieval: Augmenting Weighted TF–IDF with BERT for Enhanced Question Answering
- DOI
- 10.2991/978-94-6463-948-3_26How to use a DOI?
- Keywords
- Semantic Search; TF–IDF; BERT; Question Answering; Information Retrieval; Hybrid Retrieval; BiLSTM–CRF
- Abstract
Question-answering (QA) systems face a difficult trade-off: the speed of inverted indices versus the understanding of neural models. Traditional TF-IDF is fast but brittle when query wording shifts, while BERT offers deep context at a high computational cost. We bridge this divide with a hybrid architecture centered on "questionable spans"—specific text segments statistically likely to hold answers. By training a BiLSTM-CRF model to detect these high-value spans and up-weighting them in a standard TF-IDF index, we create a semantically sharpened candidate set. This allows us to apply expensive BERT re-ranking only where it counts. Experiments on Yahoo! Answers show this approach significantly boosts recall and precision, successfully recovering relevant documents that standard lexical search misses.
- Copyright
- © 2025 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Dinesh Kumar Koilada PY - 2026 DA - 2026/01/06 TI - Hybrid Semantic Retrieval: Augmenting Weighted TF–IDF with BERT for Enhanced Question Answering BT - Proceedings of the International Conference on Sustainable Innovation with Artificial Intelligence and Machine Learning 2025 (ICSIAIML 2025) PB - Atlantis Press SP - 359 EP - 365 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6463-948-3_26 DO - 10.2991/978-94-6463-948-3_26 ID - Koilada2026 ER -