RAG-Driven Scholarly Assistant: Automating Research Paper Analysis with Open-Source LLM Benchmarking
- DOI
- 10.2991/978-94-6239-664-7_77How to use a DOI?
- Keywords
- Retrieval-Augmented Generation (RAG); Large Language Models (LLMs); Automated Literature Review; AI-Driven Document Analysis; OCR; Citation Network Analysis; Benchmarking; NLP
- Abstract
This work introduces a Retrieval-Augmented Generation (RAG)-based scholarly assistant, for automated reading of papers, which benchmarks several open-source LLMs. The developed system uses a pipeline of document processing, citation and structural analysis, and LLM-based question-answering to produce summaries and insights from academic literature. The benchmarking is done using a range of quantitative metrics majorly BLEU, METEOR, ROUGE scores. Other parameters like Perplexity, factual consistency and computational resource usage are also taken into consideration. The evaluation report is generated by the tool and provide downloadable CSV file. Visual demonstration of the data is also included in the user interface. Our developed toolkit is assessed on five domain-specific research papers (in medicine, literature, economics, computer science and mathematics) ensuring an even comparison across domains. It has been observed that the smaller RAG-based models (DeepSeek-1.5B, 8B), responds faster while exhibiting average higher factual consistency. On the contrary, the larger generative models (Mistral-7B and LLaMA3-8B) provide more detailed answers with higher overlaps. However, it costs higher computation and occasional factually inaccurate outputs. This extensive evaluation bolsters the potential for an open scholarly assistant. Furthermore, it leaves a much clearer impression of domain dependent challenges and strengths as well as a set of directions for future advancements.
- Copyright
- © 2026 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Irtefa Waseek AU - Md Rezaul Karim AU - Md Efatuzzaman Efat AU - Sumiya Afrose PY - 2026 DA - 2026/06/08 TI - RAG-Driven Scholarly Assistant: Automating Research Paper Analysis with Open-Source LLM Benchmarking BT - Proceedings of the International Conference on Intelligent Data Analysis and Applications (IDAA 2025) PB - Atlantis Press SP - 1127 EP - 1143 SN - 1951-6851 UR - https://doi.org/10.2991/978-94-6239-664-7_77 DO - 10.2991/978-94-6239-664-7_77 ID - Waseek2026 ER -