Enhancing Retrieval-Augmented Generation with Multi-Armed Bandits for Complex Query Answering

Saturday 01 February 2025


Language models have revolutionized the field of natural language processing, enabling machines to generate human-like text and respond to questions with uncanny accuracy. However, these models often struggle with complex queries that require multiple steps or pieces of information to answer correctly. A team of researchers has developed a new approach to tackle this challenge by adapting the way retrieval-augmented generation (RAG) selects information.


Traditional RAG systems use a single method for retrieving relevant documents and then generate text based on those results. However, this can be inefficient, especially when dealing with complex queries that require multiple steps or different types of information. To address this issue, the researchers developed a multi-armed bandit (MAB) approach that dynamically selects the most suitable retrieval strategy based on the query complexity.


In their experiment, the team used six benchmark datasets, including single-hop and multi-hop questions, to test the effectiveness of their MAB-based RAG system. They compared it with other state-of-the-art methods, such as Adaptive-RAG, which uses a classifier to dynamically choose the most suitable retrieval strategy based on query complexity.


The results show that the MAB-based RAG system outperformed all other methods in terms of both accuracy and efficiency. It achieved an average accuracy of 38.8% across all datasets, while reducing the step cost by 20%. In contrast, Adaptive-RAG achieved an average accuracy of 37.1%, but with a higher step cost.


The MAB-based RAG system works by treating each retrieval strategy as an arm in a multi-armed bandit problem. It selects the most suitable arm based on the query complexity and then generates text using that strategy. The system is trained using a dynamic reward function that balances accuracy and efficiency, penalizing methods that require more steps or retrieval actions.


The researchers also implemented a multi-label classification setup to make the classification choice more rational while considering step costs. This approach allowed them to select multiple potential labels during training and inference, ultimately selecting the most likely label based on the corresponding retrieval strategy.


Overall, the MAB-based RAG system represents a significant advancement in the field of natural language processing, enabling machines to generate accurate and efficient responses to complex queries. Its adaptability and flexibility make it an attractive solution for a wide range of applications, from customer service chatbots to search engines and beyond.


Cite this article: “Enhancing Retrieval-Augmented Generation with Multi-Armed Bandits for Complex Query Answering”, The Science Archive, 2025.


Here Are The Relevant Keywords: Natural Language Processing, Retrieval-Augmented Generation, Multi-Armed Bandit, Query Complexity, Efficiency, Accuracy, Benchmark Datasets, Adaptive-Rag, Step Cost, Dynamic Reward Function


Reference: Xiaqiang Tang, Qiang Gao, Jian Li, Nan Du, Qi Li, Sihong Xie, “MBA-RAG: a Bandit Approach for Adaptive Retrieval-Augmented Generation through Question Complexity” (2024).


Leave a Reply