Skip to main content

Athena: A Plug-and-Play Advisor for Retrieval-Augmented Generation using VectorDB

Publication ,  Conference
Liang, N; Wenz, F; Giceva, J; Wills, LW
Published in: Proceedings 2025 IEEE International Symposium on Workload Characterization Iiswc 2025
January 1, 2025

Retrieval-Augmented Generation (RAG) has emerged as a popular technique for addressing several challenges of Large Language Model (LLM) systems, including static model knowledge, hallucination, and limited input sequence lengths. Although RAG mitigates common pitfalls of current LLM systems, its inherent heterogeneity and configurability introduce new challenges. The performance of RAG is crucial for meeting the high-throughput and low-latency demands of LLM services. Different components of RAG operate on different hardware platforms, and their complexity scales with the configurability and complexity of the rest of the system. For example, larger embeddings may enhance retrieval accuracy, but also increase the latency of embedding creation and indexing, thereby compromising the RAG system's performance and energy consumption.Thus, a comprehensive characterization of an end-to-end RAG system becomes necessary. In this work, we build an end-to-end RAG benchmarking framework, Athena, that supports various embedding models, vector databases, index/search algorithms, and LLMs. By characterizing the system under various RAG settings built using Athena, we demystify RAG by identifying performance bottlenecks and quantifying the impact of each sub-component on overall system performance. In addition, the plug-and-play, open-sourced Athena framework is designed to assist future RAG research.

Duke Scholars

Published In

Proceedings 2025 IEEE International Symposium on Workload Characterization Iiswc 2025

DOI

Publication Date

January 1, 2025

Start / End Page

28 / 41
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Liang, N., Wenz, F., Giceva, J., & Wills, L. W. (2025). Athena: A Plug-and-Play Advisor for Retrieval-Augmented Generation using VectorDB. In Proceedings 2025 IEEE International Symposium on Workload Characterization Iiswc 2025 (pp. 28–41). https://doi.org/10.1109/IISWC66894.2025.00013
Liang, N., F. Wenz, J. Giceva, and L. W. Wills. “Athena: A Plug-and-Play Advisor for Retrieval-Augmented Generation using VectorDB.” In Proceedings 2025 IEEE International Symposium on Workload Characterization Iiswc 2025, 28–41, 2025. https://doi.org/10.1109/IISWC66894.2025.00013.
Liang N, Wenz F, Giceva J, Wills LW. Athena: A Plug-and-Play Advisor for Retrieval-Augmented Generation using VectorDB. In: Proceedings 2025 IEEE International Symposium on Workload Characterization Iiswc 2025. 2025. p. 28–41.
Liang, N., et al. “Athena: A Plug-and-Play Advisor for Retrieval-Augmented Generation using VectorDB.” Proceedings 2025 IEEE International Symposium on Workload Characterization Iiswc 2025, 2025, pp. 28–41. Scopus, doi:10.1109/IISWC66894.2025.00013.
Liang N, Wenz F, Giceva J, Wills LW. Athena: A Plug-and-Play Advisor for Retrieval-Augmented Generation using VectorDB. Proceedings 2025 IEEE International Symposium on Workload Characterization Iiswc 2025. 2025. p. 28–41.

Published In

Proceedings 2025 IEEE International Symposium on Workload Characterization Iiswc 2025

DOI

Publication Date

January 1, 2025

Start / End Page

28 / 41