While the research on RAG is expanding, it predominantly revolves around systematic reviews and comparisons of new state-of-the-art (SoTA) techniques versus older ones.
The paper, in the link below, (written by ML guys from Predli and University of California, Berkeley) aims to bridge a gap by conducting extensive experimental comparisons where they evaluated various RAG methods and analyzed their impacts on retrieval precision and answer similarity.
The Sentence Window Retrieval (SWR) emerged as the most effective method for retrieval precision, despite variable performance on answer similarity.
They revealed that Hypothetical Document Embedding (HyDE) and LLM reranking, combined with SWR notably improve retrieval precision.
However, Maximal Marginal Relevance (MMR) and Cohere rerank did not show significant advantages over a baseline Naive RAG system.
Multi-query approaches underperformed in their assessments.
According to the paper, Document Summary Index could be a competent retrieval approach.
All resources related to this research are publicly accessible for further investigation in this repo: https://lnkd.in/db5Ecn29
paper:https://lnkd.in/dq6XRJcN
#RAG #LLM
Senior Software Engineer at JP Morgan | Ex-Software Engineer(General) at Boeing | Ex- Associate at Cognizant-LS RnD
1mo👏 well trade-off