====== Hybrid Search ====== Hybrid search combines keyword-based search (such as BM25) with semantic vector search to retrieve more comprehensive and relevant results. This fusion addresses the limitations of each individual method: keyword search misses synonyms and context, while vector search may overlook precise terms like product codes or entity names. ((https://www.meilisearch.com/blog/hybrid-search-rag|Meilisearch: Hybrid Search for RAG)) ===== How Hybrid Search Works ===== Hybrid search processes queries through two parallel pipelines: ((https://www.elastic.co/what-is/hybrid-search|Elastic: What Is Hybrid Search)) * **Keyword search (sparse)**: BM25 or similar lexical methods match exact words and phrases, excelling at specific terms, entity names, and codes * **Semantic search (dense)**: Embedding models find conceptually similar content, handling paraphrased or rephrased queries Results from both pipelines are merged via fusion techniques into a unified ranked list. The top items then augment the LLM prompt for generation. ===== Reciprocal Rank Fusion (RRF) ===== RRF is the most common fusion algorithm for hybrid search. It aggregates reciprocal ranks from each retriever, avoiding the need for score normalization across different scoring scales. ((https://www.elastic.co/what-is/hybrid-search|Elastic: Hybrid Search)) ((https://careers.edicomgroup.com/techblog/llm-rag-improving-the-retrieval-phase-with-hybrid-search/|EDICOM: Hybrid Search for RAG)) The formula for a document d: RRF(d) = sum( 1 / (k + rank_i(d)) ) Where rank_i(d) is the rank of document d in retriever i, and k is a smoothing constant (typically 60). Items are sorted by descending RRF score. The constant k dampens the influence of top-ranked items, preventing any single retriever from dominating results. ===== Weighted Score Combination ===== An alternative to RRF is a linear blend of normalized scores: ((https://rellify.com/blog/hybrid-search-for-marketers|Rellify: Hybrid Search)) score = alpha * score_vector + (1 - alpha) * score_keyword Where alpha ranges from 0 (pure keyword) to 1 (pure vector). Starting with alpha=0.5-0.7 (semantic-biased) and tuning on domain queries is recommended. ===== Database Implementations ===== ==== Weaviate ==== Weaviate provides a hybrid search API with an alpha parameter controlling the balance between keyword (BM25) and vector search. Setting alpha=0 gives pure keyword, alpha=1 gives pure vector, and values in between blend both methods. ((https://www.meilisearch.com/blog/hybrid-search-rag|Meilisearch: Hybrid Search)) ==== Qdrant ==== Qdrant supports hybrid search by combining sparse vectors (BM25-like keyword representations) with dense vectors (semantic embeddings) in a single query. Results are fused using weighted scoring or RRF. ==== Elasticsearch ==== Elasticsearch combines kNN vector search with BM25 lexical search via the rrf retriever. Each method retrieves its top-k candidates independently (e.g., top-5 from each), then RRF merges and ranks the combined results. ((https://www.elastic.co/what-is/hybrid-search|Elastic: Hybrid Search)) ==== Pinecone ==== Pinecone hybrid search uses pod indexes that store both sparse (keyword) and dense (semantic) vectors, with automatic fusion and tunable weights for balancing the two signals. ==== Milvus ==== Milvus supports hybrid ANN search blending scalar filtering (keyword-based) with vector similarity, supporting RRF or weighted scoring for result fusion. ===== Performance ===== Hybrid search outperforms single-method search by 10-30% in recall and precision metrics (NDCG@10) on standard benchmarks like BEIR and MS MARCO. In RAG applications, it reduces retrieval errors by 20-40% compared to vector-only search, producing richer context that leads to fewer hallucinations. ((https://www.velodb.io/glossary/what-is-hybrid-search|VeloDB: Hybrid Search)) ((https://superlinked.com/vectorhub/articles/optimizing-rag-with-hybrid-search-reranking|Superlinked: Optimizing RAG)) ===== When to Use ===== **Good candidates:** * Diverse queries mixing specific terms (e.g., "error code GFX-108") and concepts (e.g., "screen goes black") * Large-scale RAG with varied structured and unstructured data * Applications where both precision and semantic understanding matter **Avoid for:** * Purely lexical workloads (ID lookups, exact-match only) * Purely semantic workloads (short creative text generation) ===== Best Practices ===== * Start with alpha=0.5-0.7 (semantic bias) and tune via A/B testing on domain queries ((https://rellify.com/blog/hybrid-search-for-marketers|Rellify: Hybrid Search)) * Fetch more candidates per method (e.g., top-50 each) before fusion to boost recall * Use RRF over weighted sums for score-scale invariance * Add a reranking stage after fusion for further gains * Monitor with Hit Rate and MRR; adapt weights dynamically via query classifiers * Consider query classification to route pure-keyword queries to BM25 and semantic queries to vector search ===== See Also ===== * [[semantic_search|Semantic Search]] * [[retrieval_strategies|Retrieval Strategies]] * [[reranking|Reranking]] ===== References =====