Table of Contents

Hybrid Search

Hybrid search combines keyword-based search (such as BM25) with semantic vector search to retrieve more comprehensive and relevant results. This fusion addresses the limitations of each individual method: keyword search misses synonyms and context, while vector search may overlook precise terms like product codes or entity names. 1)

How Hybrid Search Works

Hybrid search processes queries through two parallel pipelines: 2)

Results from both pipelines are merged via fusion techniques into a unified ranked list. The top items then augment the LLM prompt for generation.

Reciprocal Rank Fusion (RRF)

RRF is the most common fusion algorithm for hybrid search. It aggregates reciprocal ranks from each retriever, avoiding the need for score normalization across different scoring scales. 3) 4)

The formula for a document d:

RRF(d) = sum( 1 / (k + rank_i(d)) )

Where rank_i(d) is the rank of document d in retriever i, and k is a smoothing constant (typically 60). Items are sorted by descending RRF score. The constant k dampens the influence of top-ranked items, preventing any single retriever from dominating results.

Weighted Score Combination

An alternative to RRF is a linear blend of normalized scores: 5)

score = alpha * score_vector + (1 - alpha) * score_keyword

Where alpha ranges from 0 (pure keyword) to 1 (pure vector). Starting with alpha=0.5-0.7 (semantic-biased) and tuning on domain queries is recommended.

Database Implementations

Weaviate

Weaviate provides a hybrid search API with an alpha parameter controlling the balance between keyword (BM25) and vector search. Setting alpha=0 gives pure keyword, alpha=1 gives pure vector, and values in between blend both methods. 6)

Qdrant

Qdrant supports hybrid search by combining sparse vectors (BM25-like keyword representations) with dense vectors (semantic embeddings) in a single query. Results are fused using weighted scoring or RRF.

Elasticsearch

Elasticsearch combines kNN vector search with BM25 lexical search via the rrf retriever. Each method retrieves its top-k candidates independently (e.g., top-5 from each), then RRF merges and ranks the combined results. 7)

Pinecone

Pinecone hybrid search uses pod indexes that store both sparse (keyword) and dense (semantic) vectors, with automatic fusion and tunable weights for balancing the two signals.

Milvus

Milvus supports hybrid ANN search blending scalar filtering (keyword-based) with vector similarity, supporting RRF or weighted scoring for result fusion.

Performance

Hybrid search outperforms single-method search by 10-30% in recall and precision metrics (NDCG@10) on standard benchmarks like BEIR and MS MARCO. In RAG applications, it reduces retrieval errors by 20-40% compared to vector-only search, producing richer context that leads to fewer hallucinations. 8) 9)

When to Use

Good candidates:

Avoid for:

Best Practices

See Also

References

1)
https://www.meilisearch.com/blog/hybrid-search-rag|Meilisearch: Hybrid Search for RAG
2)
https://www.elastic.co/what-is/hybrid-search|Elastic: What Is Hybrid Search