How Hybrid Search Works
Reciprocal Rank Fusion (RRF)
Weighted Score Combination
Database Implementations
- Weaviate
- Qdrant
- Elasticsearch
- Pinecone
- Milvus
Performance
When to Use
Best Practices
See Also
References

Hybrid Search

Hybrid search combines keyword-based search (such as BM25) with semantic vector search to retrieve more comprehensive and relevant results. This fusion addresses the limitations of each individual method: keyword search misses synonyms and context, while vector search may overlook precise terms like product codes or entity names. ¹⁾

How Hybrid Search Works

Hybrid search processes queries through two parallel pipelines: ²⁾

Keyword search (sparse): BM25 or similar lexical methods match exact words and phrases, excelling at specific terms, entity names, and codes
Semantic search (dense): Embedding models find conceptually similar content, handling paraphrased or rephrased queries

Results from both pipelines are merged via fusion techniques into a unified ranked list. The top items then augment the LLM prompt for generation.

Reciprocal Rank Fusion (RRF)

RRF is the most common fusion algorithm for hybrid search. It aggregates reciprocal ranks from each retriever, avoiding the need for score normalization across different scoring scales. ³⁾ ⁴⁾

The formula for a document d:

RRF(d) = sum( 1 / (k + rank_i(d)) )

Where rank_i(d) is the rank of document d in retriever i, and k is a smoothing constant (typically 60). Items are sorted by descending RRF score. The constant k dampens the influence of top-ranked items, preventing any single retriever from dominating results.

Weighted Score Combination

An alternative to RRF is a linear blend of normalized scores: ⁵⁾

score = alpha * score_vector + (1 - alpha) * score_keyword

Where alpha ranges from 0 (pure keyword) to 1 (pure vector). Starting with alpha=0.5-0.7 (semantic-biased) and tuning on domain queries is recommended.

Database Implementations

Weaviate

Weaviate provides a hybrid search API with an alpha parameter controlling the balance between keyword (BM25) and vector search. Setting alpha=0 gives pure keyword, alpha=1 gives pure vector, and values in between blend both methods. ⁶⁾

Qdrant

Qdrant supports hybrid search by combining sparse vectors (BM25-like keyword representations) with dense vectors (semantic embeddings) in a single query. Results are fused using weighted scoring or RRF.

Elasticsearch

Elasticsearch combines kNN vector search with BM25 lexical search via the rrf retriever. Each method retrieves its top-k candidates independently (e.g., top-5 from each), then RRF merges and ranks the combined results. ⁷⁾

Pinecone

Pinecone hybrid search uses pod indexes that store both sparse (keyword) and dense (semantic) vectors, with automatic fusion and tunable weights for balancing the two signals.

Milvus

Milvus supports hybrid ANN search blending scalar filtering (keyword-based) with vector similarity, supporting RRF or weighted scoring for result fusion.

Performance

Hybrid search outperforms single-method search by 10-30% in recall and precision metrics (NDCG@10) on standard benchmarks like BEIR and MS MARCO. In RAG applications, it reduces retrieval errors by 20-40% compared to vector-only search, producing richer context that leads to fewer hallucinations. ⁸⁾ ⁹⁾

When to Use

Good candidates:

Diverse queries mixing specific terms (e.g., “error code GFX-108”) and concepts (e.g., “screen goes black”)
Large-scale RAG with varied structured and unstructured data
Applications where both precision and semantic understanding matter

Avoid for:

Purely lexical workloads (ID lookups, exact-match only)
Purely semantic workloads (short creative text generation)

Best Practices

Start with alpha=0.5-0.7 (semantic bias) and tune via A/B testing on domain queries ¹⁰⁾
Fetch more candidates per method (e.g., top-50 each) before fusion to boost recall
Use RRF over weighted sums for score-scale invariance
Add a reranking stage after fusion for further gains
Monitor with Hit Rate and MRR; adapt weights dynamically via query classifiers
Consider query classification to route pure-keyword queries to BM25 and semantic queries to vector search