====== Hybrid Search ======

Hybrid search combines keyword-based search (such as BM25) with semantic vector search to retrieve more comprehensive and relevant results. This fusion addresses the limitations of each individual method: keyword search misses synonyms and context, while vector search may overlook precise terms like product codes or entity names. ((https://www.meilisearch.com/blog/hybrid-search-rag|Meilisearch: Hybrid Search for RAG))

===== How Hybrid Search Works =====

Hybrid search processes queries through two parallel pipelines: ((https://www.elastic.co/what-is/hybrid-search|Elastic: What Is Hybrid Search))

  * **Keyword search (sparse)**: BM25 or similar lexical methods match exact words and phrases, excelling at specific terms, entity names, and codes
  * **Semantic search (dense)**: Embedding models find conceptually similar content, handling paraphrased or rephrased queries

Results from both pipelines are merged via fusion techniques into a unified ranked list. The top items then augment the LLM prompt for generation.

===== Reciprocal Rank Fusion (RRF) =====

RRF is the most common fusion algorithm for hybrid search. It aggregates reciprocal ranks from each retriever, avoiding the need for score normalization across different scoring scales. ((https://www.elastic.co/what-is/hybrid-search|Elastic: Hybrid Search)) ((https://careers.edicomgroup.com/techblog/llm-rag-improving-the-retrieval-phase-with-hybrid-search/|EDICOM: Hybrid Search for RAG))

The formula for a document d:

  RRF(d) = sum( 1 / (k + rank_i(d)) )

Where rank_i(d) is the rank of document d in retriever i, and k is a smoothing constant (typically 60). Items are sorted by descending RRF score. The constant k dampens the influence of top-ranked items, preventing any single retriever from dominating results.

===== Weighted Score Combination =====

An alternative to RRF is a linear blend of normalized scores: ((https://rellify.com/blog/hybrid-search-for-marketers|Rellify: Hybrid Search))

  score = alpha * score_vector + (1 - alpha) * score_keyword

Where alpha ranges from 0 (pure keyword) to 1 (pure vector). Starting with alpha=0.5-0.7 (semantic-biased) and tuning on domain queries is recommended.

===== Database Implementations =====

==== Weaviate ====

Weaviate provides a hybrid search API with an alpha parameter controlling the balance between keyword (BM25) and vector search. Setting alpha=0 gives pure keyword, alpha=1 gives pure vector, and values in between blend both methods. ((https://www.meilisearch.com/blog/hybrid-search-rag|Meilisearch: Hybrid Search))

==== Qdrant ====

Qdrant supports hybrid search by combining sparse vectors (BM25-like keyword representations) with dense vectors (semantic embeddings) in a single query. Results are fused using weighted scoring or RRF.

==== Elasticsearch ====

Elasticsearch combines kNN vector search with BM25 lexical search via the rrf retriever. Each method retrieves its top-k candidates independently (e.g., top-5 from each), then RRF merges and ranks the combined results. ((https://www.elastic.co/what-is/hybrid-search|Elastic: Hybrid Search))

==== Pinecone ====

Pinecone hybrid search uses pod indexes that store both sparse (keyword) and dense (semantic) vectors, with automatic fusion and tunable weights for balancing the two signals.

==== Milvus ====

Milvus supports hybrid ANN search blending scalar filtering (keyword-based) with vector similarity, supporting RRF or weighted scoring for result fusion.

===== Performance =====

Hybrid search outperforms single-method search by 10-30% in recall and precision metrics (NDCG@10) on standard benchmarks like BEIR and MS MARCO. In RAG applications, it reduces retrieval errors by 20-40% compared to vector-only search, producing richer context that leads to fewer hallucinations. ((https://www.velodb.io/glossary/what-is-hybrid-search|VeloDB: Hybrid Search)) ((https://superlinked.com/vectorhub/articles/optimizing-rag-with-hybrid-search-reranking|Superlinked: Optimizing RAG))

===== When to Use =====

**Good candidates:**
  * Diverse queries mixing specific terms (e.g., "error code GFX-108") and concepts (e.g., "screen goes black")
  * Large-scale RAG with varied structured and unstructured data
  * Applications where both precision and semantic understanding matter

**Avoid for:**
  * Purely lexical workloads (ID lookups, exact-match only)
  * Purely semantic workloads (short creative text generation)

===== Best Practices =====

  * Start with alpha=0.5-0.7 (semantic bias) and tune via A/B testing on domain queries ((https://rellify.com/blog/hybrid-search-for-marketers|Rellify: Hybrid Search))
  * Fetch more candidates per method (e.g., top-50 each) before fusion to boost recall
  * Use RRF over weighted sums for score-scale invariance
  * Add a reranking stage after fusion for further gains
  * Monitor with Hit Rate and MRR; adapt weights dynamically via query classifiers
  * Consider query classification to route pure-keyword queries to BM25 and semantic queries to vector search

===== See Also =====

  * [[semantic_search|Semantic Search]]
  * [[retrieval_strategies|Retrieval Strategies]]
  * [[reranking|Reranking]]

===== References =====