Definition
Hybrid search runs two complementary retrieval strategies in parallel and merges their results: a sparse keyword search (typically BM25) and a dense vector search over embeddings. Dense and sparse retrievers fail in opposite ways, so combining them catches what either alone would miss. It is a foundational pattern for self-hosted retrieval pipelines where you want both precision and recall without depending on a single black-box ranker.
Why two retrievers are better than one
BM25 excels at exact-match queries: product codes, named entities, rare technical terms, and acronyms a vector model has never internalised. Dense vector search handles conceptual and paraphrased queries where the user's wording differs from the document's. A query like "BM1370 error code" benefits from BM25's literal matching, while "why does my hashboard run hot" benefits from semantic matching. Hybrid search gives you both.
Fusing the results
Because the two retrievers produce scores on incompatible scales, results are usually merged with a rank-based method such as Reciprocal Rank Fusion, which ignores raw scores and combines documents by their position in each list. Run the two searches concurrently rather than sequentially, otherwise you double retrieval latency. A reranking pass can then refine the merged list.
For sovereign operators, hybrid search is the practical default for a private knowledge base feeding a local LLM: it stays fully on your own hardware and avoids over-reliance on any one similarity metric. Pair it with good chunking and reranking for a robust on-premise RAG stack.
In Simple Terms
Hybrid search runs two complementary retrieval strategies in parallel and merges their results: a sparse keyword search (typically BM25) and a dense vector search over…
