Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Okapi BM25

Sovereign AI

Definition

Okapi BM25 ("Best Matching 25") is a probabilistic ranking function that scores how relevant a document is to a keyword query. Developed by Stephen Robertson and Karen Sparck Jones around the Okapi information retrieval system, it remains the default sparse-retrieval baseline in modern search engines and the keyword half of most hybrid search stacks. Unlike dense vector methods, BM25 matches literal terms, making it strong on exact phrases, product codes, and rare technical jargon.

The three ingredients

BM25 combines three signals. Term frequency (TF) rewards documents that use a query term more often, but with diminishing returns: a saturation function means the tenth occurrence adds far less than the second. Inverse document frequency (IDF) rewards rare terms, so a distinctive word like a chip part number counts more than a common stopword. Document-length normalisation penalises long documents so they do not win simply by containing more words. Together these produce a relevance score that is more robust than classic TF-IDF.

Where it fits

BM25 needs no model training, no GPU, and no embeddings; it runs on a plain inverted index, which makes it cheap and fully transparent for a self-hosted search box. Its blind spot is paraphrase: it cannot match "car" to "automobile" because it sees only tokens, not meaning.

That weakness is exactly why operators pair it with vector semantic search and merge the lists with Reciprocal Rank Fusion, giving a private RAG pipeline both literal precision and conceptual recall.

In Simple Terms

Okapi BM25 (“Best Matching 25”) is a probabilistic ranking function that scores how relevant a document is to a keyword query. Developed by Stephen Robertson…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners