Reranking

Sovereign AI

Reranking is a precision step in retrieval pipelines that re-orders an initial set of candidate documents by how relevant each one actually is to the query. It sits between broad retrieval and answer generation, sharpening the results a fast first-stage search returned before they are handed to a language model. In retrieval-augmented generation, reranking is what stops the generator from being fed loosely related context — and loosely related context is where hallucinated answers with confident citations come from.

Two-stage retrieval

Production retrieval typically runs in two stages. Stage one retrieves broadly: a vector or hybrid search pulls perhaps 50 to 100 candidates, optimized for recall so nothing relevant is missed. Stage two ranks precisely: a reranker scores each candidate against the specific query and reorders them, and only the top handful survive into the prompt. The reason both stages exist is economic. The embedding models behind first-stage search are trained for scalable similarity — they compress every document into a fixed vector ahead of time, so search is cheap but coarse. Rerankers are the opposite: expensive per comparison but far more discriminating. A cheap broad sweep followed by an expensive precise pass over the shortlist is the efficient design, the same funnel logic as any triage process.

Cross-encoders, and why they see more

The classic reranker is a cross-encoder: it feeds the query and a candidate document into the model together, so every query token can attend to every document token, and outputs a single relevance score. This joint processing captures things a pair of static vectors cannot — negation, exact entity matches, whether the document answers the question or merely mentions its keywords. That is why cross-encoders typically lift ranking accuracy well above bi-encoder retrieval. The cost is that nothing can be precomputed: the model runs once per query-candidate pair, which is exactly why reranking is applied only to a shortlist and never to the whole corpus. Listwise and LLM-based rerankers exist as heavier alternatives, but the cross-encoder remains the workhorse. Evaluation follows standard ranking metrics — how often the truly relevant document lands in the top handful of results — and even a small labeled test set of your own queries beats any published benchmark for judging a reranker on your corpus.

When reranking earns its keep

Reranking matters most when the corpus is dense with near-duplicates and the cost of a wrong passage is high — technical documentation being the canonical case. Consider a repair-bench knowledge base full of ASIC troubleshooting notes: a query about one error code on one hashboard revision will pull dozens of superficially similar passages about neighboring codes and sibling models. First-stage semantic search gets them all in the candidate pool; the reranker is what puts the passage about your board revision at position one instead of position eleven. Fixed context windows make this ordering decisive — whatever ranks below the cutoff simply does not exist as far as the model is concerned.

Running it locally

Open-weight cross-encoder rerankers are small by LLM standards — typically well under a billion parameters — and run comfortably on CPU or a modest GPU, adding modest latency per query for a large accuracy gain. That makes reranking one of the cheapest quality upgrades available to a self-hosted stack: your documents, your embeddings, your reranker, and your generator all on your own hardware, with nothing leaving the machine. D-Central documents reranking as the quality multiplier on top of semantic search; running both stages locally keeps a private knowledge base fully sovereign while still delivering retrieval precise enough to trust.

Cost a rerank pass in the inference cost calculator.

Reranking is a precision step in retrieval pipelines that re-orders an initial set of candidate documents by how relevant each one actually is to the…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners