Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Best-of-N Sampling

Sovereign AI

Definition

Best-of-N (BoN) sampling is an inference-time alignment method: instead of returning the first response a model produces, you sample N candidates and use a reward model to pick the highest-scoring one. It is a simple, training-free way to raise output quality by spending more compute at generation time.

The quality-versus-divergence tradeoff

Raising N generally improves the selected response, but it also pushes the resulting distribution away from the base model. A commonly cited expression places the KL divergence between the best-of-N policy and the reference policy at roughly log(n) − (n−1)/n — an upper bound that grows with N. Push N too high and the method can exploit flaws in the reward model, a failure called reward over-optimization or reward hacking, where outputs score well on the proxy but are actually worse.

Inference-time, not training-time

The key trait of best-of-N is that it changes nothing about the weights — it is pure inference-time selection. That makes it a useful baseline and a knob you can turn on any deployed model, trading more sampling for better answers. When the chosen best responses are instead fed back into training, the method becomes rejection sampling fine-tuning.

For self-hosters, best-of-N is the cheapest alignment lever available: it needs only a reward model and extra GPU cycles, no retraining. It depends on the same reward signal used to build a preference dataset.

In Simple Terms

Best-of-N (BoN) sampling is an inference-time alignment method: instead of returning the first response a model produces, you sample N candidates and use a reward…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners