Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Knowledge Distillation

Sovereign AI

Definition

Knowledge distillation is a model-compression technique that transfers the behaviour of a large, capable teacher model into a smaller, cheaper student model. Instead of training the student only on hard labels (the single correct answer), it is trained to reproduce the teacher's full output distribution — the so-called soft labels or probabilities. Those soft targets carry richer information about how the teacher relates classes or tokens to one another, so the student often reaches accuracy far above what its size alone would predict.

Why it matters for self-hosters

For a sovereign Bitcoiner running models on local hardware, distillation is what makes a 7-billion-parameter model worth running at all. A distilled small model can capture much of a frontier model's competence while fitting in the VRAM of a single consumer GPU — no API, no cloud account, no telemetry leaving your network. Many of the popular small open-weight models you can run locally were produced or refined with distillation as part of the pipeline.

How the training works

The student minimises a loss that measures the gap between its predictions and the teacher's soft targets, commonly using KL divergence. Variants also align intermediate feature representations (feature-based distillation) or the relationships between examples (relation-based distillation), not just the final outputs. The teacher stays frozen; only the student learns.

Distillation is closely related to other shrink-to-fit techniques you will encounter when running models on your own metal. See quantization for reducing numerical precision, and parameter count for what model size actually measures.

In Simple Terms

Knowledge distillation is a model-compression technique that transfers the behaviour of a large, capable teacher model into a smaller, cheaper student model. Instead of training…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners