Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Inference

Sovereign AI

Definition

Inference is the stage at which a trained machine-learning model is actually used: it receives an input (such as a text prompt) and produces an output (a prediction or generated response). It is the "execution" phase, as opposed to training, which is the one-time "learning" phase where the model's weights are built from massive datasets. For sovereign, self-hosted AI, inference is the part that runs on your own hardware.

Forward Pass vs. Backward Pass

During inference, data flows through the network's layers in a single direction — the forward pass — to compute an output. Training additionally runs a backward pass that compares the output to a known answer and adjusts the weights. Because inference skips the backward pass and weight updates, it needs far less memory and compute than training, which is why a model that took a data center weeks to train can run on a laptop or even a phone.

Autoregressive Generation

For large language models, inference is autoregressive: each token is produced by one forward pass, appended to the running context, and fed back in to predict the next token. Throughput is commonly measured in tokens per second.

Local inference is the heart of self-hosting and air-gapped AI. Models are typically distributed as GGUF files for efficient on-device inference.

Estimate local inference in the inference cost calculator.

In Simple Terms

Inference is the stage at which a trained machine-learning model is actually used: it receives an input (such as a text prompt) and produces an…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners