Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Transformer

Sovereign AI

Definition

The Transformer is a neural network architecture introduced in the 2017 paper Attention Is All You Need by Vaswani and colleagues at Google. It dispenses with the recurrence and convolutions used by earlier sequence models and relies entirely on self-attention to relate every token to every other token. Because attention can be computed in parallel across a whole sequence, Transformers train far faster on modern hardware than the recurrent networks they replaced, which is why they underpin nearly every large language model a sovereign operator might run locally.

How a Transformer is built

The original design is an encoder-decoder stack, though most generative LLMs use a decoder-only variant. Each layer contains two sub-blocks: a multi-head self-attention block and a position-wise feed-forward network. Every sub-block is wrapped in a residual connection and a layer normalization step, which keep gradients stable as depth grows. Because attention itself is order-agnostic, the model needs a positional encoding to know where each token sits in the sequence.

Why it matters for sovereignty

Understanding the Transformer is the entry point to running models you control rather than renting them. Architectural choices like grouped-query attention directly determine how much VRAM a model needs, which decides whether a given model fits on hardware you own.

For practical deployment of these models on your own hardware, see our work on self-hosted inference, and explore related entries such as self-attention and backpropagation to understand how Transformers learn.

In Simple Terms

The Transformer is a neural network architecture introduced in the 2017 paper Attention Is All You Need by Vaswani and colleagues at Google. It dispenses…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners