Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Gated Linear Attention

Sovereign AI

Definition

Gated linear attention (GLA) is a refinement of linear attention that adds data-dependent gating to the model's recurrent state. Plain linear attention accumulates information into a fixed-size state matrix but has no principled way to forget; over long sequences this can blur or saturate the state. GLA introduces gates whose values depend on the input, letting the model decide how much of the existing state to keep and how much new information to write at each step. This gives it a controllable, decaying memory while preserving the linear-time, constant-space inference that makes linear attention attractive.

Parallel training, recurrent inference

Like other modern linear-recurrent models, GLA exhibits a sequential-parallel duality: it can be trained in a parallel form that uses the prefix-scan algorithm to process a whole sequence efficiently on a GPU, then deployed in a recurrent form that updates one fixed-size state per token. That second form is what gives it constant memory at inference, with no key-value cache that grows as context lengthens, making it well suited to serving long-context models on bounded hardware.

Where it sits

GLA belongs to the same generation of efficient architectures as gated linear RNNs, RetNet, and the state-space models, all of which combine linear recurrence for cheap training with data-dependent state updates for stronger recall. The shared theme is replacing quadratic attention with a gated, fixed-size memory that you can run economically and locally.

For the broader context, see linear attention, selective state space, and state space duality.

In Simple Terms

Gated linear attention (GLA) is a refinement of linear attention that adds data-dependent gating to the model’s recurrent state. Plain linear attention accumulates information into…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners