Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Gradient Clipping

Sovereign AI

Definition

Gradient clipping constrains how large gradients can grow during backpropagation, guarding against the exploding-gradient problem. In deep networks, gradients are multiplied layer by layer as they flow backward; if their norms consistently exceed one, the product grows exponentially with depth and the resulting update can blow the model's weights to infinity or NaN in a single step.

Clipping by global norm

The most common form is clip-by-global-norm. The framework computes a single norm across all parameter gradients, and if that norm exceeds a chosen threshold, it scales every gradient down by the same ratio so the total norm equals the threshold. Because all gradients shrink proportionally, the direction of the combined update is preserved while only its length is capped. That direction encodes the steepest-descent path, which is why norm clipping is generally preferred over clipping each value independently.

Where it matters

Gradient clipping is standard practice when training recurrent networks, long-sequence models, and transformers, where backpropagation through many steps or layers can produce sudden gradient spikes. A typical threshold is 1.0, applied after gradients are computed but before the optimizer step. Frameworks expose it directly, for example as a clip-norm parameter, so it costs almost nothing to enable.

Clipping addresses gradients that grow too large; the opposite hazard of gradients shrinking to zero is handled by loss scaling in low-precision training. Both are routine safeguards layered on top of an optimizer state that already smooths updates over time.

In Simple Terms

Gradient clipping constrains how large gradients can grow during backpropagation, guarding against the exploding-gradient problem. In deep networks, gradients are multiplied layer by layer as…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners