Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Activation Recomputation

Sovereign AI

Definition

Activation recomputation, also called gradient or activation checkpointing, is a memory-saving technique that trades extra computation for reduced memory during training. Backpropagation normally needs the intermediate activations from every layer's forward pass in order to compute gradients, and for deep models or long sequences those stored activations can consume more memory than the model's own weights.

Save now, recompute later

Instead of keeping every activation, the technique stores only a sparse subset, the checkpoints, and discards the rest. During the backward pass, whenever a missing activation is needed, it is regenerated on the fly by re-running the forward computation from the nearest stored checkpoint. For a network of L layers, checkpointing roughly every √L layers can cut activation memory from order-L down to order-√L.

The compute trade-off

The price is that the forward computation for non-checkpointed layers runs twice, once during the original forward pass and again during backward regeneration. In practice this typically adds something like 20% to 30% to training time. In exchange, the freed memory lets you fit a larger model or, often more valuably, a larger batch size on the same GPU, which can improve throughput and stability elsewhere.

Activation recomputation is a core tool for training large models on limited hardware and stacks cleanly with other memory savers. It is frequently combined with ZeRO-Offload / CPU offload and reduced-precision formats like BF16 to make ambitious training runs feasible on modest, self-owned rigs.

In Simple Terms

Activation recomputation, also called gradient or activation checkpointing, is a memory-saving technique that trades extra computation for reduced memory during training. Backpropagation normally needs the…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners