Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Residual Connection

Sovereign AI

Definition

A residual connection, also called a skip connection, is a shortcut that adds a layer's input directly to its output before passing the result on. Introduced in the 2015 ResNet paper by He and colleagues, it was the breakthrough that made networks hundreds of layers deep trainable. Instead of forcing each block to learn a full transformation, the block only has to learn the residual, the difference from the identity, which is a far easier target.

Why deep models need it

When errors are propagated backward through many layers during backpropagation, the gradient signal tends to shrink toward zero, the vanishing-gradient problem. A residual connection gives that signal a clean additive path straight to earlier layers, so it survives the journey through a deep stack. This is what lets a Transformer stack dozens of attention and feed-forward blocks and still train successfully.

In the Transformer block

Every sub-block in a Transformer, both the self-attention and the feed-forward network, is wrapped in a residual connection paired with layer normalization. The pattern is simple but indispensable: output equals input plus the normalized sub-block result. Remove the residual paths and a deep Transformer fails to converge at all.

For the bigger picture of how these pieces assemble into a working model, see Transformer.

In Simple Terms

A residual connection, also called a skip connection, is a shortcut that adds a layer’s input directly to its output before passing the result on.…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners