Pipeline Parallelism

Sovereign AI

Pipeline parallelism, also called vertical or inter-layer parallelism, splits a model by depth. The layers are divided into consecutive stages, and each stage lives on a different accelerator. A batch flows through the stages like an assembly line: device one computes the first block of layers, hands its activations to device two, and so on, with gradients flowing back the same path in reverse during the backward pass.

Keeping the pipeline busy

The obvious problem is idle hardware. While device one works on the first stage, the later devices have nothing to do, and vice versa. This idle time is called the pipeline bubble. The standard fix is to split each batch into smaller micro-batches and feed them in staggered, so multiple stages are working on different micro-batches at once. Schedules such as GPipe and 1F1B reduce the bubble further by interleaving forward and backward passes carefully.

Communication profile

Pipeline parallelism only needs to pass intermediate activations between adjacent stages, so its communication cost is relatively low compared to tensor parallelism. That makes it well suited to spanning across nodes where bandwidth is more limited, complementing tensor parallelism inside each node.

Pipeline parallelism is one leg of 3D parallelism. See Tensor Parallelism for splitting work inside a layer and Gradient Accumulation, which shares the micro-batch mechanics used to fill the pipeline bubble.

Pipeline parallelism, also called vertical or inter-layer parallelism, splits a model by depth. The layers are divided into consecutive stages, and each stage lives on…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners

Pipeline Parallelism

Definition

Keeping the pipeline busy

Communication profile

In Simple Terms

Explore the Full Glossary

ASIC Miner Database