Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

Tensor Parallelism

Sovereign AI

Definition

Tensor parallelism, also known as horizontal or intra-layer parallelism, divides the math inside a single layer across multiple accelerators. Instead of giving each device a copy of the whole model, the large weight matrices of a layer are sliced column-wise or row-wise, each device computes its partial result, and the pieces are combined. This lets a single layer that would never fit on one accelerator run across several.

How the work is split

In a transformer, the attention and feed-forward matrix multiplications are the natural targets. A weight matrix is partitioned so each device multiplies its slice against the activations, then an all-reduce or all-gather stitches the outputs back together before the next layer. Because this synchronization happens inside every layer, tensor parallelism carries the highest communication overhead of the common strategies and is most effective within a single high-bandwidth node where devices are tightly interconnected.

Where it fits

Tensor parallelism rarely stands alone at scale. It is typically combined with pipeline parallelism (splitting layers across devices) and data parallelism (replicating the whole stack) into what practitioners call 3D parallelism, the layout used to train the largest models. The rule of thumb is to keep tensor parallelism inside a node and use the lower-bandwidth strategies across nodes.

Tensor parallelism is one of the three core axes of distributed training. Compare it with Pipeline Parallelism, which splits the model by layer, and Data Parallelism, which replicates the model and splits the data.

In Simple Terms

Tensor parallelism, also known as horizontal or intra-layer parallelism, divides the math inside a single layer across multiple accelerators. Instead of giving each device a…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners