Definition
NVLink is NVIDIA's proprietary high-speed interconnect for direct GPU-to-GPU communication. When a model is too large to fit on a single accelerator, the GPUs must constantly exchange data; NVLink provides a dedicated, low-latency path between them that is far wider than the standard PCIe bus, keeping multi-GPU systems from stalling on data movement.
Bandwidth across generations
NVLink has scaled steadily: the first generation offered 160 GB/s of bidirectional bandwidth on the Tesla P100, the A100's third generation reached 600 GB/s, the H100's fourth generation hit 900 GB/s across 18 links, and the fifth generation in Blackwell GPUs delivers 1.8 TB/s per GPU. For comparison, a PCIe Gen 5 x16 slot tops out near 63 GB/s bidirectional — an order of magnitude less.
Why it matters
NVLink is what makes a tray of GPUs behave more like one large accelerator, and it is central to training and serving the biggest models. It is also a vendor-specific technology: relying on it ties a cluster tightly to NVIDIA's ecosystem, reinforcing the same centralization concerns that surround CUDA.
For most self-hosters running a single-card local LLM, NVLink is irrelevant; it becomes important only when scaling out a multi-GPU distributed compute node.
In Simple Terms
NVLink is NVIDIA’s proprietary high-speed interconnect for direct GPU-to-GPU communication. When a model is too large to fit on a single accelerator, the GPUs must…
