Definition
High-Bandwidth Memory (HBM) is a JEDEC-standard memory technology that stacks DRAM dies vertically and connects them to the processor over an extremely wide interface. Mounted on the same package as an AI GPU or accelerator, HBM delivers far more memory bandwidth than conventional GDDR memory, which is exactly what data-hungry neural networks need.
Generations and bandwidth
The standard has advanced quickly. HBM3, published by JEDEC in January 2022, reaches data rates up to 6.4 Gbps for about 819 GB/s per stack across a 1024-bit interface. HBM3E, finalized in 2023, pushes pin speeds toward 9.8 Gbps and roughly 1.2 TB/s per stack. HBM4, standardized in 2025, widens the interface to 2048 bits with bandwidth above 2.0 TB/s per stack. Each generation chips away at the "memory wall" that otherwise leaves compute cores idle.
Why it matters
In both training and inference, performance is frequently limited by how fast weights and activations can be moved into the compute units rather than by raw arithmetic. HBM exists to relieve that bottleneck, which is why it is reserved for high-end Tensor Core GPUs and data-centre accelerators — and why HBM supply has become a strategic chokepoint for the whole AI industry.
Understanding HBM helps explain why top-tier AI hardware commands such a premium when you are sourcing a rig for on-premise AI.
In Simple Terms
High-Bandwidth Memory (HBM) is a JEDEC-standard memory technology that stacks DRAM dies vertically and connects them to the processor over an extremely wide interface. Mounted…
