Definition
The roofline model is a simple, visual performance model that bounds the maximum throughput a workload can achieve on a given processor using just two hardware limits: peak compute (FLOPS) and peak memory bandwidth (GB/s). Introduced by Williams, Waterman, and Patterson in 2009, it has become the standard way to reason about whether AI hardware is being used well.
How to read the chart
The horizontal axis is operational intensity (operations per byte of memory traffic); the vertical axis is attainable performance (FLOPS). The plot has two parts that form a "roofline": a sloped line on the left, where performance rises with intensity and is capped by memory bandwidth, and a flat line on the right, where performance saturates at the chip's peak compute rate. The corner where they meet is the ridge point.
What it tells you
Plot a workload's operational intensity and you immediately see its ceiling. Land left of the ridge point and the workload is memory-bound — faster compute will not help, but better data reuse or more bandwidth will. Land right of it and the workload is compute-bound — only faster math units or lower precision will help. The vertical gap between your measured point and the roof shows exactly how much headroom remains.
The roofline turns abstract specs into an actionable picture, which is why it is the right lens for sizing a local AI box. See operational intensity and the compute-bound vs memory-bound distinction to apply it.
In Simple Terms
The roofline model is a simple, visual performance model that bounds the maximum throughput a workload can achieve on a given processor using just two…
