Definition
FLOPS stands for floating-point operations per second and is the standard unit for expressing the raw arithmetic throughput of a processor. Modern AI accelerators are rated in TFLOPS (1012) or PFLOPS (1015), almost always measured on the dense matrix-multiply (GEMM) that dominates neural-network compute.
FLOPS versus FLOPs: a critical distinction
The case of the final letter carries real meaning. FLOPS (capital S) is a rate — operations per second — describing how fast hardware runs. FLOPs (lowercase s) is a count — the total number of floating-point operations a workload requires, with no time dimension. Training a large model takes some fixed number of FLOPs; the GPU cluster delivers some number of FLOPS. Confusing the two is a frequent source of error in capacity planning.
Peak versus sustained
Vendors quote peak (theoretical) FLOPS derived from clock speed, core count, and operations per cycle. Real workloads rarely hit it. Sustained throughput is commonly 30–60% of peak because of memory bandwidth limits, communication, and software overhead. Precision matters too: a chip's FP8 or BF16 rate can be many times its FP32 rate, so any FLOPS figure is meaningless without the precision attached.
For sovereign Bitcoiners running local AI inference, FLOPS is only half the story — memory bandwidth often decides real speed. Pair this with the roofline model and Model FLOPs Utilization to judge whether hardware is actually being used well.
In Simple Terms
FLOPS stands for floating-point operations per second and is the standard unit for expressing the raw arithmetic throughput of a processor. Modern AI accelerators are…
