Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

GPTQ

Sovereign AI

Definition

GPTQ (Generative Pre-trained Transformer Quantization) is a one-shot, post-training quantization method that compresses large language model weights down to 3 or 4 bits each while keeping accuracy close to the full-precision baseline. Introduced in a 2022 paper by Frantar et al., it was among the first techniques to reliably push LLMs below 8 bits, making it practical to run multi-billion-parameter models on a single consumer GPU. For sovereign Bitcoiners running local inference, this matters: smaller weights mean a capable model fits in the VRAM you already own, with no cloud dependency.

How it works

GPTQ is a weight-only method, meaning it quantizes the stored weights but leaves activations in higher precision at runtime. It processes each layer's weight matrix column by column, and after rounding each column it updates the remaining un-quantized columns to compensate for the error just introduced. This compensation uses approximate second-order (Hessian) information derived from a small calibration dataset, so the quantization error is actively cancelled rather than left to accumulate. The original work reported quantizing a 175-billion-parameter model in roughly four GPU-hours.

Where it fits

GPTQ is one of several competing approaches in the local-inference toolkit, alongside activation-aware methods and the block-wise schemes used by other runtimes. It is commonly used for GPU-served INT4 models and is well supported across inference frameworks. It does not require retraining the model, which keeps the barrier to entry low.

To understand the broader landscape, see our entries on LLM quantization and the closely related AWQ method.

See what fits at 4-bit in the GPU–LLM fit dataset.

In Simple Terms

GPTQ (Generative Pre-trained Transformer Quantization) is a one-shot, post-training quantization method that compresses large language model weights down to 3 or 4 bits each while…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners