Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

EXL2 (ExLlamaV2 format)

Sovereign AI

Definition

EXL2 is the quantization format used by ExLlamaV2, a fast inference library for running large language models on consumer-class GPUs. It builds on the same optimization ideas as earlier second-order methods but adds fine-grained mixed precision: a single model can blend 2, 3, 4, 5, 6, and 8-bit weights to reach essentially any target average bitrate between 2 and 8 bits per weight. For a sovereign operator with a fixed amount of VRAM, this means dialing in the exact size-versus-quality tradeoff your card can hold rather than being stuck with rigid bit-width tiers.

Per-layer and per-column precision

EXL2 does not apply one uniform bit-width across the whole model. It allocates more bits to layers and even individual weight columns that are sensitive to precision loss, and fewer bits where the model tolerates aggressive compression. The result is something close to sparse quantization, where the most important weights are stored at higher precision within an otherwise heavily compressed tensor. This per-layer optimization tends to give better quality than uniform quantization at the same average bitrate.

What it enables

The format is designed for GPU inference. Documented examples include running a 70-billion-parameter model on a single 24 GB card at around 2.55 bits per weight, and fitting 13B models into 8 GB of VRAM at roughly 2.65 bits. That makes capable local models reachable on hardware many enthusiasts already own.

EXL2 is one of several local formats; for the block-wise CPU/GPU alternative see GGUF, and for grounding see LLM quantization.

Pick a bitrate for your VRAM in the GPU–LLM fit dataset.

In Simple Terms

EXL2 is the quantization format used by ExLlamaV2, a fast inference library for running large language models on consumer-class GPUs. It builds on the same…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners