PEFT (Parameter-Efficient Fine-Tuning)

Sovereign AI

Parameter-Efficient Fine-Tuning (PEFT) is an umbrella term for techniques that specialise a large language model to a new task or domain by training only a tiny fraction of its parameters — often well under 1% — while keeping the original pretrained weights frozen. This sidesteps the enormous memory and compute cost of full fine-tuning, where every weight is updated, and makes adaptation feasible on modest hardware that a self-hoster might actually own. If full fine-tuning is remanufacturing an engine, PEFT is bolting on a precisely machined part: the base stays intact, and the modification is small, cheap, and removable.

Why full fine-tuning is out of reach

Updating every weight of a multi-billion-parameter model requires holding not just the weights but their gradients and optimizer states in memory — typically several times the model's own size — which puts full fine-tuning of even mid-sized open models beyond a single consumer GPU's VRAM. It also produces a complete new copy of the model per task, gigabytes each, which quickly becomes unmanageable. PEFT attacks both problems at once: frozen base weights need no optimizer state, and the trained artefact is only the small set of added parameters.

The main PEFT approaches

Several families fall under PEFT. LoRA (low-rank adaptation) injects small low-rank matrices alongside the frozen weight matrices, learning the update to each layer rather than the layer itself; at inference the update can be merged into the base weights with no speed penalty. Adapters insert lightweight bottleneck modules between transformer layers. Prefix- and prompt-tuning instead learn a small set of continuous vectors prepended to the input rather than touching the network's weights at all. Studies report that adapter and LoRA-style methods can match full fine-tuning accuracy on many tasks while reducing trainable parameters by over 95% — a result that reshaped how the open-model community customises models.

Why it matters for sovereignty

PEFT is what makes running and customising your own model practical without a data-centre. Because only the small added weights are trained, the resulting artefact is a few megabytes that can be shared, swapped, and version-controlled independently of the multi-gigabyte base model — a good fit for local, self-custodied AI. A home-lab machine can hold one base model and a shelf of task-specific adapters: one tuned on your documentation, one for code, one for a domain like ASIC repair notes, loaded and unloaded as needed. That modularity is the practical difference between "AI you rent" and "AI you own": the base model is a commodity you download once, and the intelligence you add stays on your disk, in your format, under your control.

Combining PEFT with compression

PEFT has quieter virtues, too. Because the base weights are frozen, it largely sidesteps catastrophic forgetting — the tendency of full fine-tuning to overwrite general capability while learning a narrow task — and an adapter that turns out badly is simply deleted, with the base model untouched. Serving-side, one loaded base model can host multiple adapters switched per request, which is how a single modest GPU serves several specialised assistants at once. The discipline that remains essential is evaluation: a small adapter can still skew a model's behaviour in unexpected ways, so test against your real tasks before trusting the result.

PEFT methods combine naturally with model compression: QLoRA, for instance, applies LoRA-style training on top of a quantized base model, pushing fine-tuning of surprisingly large models onto a single consumer GPU. The result is a lighter, modular alternative to traditional full fine-tuning — and for knowledge that changes frequently, it pairs well with retrieval approaches like RAG rather than competing with them: tune for style and skill, retrieve for facts.

Parameter-Efficient Fine-Tuning (PEFT) is an umbrella term for techniques that specialise a large language model to a new task or domain by training only a…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners