Skip to content

Bitcoin accepted at checkout  |  Ships from Laval, QC, Canada  |  Expert support since 2016

DoRA (Weight-Decomposed Low-Rank Adaptation)

Sovereign AI

Definition

DoRA (Weight-Decomposed Low-Rank Adaptation) is a parameter-efficient fine-tuning method that improves on LoRA by first decomposing each pre-trained weight matrix into two parts: a magnitude vector and a direction matrix. The magnitude is trained directly, while the direction receives a low-rank update in the LoRA style. Introduced by NVIDIA researchers and accepted as an Oral paper at ICML 2024, DoRA was designed to close the gap between LoRA and full fine-tuning without adding any cost at inference time.

Why decompose the weights

The DoRA authors analysed how full fine-tuning and LoRA differ. They found that LoRA tends to change a weight's magnitude and direction together in lockstep, whereas full fine-tuning can adjust direction subtly while leaving magnitude largely intact. By separating the two, DoRA recovers that flexibility, improving both learning capacity and training stability. Because the decomposition can be merged back into the base weights after training, there is no extra latency when the model runs.

Practical relevance for sovereign builders

For anyone fine-tuning an open-weight model on local hardware, DoRA offers higher accuracy than vanilla LoRA at a similar memory budget, which matters when you are training on a single consumer GPU rather than a rented cluster. It reported gains over LoRA on commonsense reasoning and vision-language tasks. The trade-off is slightly higher training-time compute from the magnitude/direction bookkeeping, though inference is unchanged.

DoRA sits in the same toolbox as other adapter methods covered in our glossary, including the LoRA rank and LoRA alpha hyperparameters it inherits, and the broader family of soft prompt techniques for adapting frozen models.

In Simple Terms

DoRA (Weight-Decomposed Low-Rank Adaptation) is a parameter-efficient fine-tuning method that improves on LoRA by first decomposing each pre-trained weight matrix into two parts: a magnitude…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners