Definition
LoRA rank, written as r, is the central hyperparameter of Low-Rank Adaptation. It sets the inner dimension of the two small matrices that LoRA inserts to approximate a weight update. Instead of training a full weight-update matrix, LoRA learns a product of a down-projection and an up-projection whose shared inner dimension is r, so the rank directly determines how many trainable parameters the adapter adds.
Choosing a rank
A higher rank gives the adapter more capacity to capture complex task-specific patterns, but it increases memory use and the risk of overfitting on small datasets. A lower rank is cheaper and often generalizes better when training data is limited. Common values run from about 8 or 16 for simpler tasks up to 128 or 256 for harder, data-rich scenarios. Because the adapter weights can be merged back into the base model after training, the chosen rank affects training cost but not inference latency.
Rank in practice
For someone fine-tuning an open-weight model on a single consumer GPU, rank is the main dial for balancing quality against the memory you actually have. A sensible workflow is to start low, confirm the task learns at all, then raise the rank only if validation results plateau below target. Rank rarely acts alone: it is paired with the scaling hyperparameter described under LoRA alpha, which is typically set relative to r.
LoRA rank also carries over to variants such as DoRA, which applies a LoRA-style low-rank update to the directional component of each weight.
In Simple Terms
LoRA rank, written as r, is the central hyperparameter of Low-Rank Adaptation. It sets the inner dimension of the two small matrices that LoRA inserts…
