Definition
A hyperparameter is a configuration value you set before training begins, which governs how a model learns rather than what it learns. Unlike the model's weights, which are adjusted automatically during training, hyperparameters are chosen by the person running the training and stay fixed throughout a given run. They are the knobs on the machine.
Examples
Common hyperparameters include the learning rate, the batch size, the number of training epochs, the strength of regularization, and architectural choices such as the number of layers or neurons. In fine-tuning workflows, the rank of a low-rank adapter is also a hyperparameter. Each of these shapes the trajectory of training without itself being learned from data.
Tuning and trade-offs
Because good hyperparameter values are rarely obvious in advance, practitioners search for them, sometimes by hand, sometimes with automated methods such as grid search, random search, or Bayesian optimization. The choices interact: a learning rate that works at one batch size may diverge at another. Poorly chosen hyperparameters are a frequent cause of both underfitting and overfitting, so disciplined tuning, evaluated against a held-out validation set, is one of the highest-leverage activities in self-hosted model training.
For related concepts, see our entries on the training epoch, regularization, and overfitting.
See how settings affect cost in the inference cost calculator.
In Simple Terms
A hyperparameter is a configuration value you set before training begins, which governs how a model learns rather than what it learns. Unlike the model’s…
