Definition
A reasoning model, sometimes called a large reasoning model (LRM) or reasoning language model, is a language model specifically trained to work through complex problems in explicit intermediate steps before committing to an answer. Rather than responding immediately, it generates a long internal chain of reasoning, breaks the problem into smaller parts, explores candidate solutions, and checks itself, then emits a final answer. OpenAI's o1 series and DeepSeek-R1 are the models that popularized the category.
What distinguishes it from a standard LLM
A conventional chat model can be coaxed into step-by-step reasoning through prompting, but a reasoning model is trained to do it by default, typically using large-scale reinforcement learning that rewards reaching correct conclusions. The reasoning tokens it produces are the visible trace of that deliberation. This yields markedly stronger performance on logic, mathematics, and programming, where a single misstep early on derails the whole solution. The cost is latency and token spend: longer thinking means more compute per answer.
When it is the right tool
Reasoning models earn their overhead on genuinely multi-step problems and waste it on simple retrieval or formatting tasks, where a standard model answers faster and cheaper. DeepSeek-R1 also demonstrated that this capability can be trained efficiently and released openly, which matters for anyone who wants to run a capable reasoner on hardware they control rather than renting it.
The long reasoning a reasoning model emits is closely tied to test-time compute, and the same behavior can be partially elicited from base models via Chain-of-Thought decoding.
In Simple Terms
A reasoning model, sometimes called a large reasoning model (LRM) or reasoning language model, is a language model specifically trained to work through complex problems…
