Definition
JSON mode is a generation setting that forces a language model to return output that parses as valid JSON. It exists because applications need machine-readable data, and free-form text wrapped in prose is fragile to extract. Running a local model behind your own tooling, JSON mode is what lets you treat the model as a structured data source rather than a chatbot, without the brittleness of regex-scraping its answers.
JSON mode vs. strict structured output
There is an important distinction. Basic JSON mode guarantees only that the output is syntactically valid JSON; it does not guarantee that required fields are present or that types are correct, and real-world testing has shown a small but nonzero schema-mismatch rate. Strict schema-bound structured output goes further: by enforcing a supplied JSON schema through constrained decoding, the model literally cannot emit a token that would violate the schema, so every required field appears and every type is correct.
How it is enforced
Under the hood, the stricter mode is grammar-constrained decoding applied to a JSON grammar derived from your schema. At each step, tokens that would break valid JSON, or violate the schema, are masked before sampling. This moves correctness from a post-hoc validation step to a guarantee baked into generation itself. The remaining work for the operator is designing a schema that is expressive enough to be useful but not so rigid that it degrades the model's reasoning.
For the general mechanism behind it, see Grammar-Constrained Decoding, and for the decoding step it sits on top of, see Greedy Decoding.
In Simple Terms
JSON mode is a generation setting that forces a language model to return output that parses as valid JSON. It exists because applications need machine-readable…
