Definition
Special tokens are reserved entries in a tokenizer's vocabulary that carry structural or control meaning instead of representing literal text. They tell a language model where a sequence begins and ends, where to stop generating, how to pad a batch, and — in chat models — who is speaking. Without them, a model has no reliable way to distinguish a system instruction from a user message, or to know when to halt.
The common set
The classic trio is BOS (beginning of sequence), EOS (end of sequence), and PAD (padding). BOS signals the start of generation; EOS marks completion and is the cue that stops decoding; PAD fills shorter sequences so every example in a batch has equal length, with an attention mask ensuring the model ignores the filler. Many tokenizers also reserve an UNK (unknown) token, though byte-level schemes rarely need it. Chat-tuned models add role and turn markers such as control tokens that delimit user and assistant turns.
Why correct handling matters
Special tokens are a frequent source of subtle bugs. Adding them twice — for example, applying a chat template that already inserts them and then re-tokenizing with automatic insertion enabled — produces duplicate markers that degrade output quality. Conversely, omitting an EOS during fine-tuning teaches a model never to stop. Anyone running models locally for sovereignty reasons should inspect exactly which special tokens their tokenizer injects.
Special tokens occupy reserved slots in the tokenizer vocabulary and are the building blocks that a chat template arranges into a structured conversation.
In Simple Terms
Special tokens are reserved entries in a tokenizer’s vocabulary that carry structural or control meaning instead of representing literal text. They tell a language model…
