Definition
A State Space Model (SSM) is a sequence-modelling approach borrowed from control theory, in which a hidden state evolves over time according to linear dynamics and produces outputs at each step. In deep learning, SSMs have become a competitive alternative to the transformer for long sequences because they can process them in roughly linear time and memory, rather than the quadratic cost of attention. Mamba is the architecture that brought this approach to the forefront for large-scale language modelling.
Selective state spaces
Classic SSMs use fixed transition matrices, which limits their ability to reason about content. Mamba introduces a selective SSM in which the key matrices governing how the state is updated and read out become functions of the input itself. This input-dependent parameterisation lets the model decide, token by token, what to remember, suppress, or forget — recovering the content-based reasoning that made attention so powerful, but without attention's quadratic blow-up.
Hardware-aware design
Because the selective formulation cannot use the simple convolution shortcut of earlier SSMs, Mamba relies on a hardware-aware parallel associative scan tuned for modern GPUs, keeping training fast and inference efficient. The payoff is strong throughput and the ability to handle very long contexts on a fixed memory budget — attractive properties for anyone running models locally on constrained or sovereign hardware.
SSMs sit alongside transformers and recurrent networks in the modern architecture landscape. For the lineage they grew out of, see the Recurrent Neural Network (RNN), and for the broader class of large pretrained systems, the foundation model.
In Simple Terms
A State Space Model (SSM) is a sequence-modelling approach borrowed from control theory, in which a hidden state evolves over time according to linear dynamics…
