Definition
A Recurrent Neural Network (RNN) is a class of neural network designed for sequential data, where connections loop back on themselves so the network can carry information from one step to the next. This recurrence gives RNNs a form of memory, making them suitable for time series, speech, and (historically) language modelling, where context from earlier in a sequence shapes the interpretation of what comes later.
Hidden state and sequence processing
At each time step an RNN computes a hidden state as a function of the current input and the previous hidden state. That hidden state acts as a running summary of everything seen so far; the state at the final step represents a context-aware encoding of the whole sequence. The same weights are applied at every step, which keeps the parameter count compact regardless of sequence length.
The vanishing gradient problem and LSTM
Training RNNs by backpropagation through time runs into the vanishing gradient problem: repeated multiplication of small gradients across many steps drives them toward zero, so the network struggles to learn long-range dependencies. The Long Short-Term Memory (LSTM) network addresses this by adding a separate cell state and gating mechanisms that let information flow across many steps with an additive update, preserving gradients. Gated Recurrent Units (GRUs) offer a lighter-weight alternative with similar benefits.
RNNs and LSTMs dominated sequence modelling until transformers and, more recently, state space models offered better parallelism and longer effective context. For those successors, see our entries on the State Space Model (Mamba) and the encoder-decoder architecture.
In Simple Terms
A Recurrent Neural Network (RNN) is a class of neural network designed for sequential data, where connections loop back on themselves so the network can…
