Recurrent Neural Network (RNN)

Sovereign AI

A Recurrent Neural Network (RNN) is a class of neural network designed for sequential data, where connections loop back on themselves so the network can carry information from one step to the next. This recurrence gives RNNs a form of memory, making them suitable for time series, speech, and (historically) language modelling, where context from earlier in a sequence shapes the interpretation of what comes later. Before the transformer era, RNNs were the workhorse of sequence modelling — and understanding them explains a great deal about why modern architectures look the way they do.

Hidden state and sequence processing

At each time step an RNN computes a hidden state as a function of the current input and the previous hidden state. That hidden state acts as a running summary of everything seen so far; the state at the final step represents a context-aware encoding of the whole sequence. The same weights are applied at every step, which keeps the parameter count compact regardless of sequence length — an elegant property that also means an RNN's memory footprint during inference stays constant no matter how long the input grows, unlike a transformer whose attention cache grows with every token.

The vanishing gradient problem and LSTM

Training RNNs by backpropagation through time runs into the vanishing gradient problem: repeated multiplication of small gradients across many steps drives them toward zero, so the network struggles to learn long-range dependencies — by the time the error signal travels back a few hundred steps, it has effectively evaporated. The Long Short-Term Memory (LSTM) network addresses this by adding a separate cell state and gating mechanisms (input, forget, and output gates) that let information flow across many steps with an additive update, preserving gradients. Gated Recurrent Units (GRUs) offer a lighter-weight alternative with similar benefits. LSTMs powered a generation of practical systems — speech recognition, machine translation, keyboard prediction — through the mid-2010s.

Why transformers displaced them

The RNN's defining feature is also its bottleneck: each step depends on the previous one, so computation is inherently serial and cannot exploit the massive parallelism of modern GPUs during training. The transformer replaced recurrence with the attention mechanism, which looks at all positions in a sequence simultaneously — trivially parallel, and far better at connecting distant tokens. Combined with scale, that architectural shift produced the large language models of today. The trade was not free: attention costs grow quadratically with sequence length, which is exactly the pressure that has revived recurrent ideas. Modern state space models such as Mamba are, at heart, principled descendants of the RNN — linear-time sequence processing with a fixed-size state, redesigned so they can also be trained in parallel.

Relevance for local AI

There is a mining-adjacent footnote worth making: recurrence is the natural shape for the telemetry a fleet throws off. Hashboard temperatures, fan speeds, per-chain hashrate, and reject rates are all time series, and the failure modes that matter — a slowly clogging heatsink, a power supply drifting under load, a chain that degrades before it dies — announce themselves as patterns over time rather than single bad readings. Small recurrent or gated models are a proven, lightweight fit for exactly this kind of anomaly detection, and they run comfortably on modest hardware with no GPU at all. The same architecture that once powered translation engines is, at hobbyist scale, a sensible way to teach a monitoring script what "normal" sounds like for your particular machines and to flag the week when it changes.

For anyone running models on their own hardware, the RNN lineage matters practically. Constant-memory sequence processing is attractive on machines with limited VRAM, which is why hybrid and state-space architectures keep appearing in the open-weight ecosystem, and why small recurrent networks still earn their keep in wake-word detection and other edge audio tasks. For the architectural successors, see the encoder-decoder architecture that framed the RNN's greatest successes and the attention-based designs that followed.

A Recurrent Neural Network (RNN) is a class of neural network designed for sequential data, where connections loop back on themselves so the network can…

Explore the Full Glossary

Browse all Bitcoin mining terms from A to Z. Whether you are a beginner or expert, deepen your understanding of the mining ecosystem.

Mining Glossary

ASIC Miner Database

Compare 500+ miners with real-time profitability data, home mining scores, and detailed specs.

Compare Miners