Definition
The encoder-decoder architecture is a neural network design built from two cooperating components: an encoder that compresses an input into an internal representation, and a decoder that expands that representation into an output. It is the foundation of sequence-to-sequence (seq2seq) learning, where the input and output can be sequences of different lengths — for example translating a sentence from one language to another.
How it works
The encoder processes the input one element at a time and folds the information into an internal state, often called a context vector or hidden state. The decoder then takes that representation and generates the output step by step, each step conditioned on what it has produced so far. Because the entire input is summarised before decoding begins, the architecture cleanly handles variable-length inputs and outputs that single-pass classifiers cannot.
Attention and modern variants
Early encoder-decoder models built on recurrent networks struggled to pack long inputs into one fixed-size context vector. The attention mechanism solved this by letting the decoder look back at all encoder states and focus dynamically on the most relevant parts at each step. This insight led directly to the transformer, which is itself an attention-based encoder-decoder (some models use only the encoder or only the decoder half). The pattern underpins machine translation, summarisation, speech recognition, and chatbots.
Encoder-decoder thinking also appears in generative models. For related architectures, see the Variational Autoencoder (VAE), which pairs an encoder and decoder for generation, and the Recurrent Neural Network (RNN) that powered the first seq2seq systems.
In Simple Terms
The encoder-decoder architecture is a neural network design built from two cooperating components: an encoder that compresses an input into an internal representation, and a…
