Definition
Optical Character Recognition (OCR) is the conversion of images of text, from scanned documents, photographs, or screenshots, into machine-encoded characters a computer can search, edit, and index. OCR has existed for decades, but modern systems are built on deep learning rather than the hand-written rules and template matching that earlier engines relied on.
From rules to neural networks
Classic OCR segmented an image into individual glyphs and matched each against a library of character shapes, which broke down on noisy scans, unusual fonts, or handwriting. Deep-learning OCR instead trains neural networks on large labeled datasets so the model learns the visual patterns of characters directly. These pipelines typically combine a detection stage that locates text regions with a recognition stage that transcribes them, often using a vision encoder followed by a sequence decoder.
Why it matters for sovereignty
OCR is the bridge between the paper world and a searchable digital archive. Open-source engines such as Tesseract and modern transformer-based recognizers let an individual digitize receipts, books, schematics, and records on their own hardware, with no document ever leaving the local machine. That self-hosted posture matters for anyone digitizing sensitive paperwork who refuses to upload it to a third-party cloud service.
OCR is closely related to broader image understanding. See our entries on visual question answering and the vision encoder that increasingly power document-understanding models.
In Simple Terms
Optical Character Recognition (OCR) is the conversion of images of text, from scanned documents, photographs, or screenshots, into machine-encoded characters a computer can search, edit,…
