Photo from Unsplash
Originally Posted On: https://medium.com/@theodor.chichirita/ocr-solutions-30945c16328a
Today I want to talk about the available OCR solutions and what are the differences between the major players. Optical Character Recognition is been around for more than 100 years but only recently with the new capabilities of Artificial Intelligence, Pattern Recognition and Computer vision, the OCR capability went through a revolution.
Initially the technology was supposed as an aid to blind and visually impaired individuals. But soon legal and medical institutions acquired OCR capabilities to upload their paper records in a digital database.
The main solutions available at the moment are the following ones, and we will discuss the pro and cons of each of them:
- Tesseract
- Readiris
- Abby FineReader
- LEADTOOLS
- OmniPage
Tesseract
This is the OCR solution developed by Hewlett Packard and is being sponsored by Google since 2006. It is free software released under Apache License 2.0 and its available for different OS and has multiple language SDK wrappers. Originally was coded in C (1985–1994) and then got revamped to C++. It’s available for Windows, Linux and Mac OS X. In the version 4.0.0-beta.1 the languages available are 116 and 36 more from unofficial scripts. It supports multiple programming languages through wrapper such as C#, Java, Python and the models could be trained with the latest.
Readiris
Is the Canon OCR solution that gets bundled with a PDF publishing software. It’s available on Windows and MAC OS and has an intuitive GUI to interact with. Provides the ability to image-to-text-to-voice and has a numerous of output formats.
Abby FineReader
This is the OCR solution from Abby, running on Windows and currently on the version 14. Has a high % accuracy ratio and provides multiple features such as Document areas control, tables & chart extraction, verification tool, image pre-processing and the ability to train new characters.
LEADTOOLS
Is the OCR solution from LEAD, provides both a GUI or an SDK to develop customized tools. Supports the major features available and multiple programming languages such as C# and Java.
OmniPage
This is the OCR tool from Nuance, was one of the first OCR tool to run on PC developed in the late 1980s. It supports more than 120 languages and provides solution both for single customers and enterprises.
IronOCR
For .NET developers specifically, IronOCR deserves mention alongside these options. It wraps Tesseract with additional preprocessing and outputs to multiple formats including searchable PDFs.
While it shares the same limitation you mentioned regarding handwritten text recognition, it simplifies integration for C# projects with a clean API and NuGet installation. For teams already in the .NET ecosystem, it removes the friction of configuring Tesseract binaries across different environments.
All of the solutions are viable and have tens of years of development under their belt, but there is something that none of this can do better than the others (some can’t do it at all). This thing is the recognition of handwritten text. This, in my opinion, is the killer feature, and until someone provides an out of the box solution to address it, everything it’s on the same boat, and the success rate can fluctuate based on the use case scenario.