logo

Tesseract OCR

Text generation

Tesseract was originally developed at Hewlett-Packard Laboratories Bristol UK and at Hewlett-Packard Co, Greeley Colorado USA between 1985 and 1994, with some more changes made in 1996 to port to Windows, and some C++izing in 1998. In 2005 Tesseract was open sourced by HP. From 2006 until November 2018 it was developed by Google (Source)


Intended Use

This model uses Tesseract (https://github.com/tesseract-ocr/tesseract) for OCR, and writes the output as a text annotation.


Performance

Tesseract works best with straight, well scanned text. For text in the wild, handwriting and other use cases, other models should be used.


Citation

https://github.com/tesseract-ocr/tesseract