Reads text in the given image.
This tool uses the Tesseract Optical Character Recognition
(OCR) engine, which utilizes training data files for different
languages and scripts. The engine is able to recognize characters
and read texts with the character sets of the languages that these
data files are trained for. The English trained data file is included
in the component and used by default. This means that the output text
contains only characters used in English texts. In ambiguous cases,
it may also tend to return words belonging to English language rather
than those belonging to other languages having the same character
set. Data files for other languages are available in
https://github.com/tesseract-ocr/tessdata_fast. New data files can
also be trained and customized using Tesseract training tools.
To add a new data file, copy it in the directory
in the VisionAppster installation. This tool supports also the
compressed archive format used by Tesseract for the data files.
language parameter for selecting the data files to be
image – The input image.
The – segmentation mode used with the Tesseract engine.
Automatic Automatic page segmentation, but no OSD or OCR.
Automatic With OSD Automatic page segmentation with
orientation and script detection (OSD).
Automatic With OCR Fully automatic page segmentation, but no
Single Column Assume a single column of text of variable
Single Vertical Block Assume a single uniform block of
vertically aligned text.
Single Block Assume a single uniform block of text.
Single Line Treat the image as a single text line.
Single Word Treat the image as a single word.
Circled Word Treat the image as a single word in a circle.
Single Character Treat the image as a single character.
Sparse Text Find as much text as possible in no particular
Sparse Text With OSD Sparse text with orientation and script
Raw Line Treat the image as a single text line, bypassing
hacks that are Tesseract-specific.
The Tesseract engine generally recognizes only dark text on light
background. In this tool, the text reading is first attempted with
the original image and if the resulting confidence falls below
this value, a new attempt is made with an inverted image.
The result with the higher confidence is then returned.
language – Defines the language training data files to be
loaded for Tesseract engine. The name of the data file
excluding the file extension is used. Multile languages are
defined with a string of the form
hin+eng will load Hindi and English. Languages may
internally specify that they want to be loaded with one or more
other languages, so the ~ sign is available to override that.
hin was set to load
eng by default, then
would force loading only
hin. The number of loaded languages is
limited only by memory, with the caveat that loading additional
languages will impact both speed and accuracy, as there is more
work to do to decide the applicable language, and there is more
chance of hallucinating incorrect words.
languagePath – The path where the language training data
files are loaded from. The default path is a
pointing to the internal
resources folder of the installed
engine – The engine version that Tesseract uses. It has two
engines, the legacy
Tesseract engine and the new
recognizer engine. There is rarely a reason to change the default value.
In some extreme cases, using the
Tesseract Only value to force the old
engine version to be used may lead to better results.
Tesseract Only Use only the old Tesseract engine.
LSTM Only Use only the new LSTM line recognizer engine.
Combined Run the LSTM line recognizer but allow fallback to
the old Tesseract engine if it fails.
Default Allow the language specific configurations in the
data files to specify the used engine or if none is specified,
use the default one.
text – The recognized UTF-8 encoded text.
confidence – The average confidence of the recognized words
in the returned text in the scale [0,1].