are many applications in which the automatic detection and recognition
of text embedded in images is useful. These applications include
digital libraries, multimedia systems, Information Retrievial
Systems, and Geographical Information Systems. When machine generated
text is printed against clean backgrounds, it can be converted
to a computer readable form (ASCII) using current Optical Character
Recognition (OCR) technology. However, text is often printed against
shaded or textured backgrounds or is embedded in images. Examples
include maps, advertisements, photographs, videos and stock certificates.
Current document segmentation and recognition technologies cannot
handle these situations well. In this paper, a four-step system
which automatically detects and extracts text in images is proposed.
First, a texture segmentation scheme is used to focus attention
on regions where text may occur. Second, strokes are extracted
from the segmented text regions. Using reasonable heuristics on
text strings such as height similarity, spacing and alignment,
the extracted strokes are then processed to form rectangular boxes
surrounding the corresponding text strings. To detect text over
a wide range of font sizes, the above steps are first applied
to a pyramid of images generated from the input image, and then
the boxes formed at each resolution level of the pyramid are fused
at the image in the original resolution level. Third, text is
extracted by cleaning up the background and binarizing the detected
text strings. Finally, better text bounding boxes are generated
by using the binarized text as strokes. Text is then cleaned and
binarized from these new boxes. If the text is of an OCR-recognizable
font, it is passed through a commercial OCR engine for recognition.
The system is stable, robust, and works well on images (with or
without structured layouts) from a wide variety of sources, including
digitized video frames, photographs, newspapers, advertisements,
stock certificates, and personal checks. All parameters remain
the same for all the experiments.
is joint work with Victor Wu and Edward Riseman.