Scanworx by Xerox
This document explains how to use the scanworx software which provides
off-the-shelf OCR capability to desktop computing. Before starting the
program be advised that this software is only licensed to operate on
scribble which is a Sparc-10 running SunOs 4.1.4. The ScanWorX home
directory is in the /fs/scanworx/ directory which is only a link to
/fs/betelgeuse/scanworx directory. Just add the /fs/scanworx/bin to
your path in order to use the scanworx software by inserting these
lines in your .cshrc file:
set path=($path /fs/scanworx/bin)
then execute the scanworx software by typing in the following command
at the shell prompt:
on the shell prompt the xterm will echo
and the following window will appear.
After hitting the continue bar with the mouse you will see the first menu
driven window, shown bellow.
The Help button has self explanatory information and is suggested to be used
in cases where specific problems are encountered. This document will provide
basic explanation of using this software.
The primary difference between a Doc and Group as is shown in the previous
figure is that a Group is a collection of documents and while a document can
be organized as a collection of images a group is a collection of documents.
For the first time users, the first thing to do is to create a new document.
Click the NewDoc button and the following window appears.
It is obvious that a Untitled.dct file gets to save various OCR settings
such as image file source and various OCR settings and you can use the File
menu options to save the file to alternate names for future retrieval.
The main procedure at this stage is to define input source, process paramters,
and output location. The specification of these functions can be done by a number
of different combinations dependent on specific situation. For starters, lets
assume that there exists TIFF formatted images to be OCRed. Then choose the
preset option which is in Scan, Recognize, and Convert setting which
presents you with a set of options shown below:
Choose the TIFF, Preview, Recognize, and Convert option so that
the application can choose a TIFF image, show it to you, recognize the
characters, and convert the results to text format. By pressing the
Go button this process will proceed.
First a base filename is asked for by opening a window and allowing to
search the disk for various file names. Type in a name in a desired path
and hit the OK button. This name will in turn be used to generate the
text output and be used to save the document settings. The text output
name will be in the form of the name typed followed by a txt
extension. After hitting the OK button, another file specification window
opens asking for the location of the input TIFF image. Select the image
you want and hit the OK button. If the TIFF image has the right TAGS
a preview window will open showing the bitmap of the image that you
specified. Not all TIFF files are equal and in the next paragraph I will
say a little about it. When you are satisfied that this is the image
which you would like to perform an OCR operation on, hit the RESUME
button. The ScanWorX software will the go ahead and recognizes the image
and converts its recognition output to text output. The image window
should close without showing any error messages and return you to the
original window. If you go to an xterm window and either edit or type out
the file name specified by the basename and extended with a txt extension,
you will see the OCR output. At this point you can save the document by
going throught the File menu and taking appropriate actions.
Some usefull information
A key item to know about is the image file format. The ScanWorX software
is rather picky about the processing of TIFF images. It is not guaranteed
that all TIFF images are readable and processable by the ScanWorX software
regardless of compression method. For example, University of Washington
database images are in the wrong format and can not be readily input to
the ScanWorX software for processing. Even the output of the xv software
is not compatible. This significantly reduces the input selection to
this package. However, the output of the scanner software (openscan),
if selected as TIFF(G3) ot TIFF(G4), can be used for input to the
A very nice option available in the ScanWorX software is the ability to
verify obscure words or marks found in the image. You need to check the
verify box which is on the right side of the main window after depressing
the NewDoc button of the first options bar. Note that the verify button is
right below the feedback button and after depressing the verify button the
feedback button gets grayes signaling that you may not perform that option.
The result of this action is that after previewing the image a new window
appears which allows for user intervention of ambiguous character recognition.
An example of such a window is shown below.
The verifier's main purpose is not to correct mispellings or correct any
incorrect recognition but to steer the recognition engine in the appropriate
direction. In the example shown above, by associating the letter `g' to the
text forces the OCR engine to modify its template to include that instance of
`g' into its template. The process of template modification has several
ramifications and will not be covered here but the set of buttons on the right
side of the verifier window allows for several tasks including: change the
recognition output of the current letter, skip the word without any modifications,
associate the highlighted component with non-character entity, join the
previous component with the current component, joint the next component with
the current component, and others. Full detail of these tasks can be found by
depressing the HELP button.
There are a couple of options that can be done while previewing the image.
The head of the previewer is shown below:
Other than the obvious zooming capabilities, the `T' button allows the
user to specify an image template, applicable to other images specified
within the document structure, in form of text boxes so that the OCR engine
may concentrate on performing
OCR on these selected regions. After selecting at least two regions, the
chain link at the right of the buttons gets enabled. By depressing the
chain link you can specify the order that the OCR engine will take in
converting the boxed regions. This is usefull in getting the OCR engine to
perform recognition with verification on desired regions and after closing
the verification, it would generalize on the remaining boxed regions.
There are a large number of other options available which are not covered
in this document. The user is encouraged to experiment with the software
and find out on their own the capabilities of the application. Any questions
about this document should be sent to /dev/null
Last modified on Sat Mar 15 12:17:28 EST 1997
Created and maintained by