Scanworx by Xerox

This document explains how to use the scanworx software which provides off-the-shelf OCR capability to desktop computing. Before starting the program be advised that this software is only licensed to operate on scribble which is a Sparc-10 running SunOs 4.1.4. The ScanWorX home directory is in the /fs/scanworx/ directory which is only a link to /fs/betelgeuse/scanworx directory. Just add the /fs/scanworx/bin to your path in order to use the scanworx software by inserting these lines in your .cshrc file:

set path=($path /fs/scanworx/bin)

then execute the scanworx software by typing in the following command at the shell prompt:

% scanworx

on the shell prompt the xterm will echo

Starting ScanWorX...

and the following window will appear.

After hitting the continue bar with the mouse you will see the first menu driven window, shown bellow.

The Help button has self explanatory information and is suggested to be used in cases where specific problems are encountered. This document will provide basic explanation of using this software.
The primary difference between a Doc and Group as is shown in the previous figure is that a Group is a collection of documents and while a document can be organized as a collection of images a group is a collection of documents.

For the first time users, the first thing to do is to create a new document. Click the NewDoc button and the following window appears.

It is obvious that a Untitled.dct file gets to save various OCR settings such as image file source and various OCR settings and you can use the File menu options to save the file to alternate names for future retrieval.

The main procedure at this stage is to define input source, process paramters, and output location. The specification of these functions can be done by a number of different combinations dependent on specific situation. For starters, lets assume that there exists TIFF formatted images to be OCRed. Then choose the preset option which is in Scan, Recognize, and Convert setting which presents you with a set of options shown below:

Choose the TIFF, Preview, Recognize, and Convert option so that the application can choose a TIFF image, show it to you, recognize the characters, and convert the results to text format. By pressing the Go button this process will proceed.

First a base filename is asked for by opening a window and allowing to search the disk for various file names. Type in a name in a desired path and hit the OK button. This name will in turn be used to generate the text output and be used to save the document settings. The text output name will be in the form of the name typed followed by a txt extension. After hitting the OK button, another file specification window opens asking for the location of the input TIFF image. Select the image you want and hit the OK button. If the TIFF image has the right TAGS a preview window will open showing the bitmap of the image that you specified. Not all TIFF files are equal and in the next paragraph I will say a little about it. When you are satisfied that this is the image which you would like to perform an OCR operation on, hit the RESUME button. The ScanWorX software will the go ahead and recognizes the image and converts its recognition output to text output. The image window should close without showing any error messages and return you to the original window. If you go to an xterm window and either edit or type out the file name specified by the basename and extended with a txt extension, you will see the OCR output. At this point you can save the document by going throught the File menu and taking appropriate actions.

Some usefull information

A key item to know about is the image file format. The ScanWorX software is rather picky about the processing of TIFF images. It is not guaranteed that all TIFF images are readable and processable by the ScanWorX software regardless of compression method. For example, University of Washington database images are in the wrong format and can not be readily input to the ScanWorX software for processing. Even the output of the xv software is not compatible. This significantly reduces the input selection to this package. However, the output of the scanner software (openscan), if selected as TIFF(G3) ot TIFF(G4), can be used for input to the ScanWorX application.

A very nice option available in the ScanWorX software is the ability to verify obscure words or marks found in the image. You need to check the verify box which is on the right side of the main window after depressing the NewDoc button of the first options bar. Note that the verify button is right below the feedback button and after depressing the verify button the feedback button gets grayes signaling that you may not perform that option. The result of this action is that after previewing the image a new window appears which allows for user intervention of ambiguous character recognition. An example of such a window is shown below.

The verifier's main purpose is not to correct mispellings or correct any incorrect recognition but to steer the recognition engine in the appropriate direction. In the example shown above, by associating the letter `g' to the text forces the OCR engine to modify its template to include that instance of `g' into its template. The process of template modification has several ramifications and will not be covered here but the set of buttons on the right side of the verifier window allows for several tasks including: change the recognition output of the current letter, skip the word without any modifications, associate the highlighted component with non-character entity, join the previous component with the current component, joint the next component with the current component, and others. Full detail of these tasks can be found by depressing the HELP button.

There are a couple of options that can be done while previewing the image. The head of the previewer is shown below:

Other than the obvious zooming capabilities, the `T' button allows the user to specify an image template, applicable to other images specified within the document structure, in form of text boxes so that the OCR engine may concentrate on performing OCR on these selected regions. After selecting at least two regions, the chain link at the right of the buttons gets enabled. By depressing the chain link you can specify the order that the OCR engine will take in converting the boxed regions. This is usefull in getting the OCR engine to perform recognition with verification on desired regions and after closing the verification, it would generalize on the remaining boxed regions.

There are a large number of other options available which are not covered in this document. The user is encouraged to experiment with the software and find out on their own the capabilities of the application. Any questions about this document should be sent to /dev/null

Last modified on Sat Mar 15 12:17:28 EST 1997
Created and maintained by