Precision Micrographics and Imaging, Inc.

Frequently Asked Questions about OCR

How does OCR work?

Here is the process our OCR software uses to convert your documents to text

  1. First the page formatting is detected and the document is divided into zones.
  2. Next the zones are classified as either text, image or table.
  3. The text zones are processed first.
  4. The text is broken down into letters, and compared to a list of fonts to identify the character.
  5. Once all the text zones have been recognized, each is run through a spell check to fix an characters that could not be identified.
  6. Tables are processed next cell by cell.
  7. Document formatting is apply to the text, tables and images
  8. The new text document is saved.

This whole process is completed in a few seconds by the OCR software.

Why should I use OCR on my documents?

OCR is commonly used to accomplish one of three goals:

What kinds of documents can be processed?

You can use OCR to extract text of any kind of document, but processes works best on clean B&W documents. Photo duplicated or faxed documents are never as clear as the original and consequently are difficult to process accurately. Small fonts can also be problematic. As a general rule, if you can't read it, neither will our OCR software.

What is Paper Capture?

Adobe's PDF format uses a technique called 'paper capture' to make your PDF documents searchable. The process is OCR, but the display works a little differently. The PDF can display the original image, and put the OCR text 'behind' it. This makes the document look exactly like the original and allows users to search for text within or highlight text on the document. This is the most common kind of OCR that we perform.Pricing can be found here.

Where can I get more information?

If you need more information, just give us a call at 512-832-6602 or send an e-mail to info@imagescan.com

8204 North Lamar • Suite C-20 • Austin, Tx 78753 • 512-832-6602 • info@imagescan.com