Advanced OCR with OmniPage and FineReader - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Advanced OCR with OmniPage and FineReader

Description:

... change Edit easily Improve recognition Preferred Programs ABBYY FineReader Relatively easy to learn Fairly ... For foreign language, ... Greek. Greek will allow ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 22
Provided by: gae80
Category:

less

Transcript and Presenter's Notes

Title: Advanced OCR with OmniPage and FineReader


1
Advanced OCRwith OmniPage and FineReader
2
Overview
  • Optical character recognition
  • Structural recognition
  • Options
  • Loading
  • Zoning
  • OCR
  • Editing

3
Optical Character Recognition (OCR)
  • OCR turns pictures of text into e-text
  • Does well unless
  • The picture is fuzzy
  • The contrast is poor
  • The font is unusual
  • The font is too small or too large
  • The material has unusual characters

4
Structural Recognition
  • Analyzes the layout of the page
  • Columns
  • Headings
  • Graphics
  • Tables
  • Usually does fairly well, unless the layout is
    non-standard

5
Programs that Run OCR
  • Programs for consumers
  • Kurzweil 1000, 3000
  • OpenBook
  • Intel Reader
  • Many others
  • Programs for production
  • ABBYY FineReader
  • Nuance OmniPage

6
Consumer Programs
  • Highly automated
  • Designed for individuals who have print
    disabilities
  • Are not good production tools
  • Do not provide flexibility
  • Do not allow much overriding
  • Interfaces not designed for editing

7
Production Programs in General
  • A good program for production allows you to
  • Control the zones (areas or blocks of text and
    graphics)
  • Add, delete, change
  • Edit easily
  • Improve recognition

8
Preferred Programs
  • ABBYY FineReader
  • Relatively easy to learn
  • Fairly intuitive
  • Good structural recognition
  • Nuance OmniPage
  • Less intuitive but more accessible
  • Often does better with technical materials

9
Both Good Tools
  • If you can afford to have both, its nice, but
    not absolutely necessary.
  • If you have both, run a couple test pages through
    each to see which is doing better on a particular
    job.

10
Under the Hood
  • For best results with a program, set up your
    options before you begin!
  • Tools gt Options

11
Lots of Languages
  • FineReader and OmniPage handle multiple
    languages.
  • For foreign language, turn on all the languages
    in the book.
  • It will recognize the diacritical marks.
  • Turn on what you need, but only what you need.

12
Math
  • If you are running OCR on math, try turning on
    Greek.
  • Greek will allow the program to recognize alphas,
    deltas, sigmas, etc.

13
Another Decision
  • Detect page orientation or not?
  • Does not always get it right
  • Try it if you have many pages turned

14
Considerations
  • You may or may not want to keep headers and
    footers.
  • I generally keep them to pull the page numbers.
  • You may want to keep the page breaks.
  • Retaining page breaks helps to maintain
    one-to-one page correspondence with the book.

15
Fitting Everything
  • In some cases, you may need to work with a custom
    paper size to fit everything onto one page.
  • This feature can be helpful when you are
    retaining everything on the page but not the
    layout.

16
Loading Files
  • Open
  • Opens saved program files
  • Load
  • Loads image files to process
  • Note that this same issue comes up with saving!

17
Wizards Are Evil
  • Do not rely on the automation
  • Load the image file and choose the processes you
    want

18
Workspace
  • The program has three primary areas
  • Pages Pane
  • Either thumbnails or details
  • Allows simple navigation of pages
  • Image Pane
  • Your graphic
  • Text Pane
  • Area where the text from OCR will show

19
More Accessible
  • Both programs have a detail view.
  • Shows text instead of graphics
  • Detail view is more accessible for screen
    readers.
  • Otherwise, it is personal preference.

20
Two Ways to Save
  • To Save the program file to access later in the
    OCR program, choose File gt Save
  • This saves your work file.
  • You save your converted file during the last
    phase of the processing.

21
Production Tips
  • Work with dual monitors
  • Check your computer and video card
  • Stretching an OCR program across two monitors is
    a HUGE time-saver!
  • Learn to use keyboard shortcuts.
  • They save tons of time!
Write a Comment
User Comments (0)
About PowerShow.com