OPTICAL CHARACTER RECOGNITION - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

OPTICAL CHARACTER RECOGNITION

Description:

... provide a useful test-bed for checking the other ... 13 18 months: Developing a training set having different font styles and sizes in Devanagari ... – PowerPoint PPT presentation

Number of Views:248
Avg rating:3.0/5.0
Slides: 20
Provided by: stude1456
Category:

less

Transcript and Presenter's Notes

Title: OPTICAL CHARACTER RECOGNITION


1
OPTICAL CHARACTER RECOGNITION
  • A proposal from IIIT Allahabad

2
Project Title
  • Development of OCR technologies with
  • application to various Indian and Tibetan
  • script.

3
Objectives
  • The two major objectives are
  • To develop a toolset that will allow rapid
    prototyping of OCRs will allow experimentation
  • To investigate the use of Gabor filters for
    feature extraction and test its applicability on
    various Indian and Tibetan scripts allows
    robust handling of degardation

4
Modules foreseen
  • Annotation tool / Ground Truth Generator
  • Development of general framework
  • Visual component recognizer engine using Gabor
    filter
  • Line component recognizer engine using HMM
  • Tibetan OCR

5
Annotation tool / Ground truth generator
  • To be used for generating the training sets for
    OCR in a short span of time and testing its
    accuracy for degraded images. The purpose of this
    tool is twofold.
  • - It can be used to generate conveniently large
    amounts of ground truth data that would be
    suitable as training data for building an OCR
    with degradations
  • - It can be used for testing an OCR and
    performing an analysis of its errors in the form
    of confusion matrixes useful for OCR fine tuning

6
Development of general framework
  • Involves development of open architecture
  • for optical character recognition
  • Design consists of an OCR system that has
  • It has component oriented design
  • Highly configurable
  • Any new algorithm can be added to system very
    easily

7
Design
8
Important issues
  • Interface definition
  • Control panel for configuring plugins
  • Installation package

9
Visual component recognizer engine using Gabor
filter
  • A complete visual component recognizer engine
    will be built that exploits the robustness of the
    Gabor filter feature set.
  • The engine would accept visual components as
    input, extract estimates for line width, stroke
    width etc.
  • Adjust its parameters automatically to produce
    parameters that give the best set of filters.
  • These filters will then be used to obtain the
    feature vector corresponding to the given visual
    component.
  • The engine will also provide various classifiers
    that would allow the final classification of the
    visual component.

10
Some observations
  • Experiments with degraded text images show that
  • the chief source of error is at the level of
  • segmentation of characters.
  • A similar situation exists for recognition of
    hand-
  • written texts.
  • Error rates are at acceptable levels for the
    other
  • stages i.e. line segmentation, word segmentation,
  • character recognition etc.
  • Solution Develop a strategy that avoids
    character segmentation.

11
Line component recognizer engine using Gabor
filter
  • To build an OCR using Gabor Filter and HMM
  • - Family of two-dimensional Gabor functions was
    proposed by Daugman to model the spatial
    summation properties (of the receptive fields) of
    simple cells in the visual cortex.
  • - Provides an excellent feature set that is
    robust against noise and degradations
  • - Highly experimental. Expected to minimize
    error rate of character segmentation.
  • - Mimics Automatic speech recognition

12
Feature Extraction A look at gabor filter
extraction for a 0 degree filter
13
Basic approach
  • Segment lines from image using pixel row scan
  • Extract line width
  • Extract stroke width using zero crossings
  • Extract features using gabor filter by moving in
    x direction with jumpsize stroke width.
  • At each x(i), make an observation sequence by
    moving in y-direction.
  • Find the best hmm which matches with the
    sequence.
  • Repeat the process.

14
Tibetan OCR
  • It is proposed to develop an OCR for the Tibetan
    script.
  • The methods developed in the previous modules
    will also be applied for the Tibetan script.
  • The primary motivation is that it will provide a
    useful test-bed for checking the other
    technologies.

15
Time targeted deliverables
  • Annotation tool / Ground Truth Generator
  • 0 6 months Basic design to be completed
    including determination of format for the XML
    document.
  • 7 15 months Implementation of editor
    (multilingual) and I/O interface for images in
    various formats.
  • 16 24 months Development of module for
    producing noise / degradation of input images.
  • 24 - 30 months Development of the tool for
    performing error analysis of the output of OCR.
  • 30 36 months Testing, debugging and
    documentation of the entire module including
    installer.

16
Time targeted deliverables
  • Development of general framework
  • 0 4 months Exploring various technologies
    available for creating a plug and play
    environment.
  • 4 8 months Design of the system.
  • 8 16 months Development of the system.
  • 17 20 months Testing, debugging and
    documentation of the system.
  • 21 36 months Implementation of various
    techniques / algorithms that are used in creating
    OCRs together with their implementation in the
    system.

17
Time targeted deliverables
  • Visual component recognizer engine using
  • Gabor filter
  • 0 6 months Development of sub-modules for
    extracting extracting line width, stroke width
    and other relevant parameters
  • 6 18 months Determination of the Gabor
    function parameters as functions of the document
    level parameters extracted earlier including
    extent of scaling of images.
  • 19 30 months Development of at least three
    classifiers for performing the classification
  • 31 36 months Testing, debugging and
    documentation of the complete tool including
    installer.

18
Time targeted deliverables
  • Line component recognizer engine using HMM
  • 0 6 months Design of the basic system
  • 7 12 months Developing a module for performing
    HMM training and classification
  • 13 18 months Developing a training set having
    different font styles and sizes in Devanagari
  • 19 30 months Training the HMM and fine-tuning
    the parameters for optimal performance
    (recognition and time)
  • 31 36 months Testing, debugging and
    documentation of the system including installer.

19
Time targeted deliverables
  • Tibetan OCR
  • 0 6 months Analysis of Tibetan script
  • 7 12 months Development of training set for
    Tibetan script
  • 12 30 months Design and development of OCR for
    Tibetan script
  • 31 36 months Testing, debugging and
    documentation of the system including installer.
Write a Comment
User Comments (0)
About PowerShow.com