Optical Character Recognition - PowerPoint PPT Presentation

1 / 6
About This Presentation
Title:

Optical Character Recognition

Description:

Printed books (robustness check against banding noise etc.) Optical Character Recognition ... OCRized Dictionary and spell checker. Confusion matrix ... – PowerPoint PPT presentation

Number of Views:169
Avg rating:3.0/5.0
Slides: 7
Provided by: tdilM
Category:

less

Transcript and Presenter's Notes

Title: Optical Character Recognition


1
Optical Character Recognition
  • - Proposer -
  • Prof. S. Rama Mohan
  • Department of Applied Mathematics
  • Faculty of Technology and Engineering
  • The M. S. University of Baroda
  • Vadodara 390002. Gujarat

Language Gujarati
  • Components that will be implemented
  • Page Layout Analysis Engine
  • Visual Component Extraction Engine
  • Visual Component Recognizer Engine
  • Post Processor Engine
  • Training and Testing Data

2
Page Layout Analysis Engine
  • List of the technique (s) that will be used
  • Smearing and Histogram Analysis (For locating
    heading (s), paragraphs, )
  • Connected components Analysis. (Skew estimation,
    Text-Image Separation)
  • Performance for these techniques in other
    Languages
  • The claims in other languages are to be evaluated
  • Estimate of the expected performance (Quarterly
    PERT Chart)
  • To be worked out
  • Name of the domain for which the performance will
    be optimized
  • Magazine articles with regular shaped images
  • Other evaluation metrics in addition to the
    domain
  • Preformatted forms

3
Visual Component Extraction Engine
  • List of the technique (s) that will be used
  • Connected component Analysis
  • Artificial Neural Networks (Unsupervised)
  • Performance for these techniques in other
    Languages
  • Recorded quality performance for European scripts
  • Experimental approaches for Indian scripts
  • Estimate of the expected performance (Quarterly
    PERT Chart)
  • To be worked out
  • Name of the domain for which the performance will
    be optimized
  • Printed documents of computerized type set
  • Other evaluation metrics in addition to the
    domain
  • Printed books (robustness check against banding
    noise etc.)

4
Visual Component Recognizer Engine
  • List of the technique (s) that will be used
  • Performance for these techniques in other
    Languages
  • Established performance in European scripts
  • Template matching with Fringe Maps gives more
    than 90 character level recognition accuracy for
    Telugu script.
  • Estimate of the expected performance (Quarterly
    PERT Chart)
  • To be worked out
  • Name of the domain for which the performance will
    be optimized
  • Printed documents of computerized type set
  • Other evaluation metrics in addition to the
    domain
  • Printed books

5
Post Processor Engine
  • List of the technique (s) that will be used
  • N-gram analysis
  • OCRized Dictionary and spell checker
  • Confusion matrix
  • Performance for these techniques in other
    Languages
  • Extensively used for the European languages
  • Estimate of the expected performance (Quarterly
    PERT Chart)
  • To be worked out
  • Name of the domain for which the performance will
    be optimized
  • Magazine articles on sports.
  • Other evaluation metrics in addition to the
    domain
  • Government circulars

6
Training and Testing Data
  • List of the technique (s) that will be used
  • Ground truth data generation with information for
    every identified components
  • Performance for these techniques in other
    Languages
  • Encouraging performance in Telugu, Bangla and
    Devnagari
  • Estimate of the expected performance (Quarterly
    PERT Chart)
  • To be worked out
  • Name of the domain for which the performance will
    be optimized
  • Magazine articles on sports
  • Other evaluation metrics in addition to the
    domain
  • Government circulars
Write a Comment
User Comments (0)
About PowerShow.com