Title: A Modular System to Recognize Numerical Amounts on Brazilian Bank Cheques
1A Modular System to RecognizeNumerical Amounts
on Brazilian Bank Cheques
- L.S.Oliveira, R.Sabourin, F.Bortolozzi, and
C.Y.Suen
Pontifícia Universidade Católica do Paraná
(PUCPR) BRAZIL Ecole de Technologie Superiéure
(ETS) CANADA Centre for Pattern Recognition and
Machine Inteligence (CENPARMI) - CANADA
2System Overview
- Segmentation-based recognition.
- Explicit segmentation.
- Integration of all modules is done through a
probabilistic model. - Problem to overcome
- To distinguish, at the recognition stage,
isolated (correctly segmented) characters from
over and under segmentation. - Recognition and verification approach.
3Over and Under-segmentationProblems.
Misclassification caused by over-segmentation (a
and b) and under-segmentation (c)
4Modular System
Grey boxes AD modules. White boxes TD modules.
5Component Detection and Segmentation
- Component Detection
- It operates in three steps connected component
analysis, delimiter detection and grouping. - Segmentation
- Relationship among complementary structural
features contour, profile and skeleton
IWFHR00. - Segmentation graphs.
6Features and General-purpose Recognizer
- General-purpose recognizer.
- Mixture of concavity and contour features.
- e10 and e3 132 components.
- e13 18 components (13 outputs 4 structural
features 1 contextual feature). - Databases 11 400, 2 000 and 4 000 (training,
validation and testing). - Performance 99.2, 99.0 and 98.9.
7Verifier
- In order to overcome over- and under-segmentation
problems, we have proposed the following
verifier - MLP with 3 classes isolated, over-segmented and
under-segmented characters. - Features.
- Multi-level concavity analysis.
- Profile distances.
- Databases.
- 40 500, 4 000 and 4 000 (training, validation and
testing). - 99.02 on test set.
8New Feature Set
42 components from MCA 6 components from
profile distances 48 components.
9Interaction Between GPR and Verifier
10Global Hypothesis and Post-processor
- Hypothesis generation.
- Modified Viterbi algorithm.
- Post-processor.
- Deterministic automaton.
11Experimental Results
- Experiments on numerical amounts.
- 503 images (about 9 characters per image).
Recognition Rates (zero-rejection level)
12Experimental Results
- NIST SD19.
- Database.
- Isolated digits 195 000, 28 000 and 60 000
(training, validation and testing). - Performance on isolated digits (zero-rejection
level) 99.66, 99.55 and 99.13. - Verifier 40 500, 4 000 and 4 000 (training,
validation and testing). - Performance on test set 98.90.
- Database of strings 12 800 images (hsf_7 series).
13Experimental Results
Recognition Rates (zero-rejection level)
14Experimental Results
Recognition rates on NIST database reported by
other authors.
- ICDAR, (2) New Results.
- Results achieved without knowledge about the
number of digits in the strings.
15Future Works
- Future Works.
- Optmization of the classifiers.
- General-purpose recognizer.
- Verifier.
- Optimization of the system.
- Ensemble of classifiers.