Towards Performance Evaluation of Symbol Recognition - PowerPoint PPT Presentation

About This Presentation
Title:

Towards Performance Evaluation of Symbol Recognition

Description:

Title: From Web Documents to Old Books Author: Mathieu Last modified by: Utilisateur Windows Created Date: 6/5/1998 10:47:40 PM Document presentation format – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 27
Provided by: Mathi196
Category:

less

Transcript and Presenter's Notes

Title: Towards Performance Evaluation of Symbol Recognition


1
Towards Performance Evaluation of Symbol
Recognition Spotting Systems in a Localization
Context
  • Mathieu Delalandre
  • CVC, Barcelona, Spain
  • EuroMed Meeting
  • LORIA, Nancy city, France
  • Monday 18th of May 2009

2
Introduction
3
Introduction
Performance evaluation Information Retrieval
Salton1992, Computer Vision Thacker2005, CBIR
Muller2001, DIA Haralick2000
Case of symbol recognition spotting
Ezra2008Delalandre2008
Training data
System
Groundtruthing
Characterisation
Performance evaluation
4
Plan
  1. Groundtruth and test documents
  2. Performance characterization
  3. Conclusions and perspectives

5
Groundtruth and test documents Overview of
approaches
1. Overview of approaches 2. Existing datasets
Speed Realism Reliability Symbol Connected Noise

Dosch06 - - - many yes no
Yan04 - - - many yes no
Rusinol09 - -- many yes no

Aksoy00 - - many no yes
Zhai03 - - one no yes
Valveny07 - - one no yes
Delalandre08 many yes no
- - weak good
real approach
synthetic approach
6
Groundtruth and test documents Overview of
approaches
1. Overview of approaches 2. Existing datasets
Speed Realism Reliability Symbol Connected Noise

Dosch06 - - - many yes no
Yan04 - - - many yes no
Rusinol09 - -- many yes no

Aksoy00 - - many no yes
Zhai03 - - one no yes
Valveny07 - - one no yes
Delalandre08 many yes no
- - weak good
real approach
synthetic approach
7
Groundtruth and test documents Overview of
approaches
1. Overview of approaches 2. Existing datasets
Delalandre2008
Speed Realism Reliability Symbol Connected Noise

Dosch06 - - - many yes no
Yan04 - - - many yes no
Rusinol09 - -- many yes no

Aksoy00 - - many no yes
Zhai03 - - one no yes
Valveny07 - - one no yes
Delalandre08 many yes no
- - weak good
real approach
To use a same background layer with different
symbol layers
synthetic approach
8
Groundtruth and test documents Overview of
approaches
1. Overview of approaches 2. Existing datasets
Delalandre2008
Speed Realism Reliability Symbol Connected Noise

Dosch06 - - - many yes no
Yan04 - - - many yes no
Rusinol09 - -- many yes no

Aksoy00 - - many no yes
Zhai03 - - one no yes
Valveny07 - - one no yes
Delalandre08 many yes no
- - weak good
real approach
synthetic approach
9
Groundtruth and test documents Overview of
approaches
1. Overview of approaches 2. Existing datasets
Delalandre2008
Speed Realism Reliability Symbol Connected Noise

Dosch06 - - - many yes no
Yan04 - - - many yes no
Rusinol09 - -- many yes no

Aksoy00 - - many no yes
Zhai03 - - one no yes
Valveny07 - - one no yes
Delalandre08 many yes no
- - weak good
real approach
synthetic approach
10
Groundtruth and test documents Existing datasets
1. Overview of approaches 2. Existing datasets
datasets images symbols degradations models

GREC03 30 3000 3000 10 5-50
GREC05 16 1000 1000 6 25-150
GREC07 6 2100 2100 6 50-150

ICPR00 9 450 11250 9 25

bags 16 1600 15046 none 25-150
floorplans 10 1000 26830 none 16
diagrams 10 1000 14100 none 21
queries 6 6000 6000 none 16-21

Rusinol09 1 42 344 none 38
GREC
ICPR
SESYD
Others
11
Groundtruth and test documents Existing datasets
1. Overview of approaches 2. Existing datasets
datasets images symbols degradations models

GREC03 30 3000 3000 10 5-50
GREC05 16 1000 1000 6 25-150
GREC07 6 2100 2100 6 50-150

ICPR00 9 450 11250 9 25

bags 16 1600 15046 none 25-150
floorplans 10 1000 26830 none 16
diagrams 10 1000 14100 none 21
queries 6 6000 6000 none 16-21

Rusinol09 1 42 344 none 38
GREC
ICPR
SESYD
Others
12
Groundtruth and test documents Existing datasets
1. Overview of approaches 2. Existing datasets
datasets images symbols degradations models

GREC03 30 3000 3000 10 5-50
GREC05 16 1000 1000 6 25-150
GREC07 6 2100 2100 6 50-150

ICPR00 9 450 11250 9 25

bags 16 1600 15046 none 25-150
floorplans 10 1000 26830 none 16
diagrams 10 1000 14100 none 21
queries 6 6000 6000 none 16-21

Rusinol09 1 42 344 none 38
GREC
ICPR
SESYD
Others
13
Groundtruth and test documents Existing datasets
1. Overview of approaches 2. Existing datasets
datasets images symbols degradations models

GREC03 30 3000 3000 10 5-50
GREC05 16 1000 1000 6 25-150
GREC07 6 2100 2100 6 50-150

ICPR00 9 450 11250 9 25

bags 16 1600 15046 none 25-150
floorplans 10 1000 26830 none 16
diagrams 10 1000 14100 none 21
queries 6 6000 6000 none 16-21

Rusinol09 1 42 344 none 38
GREC
ICPR
SESYD
Others
14
Groundtruth and test documents Existing datasets
1. Overview of approaches 2. Existing datasets
datasets images symbols degradations models

GREC03 30 3000 3000 10 5-50
GREC05 16 1000 1000 6 25-150
GREC07 6 2100 2100 6 50-150

ICPR00 9 450 11250 9 25

bags 16 1600 15046 none 25-150
floorplans 10 1000 26830 none 16
diagrams 10 1000 14100 none 21
queries 6 6000 6000 none 16-21

Rusinol09 1 42 344 none 38
GREC
ICPR
SESYD
Others
15
Groundtruth and test documents Existing datasets
1. Overview of approaches 2. Existing datasets
datasets images symbols degradations models

GREC03 30 3000 3000 10 5-50
GREC05 16 1000 1000 6 25-150
GREC07 6 2100 2100 6 50-150

ICPR00 9 450 11250 9 25

bags 16 1600 15046 none 25-150
floorplans 10 1000 26830 none 16
diagrams 10 1000 14100 none 21
queries 6 6000 6000 none 16-21

Rusinol09 1 42 344 none 38
GREC
ICPR
SESYD
Others
16
Groundtruth and test documents Existing datasets
1. Overview of approaches 2. Existing datasets
datasets images symbols degradations models

GREC03 30 3000 3000 10 5-50
GREC05 16 1000 1000 6 25-150
GREC07 6 2100 2100 6 50-150

ICPR00 9 450 11250 9 25

bags 16 1600 15046 none 25-150
floorplans 10 1000 26830 none 16
diagrams 10 1000 14100 none 21
queries 6 6000 6000 none 16-21

Rusinol09 1 42 344 none 38
GREC
ICPR
SESYD
Others
17
Groundtruth and test documents Existing datasets
1. Overview of approaches 2. Existing datasets
1. Random selection of a document 2. Radom
selection of a symbol
datasets images symbols degradations models

GREC03 30 3000 3000 10 5-50
GREC05 16 1000 1000 6 25-150
GREC07 6 2100 2100 6 50-150

ICPR00 9 450 11250 9 25

bags 16 1600 15046 none 25-150
floorplans 10 1000 26830 none 16
diagrams 10 1000 14100 none 21
queries 6 6000 6000 none 16-21

Rusinol09 1 42 344 none 38
Groundtruth
Generator of queries
GREC
ICPR
3. Random crop
SESYD
Others
18
Groundtruth and test documents Existing datasets
1. Overview of approaches 2. Existing datasets
datasets images symbols degradations models

GREC03 30 3000 3000 10 5-50
GREC05 16 1000 1000 6 25-150
GREC07 6 2100 2100 6 50-150

ICPR00 9 450 11250 9 25

bags 16 1600 15046 none 25-150
floorplans 10 1000 26830 none 16
diagrams 10 1000 14100 none 21
queries 6 6000 6000 none 16-21

Rusinol09 1 42 344 none 38
GREC
ICPR
SESYD
Others
19
Plan
  1. Groundtruth and test documents
  2. Performance characterization
  3. Conclusions and perspectives

20
Performance characterization Introduction
  • Performance characterisation (segmented symbols)
  • Valveny2004 Dosch2006 Valveny2007,2008a,2008b
  • Recognition rate
  • Precision/Recall
  • Homogeneity
  • Separability

Performance characterisation (real context)
21
Performance characterization About mapping
Mapping cases
Single a model line matches only with one
detected line. Split two model lines match
with one detected line. Merge a model line
matches with two detected lines.
False alarm a detected line doesn't match with
any model lines. Miss a model line doesn't
match with any detected lines.
truth results
Symbol spotting Rusinol2009
g1
g2
Groundtruth
Results
r
Mapping
c1
c2
22
Performance characterization Mapping,
application to symbol
Which representation ?
How to define the regions ?
Compatibility with recognition systems ?
Lot of systems use sliding windows to detect
symbols providing only points Adam2001
Dosh2004 Rusinol2007
Lot of systems use sliding windows to detect
symbols providing only points Adam2001
Dosh2004 Rusinol2007
How to define local thresholds
point
the polarized pat of the capacitor belong to the
symbol ?
Systems providing region of interest can tune
their results, how to limit the over segmentation
cases ?
the precision will depend of the model
wrapper box, ellipsis
groundtruth
Same for the moving area of the door ?
segmentation
convex polygon
could be of weak precision
precise but comparison is time consuming
concave polygon
23
Performance characterization Work in progress
Comparison of some criteria System of
Qureshi08 , 100 floorplans (2521 symbols)
Signature based characterization
24
Plan
  1. Groundtruth and test documents
  2. Performance characterization
  3. Conclusions and perspectives

25
Conclusions and perspectives
  • Conclusions
  • Large databases of segmented symbol images exist
    GREC
  • Synthetic databases in real context exist SESYD
  • True-life documents and groundtruth are at the
    corner EPEIRES
  • Characterization tools have been proposed
    SymbolRec
  • Perspectives
  • Continue to produce other databases, using
    existing platforms
  • Mapping is the key problem today, to achieve a
    performance evaluation in real context

26
Thanks
  • All the referenced papers can be found in
  • 1 M. Delalandre, E. Valveny and J. Lladós
    Performance Evaluation of Symbol Recognition and
    Spotting Systems A Overview. Workshop on
    Document Analysis Systems (DAS), pp 497-505, 2008.
Write a Comment
User Comments (0)
About PowerShow.com