Performance Evaluation Measures for Face Detection Algorithms - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Performance Evaluation Measures for Face Detection Algorithms

Description:

Performance Evaluation Measures for Face Detection Algorithms Prag Sharma, Richard B. Reilly DSP Research Group, Department of Electronic and Electrical Engineering, – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 25
Provided by: PATRICK510
Category:

less

Transcript and Presenter's Notes

Title: Performance Evaluation Measures for Face Detection Algorithms


1
Performance Evaluation Measures for Face
Detection Algorithms
  • Prag Sharma, Richard B. Reilly
  • DSP Research Group,
  • Department of Electronic and Electrical
    Engineering,
  • University College Dublin, Ireland.

2
Aim
  • To highlight the lack of standard performance
    evaluation measures for face detection purposes.
  • To propose a method for the evaluation and
    comparison of existing face detection algorithms
    in an unbiased manner.
  • To apply the proposed method on an existing face
    detection algorithm.

3
Face Detection Applications and Challenges Posed
4
Need for Face Detection
  • Face Recognition
  • Intelligent Vision-based Human Computer
    Interaction
  • Object-based Video Processing
  • Content-based functionalities
  • Improved Coding Efficiency
  • Improved Error-Robustness
  • Content Description

5
Challenges Associated with Face Detection
  • Pose Estimation and Orientation
  • Presence or Absence of Structural Component
  • Facial Expressions and Occlusion
  • Imaging Conditions

6
Performance Evaluation Measures
7
Need for Standard Performance Evaluation Measures
  • Main reason for advancement of research by
    comparison and testing.
  • In order to obtain an impartial and empirical
    evaluation and comparison of any two methods, it
    is important to consider the following points
  • Use of a standard and representative test set for
    evaluation.
  • Use of standard terminology for the presentation
    of results.

8
Standard and Representative Test Set for
Evaluation

9
Use of Standard Terminology
  • Lack of standard terminology to describe results
    leads to difficulty in comparing algorithms.
  • Eg, while one algorithm may consider a successful
    detection if the bounding box contains eyes and
    mouth, another may require the entire face
    (including forehead and hair) to be enclosed in a
    bounding box for a positive result.

Successful face detection by (a) Rowley et al.
(b) Hsu et al.
(a)
(b)
10
Use of Standard Terminology
  • Lack of standard terminology to describe results
    leads to difficulty in comparing algorithms.
  • Moreover, there may be differences in the
    definition of a face (e.g., cartoon, hand-drawn
    or human faces).

11
Use of Standard Terminology
  • Therefore, first step towards a standard
    evaluation protocol is to answer the following
    questions
  • What is a face?
  • What constitutes successful face detection?

12
Use of Standard Terminology
  • What is a face?
  • Several databases contain human faces, animal
    faces, cartoon faces, line-drawn faces, frontal
    and profile view faces.
  • MIT-23 contains 23 images with 149 faces.
  • MIT20 contains only 20 images with 136 faces
    (excluding hand-drawn and cartoon faces).
  • CMU Rowley established ground truth for 483
    faces in this database based on excluding some of
    the occluded faces and non-human faces.

Therefore, total number of faces in a database
can vary for different algorithms!!
13
Use of Standard Terminology
  • To eliminate this problem!!
  • Use only standard databases that come with
    clearly marked faces in terms of cartoon/human,
    pose, orientation, occlusion and presence or
    absence of structural components such as glasses
    or sunglasses.
  • Previous work in this has led to the development
    of the UCD Colour Face Image Database. Each face
    in the database is marked using clearly defined
    terms. http//dsp.ucd.ie/prag
  • This eliminates any misinterpretation between
    pose variations, orientation etc. by different
    researchers as a fixed number for cartoon faces,
    hand-drawn faces and faces in different poses and
    orientations is provided with the database.

14
Use of Standard Terminology
  • What constitutes successful face detection?
  • Most face detection algorithms do not clearly
    define a successful face detection process.
  • A uniform criterion should be adopted to define a
    successful detection.
  1. Test image.
  2. Possible face detection results to be classified
    as face or non-face.

(a)
(b)
15
Use of Standard Terminology
  • What constitutes successful face detection?
  • Criterion adopted by Rowley the center of the
    detected bounding box must be within four pixels
    and the scale must be within a factor of 1.2
    (their scale step size) of ground truth (recorded
    manually).
  • Face detection results should be presented in
    such a manner so that the interpretation of
    results is open for specific applications.
  • Graphical representation number of faces vs.
    percentage overlap
  • Use database that comes with hand-segmented
    results outlining each face, e.g. UCD Colour Face
    Image Database

Therefore, a correct face detection is one in
which the bounding box includes the visible eyes
and the mouth region and the overlap region
between the hand-segmented results and the
detection result is greater than a fixed
threshold (the threshold dependent on the
application).
16
Use of Standard Terminology
  • What constitutes successful face detection?
  • Use of standard terminology in describing
    results.
  • Detection rate The number of faces correctly
    detected to the number of faces determined by a
    human expert (hand-segmented results). 
  • False positives This is when an image region
    is declared to be a face but it is not. 
  • False negatives This is when an image region
    that is a face is not detected at all. 
  • False detections False detections False
    positives False negatives.

17
Use of Standard Terminology
  • What constitutes successful face detection?
  • For methods that require training
  • The number and variety of training examples
    have a direct effect on the classification
    performance.
  • The training and execution time varies for
    different algorithms. 
  • Most of these systems can often be tested at
    different threshold values to balance the
    detection rate and the number of false positives.

18
Use of Standard Terminology
  • What constitutes successful face detection?
  • To standardize these variability
  • Training should be complete on a different
    dataset prior to testing.
  • The number and variety of training examples
    should be left to the algorithm developer
  • The training and execution time should always
    be mentioned for all algorithms that require
    training.
  • All methods should present results in terms of
    an ROC curve.

19
Overall Procedure
  • Employ a colour face detection database that
    comes with hand-segmented results in the form of
    eyes and mouth coordinates along with segmented
    face regions.
  • The face database should also contain details of
    the faces in standard terminology of pose,
    orientation, occlusion and presence of structural
    components along with the type of faces
    (hand-draw, cartoon etc.).
  • Clearly define the type of faces the algorithm
    can detect.

20
Overall Procedure
  • For algorithms that require training, the
    training should be completed prior to testing
    using face recognition databases for the face
    class and the boot-strap training technique for
    the non-face class.
  • All results should be presented in the form of
    two graphical plots. The ROC curves should be
    used to show the correct detection/false-positives
    trade-off while the "number of faces vs.
    percentage overlap" should be presented for
    determining correct face detection.
  • All results should also present the training and
    execution times for comparison.

21
Presentation of Results
  • The above procedure is implemented for the
    performance evaluation of a previously developed
    face detection algorithm as follows
  • The colour face detection database chosen is the
    HHI MPEG-7 image database.
  • The algorithm developed does not require any
    training before execution. 
  • The results are presented in terms of number of
    faces vs. percentage overlap for the HHI MPEG-7
    database (see figure).
  • Since there is no adjustable variable
    threshold, the ROC curve is not presented.
  • The execution time is 3.54 seconds/image on a
    Pentium III processor.

22
Presentation of Results
The graph shows that there are 13 faces with no
overlap (i.e. false detections) and 43 faces with
over 85 overlap with the hand segmented results.
23
Conclusions
  • This paper highlights the problems associated
    with evaluating and comparing the performance of
    new and existing face detection methods in an
    unbiased manner.
  • A solution in the form of a standard procedure
    for the evaluation and presentation of results
    has been presented.
  • The evaluation procedure described in this paper
    concentrates on using standard terminology along
    with carefully labelled face databases for
    evaluation purposes.
  • The method also recommends that results should be
    presented graphically the ROC curves to show the
    correct detection/false-positives trade-off while
    the "number of faces vs. percentage overlap" to
    determine correct face detection accuracy.

24
Questions??
Write a Comment
User Comments (0)
About PowerShow.com