Performance Evaluation Measures for Face Detection Algorithms - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Performance Evaluation Measures for Face Detection Algorithms

Description:

Performance Evaluation Measures for Face Detection Algorithms Prag Sharma, Richard B. Reilly DSP Research Group, Department of Electronic and Electrical Engineering, – PowerPoint PPT presentation

Number of Views:102

Avg rating:3.0/5.0

Slides: 25

Provided by: PATRICK510

Category:

more less

Transcript and Presenter's Notes

Title: Performance Evaluation Measures for Face Detection Algorithms

1
Performance Evaluation Measures for Face
Detection Algorithms

Prag Sharma, Richard B. Reilly
DSP Research Group,
Department of Electronic and Electrical
Engineering,
University College Dublin, Ireland.

2
Aim

To highlight the lack of standard performance
evaluation measures for face detection purposes.
To propose a method for the evaluation and
comparison of existing face detection algorithms
in an unbiased manner.
To apply the proposed method on an existing face
detection algorithm.

3
Face Detection Applications and Challenges Posed
4
Need for Face Detection

Face Recognition
Intelligent Vision-based Human Computer
Interaction
Object-based Video Processing
Content-based functionalities
Improved Coding Efficiency
Improved Error-Robustness
Content Description

5
Challenges Associated with Face Detection

Pose Estimation and Orientation
Presence or Absence of Structural Component
Facial Expressions and Occlusion
Imaging Conditions

6
Performance Evaluation Measures
7
Need for Standard Performance Evaluation Measures

Main reason for advancement of research by
comparison and testing.
In order to obtain an impartial and empirical
evaluation and comparison of any two methods, it
is important to consider the following points
Use of a standard and representative test set for
evaluation.
Use of standard terminology for the presentation
of results.

8
Standard and Representative Test Set for
Evaluation

9
Use of Standard Terminology

Lack of standard terminology to describe results
leads to difficulty in comparing algorithms.
Eg, while one algorithm may consider a successful
detection if the bounding box contains eyes and
mouth, another may require the entire face
(including forehead and hair) to be enclosed in a
bounding box for a positive result.

Successful face detection by (a) Rowley et al.
(b) Hsu et al.
(a)
(b)
10
Use of Standard Terminology

Lack of standard terminology to describe results
leads to difficulty in comparing algorithms.
Moreover, there may be differences in the
definition of a face (e.g., cartoon, hand-drawn
or human faces).

11
Use of Standard Terminology

Therefore, first step towards a standard
evaluation protocol is to answer the following
questions
What is a face?
What constitutes successful face detection?

12
Use of Standard Terminology

What is a face?
Several databases contain human faces, animal
faces, cartoon faces, line-drawn faces, frontal
and profile view faces.
MIT-23 contains 23 images with 149 faces.
MIT20 contains only 20 images with 136 faces
(excluding hand-drawn and cartoon faces).
CMU Rowley established ground truth for 483
faces in this database based on excluding some of
the occluded faces and non-human faces.

Therefore, total number of faces in a database
can vary for different algorithms!!
13
Use of Standard Terminology

To eliminate this problem!!
Use only standard databases that come with
clearly marked faces in terms of cartoon/human,
pose, orientation, occlusion and presence or
absence of structural components such as glasses
or sunglasses.
Previous work in this has led to the development
of the UCD Colour Face Image Database. Each face
in the database is marked using clearly defined
terms. http//dsp.ucd.ie/prag
This eliminates any misinterpretation between
pose variations, orientation etc. by different
researchers as a fixed number for cartoon faces,
hand-drawn faces and faces in different poses and
orientations is provided with the database.

14
Use of Standard Terminology

What constitutes successful face detection?
Most face detection algorithms do not clearly
define a successful face detection process.
A uniform criterion should be adopted to define a
successful detection.

Test image.
Possible face detection results to be classified
as face or non-face.

(a)
(b)
15
Use of Standard Terminology

What constitutes successful face detection?
Criterion adopted by Rowley the center of the
detected bounding box must be within four pixels
and the scale must be within a factor of 1.2
(their scale step size) of ground truth (recorded
manually).
Face detection results should be presented in
such a manner so that the interpretation of
results is open for specific applications.
Graphical representation number of faces vs.
percentage overlap
Use database that comes with hand-segmented
results outlining each face, e.g. UCD Colour Face
Image Database

Therefore, a correct face detection is one in
which the bounding box includes the visible eyes
and the mouth region and the overlap region
between the hand-segmented results and the
detection result is greater than a fixed
threshold (the threshold dependent on the
application).
16
Use of Standard Terminology

What constitutes successful face detection?
Use of standard terminology in describing
results.
Detection rate The number of faces correctly
detected to the number of faces determined by a
human expert (hand-segmented results).
False positives This is when an image region
is declared to be a face but it is not.
False negatives This is when an image region
that is a face is not detected at all.
False detections False detections False
positives False negatives.

17
Use of Standard Terminology

What constitutes successful face detection?
For methods that require training
The number and variety of training examples
have a direct effect on the classification
performance.
The training and execution time varies for
different algorithms.
Most of these systems can often be tested at
different threshold values to balance the
detection rate and the number of false positives.

18
Use of Standard Terminology

What constitutes successful face detection?
To standardize these variability
Training should be complete on a different
dataset prior to testing.
The number and variety of training examples
should be left to the algorithm developer
The training and execution time should always
be mentioned for all algorithms that require
training.
All methods should present results in terms of
an ROC curve.

19
Overall Procedure

Employ a colour face detection database that
comes with hand-segmented results in the form of
eyes and mouth coordinates along with segmented
face regions.
The face database should also contain details of
the faces in standard terminology of pose,
orientation, occlusion and presence of structural
components along with the type of faces
(hand-draw, cartoon etc.).
Clearly define the type of faces the algorithm
can detect.

20
Overall Procedure

For algorithms that require training, the
training should be completed prior to testing
using face recognition databases for the face
class and the boot-strap training technique for
the non-face class.
All results should be presented in the form of
two graphical plots. The ROC curves should be
used to show the correct detection/false-positives
trade-off while the "number of faces vs.
percentage overlap" should be presented for
determining correct face detection.
All results should also present the training and
execution times for comparison.

21
Presentation of Results

The above procedure is implemented for the
performance evaluation of a previously developed
face detection algorithm as follows
The colour face detection database chosen is the
HHI MPEG-7 image database.
The algorithm developed does not require any
training before execution.
The results are presented in terms of number of
faces vs. percentage overlap for the HHI MPEG-7
database (see figure).
Since there is no adjustable variable
threshold, the ROC curve is not presented.
The execution time is 3.54 seconds/image on a
Pentium III processor.

22
Presentation of Results
The graph shows that there are 13 faces with no
overlap (i.e. false detections) and 43 faces with
over 85 overlap with the hand segmented results.
23
Conclusions

This paper highlights the problems associated
with evaluating and comparing the performance of
new and existing face detection methods in an
unbiased manner.
A solution in the form of a standard procedure
for the evaluation and presentation of results
has been presented.
The evaluation procedure described in this paper
concentrates on using standard terminology along
with carefully labelled face databases for
evaluation purposes.
The method also recommends that results should be
presented graphically the ROC curves to show the
correct detection/false-positives trade-off while
the "number of faces vs. percentage overlap" to
determine correct face detection accuracy.