A Robust Real Time Face Detection - PowerPoint PPT Presentation

About This Presentation

Title:

A Robust Real Time Face Detection

Description:

Face Detection (not face recognition) Face Detection in ... We analyze faces in a specific location' Robust Real-Time Face Detection. Viola and Jones, 2003 ... – PowerPoint PPT presentation

Number of Views:349

Avg rating:3.0/5.0

Slides: 56

Provided by: ari6

Category:

more less

Transcript and Presenter's Notes

Title: A Robust Real Time Face Detection

1
A Robust Real Time Face Detection
2
Outline

AdaBoost Learning Algorithm
Face Detection in real life
Using AdaBoost for Face Detection
Improvements
Demonstration

3
AdaBoost

A short Introduction to Boosting (Freund
Schapire, 1999)
Logistic Regression, AdaBoost and Bregman
Distances (Collins, Schapire, Singer, 2002)

4
Boosting

The Horse-Racing Gambler Problem
Rules of thumb for a set of races
How should we choose the set of races in order to
get the best rules of thumb?
How should the rules be combined into a single
highly accurate prediction rule?
Boosting !

5
AdaBoost - the idea

AdaBoost agglomerates many weak classifiers into
one strong classifier.

Initialize sample weights
For each cycle
Find a classifier that performs well on the
weighted sample
Increase weights of misclassified examples
Return a weighted list of classifiers

IQ
Shoe size
Shoe size
6
AdaBoost - algorithm
7
AdaBoost training error

Freund and Schapire (1997) proved that
AdaBoost ADApts to the error rates of the
individual weak hypotheses.

8
AdaBoost generalization error

Freund and Schapire (1997) showed that

9
AdaBoost generalization error

The analysis implies that boosting will overfit
if run for too many rounds
However, it was observed empirically that
AdaBoost does not overfit, even when run
thousands of rounds.
Moreover, it was observed that the generalization
error continues to drive down long after training
error reached zero

10
AdaBoost generalization error

An alternative analysis was presented by Schapire
et al. (1998), that suits the empirical findings

11
AdaBoost different point of view

We try to solve the problem of approximating the
ys using a linear combination of weak hypotheses
In other words, we are interested in the problem
of finding a vector of parameters a such that
is a good approximation of yi
For classification problems we try to match the
sign of f(xi) to yi

12
AdaBoost different point of view

Sometimes it is advantageous to minimize some
other (non-negative) loss function instead of the
number of classification errors
For AdaBoost the loss function is
This point of view was used by Collins, Schapire
and Singer (2002) to demonstrate that AdaBoost
converges to optimality

13
Face Detection (not face recognition)
14
Face Detection in Monkeys

There are cells that detect faces

15
Face Detection in Human

There are processes of face detection

16
Faces Are Special

We analyze faces in a different way

17
Faces Are Special

We analyze faces in a different way

18
Faces Are Special
We analyze faces in a different way
19
Face Recognition in Human

We analyze faces in a specific location

20
Robust Real-Time Face Detection

Viola and Jones, 2003

21
Features

Picture analysis, Integral Image

22
Features

The system classifies images based on the value
of simple features

Two-rectangle
Value ? (pixels in white area) - ? (pixels in
black area)
Three-rectangle
Four-rectangle
23
Contrast Features
Source
Result
24
Features

Notice that each feature is related to a special
location in the sub-window
Why features and not pixels?
Encode domain knowledge
Feature based system operates faster
Inspiration from human V1

25
Features

Later we will see that there are other features
that can be used to implement an efficient face
detector
The original system of Viola and Jones used only
rectangle features

26
Computing Features

Given a detection resolution of 24x24, and size
of 200x200, the set of rectangle features is
160,000 !
We need to find a way to rapidly compute the
features

27
Integral Image

Intermediate representation of the image
Computed in one pass over the original image

28
Integral Image
Using the integral image representation one can
compute the value of any rectangular sum in
constant time. For example the integral sum
inside rectangle D we can compute as ii(4)
ii(1) ii(2) ii(3)
29
Integral Image
Integral Image
30
Building a Detector

Cascading, training a cascade

31
Main Ideas

The Features will be used as weak classifiers
We will concatenate several detectors serially
into a cascade
We will boost (using a version of AdaBoost) a
number of features to get good enough detectors

32
Main Ideas

The Features will be used as weak classifiers
We will concatenate several detectors serially
into a cascade
We will boost (using a version of AdaBoost) a
number of features to get good enough detectors

33
Weak Classifiers

Weak Classifier A feature which best separates
the examples
Given a sub-window (x), a feature (f), a
threshold (T), and a polarity (p) indicating the
direction of the inequality

34
Weak Classifiers

A weak classifier is a combination of a feature
and a threshold
We have K features
We have N thresholds where N is the number of
examples
Thus there are KN weak classifiers

35
Weak Classifier Selection

For each feature sort the examples based on
feature value
For each element evaluate the total sum of
positive/negative example weights (T/T-) and the
sum of positive/negative weights below the
current example (S/S-)
The error for a threshold which splits the range
between the current and previous example in the
sorted list is

36
An example

e B A S- S T- T W f y x
2/5 3/5 2/5 0 0 2/5 3/5 1/5 2 -1 X1
1/5 4/5 1/5 1/5 0 2/5 3/5 1/5 3 -1 X2
0 5/5 0 2/5 0 2/5 3/5 1/5 5 1 X3
1/5 4/5 1/5 2/5 1/5 2/5 3/5 1/5 7 1 X4
2/5 3/5 2/5 2/5 2/5 2/5 3/5 1/5 8 1 X5
37
Main Ideas

The Features will be used as weak classifiers
We will concatenate several detectors serially
into a cascade
We will boost (using a version of AdaBoost) a
number of features to get good enough detectors

38
Main Ideas

The Features will be used as weak classifiers
We will concatenate several detectors serially
into a cascade
We will boost (using a version of AdaBoost) a
number of features to get good enough detectors

39
Cascading

We start with simple classifiers which reject
many of the negative sub-windows while detecting
almost all positive sub-windows
Positive results from the first classifier
triggers the evaluation of a second (more
complex) classifier, and so on
A negative outcome at any point leads to the
immediate rejection of the sub-window

40
Cascading
41
Main Ideas

The Features will be used as weak classifiers
We will concatenate several detectors serially
into a cascade
We will boost (using a version of AdaBoost) a
number of features to get good enough detectors

42
Main Ideas

The Features will be used as weak classifiers
We will concatenate several detectors serially
into a cascade
We will boost (using a version of AdaBoost) a
number of features to get good enough detectors

43
Training a cascade

User selects values for
Maximum acceptable false positive rate per layer
Minimum acceptable detection rate per layer
Target overall false positive rate
User gives a set of positive and negative examples

44
Training a cascade (cont.)

While the overall false positive rate is not met
While the false positive rate of current layer is
less than the maximum per layer
Train a classifier with n features using AdaBoost
on set of positive and negative examples
Decrease threshold for current classifier
detection rate of the layer is more than the
minimum
Evaluate current cascade classifier on validation
set
Evaluate current cascade detector on a set of non
faces images and put any false detections into
the negative training set

45
Results
46
Training Data Set

4916 hand labeled faces
Aligned to base resolution (24x24)
Non faces for first layer were collected from
9500 non faces images
Non faces for subsequent layers were obtained by
scanning the partial cascade across non faces and
collecting false positives (max 6000 for each
layer)

47
Structure of the Detector

38 layer cascade
6060 features

48
Speed of final Detector

On a 700Mhz Pentium III processor, the face
detector can process a 384 by 288 pixel image in
about .067 seconds

49
Improvements

Learning Object Detection from a Small Number of
Examples the Importance of Good Features (Levy
Weiss, 2004)

50
Improvements

Performance depends crucially on the features
that are used to represent the objects (Levy
Weiss, 2004)
Good Features imply
Good results from small training databases
Better generalization abilities
Shorter (faster) classifiers

51
Edge Orientation Histogram