Synergistic Face Detection and Pose Estimation

About This Presentation

Title:

Synergistic Face Detection and Pose Estimation

Description:

Robust to: yaw (from left to right profile), roll (-45, 45), and pitch (-60, 60) ... Facial Expressions. Common Problems. Train together. Better generalization ... – PowerPoint PPT presentation

Number of Views:99

Avg rating:3.0/5.0

Slides: 26

Provided by: Rita157

Category:

more less

Transcript and Presenter's Notes

Title: Synergistic Face Detection and Pose Estimation

1
Synergistic Face Detection and Pose Estimation

M. Osadchy M.Miller Y. LeCun
Technion NEC Labs NYU

2
Our System
No tracking!

Detects faces independently of their poses.
Estimates head poses.

3
Our System

Robust to yaw (from left to right profile),
roll (-45, 45), and pitch (-60, 60).
Single Detector is applied to all poses.
Pose estimation Within 15 error about 90 of
poses are estimated correctly.
Near real-time 5 frames per second on standard
hardware.

4
Synergy
Common Problems

Inner class variation (skin color, hair style,
etc.)
Lighting Variations
Scale Variations
Facial Expressions

closely related
Multi-View Face Detection
Pose estimation
5
Integrating Face Detection and Pose Estimation
Previous Methods
Pose specific face detector
Rough pose estimation
image
Unmanageable in real problems
6
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Image X
7
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Image X
8
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Train
9
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Train
10
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Apply
Image X
11
Parameterization of the Face Manifold Single
Parameter
Yaw
12
Parameterization of the Face Manifold Two
Parameters
Yaw and roll
a portion of the surface of a sphere
13
Minimum Energy Machine
14
Operating the Machine

Clamp X to the observed value (the image)
Find Z and Y such that
Complete energy

15
Operating the Machine

Clamp X to the observed value (the image)
Find Z and Y such that
Complete energy

16
Architecture
( energy)
Operating the machine
switch
T
otherwise
analytical mapping onto face manifold
convolutional network
W (param)
Z (pose)
Y (label)
X (image)
17
Convolutional Network

end-to-end trainable systems from low-level
features to high-level representations.
Easily learn the type of shift-invariant
features, relevant to object recognition.
Can be replicated over large images much more
efficiently than traditional classifiers.

Considerable advantage for real-time systems!
18
Similar to LeNet5, with more maps
C1 feature maps 8_at_28x28
C3 f. maps 20_at_10x10
Input 32x32
S4 f. maps 20_at_5x5
S1 f. maps 8_at_14x14
C5 120
Output 9
Full connection
Subsampling
Convolutions
Subsampling
Convolutions
Convolutions
19
Training with Discriminative Loss Function
loss for face sample with known pose
loss for non-face sample
Minimize
training non-faces
training faces
20
Running the Machine

Works on grey-level images.
Applied at range of scales stepping by a factor
of .
The network is replicated over the image at each
scale, stepping by 4 pixels in x and y.
Overlapping detections are replaced by the
strongest.

21
Results

Our system is robust to yaw , in-plane
rotation , and pitch

22
Training

52,850, 32x32 grey-level images of faces (NEC
Labs hand annotated set) with uniform
distribution of poses.
Initial negative set 52,850 random non-face
natural images.
Second phase half of the initial negative set
was replaced by false positives of the initial
version of the detector.
Each training image was used 5 times with random
variation in scale, in-plane rotation, brightness
and contrast.
9 passes on the data 26 hours on 2Ghz Pentium 4.
The system converged to an EER of 5 on training
set and 6 on test set of 90,000 images.

23
Test on Standard Data Sets

No standard set tests all poses, that our system
is designed to detect.
3 standard sets focusing on particular pose
variation tilted, profile, and frontal.

Real time
24
Standard Sets
Pose Estimation of the detected faces
Detection
Note typical pose estimation systems input
centered faces when we hand localize this faces
we get 89 of yaw and 100 of in-plane rotations
within 15 degrees.
25
Synergy Test
Detection
Pose Estimation

Write a Comment

User Comments (0)