Synergistic Face Detection and Pose Estimation - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Synergistic Face Detection and Pose Estimation

Description:

Robust to: yaw (from left to right profile), roll (-45, 45), and pitch (-60, 60) ... Facial Expressions. Common Problems. Train together. Better generalization ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 26
Provided by: Rita157
Category:

less

Transcript and Presenter's Notes

Title: Synergistic Face Detection and Pose Estimation


1
Synergistic Face Detection and Pose Estimation
  • M. Osadchy M.Miller Y. LeCun
  • Technion NEC Labs NYU

2
Our System
No tracking!
  • Detects faces independently of their poses.
  • Estimates head poses.

3
Our System
  • Robust to yaw (from left to right profile),
    roll (-45, 45), and pitch (-60, 60).
  • Single Detector is applied to all poses.
  • Pose estimation Within 15 error about 90 of
    poses are estimated correctly.
  • Near real-time 5 frames per second on standard
    hardware.

4
Synergy
Common Problems
  • Inner class variation (skin color, hair style,
    etc.)
  • Lighting Variations
  • Scale Variations
  • Facial Expressions

closely related
Multi-View Face Detection
Pose estimation
5
Integrating Face Detection and Pose Estimation
Previous Methods
Pose specific face detector
Rough pose estimation
image
Unmanageable in real problems
6
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Image X
7
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Image X
8
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Train
9
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Train
10
Integrating Face Detection and Pose Estimation
Our Approach
Low dimensional space
Mapping G
Apply
Image X
11
Parameterization of the Face Manifold Single
Parameter
Yaw
12
Parameterization of the Face Manifold Two
Parameters
Yaw and roll
a portion of the surface of a sphere
13
Minimum Energy Machine
14
Operating the Machine
  • Clamp X to the observed value (the image)
  • Find Z and Y such that
  • Complete energy

15
Operating the Machine
  • Clamp X to the observed value (the image)
  • Find Z and Y such that
  • Complete energy

16
Architecture
( energy)
Operating the machine
switch
T
otherwise
analytical mapping onto face manifold
convolutional network
W (param)
Z (pose)
Y (label)
X (image)
17
Convolutional Network
  • end-to-end trainable systems from low-level
    features to high-level representations.
  • Easily learn the type of shift-invariant
    features, relevant to object recognition.
  • Can be replicated over large images much more
    efficiently than traditional classifiers.

Considerable advantage for real-time systems!
18
Similar to LeNet5, with more maps
C1 feature maps 8_at_28x28
C3 f. maps 20_at_10x10
Input 32x32
S4 f. maps 20_at_5x5
S1 f. maps 8_at_14x14
C5 120
Output 9
Full connection
Subsampling
Convolutions
Subsampling
Convolutions
Convolutions
19
Training with Discriminative Loss Function
loss for face sample with known pose
loss for non-face sample
Minimize
training non-faces
training faces
20
Running the Machine
  • Works on grey-level images.
  • Applied at range of scales stepping by a factor
    of .
  • The network is replicated over the image at each
    scale, stepping by 4 pixels in x and y.
  • Overlapping detections are replaced by the
    strongest.

21
Results
  • Our system is robust to yaw , in-plane
    rotation , and pitch

22
Training
  • 52,850, 32x32 grey-level images of faces (NEC
    Labs hand annotated set) with uniform
    distribution of poses.
  • Initial negative set 52,850 random non-face
    natural images.
  • Second phase half of the initial negative set
    was replaced by false positives of the initial
    version of the detector.
  • Each training image was used 5 times with random
    variation in scale, in-plane rotation, brightness
    and contrast.
  • 9 passes on the data 26 hours on 2Ghz Pentium 4.
  • The system converged to an EER of 5 on training
    set and 6 on test set of 90,000 images.

23
Test on Standard Data Sets
  • No standard set tests all poses, that our system
    is designed to detect.
  • 3 standard sets focusing on particular pose
    variation tilted, profile, and frontal.

Real time
24
Standard Sets
Pose Estimation of the detected faces
Detection
Note typical pose estimation systems input
centered faces when we hand localize this faces
we get 89 of yaw and 100 of in-plane rotations
within 15 degrees.
25
Synergy Test
Detection
Pose Estimation
Write a Comment
User Comments (0)
About PowerShow.com