Title: Estimating 3D Facial Pose in Video with Just Three Points
1Estimating 3D Facial Pose in Video with Just
Three Points
- Ginés García Mateos, Alberto Ruiz García
- Dept. de Informática y Sistemas
- P.E. López-de-Teruel, A.L. Rodriguez, L.
Fernández - Dept. Ingeniería y Tecnología de Computadores
- University of Murcia - SPAIN
2Introduction (1/3)
- Main objective to develop a new method to
estimate the 3D pose of the head of a human user - Estimation through a video sequence
- Working with the minimum necessary information a
2D location of the face - A very simple method, without training, running
in real-time fast processing - Under realistic conditions robust to facial
expressions, light, movements - Robustness preferred to accuracy
3Introduction (2/3)
- 3D pose estimation using 3D tracking
3D morphable mesh
Active Appearance Model
http//cvlab.epfl.ch/research/body
http//www.lysator.liu.se/eru/research/
Cylindrical Models
Shape texture models
http//www.merl.com/projects/3Dfacerec/
www.cs.bu.edu/groups/ivc/html/research_list.php
4Introduction (3/3)
- In short, we want to obtain something like this
- The result is 3D location (x, y, x), and 3D
orientation (roll, pitch, yaw) 6 D.O.F.
5Index of the presentation
- Overview of the proposed method
- 2D facial detection and location
- 2D face tracking
- 3D Facial pose estimation
- 3D Position
- 3D Orientation
- Experimental results
- Conclusions
6Overview of the Proposed Method
- The key idea separate the problems of 2D
tracking and 3D pose estimation.
3D Pose estimation
2D Face detection
2D Face tracking
The proposed 3D pose estimator could use any 2D
facial tracker
- Introducing some assumptions and simplifications,
pose is extracted with very little information.
72D Face Detection, Location and Tracking Using
I.P.
- We use a method based on integral projections
(I.P.), which is simple and fast. - Definition of I.P. average of gray levels of an
image along rows and columns.
PVi ymin, ..., ymax ? R Given by PVi(y)
i(, y)
PHi xmin, ..., xmax ? R Given by PHi(x)
i(x, )
i(x, y)
82D Face Detection with I.P.
- Global view of the I.P. face detector
Step 2. Horizontal projection of the candidates
Step 1. Vertical projections by strips
Step 3. Grouping of the candidates
Inputimage
PVface
Final result
PHeyes
92D Face Detection with I.P.
- To improve the results, we combine two face
detectors combined detector.
Face Detector 1.
Face Detector 2.
Final detection
Look for candidates
Verify face candidates
result
Haar AdaBoostViola and Jones, 2001
Integral ProjectionsGarcia et al, 2007
102D Face Detection with I.P.
Garcia et al, 2007
112D Face Location with I.P.
- Global view of the 2D face locator
Step 3. Horizontal alignment
Step 2. Vertical alignment
Step 1. Orientation estimation
Input image and face
Final result
100
150
200
250
122D Face Location with I.P.
- Location accuracy of the 2D face locator
IntProj
NeuralNet
EigenFeat
Av. time PIV 2.6Gh
323,6 ms
20,5 ms
1,7 ms
132D Face Tracking with I.P.
142D Face Tracking with I.P.
- Sample result of the proposed tracker.
(e1x, e1y) location of left eye (e2x, e2y)
right eye (mx, my) location of the mouth
320x240 pixels, 312 frames at 25fps, laptop webcam
153D Facial Pose Estimation
- In theory, 3 points should be enough to solve the
6 degrees-of-freedom (if focal length and face
geometry are known). - But
- Location errors are high in the mouth for
non-frontal faces. - Some assumptions are introduced to avoid the
effect of this error.
163D Facial Pose Estimation
- Fixed body assumption fixed users body, moving
the head ? 3D position is estimated in the first
frame 3D orientation in the following frames.
- A simple perspective projection model is used to
estimate 3D position.
173D Position Estimation
p (px,py,pz)
(0,0,0)
cx (e1xe2xmx)/3 cy (e1ye2ymy)/3
- f focal length (known)
- (cx,cy) tracked center of the face
183D Position Estimation
- We have
- cx/f px/pz cy/f py/pz
- Where
- cx (e1xe2xmx)/3 cy (e1ye2ymy)/3
- So
- px (e1xe2xmx)/3pz/f
- py (e1ye2ymy)/3pz/f
- The depth of the face, pz, is computed with pz
ft/r, where r is the apparent face size and t
is the real size. - For more information, see the paper. .
19Estimation of Roll Angle
- Roll angle can be approximately associated with
the 2D rotation of the face in the image.
e2y - e1y
roll arctan
e2x - e1x
roll
-43,7º
roll
-2,8º
roll
15,9º
roll
34,6º
- This equation is valid in most practical
situations, but it is not precise in all cases.
20Estimation of Pitch and Yaw
- The head-neck system can be modeled as a robotic
arm, with 3 rotational DOF.
TOP VIEW
ORTHOGRAPHIC VIEW
FRONT VIEW
Y
Y
Y
X
b
b
yaw
c
a
pitch
X
X
Z
Z
roll
b
b
Z
i
- In this model, any point of the head lies in a
sphere ? its projection is related to pitch and
yaw.
21Estimation of Pitch and Yaw
- rw radius of the sphere where the center of the
eyes lies. - ri radius of the circle where that sphere is
projected. - (dx0, dy0) initial center of eyes.
- (dxt, dyt) current center of eyes
? rw sqrt(a2c2)
? ri rwf/pz
i
i
i
Y
Y
Y
(dx1,dy1)
(dx0,dy0)
(dx0,dy0)
(dx0,dy0)
(dx2,dy2)
i
i
i
X
X
X
r
i
r
i
r
i
Initial frame pitch 0, yaw 0
Instant t 2
Instant t 1
22Estimation of Pitch and Yaw
- In essence, we have a problem of computing
altitude and latitude for a given point in a
circle. - The center of the circle is
- (dx0, dy0 - af/pz)
- So we have
- pitch arcsin
- And
- yaw arcsin
dyt - (dy0 - a f/pz)
- arcsin a/c
ri
dxt - dx0
ri cos(pitch arcsin(a/c))
23Experimental Results (1/7)
- Experiments carried out
- Off-the-shelf webcams.
- Different individuals.
- Variations in facial expressions and facial
elements (glasses). - Studies of robustness, efficiency, comparison
with a projection-based 3D estimation algorithm. - In a Pentium IV at 2.6Gh 5 ms file reading, 3
ms tracking, 0.006 ms pose estimation
24Experimental Results (2/7)
- Sample input video bego.a.avi
320x240 pixels, 312 frames at 25fps, laptop webcam
25Experimental Results (3/7)
- 3D pose estimation results
320x240 pixels, 312 frames at 25fps, laptop webcam
26Experimental Results (4/7)
Proposed method
Projection-based
Proposed method
Pitch
Projection-based
27Experimental Results (5/7)
- Approx. 20º in pitch and 40º in yaw.
- The 2D tracker is not explicitly prepared for
profile faces!
28Experimental Results (6/7)
- With glasses and without glasses
29Experimental Results (7/7)
- When fixed-body assumption does not hold
- Body/shoulder tracking could be used to
compensate body movement.
30Conclusions (1/3)
- Our purpose was to design a fast, robust, generic
and approximate 3D pose estimation method - Separation of 2D tracking and 3D pose.
- Fixed-body assumption.
- Robotic head model.
- 3D position is computed in the first frame.
- 3D orientation is estimated in the rest of
frames. - Estimation process is very simple, and avoids
inaccuracies in the 2D tracker.
31Conclusions (2/3)
- Future work using the 3D pose estimator in a
perceptual interface.
32Conclusions (3/3)
- The simplifications introduced lead to several
limitations of our system, but in general - Human anatomy of the head/neck system could be
used in 3D face trackers. - The human head cannot move independently of the
body! - Taking advantage of these anatomical limitations
could simplify and improve current trackers.
33Last
- This work has been supported by the project
Consolider Ingenio-2010 CSD2006-00046, and
TIN2006-15516-C04-03. - Sample videos
- http//dis.um.es/ginesgm/fip
- Grupo PARP web page
- http//perception.inf.um.es/
- Thank you very much