A Nonobtrusive Head Mounted Face Capture System - PowerPoint PPT Presentation

About This Presentation

Title:

A Nonobtrusive Head Mounted Face Capture System

Description:

Can process in real-time. Simple and user-friendly system. Static with respect to human head ... Euclidean distance between two points i and j is given by ... – PowerPoint PPT presentation

Number of Views:60

Avg rating:3.0/5.0

Slides: 55

Provided by: cse58

Learn more at: http://www.cse.msu.edu

Category:

more less

Transcript and Presenter's Notes

Title: A Nonobtrusive Head Mounted Face Capture System

1
A Non-obtrusive Head Mounted Face Capture System
Chandan K. ReddyMasters Thesis Defense

Thesis Committee

Dr. George C. Stockman (Main Advisor) Dr. Frank
Biocca (Co-Advisor) Dr. Charles Owen Dr. Jannick
Rolland (External Faculty)
2
Modes of Communication

Text only - e.g. Mail, Electronic Mail
Voice only e.g. Telephone
PC camera based conferencing e.g. Web cam
Multi-user Teleconferencing
Teleconferencing through Virtual Environments
Augmented Reality Based Teleconferencing

3
Face-to-Face Communication
There is no landscape that we know as well as
the human face. The twenty-five-odd square inches
containing the features is the most intimately
scrutinized piece of territory in existence,
examined constantly, and carefully, with far more
than an intellectual interest. - by Gary
Faigin.
A well developed Face-to Face Communication
System will advance the state-of-art
teleconferencing systems
Face-to Face Communication System is part of
Teleportal project that is being developed in
MIND Lab at Michigan State University and ODA Lab
at University of Central Florida.
4
Problem Definition

Face Capture System ( FCS )
Virtual View Synthesis
Depth Extraction and 3D Face Modeling
Head Mounted Projection Displays
3D Tele-immersive Environments
High Bandwidth Network Connections

5
Thesis Contributions

Complete hardware setup for the FCS.
Camera-mirror parameter estimation for the
optimal configuration of the FCS.
Generation of quality frontal videos from two
side videos
Reconstruction of texture mapped 3D face model
from two side views
Evaluation mechanisms for the generated frontal
views.

6
Existing Face Capture Systems

FaceCap3d - a product from Standard Deviation
Optical Face Tracker a product from Adaptive
Optics
Courtesy
Advantages Freedom for Head Movements Drawbacks
Obstruction of the users Field of view Main
Applications Character Animation and Mobile
environments
7
Existing Face Capture Systems

Courtesy
Sea of Cameras (UNC Chappel Hill)
National tele-immersion Initiative
Advantages No burden for the user Drawbacks
Highly equipped environments and restricted head
motion Main Applications Teleconferencing and
Collaborative work
8
Virtual View Synthesis

View Interpolation for Image Synthesis by Chen
and Williams 93
View Morphing by Seitz and Dyer 96
The Lumigraph by Gortler et al 96
Light Field Rendering by Levoy and Hanrahan
96
Stereo based View Synthesis by Kanade et al
99
Dynamic View Morphing by Manning and Dyer 99
Spatio-Temporal View Interpolation by Vedula
and Kanade 02

9
Depth Extraction and Face Modeling

Depth Extraction
Structured Light
Shape from Shading
Structure from Stereo
Structure from Motion
Face Modeling
A parametric model of human faces Parke 74
3D individualized head model from orthogonal
views - Ip and Yin 96
Realistic facial expressions synthesized from
photographs - Pighin et al 98
Face model from a video sequence of face images -
Lai and Cheng 01

10
Head Mounted Displays and Tele-immersive
Environments

Head Mounted Displays Ivan Sutherland 68
VIDEOPLACE Kruger 85
CAVES Cruz Neira 93
Teleconferencing using a Sea of Cameras Fuchs
et al 94
Head Mounted Projective Displays - Fischer 96
Degenerate CAVES (Immersa Desk, Immersive Work
Bench) Czernuszenko et al 97
Office of the Future Raskar et al 98
MAGIC BOOK Billinghurst et al 01
Mobile Displays Feiner 02

11
Proposed Face Capture System
(F. Biocca and J. P. Rolland, Teleportal
face-to-face system, Patent Filed, 2000.)
Novel Face Capture System that is being
developed. Two Cameras capture the corresponding
side views through the mirrors
12
Advantages

Users field of view is unobstructed
Portable and easy to use
Gives very accurate and quality face images
Can process in real-time
Simple and user-friendly system
Static with respect to human head
Flipping the mirror cameras view the users
viewpoint

13
Applications

Mobile Environments
Collaborative Work
Multi-user Teleconferencing
Medical Areas
Distance Learning
Gaming and Entertainment industry
Others

14
System Design
15
Equipment Required
16
Transmission using Internet2

Over 190 universities working in partnership with
industry to develop Internet2
Internet2 connections are capable of transmitting
full broadcast quality video streams between
remote collaborative sites using MPEG2 video
encoding and decoding technology
Suitable for High Band width channel applications
like Medical visualization, Tele-conferencing and
other applications that make use of enormous
amount of data
The Internet 2 test bed has been established
between the MIND Lab at Michigan State University
and ODA Lab at University of Central Florida
implemented using MPEG 2 video streams

17
Optical Layout

Three Components to be considered
Camera
Mirror
Human Face

18
Specification Parameters

Camera
Sensing area 3.2 mm X 2.4 mm (¼).
Pixel Dimensions Image sensed is of dimensions
768 X 494 pixels. Digitized image size is 320 X
240 due to restrictions of the RAM size.
Focal Length(Fc) 12 mm (VCL 12UVM).
Field of View (FOV) 15.2 0 X 11.4 0.
Diameter (Dc) 12mm
Fnumber (Nc) 1 -achieve maximum lightness.
Minimum Working Distance (MWD)- 200 mm.
Depth of Field (DOF) to be estimated

19
Specification Parameters (Contd.)

Mirror
Diameter (Dm) / Fnumber (Nm)
Focal Length (fm)
Magnification factor (Mm)
Radius of curvature (Rm)
Human Face
Height of the face to be captured (H 250mm)
Width of the face to be captured (W 175 mm)
Distances
Distance between the camera and the mirror.
(Dcm150mm)
Distance between the mirror and the face. (Dmf
200mm)

20
Estimation of the variable parameters
The Imaging Equation
The Diameter of the mirror Dm 26.3 2 /
(10.16 N)
21
Optimal Design Calculations
22
Customization of Cameras and Mirrors

Off-the-shelf cameras
Customizing camera lens is a tedious task
Trade-off has to be made between the field of
view and the depth of field
Sony DXC LS1 with 12mm lens is suitable for our
application
Custom designed mirrors
A plano-convex lens with 40mm diameter is coated
with black on the planar side.
The radius of curvature of the convex surface is
155.04 mm.
The thickness at the center of the lens is 5 mm.
The thickness at the edge is 3.7 mm.

23
Block diagram of the system
24
Experimental setup
25
Virtual Video Synthesis
26
Problem Statement
Generating virtual frontal view from two side
views
27
Data processing

Two synchronized videos are captured in real-time
(30 frames/sec) simultaneously.
For effective capturing and processing, the data
is stored in uncompressed format.
Machine Specifications (Lorelei _at_
metlab.cse.msu.edu)
Pentium III processor
Processor speed 746 MHz
RAM Size 384 MB
Hard Disk write Speed (practical) 9 MB/s
MIL-LITE is configured to use 150 MB of RAM

28
Data processing (Contd.)

Size of 1 second video 30 320 240 3
6.59 MB
Using 150 MB RAM, only 10 seconds video from two
cameras can be captured
Why does the processing have to be offline?
Calibration procedure is not automatic
Disk writing speed must be at least 14 MB/S.
To capture 2 videos of 640 480 resolution, the
Disk writing speed must be at least 54 MB/S ???

29
Structured Light technique
Projecting a grid on the frontal view of the face
A square grid in the frontal view appears as a
quadrilateral (with curved edges) in the real
side view
30
Color Balancing

Hardware based approach
White balancing of the cameras
Why this is more robust ? why not software
based ?
There is no change in the input camera
Better handling of varying lighting conditions
No pre - knowledge of the skin color is required
No additional overhead
Its enough if both cameras are color balanced
relatively

31
Off-line Calibration Stage
Left Calibration Face Image
Right Calibration Face Image
Projector
Transformation Tables
32
Calibration Procedure

Capture the two side views with a grid projected
on the face from the two cameras placed near two
ears and store them in the corresponding images
(ILs,t and IRu,v) .
Take some grid intersection points and define
transform functions for determining the (s,t)
coordinates in the left image (IL) and (u,v)
coordinates in the right image (IR).
Apply bilinear interpolation technique to obtain
any points inside the grid coordinates.
Based on transformation functions construct two
transformation tables (one for the left image and
one for the right) which have index as (x,y) and
gives a corresponding (s,t) of IL and (u,v) of IR.

33
Operational Stage
Right Face Image
Left Face Image
Transformation Tables
Right Warped Face Image
Left Warped Face Image
Mosaiced Face Image
34
Generation of Virtual Frontal Views

Get the two side views without a grid projected
on the face from the two cameras placed near two
ears (IL and IR).
Generate the (x,y) coordinate in the virtual
view, move to the corresponding location in the
transformation table and store the mapping
(Mpx,y) at that pixel value.
Reconstruct the (x,y) coordinates of the frontal
view (image V) with the help of Mpx,y and the
values of ILs,t and IRu,v.
Smooth the geometrical and lighting variations
across the vertical midline in V by applying a
linear (one-dimensional) filter.
Continue this reconstruction of Vx,y for every
frame of the videos to produce the final virtual
frontal video.

35
Bilinear Mapping

To get the corresponding (u,v) point inside the
quadrilateral

Computed by linearly interpolating by fraction of
u along the top and bottom edges of the
quadrilateral, and then linearly interpolating by
fraction v between the two interpolated points to
yield destination point
36
Virtual video synthesis (Calibration phase)
37
Virtual video synthesis (contd.)
38
Virtual Frontal Video
39
Comparison of the Frontal Views
First row Virtual frontal views Second row
Original frontal views
40
Video Synchronization (Eye blinking)
First row Virtual frontal views Second row
Original frontal views
41
Face Data through Head Mounted System
42
3D Face Model
43
Coordinate Systems

There are five coordinate systems in our
application
World Coordinate System (WCS)
Face Coordinate System (FCS)
Left Camera Coordinate system (LCCS)
Right Camera Coordinate system (RCCS)
Projector Coordinate System (PCS)

44
Camera Calibration

Conversion from 3D world coordinates to 2D camera
coordinates - Perspective Transformation Model

Eliminating the scale factor uj (c11 c31
uj) xj (c12 c32 uj) yj (c13 c33 uj) zj
c14 vj (c21 c31 vj) xj (c22 c32 vj) yj
(c23 c33 vj) zj c24
45
Calibration sphere

A sphere can be used for Calibration
Calibration points on the sphere are chosen in
such a way that the
Azimuthal angle is varied in steps of 45o
Polar angle is varied in steps of 30o
The location of these calibration points is known
in the 3D coordinate System with respect to the
origin of the sphere
The origin of the sphere defines the origin of
the World Coordinate System

46
Spherical to Cartesian coordinates

The 3D coordinates are known in the spherical
coordinate system
The 3D location (Px, Py, Pz) in cartesian
coordinate system is defined as
(R, ?, ?) in the spherical coordinate system
R Radius of the Sphere
? - Azimuthal angle in the xy-plane from the
x-axis.
? - Polar angle from the z-axis. (also known as
"colatitude of P).
The range - 0 ? ? ? 2 ? and 0 ? ? ? ?
Px R Sin (?) Cos (?)
Py R Sin (?) Sin (?)
Pz R Cos (?)
Given (R, ?, ?) we can compute (Px, Py, Pz)

47
Projector Calibration

Similar to Camera Calibration
2D image coordinates can not be obtained directly
from a 2D image.
A Blank Image is projected onto the sphere
The 2D coordinates of the calibration points on
the projected image are noted
More points can be seen from the projectors
point of view some points are common to both
camera views
Results appear to have slightly more errors when
compared to the camera calibration

48
3D Face Model Construction

Why?
To obtain different views of the face
To generate the stereo pair to view it in the
HMPD
Steps required
Computation of 3D Locations
Customization of 3D Model
Texture Mapping

49
Computation of 3D points

3d point estimation using stereo
Stereo between two cameras is not possible
because of the occlusion by the facial features
Hence two stereo pair computations
Left camera and projector
Right camera and projector
Using stereo, compute 3D points of prominent
facial feature points in FCS

50
3D Generic Face Model
A generic face model with 395 vertices and 818
triangles Left front view and Right side view
51
Texture Mapped 3D Face
52
Evaluation
53
Evaluation Schemes

Evaluation of facial expressions and is not
studied extensively in literature
Evaluation can be done for facial alignment, face
recognition for static images
Lip and eye movements in a dynamic event
Perceptual quality How are the moods conveyed?
Two types of evaluation
Objective evaluation
Subjective evaluation

54
Objective Evaluation

Theoretical Evaluation
No human feedback required
This evaluation can give us a measure of
Face recognition
Face alignment
Facial movements
Methods applied
Normalized cross correlation
Euclidean distance measures

55
Evaluation Images
5 frames were considered for objective
evaluation First row virtual frontal views
Second row original frontal views
56
Normalized Cross-Correlation

Regions considered for normalized
cross-correlation
( Left Real image Right Virtual image)

57
Normalized Cross-Correlation

Let V be the virtual image and R be the real
image
Let w be the width and h be the height of the
images
The Normalized Cross-correlation between the two
images V and R is given by
where

58
Normalized Cross-Correlation
59
Euclidean Distance measures

Euclidean distance between two points i and j is
given by
Let Rij be the euclidean distance between two
points i and j in the real image
Let Vij be the euclidean distance between two
points i and j in the virtual image
Dij Rij - Vij

60
Euclidean Distance measures
61
Subjective Evaluation

Evaluates the human perception
Measurement of quality of a talking face
Factors that might affect
Quality of the video
Facial movements and expressions
Synchronization of the two halves of the face
Color and Texture of the face
Quality of audio
Synchronization of audio
A preliminary study has been made to assess the
quality of the generated videos

62
Conclusion and Future Work
Future Work
Conclusion
Time Domain
Static
Dynamic
Virtual Frontal Image
Virtual Frontal Video
2D
Texture Mapped 3D Face Model
3D Facial Animation
3D
63
Summary