Nicolas Galoppo von Borries

About This Presentation

Title:

Nicolas Galoppo von Borries

Description:

Make animations directly from audio channel, without performance capture. This animation can be the point ... Vocals and labial consonants: strict deformations ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 20

Provided by: nicogalopp

Category:

more less

Transcript and Presenter's Notes

Title: Nicolas Galoppo von Borries

1
Speech synchronized facial animation

Speech animation using Viseme Space
G.A. Kalberer, P.Mueller and L.Van Gool

2
Introduction

Why not make the job even easier, and generate
animations directly from speech or transcribed
text?
Make animations directly from audio channel,
without performance capture
This animation can be the point of departure for
the animators, who then also get support from the
system to make further changes as desired

3
Visemes

Visemes can be considered as the visual speech
counterparts of phonemes.
They are associated to the 3D deformations of a
neutral face
Animation is achieved by concatenating visemes

VIDEO
4
Animating faces in Viseme Space

Smooth and convincing transitions by performing
interpolation in Viseme Space rather than in
geometric space
Viseme space can be roamed by the animator, as a
convenient tool to make creative modifications to
the animation.

5
Face animation with visemes

In related work, they describe how to extract a
set of visemes from a face, observed in 3D while
talking
Time consuming process we dont want to repeat
it for every single face to be animated

6
Animation of novel faces

What if we havent observed visemes for a novel
face? There are 3 main steps to animate such a
face
Personalizing the visemes
Automatic audio-based animation
Further modifications by the animator

7
Personalizing the Visemes

Simply cloning the visemes of a particular
example face on the novel face doesnt look real
We represent faces as points in Face Space

8
Personalizing the Visemes

is the orthogonal projection of the novel
face onto this hyperplane

9
Personalizing the Visemes

Express the projected novel face as
Now we apply the weights wi to the visemes of the
example faces, to yield personalized set of
visemes for the novel face
The effect a rounded face will get visemes that
are closer to those of the more rounded example
face

10
Automatic audio-based animation

Basic steps for animation
Audio track extract allophones and timings
Translate the allophones to visemes
Concatenate the visemes
Animation sequence is a trajectory in Viseme
Space (similar to Face Space)
Viseme Space is based on Independent Component
Analysis (ICA) as opposed to PCA
(more about ICA later)

11
Automatic audio-based animation

Two problems with straight interpolation
Point-to-point navigation between visemes yields
jerky motions
The temporal samples in the audio track may not
coincide with the pace at which visemes change
Do a Spline fitting to the Viseme Space
coordinates (NURBS)
All the visited deformations look realistic
Fixed rate sampling gives smooth interpolations

12
Automatic audio-based animation

2 types of visemes
Vocals and labial consonants strict deformations
Other can be pronounced with a lot of visual
variation
First perform fitting using first type, then bend
the curve towards points of second type.

13
Modifications by the animator

We want to allow the animator to add his creative
input to the generated animation
The animator can change
The visited visemes
The spline trajectories in between

14
Modifications by the animator

What happens when the animator changes the spline
trajectory?
The space of possible deformations in a space
based on PCA or ICA is the same
The difference ICA generates independent
components

15
Modifications by the animator

In fact, PCA is a pre-process of ICA
PCA gives the major modes of variation in the net
effect of the movements of the individual
muscles
ICA decouples the net effect again. Therefore,
ICA makes the modifications by the animator more
intuitive.

16
Modifications by the animator

Example
1 IC can model opening the mouth, more PCs are
needed to model the same change
Changing 1 PC can open the mouth, but it will
also round it
Animators want intuitive keyframes like visemes,
but basic emotions as the primary modeling
interface

17
Results

Cloning alone of the visemes does not work,
weighting after projection in Face Space is needed

18
Results

With weights applied

And animated
VIDEO
19
Additional remarks

The main key to producing realistic animations is
to add non-verbal speech related facial
expressions
Add Perlin noise
Automatic Generation of Non-Verbal Facial
Expressions from Speech
Irene Albrecht, Jorg Haber, Hans-Peter Seidel

Write a Comment

User Comments (0)