Title: EE 492 ENGINEERING PROJECT
1EE 492 ENGINEERING PROJECT
- LIP TRACKING
-
- Yusuf Ziya Isik Ashat Turlibayev
-
- Advisor Prof. Dr. Bülent Sankur
2Outline
- IDENTIFICATION OF THE PROBLEM
- LIP CONTOUR EXTRACTION
- LIP TRACKING
- RESULTS AND CONCLUSION
- FUTURE WORK
3IDENTIFICATION OF THE PROBLEM
- Automatic Speech Recognition (ASR) systems
- 1.Systems Using Only Acoustic Information
- - Poor performance in noisy
environments -
- 2.Bimodal Audio-Visual Systems
- - Visual signal often contains information
that is complementary to audio
information - - Visual information is not affected by
acoustic noise - - The overall performance of the combined
sistem is better
4 Recognition ratio of audio, visual and
audio-visual approaches
5LIP READING
- Obtaining the visual information is known as lip
reading problem - Lip tracking is a crucial step of extracting
visual features. -
6LIP TRACKING
- Lip tracking problem can be solved in 2 steps
- Extracting lip boundary in the first frame by the
help of the user - Tracking the obtained contour through the
subsequent frames automatically
7Lip Contour Extraction
- Fully automatic segmentation is a very difficult
task - Semi-automatic methods are unavoidable and wanted
- Intelligent Scissors is a robust, accurate, and
interactive semi-automatic boundary extraction
tool which requires minimal user input.
8Intelligent Scissors I
- Intelligent Scissors tool provides extracting of
objects contour by using several seed points
specified interactively by the user. - Intelligent Scissors algorithm converts the
object boundary extraction to the problem of
optimal path search in a weighted graph.
9Obtaining Weighted Graph
- Weighted Graph The local cost is calculated from
every pixel in the image to its neghbouring
pixel. - Local Cost Functionals
- -Laplacian zero crossing
- -Gradient Magnitude
- -Gradient Direction
- Pixels that exibit strong edge features are made
to have low local costs.
10Optimal Path Selection
- User Interaction Seed points are specified on
the image after all local costs are calculated. - Contour Minimal Cost Path The optimal path
from every pixel in the image to the seed point
is determined by using Dijkstras algorithm.
11Live-Wire Tool
- Live-Wire Tool As the user moves the mouse, the
optimal path from the free point to the seed
point is displayed. - Property of the live-wire If the cursor comes
in proximity of the edge the live-wire snaps to
the object boundary. - Extracting the Contour When the new seed point
is specified, the live wire from this point to
the previous seed point is taken as a segment of
contour.
12Extracting of a Lip Contour Using Intelligent
Scissors
At every move of the mouse the previous
live-wire is deleted and the new one beginning
from the current position of the cursor and
ending at the seed point is displayed.
13 Extraction of Outer Boundaries of
Lena and a Lip
Image Using Intelligent Scissors
14LIP TRACKING
- Method 1
- Non-Rigid Object Tracking Algorithm
- Method 2
- Tracking with Intelligent Scissors
- Method 3
- Active Shape Models
15Non-Rigid Object Tracking
16Results of Non-Rigid Object Tracking
Esra-8 Video Sequence
Aysel-0 Video Sequence
Esra-6 Video Sequence
17Evaluation of Algorithm
Color Edge
Frame 67
Frame 68
Color Segmentation
18Remarks
- The overall performance of the algorithm is
satisfactory. - Advantage Ability to track the lips through
large number of frames. - Drawback Long computation time of this
algorithm in a closed loop mode makes it
inappropriate for accurate tracking in real time
applications.
19Lip Tracking Using Intelligent Scissors
- Motivations
- A desire to obtain a more accurate and faster
lip tracking tool. - Intelligent Scissors may be extended from lip
segmentation to lip tracking easily.
20Lip Tracking using Intelligent Scissors
- Seed points from the first frame are tracked to
the following frames and by using Intelligent
Scissors the contour of the lip may be extracted
automatically. - Suitable seed points are located by using priori
information about the lip image. - Used Features
- Gradient Magnitude
- Hue Value
- Distance between successive seed points
21Gradient Magnitude Feature
- Lip region has larger gradient magnitude
- than its surrounding region
- N points with highest gradient magnitudes
- (N ltlt MM, M is the search range) are seed
- candidates.
22Hue Values
- Hue value is very useful for separating boundary
from inner lip regions.
- Hue tripple In addition to the seed point that
is going to be tracked, - hues of neighbours that are p pixels up and
down of the current point - are calculated.
- Selected Seed Point From N points having largest
gradients the one whose - hue tripple is the most similar to the preious
seeds tripple is selected.
23The Distance Between Seed Points
- The relative poistion of seed points is very
- important during tracking. The Intelligent
Scissor - tool gives wrong results if they get too close
or too - far away from each other.
- In the figure above the search range of seed
point s2 in - the following frame is shown.
24Result
- Result of the Tracking Using Intelligent
Scissors method applied - on the 20 frame lip
sequence -
25Active Shape Models
- Motivations
- Lip tracking is a specific case of the general
object tracking problem. Therefore, taking into
account the knowledge about the shape of the lip
will increse the performance of a tracker. - Active Shape Models may be used for lip tracking
on their own as well as for complementing and
correcting the errors of a tracker with
Intelligent Scissors.
26Lip Training Set
- The shape of a lip is represented by a set of n
2-D points - xx1,x2,x3,...,xn,y1,y2,y3,...,yn
- If there are s training examples in a set
corresponding s vectors are constructed and
brought to the same coordinate frame.
27Active Shape Models I
- Shape Model We look for a parametric model
xM(b), where b is vector of model parameters. - Principal Component Analysis Helps to reduce the
dimensionality of the data. - Covariance matrix S of shape vectors
28Active Shape Models II
- Eigenlips Eigenvectors of S (fi) are computed
and corresponding eigenvalues (?i) are determined
. - The matrix F is formed which contains t
eigenvectors corresponding to t largest
eigenvalues. Hence - New Lip Shapes By changing components of the
vector b in a controlled way we may obtain new
plausible lip shapes
29Applications of Active Shape Models
- 1. Determining Visemes of a Language
- 2. Increasing Robustness of any Tracking
Algorithm - 3. If the shape model of an object is extracted
apriory - i) To locate the object in the image
- ii)To track that object through image
sequence
30Visemes of a Language
- Determining viseme of each letter Using Acitive
Shape Models the parameter vector b of a lip
shape corresponding to a letter of a language is
obtained. - Benefits to Speech Recognition Parameter vectors
obtained from an image sequence may be fused with
acoustic information, thus increasing the
recognition rate.
31Contribution of EigenLips to Lip Tracking
Algorithms
- Lip tracking algorithms may give wrong lip
contours for frames far from the first frame. - The shape vector of a wrong lip x is projected
into the shape space - Distribution of the parameter vector b
- if p(b) is larger that a given threshold the
contour is accepted as correct. - if p(b) is smaller, then the closest b vector is
assigned to to the lip, thus correcting the wrong
boundary.
32Conclusion I
- Intelligent Scissors is an interactive semi-
automatic image segmentation tool. - May be used for extracting of initial lip
boundary as well as for tracking that boundary
through image sequence.
33Conclusion II
- Non-Rigid Object Tracking Algorithm
- High time complexity
- Tracking through large number of frames
- Tracking with Intelligent Scissors
- More accurate results
- Low time complexity
- Tracking through small number of frames
34Future Works
- Active Shape Models
- The library of lip shapes was obtained
- Viseme group for Turkish language
- Correction of wrong contours
- Extraction Tracking of contours
35Future Works II
- The method of Lip Tracking Using Itelligent
Scissors may be made more robust by imposing
Shape Constraint factor. - Given an image, the region of the lip may be
located by using Shape Models. - A lip tracking system which is fully based on
Active Shape Models may be developed.