LING124 Acoustic models

About This Presentation

Title:

Description:

Number of Views:37

Avg rating:3.0/5.0

Slides: 12

Provided by: hahn7

Category:

Tags: acoustic | calculates | likelihood | ling124 | models

Transcript and Presenter's Notes

Title: LING124 Acoustic models

1
LING124 Acoustic models

2
Class outline

3
Acoustic model

In statistical approach to ASR, the acoustic
model calculates the likelihood, P(OW)
Among many different types of acoustic models, we
focus on the most common method Gaussian mixture
model (GMM)

4
GMM Basic idea

GMM is most often used with HMM
A state represents a symbolic linguistic unit,
most commonly a context-specific phone
Each state emits a feature-vector, e.g. the
feature vector consisting of MFCCs, deltas, and
double-deltas
The emission probability is calculated using a
mixture of multivariate Gaussians

5
Gaussian distribution
6
Multivariate Gaussians

7
Covariance matrix (1)

Variance of each vector component
Covariance between any two components
How much two variables change together
e.g. Two variables X and Y have a positive
covariance if X has a value above the mean, then
Y also tends to have a value above the mean

8
Covariance matrix (2)

If you think about it, variance of a vector
component is the covariance between the component
and itself
So all the necessary variances can be represented
as a single covariance matrix
M(i,j) covariance between the ith component and
the jth component
M(i,i) variance of a single component

9
Diagonal matrix

Covariance between two different components means
More calculation
More data to train the parameters
Risking negative effect on performance,
covariance between different components is often
assumed to be zero
The resulting covariance matrix is a diagonal
matrix
Except for the main diagonal of the matrix, all
the elements of the matrix are zero

10
Gaussian mixture model

Using a multivariate Gaussian as observation pdf
is assuming the vector components are normally
distributed
In reality, the normality assumption may not be
true
One way around this problem is to represent the
pdf as a weighted sum of multiple Gaussians

11
Gaussian mixture model (2)

Write a Comment

User Comments (0)