An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Model for Mandarin Large

About This Presentation

Title:

An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Model for Mandarin Large

Description:

Model Space Adaptation (MPE-LR) Feature Space Adaptation (MPE-LT) Experiments. 12/7/09 ... Construct a tractable lower bound. current. guess. linear ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 18

Provided by: jenwe

Category:

more less

Transcript and Presenter's Notes

Title: An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Model for Mandarin Large

1
An Initial Study on Minimum Phone Error
Discriminative Learning of Acoustic Model for
Mandarin Large Vocabulary Continuous Speech
Recognition

Jen-Wei Kuo
National Taiwan Normal University

2
Outline

Minimum Phone Error (MPE) Training
Objective Function From Minimum Overall Risk
Function Maximization
Update Formula
MAP updates (I-Smoothing)
MPE Linear Transform based Adaptation
Model Space Adaptation (MPE-LR)
Feature Space Adaptation (MPE-LT)
Experiments

3
Notations
4
Minimum Overall Risk
Loss function
Overall risk
Classifier Design Conventional MAP
decoding, Hypothesis tesing, WER minimization
(Sausage), MBR Recognition
5
Overall Risk Criterion Estimation

ORCE was firstly proposed by Na Eurospeech 95,
using the zero-one loss function.
Kaiser introduced the Levenshtein distance to be
as the loss function in ORCE ICSLP00,
SpeechComm02.

average loss of all hypotheses
6
Maximize the average accuracy

ORCE can be regarded as maximizing the average
accuracy of all possible hypotheses.
It would try to increase the weight of the path
with higher accuracy, and reduce the weight of
the lower one.
The higher/lower accuracy the path has, the more
positive/negitive contribution it gives.

average accuracy of all hypotheses
7
Minimum Phone Error (MPE)

The improvements from ORCE to MPE
The use of lattices
MAP estimation of parameters (I-smoothing)
The set of the smoothing constants in the EB
update equations
The emphasis on phone error rather than word
error

8
Expectation Maximum
Berlin Chen
How to do?
9
Lower Bound
Construct a tractable lower bound
10
Three Steps for EM

Step 1.Draw a lower bound
Use the Jensens inequality
Step 2.Find the best lower bound
Let the lower bound touch the objective function
at current guess
Step 3.Maximize the best lower bound
Obtain the new guess
Go to Step 1 until converge

11
Step 1.Draw a lower bound
Apply Jensens Inequality
12
Step 2.Find the best lower bound

Let the lower bound touch the objective function
at current guess
Find the best at

13
Step 2.Find the best lower bound
Set it to zero
14
Step 2.Find the best lower bound
Q function
15
Strong-sense auxiliary function

e.g. EM algorithm

16
Weak-sense auxiliary function and smooth function

Weak-sense auxiliary function
Smooth function

17
Weak-sense auxiliary function for MPE
18
Weak-sense auxiliary function for MPE
19
Weak-sense auxiliary function for MPE
20
Weak-sense auxiliary function for MPE
21
EBW equivalent smooth function
22
EBW equivalent smooth function
23
MAP updates (I-smoothing)
24
Experiments -Baseline