An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Model for Mandarin Large - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Model for Mandarin Large

Description:

Model Space Adaptation (MPE-LR) Feature Space Adaptation (MPE-LT) Experiments. 12/7/09 ... Construct a tractable lower bound. current. guess. linear ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 18
Provided by: jenwe
Category:

less

Transcript and Presenter's Notes

Title: An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Model for Mandarin Large


1
An Initial Study on Minimum Phone Error
Discriminative Learning of Acoustic Model for
Mandarin Large Vocabulary Continuous Speech
Recognition
  • Jen-Wei Kuo
  • National Taiwan Normal University

2
Outline
  • Minimum Phone Error (MPE) Training
  • Objective Function From Minimum Overall Risk
  • Function Maximization
  • Update Formula
  • MAP updates (I-Smoothing)
  • MPE Linear Transform based Adaptation
  • Model Space Adaptation (MPE-LR)
  • Feature Space Adaptation (MPE-LT)
  • Experiments

3
Notations
4
Minimum Overall Risk
Loss function
Overall risk
Classifier Design Conventional MAP
decoding, Hypothesis tesing, WER minimization
(Sausage), MBR Recognition
5
Overall Risk Criterion Estimation
  • ORCE was firstly proposed by Na Eurospeech 95,
    using the zero-one loss function.
  • Kaiser introduced the Levenshtein distance to be
    as the loss function in ORCE ICSLP00,
    SpeechComm02.

average loss of all hypotheses
6
Maximize the average accuracy
  • ORCE can be regarded as maximizing the average
    accuracy of all possible hypotheses.
  • It would try to increase the weight of the path
    with higher accuracy, and reduce the weight of
    the lower one.
  • The higher/lower accuracy the path has, the more
    positive/negitive contribution it gives.

average accuracy of all hypotheses
7
Minimum Phone Error (MPE)
  • The improvements from ORCE to MPE
  • The use of lattices
  • MAP estimation of parameters (I-smoothing)
  • The set of the smoothing constants in the EB
    update equations
  • The emphasis on phone error rather than word
    error

8
Expectation Maximum
Berlin Chen
How to do?
9
Lower Bound
Construct a tractable lower bound
10
Three Steps for EM
  • Step 1.Draw a lower bound
  • Use the Jensens inequality
  • Step 2.Find the best lower bound
  • Let the lower bound touch the objective function
    at current guess
  • Step 3.Maximize the best lower bound
  • Obtain the new guess
  • Go to Step 1 until converge

11
Step 1.Draw a lower bound
Apply Jensens Inequality
12
Step 2.Find the best lower bound
  • Let the lower bound touch the objective function
    at current guess
  • Find the best at

13
Step 2.Find the best lower bound
Set it to zero
14
Step 2.Find the best lower bound
Q function
15
Strong-sense auxiliary function
  • e.g. EM algorithm


16
Weak-sense auxiliary function and smooth function
  • Weak-sense auxiliary function
  • Smooth function

17
Weak-sense auxiliary function for MPE
18
Weak-sense auxiliary function for MPE
19
Weak-sense auxiliary function for MPE
20
Weak-sense auxiliary function for MPE
21
EBW equivalent smooth function
22
EBW equivalent smooth function
23
MAP updates (I-smoothing)
24
Experiments -Baseline
  • Character error rate for ML training

25
Experiments
  • Character error rate on different settings
Write a Comment
User Comments (0)
About PowerShow.com