Chap 10' Environment Robustness - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Chap 10' Environment Robustness

Description:

Channel, speaker normalization - CMN, SBR. remove the channel effect (in telephone-line, ... (1) Try to remove noise in input signal, Winer filter, ..., etc. ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 15
Provided by: yihru
Category:

less

Transcript and Presenter's Notes

Title: Chap 10' Environment Robustness


1
Chap 10. Environment Robustness
  • Environment difference in Train/Test data
  • Channel, speaker normalization - CMN, SBR
  • remove the channel effect (in telephone-line,
    microphone speech)
  • convolution noise
  • Noise compensation PMC
  • overcome the additive noise
  • Speaker adaptation MLLR
  • SA between the SI performance to SD
    performance
  • SI ?SA?SD

2
System Modeling of Robust Speech Recognition
System
  • Block Diagram
  • Acoustic mismatch in train/test data
  • Convolution noise (channel effect)
  • Additive noise

3
CMN SBR
  • CMN(CMS) Cepstral mean normalization
    (Subtraction)
  • SBR Signal Bias Removal
  • remove the convolution noise
  • CMN remove the cepstral mean
  • Average over speaker/utterance
  • If average over syllable, the information of
    spectrum will be remove.
  • 10 error reduction.

4
  • Real-time CMN
  • Using a LPF to estimate the cepstral mean
  • Time constant gt 5 sec
  • RASTA, the LPF used is
  • SBR remove the cepstral mean
  • Ref M. G. Rahim, Biing-Hwang Juang, Signal
    Bias Removal by maximum likelihood estimation for
    robust telephone speech recognition, IEEE Trans.
    on SA, Vol. 4, No. 1, Jan, 1996.

5
  • In Cepstral domain
  • Assume
  • Model
  • If

6
PMC parallel model combination
  • Training data noisy/clean
  • Testing data - noisy
  • We cannot train the model with all kinds of SNR,
    types of noise
  • (1) Try to remove noise in input signal, Winer
    filter, , etc.
  • (2) Try to construct the model from noisy speech
    from speech and noise models PMC (Parallel
    model combination)

7
  • Assume the noise is additive, then its also
    additive in freq-domain
  • The speech and noise parameters will not in
    addition in MFCC/cepstral/log-spectral domain,
    but in linear-spectral domain
  • Noise model ?1-state HMM, noise is stationary
  • The parameters of HMM
  • model (mean, variance)
  • can be added
  • in linear-spectral domain?

8
  • In linear-spectral domain, the observation prob.
  • RV addition ? pdf convolution, for Gaussian pdf
    ? addition of log-likelihood
  • Transfer cepstral domain to linear domain
  • You can easily find the parameters (mean,
    variance) in linear-domain

9
  • Change linear-cepstral to log-spectral (cepstral)
    domain
  • In MFCC-domain,
  • Problem estimation of SNR

10
  • Performance of PMC

11
Maximum Likelihood Linear Regression (MLLR)
  • MLLR model adaptation
  • Transfer the observation prob. in HMM
  • Linear regression
  • Mean of mixtures
  • Find optimal W using ML criterion
  • ? minimize Q-function

12
  • Q-function
  • Then,
  • The observation prob. were cluster into class,
    and multiple linear regressions were used.

13
  • Rewrite the above equation
  • Finally,

14
  • How to find the classes of MLLR
  • ? the model with same statistics using same LR
  • ? using decision tree to find the classes
  • Performance

MLLR
New Model
MAP
Write a Comment
User Comments (0)
About PowerShow.com