LOW-RESOURCE NOISE-ROBUST FEATURE POST-PROCESSING ON AURORA 2.0 - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

LOW-RESOURCE NOISE-ROBUST FEATURE POST-PROCESSING ON AURORA 2.0

Description:

Chia-Ping Chen, Jeff Bilmes and Katrin Kirchhoff. SSLI Lab. Department of ... The performance of ASR systems often decreases dramatically ... theory (Cooke et ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 15
Provided by: ShihH
Category:

less

Transcript and Presenter's Notes

Title: LOW-RESOURCE NOISE-ROBUST FEATURE POST-PROCESSING ON AURORA 2.0


1
LOW-RESOURCE NOISE-ROBUST FEATURE POST-PROCESSING
ON AURORA 2.0
  • Chia-Ping Chen, Jeff Bilmes and Katrin Kirchhoff
  • SSLI Lab
  • Department of Electrical Engineering
  • University of Washington
  • Presenter Shih-Hsiang(??)

ICSLP 2002
2
Introduction
  • The performance of ASR systems often decreases
    dramatically when the noise level increases
  • The degradation is minor when the signal-to-noise
    ratio (SNR) is high, but quite significant at low
    SNR level
  • In the past, a variety of techniques have been
    proposed
  • Principle component analysis and a discriminative
    neural network (Ellis et al. 2001)
  • Missing Data theory (Cooke et al. 2001)
  • Voice activity detector (VAD) and variable frame
    rate are used to drop noisy feature vector to
    reduce insertion error (John et al. 2001)
  • Nonlinear spectral subtraction, noise masking,
    feature filters, and model adaptation (Lieb et
    al. 2001)
  • data-driven temporal filters, on-line mean and
    variance normalization, voice activity detection,
    and server side discriminate features are
    integrated together to improve noise robustness
    (Morgan et al. 2001)
  • etc

3
Literature Review
  • Ellis et al. 2001
  • John et al. 2001
  • Variable frame rate processing
  • An observation vector is discarded if it does not
    differ much from the previous observation vector.
    In our implementation of VFR, frame-to-frame
    variation is estimated as the Euclidean norm of
    the sub-vector corresponding to the
    delta-cepstrum.
  • Voice activity detection

4
Literature Review
  • Morgan et al. (2001)

5
Proposed method
  • The first step is standard mean subtraction (MS)
  • The second step is variance normalization (VN)
  • The third step is auto-regression moving average
    (ARMA)

feature vector (cepstral coefficient)
the order of the ARMA filter
6
Choosing a proper order M of the filter
The transfer function is
The frequency response of the ARMA filter of
order M is
There are zeros in the frequency
response of the ARMA filter is approximately
proportional to its order It support that a
large M will perform poorly since it could filter
out important speech information
7
Gain and phase shifts of the ARMA filter
8
The time sequences of the cepstral coefficient c1
for the digit string 5376869 corrupted with
different levels of noises
9
Evaluation
  • Evaluate on Aurora 2.0 noisy digits database
  • Two training sets and three test sets
  • Training sets clean training set only /
    multi-condition speech
  • Test sets stationary-noise sets /
    non-stationary-noise sets / convolutional noise
  • 7 different levels of noises
  • Clean, 20dB, 15 dB, 10dB, 5dB, 0dB, -5dB
  • Recognizer
  • Simple HMM-based system using whole-word models
  • Zero Nine and Oh 16 states per word, 3
    mixture Gaussian per state
  • silence 3-states

10
Recognition results
Word accuracies (as percentages)
Top multi-condition training Bottom clean
training
11
A comparison of different orders of the ARMA
filtering
  • A small M will retain the short-term cepstral
    information but is more vulnerable to noise
  • A large M will make the processed features less
    corrupted by noise, but the short-term cepstral
    information will be lost.

Top multi-condition training Bottom clean
training
12
Test the effectiveness of proposed technique
  • The results show that while variance
    normalization and mean subtraction improves
    performance over the baseline, the addition of
    the ARMA filter provides significant further
    improvements

13
Comparison of different filter
  • causal ARMA filter
  • non-causal MA filter
  • causal MA filter

14
Comparison of different filter (cont.)
Top multi-condition training Bottom clean
training
Write a Comment
User Comments (0)
About PowerShow.com