Noisy Speech Recognition based on Harmonic Noise Model HNM - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Noisy Speech Recognition based on Harmonic Noise Model HNM

Description:

Noise affects different bands of speech differently. In spectrogram of noisy speech, regions of low SNR will ... What is the high SNR region of the spectrogram ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 16
Provided by: cnel4
Category:

less

Transcript and Presenter's Notes

Title: Noisy Speech Recognition based on Harmonic Noise Model HNM


1
Noisy Speech Recognition based on HarmonicNoise
Model(HNM)
  • Xiuyun Shan
  • Shengli Li

2
Idea
  • Mimic human auditory systems way of recognizing
    noisy speech.
  • Preferentially processing the high energy
    components of the speech signal
  • Suppressing the weaker parts.
  • Noise affects different bands of speech
    differently
  • In spectrogram of noisy speech, regions of low
    SNR will be more corrupt than regions of high SNR

3
Question
  • What is the high SNR region of the spectrogram
  • Voiced part of a speech signal has higher energy
    and has a harmonic structure.
  • Explore the harmonicity of voiced speech for
    noisy speech recognition
  • Algorithm
  • Weighted Harmonic Noise Model

4
What is an HNM
  • HNM model represents speech signals as a sum of a
    harmonic and a noise part.
  • The harmonic part corresponds to the
    quasi-periodic components.
  • The noise part corresponds to non-periodic
    components.

5
HNM
  • The harmonic part contains only harmonic
    multiplications of fundamental frequency and is
    modeled as
  • fundamental frequency
  • total number of harmonics in the signal.
  • amplitude coefficient

6
HNM Parameters Estimation
  • Pitch
  • Using pitch tracking algorithms in the
    literature.
  • A Robust Algorithm for Pitch Tracking (RAPT)"
  • Amplitude Parameters
  • Least-squares solution for the amplitude
    coefficient is

7
HNM
  • When HNM is applied to speech signals corrupted
    by additive noise

8
HNM Parameters Estimation
  • Harmonic Part
  • Noise Part

9
HNM applied to ASR
  • Assumption
  • Noise and speech are uncorrelated
  • Harmonic and noisy components are uncorrelated
  • An estimate for clean Mel Spectral component from
    a noisy-corrupted frame of speech
  • Mel spectral components of harmonic part
  • Mel spectral components of noisy part
  • Scaling factors

10
HNM
11
WHNM(Weighted HNM)
  • Using scaled version of the harmonic and noisy
    components to obtain an estimate of the
    underlying clean speech signal.
  • But how to estimate ?
  • Hard to obtain an estimate for
  • held constant over all utterances and a range of
    values
  • 0.1

12
WHNM-based MFCC Implementation
13
ASR Experiments
  • Training dataHW05Traindata
  • Test Data HW05Testdata
  • Corrupted by additive noise of three kinds with
    SNR from -530dB
  • White noise, music noise, vehicle noise
  • Features WHNM-based MFCC feature
  • Frames labeled as unvoiced are assigned a pitch
    of 150Hz
  • 0.1

14
ASR Results
15
Conclusions
  • With no assumption about the corrupting noise
  • In lower SNR, the model achieve significant
    recognition accuracy.
  • In high SNR condition, the algorithm does not
    perform so well as pure MFCC does.
Write a Comment
User Comments (0)
About PowerShow.com