EEL 6586: AUTOMATIC SPEECH PROCESSING Windows Lecture - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

EEL 6586: AUTOMATIC SPEECH PROCESSING Windows Lecture

Description:

Speech windows. What is a short' window of time? ... Text-to-speech synthesis, Noise reduction. Typical window (frame) length: 20-30 ms ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 18
Provided by: cnel4
Category:

less

Transcript and Presenter's Notes

Title: EEL 6586: AUTOMATIC SPEECH PROCESSING Windows Lecture


1
EEL 6586 AUTOMATIC SPEECH PROCESSINGWindows
Lecture
  • Mark D. Skowronski
  • Computational Neuro-Engineering Lab
  • University of Florida
  • February 10, 2003

2
No, not MS Windows
3
not those either!
4
Speech windows
Speech is NONSTATIONARY
5
Speech windows
SEVEN
  • Assume speech is stationary over short window
    of time.

6
What is a short window of time?
  • 10 µs smallest difference detectable by auditory
    system (localization),
  • 3 ms shortest phoneme (plosive burst),
  • 10 ms glottal pulse period,
  • 100 ms average phoneme duration,
  • 4 s exhale period during speech.

Short depends on application.
7
Applications using windows
  • Automatic speech recognition,
  • Speech coding/decoding,
  • Speaker identification,
  • Text-to-speech synthesis,
  • Noise reduction

Typical window (frame) length 20-30 ms Typical
frame rate 100 frames/sec
8
Short-time analysis
s(n) entire speech utterance w(n) window
function x(n) frame of speech
Window function is non-zero for N samples,
n0,,N-1
9
Short-term Fourier Transform
s(m) entire speech utterance w(m) window
function X(n,?) STFT of speech at time n
STFT is a smoothed version of original spectrum.
10
STFT example
s(n) pure sinewave of infinite length w(n)
rectangular window
11
STFT example
W(?)

X(?)

12
Window types
  • Rectangular
  • Hann (cosine)
  • Hamming (raised cosine)
  • Blackman
  • Kaiser-Bessel

Tradeoff between leakage and blurring
13
Window tradeoff
  • Blurring main lobe width A
  • Leakage side lobe suppression B

B
A
14
Popular windows
Window Unit BW Sidelobe
Rectangle 1 -13 dB
Hann 2 -31 dB
Hamming 2 -43 dB
Blackman 3 -68 dB
Kaiser-Bessel 4 -91 dB
15
Practical issues
  • Rule of thumb
  • Time domain, use Rectangle window
  • Freq domain, use Hamming window
  • Why?

16
Time domain issues
  • Correlation in time domain interfered by tapered
    windows

20 ms /eh/, male utterance, pitch measurement
(normalized autocorrelation). First side peak
lower using Hamming window
17
Frequency domain issues
fs12.5 KHz, /eh/, 800 samples, male
speaker. Blurring/Leakage tradeoff evidence
Write a Comment
User Comments (0)
About PowerShow.com