Short Time Fourier Analysis - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Short Time Fourier Analysis

Description:

First formant peak at 300-400 Hz. Other peaks at 2200 Hz, 3800 Hz. ( From Rabiner and Schafer) ... Typical spacing of formants is 1 kHz. The limited range of ... – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 29
Provided by: Jie7
Category:

less

Transcript and Presenter's Notes

Title: Short Time Fourier Analysis


1
Short Time Fourier Analysis
  • RPI ECSE
  • Jie Zou

2
Introduction
  • Frequently used in speech processing.
  • The properties of speech signals change as a
    function of time.
  • The basic idea is to analyze the signal in short
    period of time using window function.

3
Time-varying Fourier Transform (1)
4
Time-Varying Fourier Transform (2)
IMPORTANTTime-varying Fourier transform is a
function of two variables.n the time index,
discrete the frequency variable
In this example, high frequency components
increase with time.
5
Window Shape
The shape of the window sequence has an important
effect on the time-dependent Fourier transform.
We study rectangular window and Hamming window.
6
Rectangular Window (N64)
w rectwin(64)wvtool(w)
Narrow main lobe ? greater frequency resolution ?
increased sharpnessLarge side lobe ? adjacent
harmonics interact ? ragged noisy spectrum
7
Hamming Window (N64)
w hamming(64)wvtool(w)
A little bigger main lobe Much smaller side lobe
and leakage factor.Better than rectangular
window.
8
Rectangular Window (Larger N)
N500. Obviously periodic. First formant peak at
300-400 Hz. Other peaks at 2200 Hz, 3800 Hz.
(From Rabiner and Schafer)
9
Hamming Window (Larger N)
N500. More smooth. First formant peak at 300-400
Hz. Other peaks at 2200 Hz, 3800 Hz. (From
Rabiner and Schafer)
10
Rectangular Window (n16)
w rectwin(16)wvtool(w)
The width of the main lobe is inversely
proportional to the length of the window.
11
Hamming Window (n16)
w hamming(16)wvtool(w)
12
Rectangular Window (Smaller N)
N50. No periodicity. Higher frequency
resolution. Broad peak at 400, 1400, and 2200 Hz.
(From Rabiner and Schafer)
13
Hamming Window (Smaller N)
N50. No periodicity. Broad peak at 400, 1400,
and 2200 Hz. (From Rabiner and Schafer)
14
Summary
The basic idea is to analyze the signal in short
period of time with a window function.
The purpose of the window is to limit the time
interval to be analyzed so that the properties of
the waveform do not change appreciably.
A good window function should have narrow main
lobe and smaller side lobe. The ideal window
function is an impulse.
Good temporal resolution requires a short window
while good frequency resolution calls for a long
window.
15
Fourier Transform Interpretation (1)
is the normal Fourier transform of
Input signal can recovered exactly from the
time-varying Fourier Transform with the
requirement that w(0) is nonzero.
Prove
16
Fourier Transform Interpretation (2)
Windowing Fourier Transform of the sequence x(m)
is convolved with the Fourier Transform of the
shifted window sequence.
The time dependent Fourier transform can be
interpreted as a smoothed version of the Fourier
transform of the part of the signal within the
window.
17
Fourier Transform Interpretation (3)
Time Varying Spectral Display (TVSD)
Freq
Time
18
Linear Filtering Interpretation (1)
For each value of
is the convolution
with the sequence
of the sequence
Filter with a low pass filter to see the change
at particular frequency as time goes.
19
Linear Filtering Interpretation (2)
Time Varying Spectral Display (TVSD)
Freq
Time
20
Speech Terminology
  • Voiced speech Caused by excitation of periodic
    sound source.
  • Unvoiced speech Aperiodic noise causes unvoiced
    speech.
  • Formant Major resonance are called formants and
    they appear as dark bands. Typical spacing of
    formants is 1 kHz. The limited range of possible
    bandwidths is 30-500 Hz.
  • Start start of voiced speech after a gap.
  • Stop A short period of silence following voiced
    speech.

21
Speech Spectrum (Voice Print)
Unvoiced
Start
End
Voiced
Formant
22
(No Transcript)
23
(No Transcript)
24
Overlapping Windows
Time Varying Spectral Display (TVSD)
?
50 overlapping
25
Required Sampling Rate in Time Dimension
Suppose the effective bandwidth of the window is
B Hz. The sequence has the same
bandwidth as the window.According to the
sampling theorem, the sampling rate should at
least be 2B samples/second to avoid aliasing.
Example the approximate bandwidth of Hamming
window is B2Fs/L, where Fs is the sampling rate
of the original signal x(n), L is the window
width. Suppose Fs 10000Hz, L 100,B200Hz,
2B400times/second, i.e. every 25 samples.
Windows are overlap
26
Required Sampling Rate in Frequency Dimension
The inverse Fourier transform of
is the signal and this signal is of duration L
(the width of the window) samples.
should be sampled at the set of frequencies.
Example for a Hamming window of duration L100,
is required to be evaluated
at least 100 uniformly spaced frequencies.
27
Total Sampling Rate
SR2BL samples/sec
For most practical windows, B can be represented
as BCFs/L.
SR2C Fs samples/sec
2C indicates over-sampling ratio of the
short-time analysis.2C is usually between 2
(rectangular window) and 4 (Hamming window).
28
Summary
The basic idea is to analyze the signal in short
period of time with a window function.
A good window function should have narrow main
lobe and smaller side lobe. The ideal window
function is an impulse.
Good temporal resolution requires a short window
while good frequency resolution calls for a long
window.
Overlapping windows in order to have good
temporal and frequency resolution.
Write a Comment
User Comments (0)
About PowerShow.com