LING124 Cepstrum, LPC, Autocorrelation - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

LING124 Cepstrum, LPC, Autocorrelation

Description:

Source-filter model. Source = glottal pulse. Filter = the vocal tract, ... Source-filter model (2) x(t) = excitation signal ... y(t) = output ... – PowerPoint PPT presentation

Number of Views:168
Avg rating:3.0/5.0
Slides: 15
Provided by: hahn7
Category:

less

Transcript and Presenter's Notes

Title: LING124 Cepstrum, LPC, Autocorrelation


1
LING124 Cepstrum, LPC, Autocorrelation
  • September 18, 2008

2
Class outline
  • Source-filter model of speech production
  • Cepstrum
  • Linear Predictive Coding
  • Autocorrelation

3
Source-filter model
  • Source glottal pulse
  • Filter the vocal tract, characterized as a tube
  • Harmonics whole number multiples of the
    fundamental frequency
  • Different shapes of the filter (vocal tract)
    amplify different harmonics
  • Different vowel ? different shape of the vocal
    tract ? amplification of different harmonics ?
    different spectral shape

4
Source-filter model (2)
  • x(t) excitation signal source (glottal pulse)
  • h(t) transfer function filter shape of the
    vocal tract
  • y(t) x(t) h(t)
  • y(t) output signal speech sound
  • The output signal is the convolution of the
    excitation signal and the transfer function

5
Convolution
  • The amount of overlap between two functions as
    one function is shifted over another function

From Wolfram MathWorld
6
Cepstrum
  • y(t) x(t) h(t)
  • We want to find out the shape of the vocal tract
    that led to the speech sound
  • We want to separate h(t) from y(t)
  • A homomorphic transformation Dy(t) converts
    y(t) x(t) h(t) into y(t) x(t)h(t)
  • One such transformation is the Cepstral analysis,
    which is, inverse Fourier transform of the log
    of Fourier transform of the signal

7
Cepstrum (2)
  • Spectrum of spectrum
  • (Inverse) Fourier transform of the log of Fourier
    transform
  • For xNn, a discrete-time signal with N samples
    in a cycle
  • Magnitude and (phase) of the kth frequency
    component from DFT is
  • Cepstral value for the nth sample is

8
Cepstrum (3)
  • Spectrum
  • Cepstrum

Figures from JM (p.300), originally from Taylor
(2008)
9
Linear Predictive Coding (LPC)
Figure 11.1 in Ladefoged (p.182)
10
LPC (2)
  • We assume that each sample (xn) of a speech
    sound can be predicted by a linear combination of
    a number of previous samples (e.g. xn-1,
    xn-2, ... , xn-p)
  • xna1xn-1a2xn-2 ... apxn-p
  • Of course there will be errors there will be
    difference between predicted value and the actual
    value
  • en(xn-xn)2

11
LPC (3)
  • We want to find the coefficients that minimize
    the error, more specifically the sum of squared
    error within a window
  • Assuming pth order and a window has w samples
  • Define error as
  • Solve for 1i p
  • We will consider the coefficients as
    characterizing the filter

12
LPC (4)
  • The prediction error will be the greatest when
    there is an abrupt change in the sound wave
  • This is most likely to happen whenever there is
    an impulse from the vocal fold vibration
  • So if we plot the size of the error over time,
    the interval between two nearest peaks in error
    will equal the period of the glottal pulse

13
Autocorrelation
  • f(tT) f(t) for a periodic wave with period T
  • If there are N samples in a cycle, xnNxn
  • Take a copy of the wave
  • Shift the wave by small number of samples
  • Identify after how many samples the copy and the
    original completely overlaps
  • e.g. Plot and compute the
    distance between two nearest minima

14
Autocorrelation (2)
Write a Comment
User Comments (0)
About PowerShow.com