LandmarkBased Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Net - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

LandmarkBased Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Net

Description:

... Tube Models. The vowel /a/ Helmholtz Resonator. The vowels /u,i, ... The Vowel /u/: A Four-Tube Model. Two Helmholtz Resonators = Two Low-Frequency Formants! ... – PowerPoint PPT presentation

Number of Views:214
Avg rating:3.0/5.0
Slides: 38
Provided by: jhas
Category:

less

Transcript and Presenter's Notes

Title: LandmarkBased Speech Recognition: Spectrogram Reading, Support Vector Machines, Dynamic Bayesian Net


1
Landmark-Based Speech RecognitionSpectrogram
Reading,Support Vector Machines,Dynamic
Bayesian Networks,and Phonology
  • Mark Hasegawa-Johnson
  • jhasegaw_at_uiuc.edu
  • University of Illinois at Urbana-Champaign, USA

2
Lecture 2 Acoustics of Vowel and Glide Production
  • One-Dimensional Linear Acoustics
  • The Acoustic Wave Equation
  • Transmission Lines
  • Standing Wave Patterns
  • One-Tube Models
  • Schwa
  • Front cavity resonance of fricatives
  • Two-Tube Models
  • The vowel /a/
  • Helmholtz Resonator
  • The vowels /u,i,e/
  • Perturbation Theory
  • The vowels /u/, /o/ revisited
  • Glides

3
1. One-Dimensional Acoustic Wave Equation and
Solutions
4
Acoustics Constitutive Equations
5
Acoustic Plane Waves Time Domain
6
Acoustic Plane Waves Frequency Domain
Tex
7
Solution for a Tube with Constant Area and Hard
Walls
8
2. One-Tube Models
9
Boundary Conditions
L
0
10
Resonant Frequencies
11
Standing Wave Patterns
12
Standing Wave Patterns Quarter-Wave Resonators
Tube Closed at the Left End, Open at the Right End
13
Standing Wave Patterns Half-Wave Resonators
Tube Closed at Both Ends
Tube Open at Both Ends
14
Schwa and Invv (the vowels in a tug)
F32500Hz5c/4L
F21500Hz3c/4L
F1500Hzc/4L
15
Front Cavity Resonances of a Fricative
/s/ Front Cavity Resonance 4500Hz
4500Hz c/4L if Front Cavity Length is
L1.9cm
/sh/ Front Cavity Resonance 2200Hz
2200Hz c/4L if Front Cavity Length
is L4.0cm
16
3. Two-Tube Models
17
Conservation of Mass at the Juncture of Two Tubes
U2(x,t) 2U1(x,t)
U1(x,t)
A2 A1/2
A1
Total liters/second transmitted (velocity) X
(tube area)
18
Two-Tube Model Two Different Sets of Waves
Incident Wave P1
Reflected Wave P2
Reflected Wave P1-
Incident Wave P2-
19
Two-Tube Model Solution in the Time Domain
20
Two-Tube Model in the Frequency Domain
21
Approximate Solution of the Two-Tube Model, A1A2
LBACK
LFRONT
Approximate solution Assume that the two tubes
are completely decoupled, so that the formants
include - F(BACK CAVITY) c/4
LBACK - F(FRONT CAVITY) c/4LFRONT
22
The Vowels /AA/, /AH/
LBACK
LFRONT
LBACK8.8cm ? F2 c/4LBACK 1000Hz LFRONT12.6c
m ? F1 c/4LFRONT 700Hz
23
Acoustic Impedance
Z(x,jW)
x
0
Z(x,jW)
x
0
24
Low-Frequency Approximations of Acoustic Impedance
25
Helmholtz Resonator
?
-Z1(x,jW)
Z2(x,jW)
x
0
x
0
26
The Vowel /i/
Back Cavity Pharynx Resonances 0Hz,
2000Hz, 4000Hz Front Cavity Palatal
Constriction Resonances 0Hz, 2500Hz,
5000Hz Back Cavity Volume 70cm3 Front Cavity
Length/Area 7cm-1 ? 1/2pvMC
250Hz Helmholtz Resonance replaces all 0Hz
partial-tube resonances.
2500Hz
2000Hz
250Hz
27
The Vowel /u/ A Two-Tube Model
2000Hz
1000Hz
250Hz
Back Cavity Mouth Pharynx Resonances
0Hz, 1000Hz, 2000Hz Front Cavity Lips
Resonances 0Hz, 18000Hz, Back Cavity Volume
200cm3 Front Cavity Length/Area 2cm-1 ?
1/2pvMC 250Hz Helmholtz Resonance replaces all
0Hz partial-tube resonances.
28
The Vowel /u/ A Four-Tube Model
Velar Tongue Body Constriction
Lips
Pharynx
Mouth
Two Helmholtz Resonators Two Low-Frequency
Formants! F1 250Hz F2 500Hz F3
Pharynx resonance, c/2L 2000Hz
2000Hz
500Hz
250Hz
29
4. Perturbation Theory
30
Perturbation Theory(Chiba and Kajiyama, The
Vowel, 1940)
A(x) is constant everywhere, except for one small
perturbation.
Method 1. Compute formants of the
unperturbed vocal tract. 2. Perturb the
formant frequencies to match the area
perturbation.
31
Conservation of Energy Under Perturbation
32
Conservation of Energy Under Perturbation
33
Sensitivity Functions
34
Sensitivity Functions for the Quarter-Wave
Resonator (Lips Open)
x
0
L
/AA/
/ER/
/IY/
/W/
35
Sensitivity Functions for the Half-Wave Resonator
(Lips Rounded)
x
0
L
/L,OW/
/UW/
36
Formant Frequencies of Vowels
From Peterson Barney, 1952
37
Summary
  • Acoustic wave equation easiest to solve in
    frequency domain, for example
  • Solve two boundary condition equations for P and
    P-, or
  • Solve the two-tube model (four equations in four
    unknowns)
  • Quarter-Wave Resonator Open at one end, Closed
    at the other
  • Schwa or Invv (a tug)
  • Front cavity resonance of a fricative or stop
  • Half-Wave Resonator Closed at the glottis,
    Nearly closed at the lips
  • /uw/
  • Two-Tube Models
  • Exact solution use reflection coefficient
  • Approximate solution decouple the tubes, solve
    separately
  • Helmholtz Resonator
  • When the two-tube model seems to have resonances
    at 0Hz, use, instead, the Helmholtz Resonance
    frequency, computed with low-frequency
    approximations of acoustic impedance
  • /iy/ F1 is a Helmholtz Resonance
  • /uw/ and /ow/ Both F1 and F2 are Helmholtz
    Resonances
  • Perturbation Theory
  • Perturbed area ? Perturbed formants
  • Sensitivity function explains most vowels and
    glides in one simple chart
Write a Comment
User Comments (0)
About PowerShow.com