Theoretical Perspectives to Speech Perception - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Theoretical Perspectives to Speech Perception

Description:

Human communication is not just a transfer of information like two fax machines ... Motor Theory (Liberman et al., 1967; Liberman & Mattingly, 1985) ... – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 26
Provided by: markf
Category:

less

Transcript and Presenter's Notes

Title: Theoretical Perspectives to Speech Perception


1
Theoretical Perspectives to Speech Perception
  • Mark C. Flynn
  • The University of Canterbury

2
Question to be answered
  • How are we able to perceive the acoustic speech
    wave and transform it into a linguistically coded
    message in our brain?

3
Basic Problem is
  • Human communication is not just a transfer of
    information like two fax machines connected with
    a wire. It is a series of alternating displays of
    behaviour by sensitive, scheming, second
    guessing, social animals.
  • - Steven Pinker

4
Spectrogram
5
The speech input consists of
  • Frequency range 50-5600Hz
  • Critical band filters
  • Dynamic range 50dB
  • Temporal resolution of 10ms
  • Smallest detectable change in F0 2Hz
  • Smallest change in F1 40Hz
  • Smallest change in F2 100Hz
  • Smallest change in F3 150Hz

6
Issues in models of speech perception
  • Bottom-up vs top-down processing
  • Acoustic phonetic invariance
  • Segmentation of the signal into phonetic units
  • Time normalization
  • Talker normalization
  • Lexical representations for optimal search
  • Phonological recoding of words in sentences
  • Dealing with errors in the initial representation
  • Interpretation of prosodic cues

7
Bottom-up processing
  • Peripheral processing
  • Acoustic property detectors
  • Phonetic feature detectors
  • Segmental analysis
  • Lexical search
  • Syntactic and semantic analysis

8
Top-down processing
  • Higher level processing
  • Lexical cues
  • Contextual cues
  • World knowledge
  • Cognitive skills

9
Acoustic phonetic non-invariance
  • Intra-speaker variability
  • Co-articulation (phonetic context)
  • Which acoustic cues are used?
  • How are the acoustic cues combined?

10
Segmentation into phonetic units
  • Segments overlap (co-articulation)
  • Segments are not always well defined acoustically
  • Acoustic segments do not always match phonetic
    segments
  • Errors are difficult to recover from

11
Overlapping speech segments
12
Effects of consonants on vowel transitions
13
Time normalization
  • Duration of segments can vary up to two or three
    times
  • Speaking rate
  • Syllable stress
  • Syntactic boundaries
  • Co-articulation
  • Duration is a phonetic cue that interacts with
    the other factors above

14
Talker normalization
  • Formant frequencies depend of anatomical size
  • Dialects vary
  • Moderate amounts of noise, and distortion are
    common, but do not affect the accuracy of speech
    perception in normally hearing listeners
  • Indexical communication

15
Lexical normalization
  • Perceptual units?
  • features
  • phonemes
  • syllables
  • words
  • beyond?

16
Phonological recoding
  • The phonetic representation of a word depends on
    the sentence/word context.
  • For example, /p/ is normally aspirated but in
    some contexts it is unreleased as in /apt/ or
    unaspirated as in /spIt/ (Kess, 1992).

17
Interpretation of prosodic cues
  • Prosodic cues (F0, segmental durations, intensity
    contour) provide syntactic, semantic and
    pragmatic information
  • The prosodic cues modify the acoustic phonetic
    cues
  • How is the prosodic information preserved and
    incorporated

18
Models of Speech Perception
  • Motor Theory (Liberman et al., 1967 Liberman
    Mattingly, 1985).
  • TRACE (McClelland Elmann, 1986 Elmann, 1989).
  • Neighborhood Activation Model (Luce, 1986 Luce,
    Pisoni, Goldinger, 1989).
  • Logogen Theory (Morton, 1969 1979).

19
(No Transcript)
20
(No Transcript)
21
The Speech Chain
22
Why perception is not all phonetic!
  • Woman Im leaving you
  • Main Who is he?
  • Blundy gets chair for wife
  • Isle of view

23
Predicted vs Actual CNC words.
24
Effect of Context (at 5dBSNR).
Participants (in order of PTA)
25
Distortion and Audibility.
Write a Comment
User Comments (0)
About PowerShow.com