Title: Visual speech speeds up the neural processing of auditory speech van Wassenhove, V., Grant, K. W.,
1Visual speech speeds up the neural processing of
auditory speechvan Wassenhove, V., Grant, K.
W., Poeppel, D. (2005) Proceedings of the
National Academy of Sciences, 102(4), 1181-1186.
- Jaimie Gilbert
- Psychology 593
- October 6, 2005
2Audio-Visual Integration
- Information from one modality (e.g., visual) can
influence the perception of information presented
in a different modality (e.g., auditory) - Speech in noise
- McGurk Effect
3Demonstration of McGurk Effect
- Audiovisual Speech Web-Lab
- http//www.faculty.ucr.edu/rosenblu/lab-index.htm
l - Arnt Maasø University of Oslo
- http//www.media.uio.no/personer/arntm/McGurk_engl
ish.html
4Unresolved questions about AV integration
- Behavioral evidence exists for vision altering
the perception of speech, but - When does it occur in processing?
- How does it occur?
5ERPs can help answer the when question
- EEG/MEG studies have demonstrated AV integration
effects using oddball/mismatch paradigms - These effects occur around 150-250 ms
- A non-speech ERP study with non-ecologically
valid stimuli demonstrated earlier interaction
effects (40-95 ms) (Giard Peronnet, 1999) - Does AV integration for speech occur earlier than
150-250 ms?
6Theres a debate about the how question
- Enhancement
- Audio-visual integration generates activity at
multi-sensory integration sites, information
possibly fed back to sensory cortices - VS.
- Suppression
- Reduction of stimulus uncertainty by two
corresponding sensory stimuli reduces the amount
of processing required
7The Experiments
- 3 experiments were conducted
- Each had behavioral and EEG measures
- Behavioral Forced choice task
- EEG Auditory P1/N1/P2
- 26 participants
- Experiment 1 16
- Experiment 2 10
- Experiment 3 10 (of the 16 who participated in
Experiment 1)
8The Stimuli
- Audio /pa/
- Audio /ta/
- Audio /ka/
- Visual /pa/
- Visual /ta/
- Visual /ka/
- AV /pa/
- AV /ta/
- AV /ka/
- Incongruent AV with Audio /pa/ Visual /ka/
- 1 Female face voice for all stimuli
- In Exp. 1 2, each stimuli presented 100 times
total of 1000 trials
9Experiment 1
- Exp. 1
- Stimuli presented in blocks of audio, or blocks
of visual, or blocks of AV (congruent and
incongruent) - Participants knew before each block which stimuli
were going to be presented
10Experiment 2
- Exp. 2
- Stimuli presented in randomized blocks containing
all stimuli types (A, V, Congruent AV,
Incongruent AV) to reduce expectancy - Task for both experiments choose which stimuli
was presented for AV--choose what was heard
while looking at the face
11Experiment 3
- Presented 200 Incongruent AV stimuli
- Task choose what syllable you saw, neglect what
you heard - In all experiments, correct response to
Incongruent AV /ta/
12Waveform Analysis
- Retained 75-80 of recordings after Artifact
Rejection and Ocular Artifact Reduction - Only correct responses were analyzed
- 6 electrodes used in analysis FC3, FC4, FCz,
CPz, P7, P8 - Reference electrodes Linked mastoids
13Results
- This studys answer to How
- Suppression/Deactivation Hypothesis
- AV N1 P2 amplitude were significantly reduced
compared to Auditory-alone peaks - Performed separate analysis to determine if
summing the responses to unimodal stimuli would
result in the amplitude reduction present in the
datathis was not the case therefore the AV
waveform is not a superposition of the 2 sensory
waveforms, but reflects actual multisensory
interaction.
14(No Transcript)
15Results Experiment 1
- N1/P2 Amplitude
- AV lt A (p lt .0001)
- N1/P2 Latency
- AV lt A (significant, but confounded by
interaction) - Modality x Stimulus Identity
- P lt T lt K (p lt .0001)
- Latency effect more pronounced in P2, but can
occur as early as N1
16(No Transcript)
17Results Experiment 2
- N1/P2 Amplitude
- AV lt A (p lt .0001)
- N1/P2 Latency
- AV lt A (p lt .0001)
- Modality x Stimulus Identity (p lt .06)
18Results comparison of Exp. 1 Exp. 2
- Similar results for Exp. 1 2
- Temporal facilitation varied by Stimulus Identity
but amplitude reduction did not - No evidence for attention effect (i.e., for
expectancy affecting waveform morphology)
19Temporal facilitation depends on visual
saliency/signal redundancy
- More temporal facilitation is expected to occur
if - The audio and the visual signals are redundant
- The visual cue (which naturally precedes the
auditory cue) is more salient - (Figure 3)
20Results Experiment 3/Incongruent AV Stimuli
- Incongruent AV stimuli in Exp. 1 2
- no temporal facilitation
- Amplitude reduction present and equivalent to
reduction seen for Congruent AV stimuli - Experiment 3
- Both temporal facilitation and amplitude
reduction occurred
21(No Transcript)
22Visual speech effects on auditory speech
- Perceptual ambiguity/salience of visual speech
affects processing time of auditory speech - Incorporating visual speech with auditory speech
reduces the amplitude of N1/P2 independent of AV
congruency, participants expectancy, and
attended modality (p. 1184)
23Ecologically valid stimuli
- Suggest that AV speech processing is different
from general multisensory integration due to the
ecological validity of speech
24Possible explanation for amplitude reduction
- Visemes provide information regarding place of
articulation - If this information is salient and/or redundant
with auditory place of articulation cues (e.g.,
2nd and 3rd formants), the auditory cortex does
not need to analyze these frequency regions,
resulting in fewer firing neurons
25Analysis-by-Synthesis Model of AV Speech
Perception
- Visual speech activates internal
representation/prediction - This representation/prediction is updated as more
visual information is received over time - Representation/prediction is compared to the
incoming auditory signal - Residual errors to this matching process are
reflected by temporal facilitation and amplitude
reduction effects - Attended modality can influence temporal
facilitation
26(No Transcript)
27Suggest 2 time scales for AV integration
- 1 feature stage
- 25 ms
- Latency facilitation
- (sub-)segmental analysis
- 2 perceptual unit stage
- 200 ms
- Amplitude reduction
- Syllable level analysis
- Independent of feature content and attended
modality
28Summary
- AV speech interaction occurs by the time N1 is
elicited (50-100 ms) - Processing time of auditory speech varies by the
saliency/ambiguity of visual speech - Amplitude of AV ERP reduced when compared to
amplitude of A-alone ERP
29Questions
- Dynamic visual stimulus and ocular artifact
- If effects of AV integration are influenced by
attended modality, would modality dominance also
influence these effects? - Are incongruent AV/McGurk stimuli ecologically
valid?
30(No Transcript)