Title: Aug 10 Outline
1Aug 10 Outline
- Spoken word recognition
- Evidence for top-down feedback
- TRACE theory
- Cohort theory
- Windmann Presentation
- But isnt word recognition automatic?
- Differences between spoken written word
recognition
2Evidence for Top-Down influence on speech
perception
- Phoneme Restoration Effect (Warren, 1970)
- Lexical bias in categorical perception task, e.g.
dype vs. type (Clifton Connine, 1987) - Errors made by close shadowers (Marslen-Wilson,
1973)
3What kinds of Top-Down knowledge can we use for
Speech Perception?
- Lexical
- Syntactic and Semantic
- Right-context comes too late, but Left-context
might be useful IF our syntactic and semantic
processing keeps pace with speech perception. - The driver turned the eel.
- She saw his/him duck.
4Marslen-Wilson (1973)
- Speech Shadowing Task
- While listening to continuous speech, repeat it
back as rapidly as possible. - For isolated words or nonsense syllables, RT is
about 150 250 ms. - For continuous prose, shadowing latency is about
500 1500 ms. - Why different? Maybe because of syntactic and
semantic processing for sentences, which requires
larger units of processing (e.g. phrase or
clause). - If so, people shadowing at very short latencies
should make errors that ignore syntactic and
semantic constraints of sentence. - Only distant shadowers will make errors that
respect syntactic and semantic constraints
5Marslen-Wilson (1973)
- Ran 65 participants in shadowing task and
measured average latency. - 7 participants were close shadowers lt 350 ms.
- Remaining participants had latencies of 500 -800
ms. - Test passage presented over headphones to 7 close
7 distant shadowers - 300 words _at_ 160 words/min.
- average syllable 200 ms
- Original passage shadowing performance recorded
on separate tracks of tape recorder. - 4 closest shadowers had 254-287 ms latencies
made 1.7 - 6.6 errors
6Marslen-Wilson (1973)
- Were close shadowers comprehending the input more
superficially than distant shadowers? - No.
- Memory test on 600 word passage showed no
reliable correlation between shadowing latency
memory score - But this could reflect additional processes that
lag behind shadowing performance. - Do close shadowers make different types of errors
in their shadowing performance itself?
7Marslen-Wilson (1973)
- There were 111 constructive errors, in which
participants added a real word or changed a word
into another real word. - All but 3 were grammatical semantically
appropriate. - No qualitative difference between close distant
shadowers sometimes they made the exact same
error - It was beginning to be light enough so THAT I
could see. - Especially for close shadowers, constructive
errors tended to occur at very short latencies,
perhaps relying more on predictive top-down cues
than bottom-up information.
8Marslen-Wilson (1973)
- Summary
- Syntactic and semantic information (higher order
structure) was available to both close and
distant shadowers - When shadowers made errors, they were
syntactically and semantically well-formed - Language Comprehension is IncrementalDTC cannot
be correct - Syntactic and Semantic processing keep pace with
speech perception (within a syllable or so) - Potential source of top-down cues to guide speech
perception spoken word recognition.
9Connine Clifton (1987)
- Lexical Bias effect is enhanced by sentential
context. - At her birthday, she received a valuable ift.
- is ambiguous between /g/ /t/
- Such top-down effects are clearly consistent with
interactive (though underspecified) models like
TRACE. - How would an autonomous (modular) account of
speech perception handle this finding?
10TRACE (McClelland Elman, 1986)
- At each level, individual nodes (corresponding to
features, phonemes, or words) compete for
activation. - Facilitatory activation from bottom-up top-down
sources - Inhibition from bottom-up, top-down, lateral
sources - Recognition occurs when network settles into
stable state with a clear winner.
11Word-level competition in TRACE
Activation
Cycles (time)
12Visual Trace Example
13Cohort (Marslen-Wilson)
- Theory combines initial autonomous stage with
secondary interactive stage. - Word-initial cohort formed solely on the basis of
bottom-up acoustic input - All cohort members are actual words
- Lexical access of candidates
- Words in the cohort are removed on the basis of
- Inconsistency with further acoustic input
- Inconsistency with context
- Word recognition only one candidate remains
14Cohort Example stand
15Use Gating to find Recognition Pt
- Gating study from Zwitserlood (1989)
- People heard successively longer fragments of
critical words - In 3 kinds context
- Carrier phrase The next word is kapitein.
- Neutral context They mourned the loss of their
kapitein. - Biasing context With dampened spirits the men
stood around the grave. They mourned the
loss of their kapitein.
- Guessed what the word was
- Recognition point Point in word where everyone
identifies it as the critical word - Often earlier than uniqueness point
- How much earlier typically depends on degree of
contextual constraint - Get to see what competitors are produced before
recognition point
16Zwitserlood (1989)
- Evidence for parallel lexical activation of
cohort members     - Present participants with /kaept/, which is
ambiguous between captain and captive - Experiments were conducted in Dutch, so modified
here slightly to work in English - Then present a word related to either of those
continuations like ship and guard - Both ship guard recognized fast, compared
with unrelated control word. Indicates access to
semantics for both cohort members - True, even in biasing sentence context, so
top-down context did not prevent lexical access
of cohort candidates! - Example of semantic priming
17Priming paradigm
- Name (or make LDT to) red stimulus (i.e.,
target). - prime word CAT
- target word CAT
Repetition Priming Faster to name/LDT target
after same-word prime than after any other kind
of prime. Semantic Priming CAT is faster after
related prime (DOG) compared to unrelated prime
(DOT)
18Implications of Cohort
- Special role for word-onset
- Recognition point can precede end of word
- An infelicitous word might not be accurately
recognized - I mailed the letter w/o a STEAK.
- Can account for most top-down effects
- But not word-initial phoneme restoration
19TRACE vs. Cohort
- Cohort focus specifically on word level, whereas
TRACE models feature and letter/phoneme
identification as well. - A later, connectionist version of Cohort
incorporates speech perception addresses
shortcomings of original cohort model (Gaskell
Marslen-Wilson, 1997) - Both theories allow for top-down effects on
spoken word recognition - TRACE is fully interactive Cohort has an initial
autonomous stage - Cohort depends upon clear phonological input at
word onset, for activation of cohort. - TRACE allows for graded activation based on
shared features - ba activates papa as well as /b/ words.
- TRACE allows for activation of rhyming words
- ball partially activates fall and call
20Windmann Presentation
21Take-Home Points
- Speech perception is fast and many aspects of it
seem to be automatic and feed-forward. - Yet when bottom-up input is ambiguous, noisy, or
conflicted, top-down knowledge can influence
final percept, and perhaps the initial percept. - Sentence-level Syntactic and Semantic Processing
keeps pace with speech perception, lagging by no
more than a syllable or two. - Unit of syntactic analysis during comprehension
is word, not sentence build parse tree
incrementally.
22Is lexical access Automatic Modular?
- Automatic Processes
- Fast
- Do not require attention
- Feed-forward (cant be guided, controlled, or
stopped midstream) - Not subject to top-down feedback (informational
encapsulation)
23Priming paradigm
- Name (or make LDT to) red stimulus (i.e.,
target). - prime word CAT
- target word CAT
Repetition Priming Faster to name/LDT target
after same-word prime than after any other kind
of prime. Subliminal Priming Even if prime is
presented too quickly for conscious awareness
24Stroop Effect
RED GREEN BLUE YELLOW GREEN
What happens if you have to name word?
25Stroop Effect
- When font color conflicts with word itself, we
are slower and less accurate to name the font
color. - Recognition of word interferes with naming color
of letters. - No such interference from font color if task is
to name the word. - Word recognition is fast feed-forward we cant
stop recognizing the word, even when doing so is
detrimental to task performance.
26Is lexical access sensitive to top-down context?
- Maybe not.
- Zwitserlood (1989) found that cohort members were
activated, even if they were inconsistent with
the semantic context. - Context did have an effect, but it was after the
initial bottom-up activation of cohort members.
27A Puzzle
- Lexical Access seems like an automatic,
feed-forward, bottom-up process. - Speech perception seems quite sensitive to
top-down context effects. - Can both of these be true?
- Is lexical access really more interactive than it
appears? - Is speech perception really more bottom-up than
it appears?
28Word Recognition Across Modalities
- Production
- Spoken vs. Written
29Lexical Access in Language Production
- Levels of Processing
- Concept selection
- Word selection
- Phonological phonetic encoding
- Construction of motor plan
- Articulation
- Is this bottom-up or top-down processing?
- Describe the Stroop effect in terms of these
levels of processing. - Describe Ashcrofts deficit in terms of these
levels of processing.
30Differences between spoken and written word
recognition
- For relatively short words, letters in a written
word are processed in parallel - Eye movement data
- Word superiority effect
- Letter-Search Task
- Spoken word unfolds across time
- Can recognize some words before they are
completely pronounced.
31Eye Tracking
32Word Superiority Effect(Cattell, 1886 Reicher,
1969)
- Present stimulus for brief (near threshold)
interval on T-scope. Is the (final) letter a D or
a K?
It is easier to recognize a letter when it is in
a word, compared to a non-word or isolation.
OWRK
K
WORK
Is the word easier, due to guessing?
33Visual Trace Example
Equal bottom-up support for R K, but R wins due
to top-down support from word level.
34How many instances of the letter t in the
first sentence?
35Which letter ts do people miss?
36Implications
- Word Superiority effect
- Letter Search Task
- Do we recognize a word by recognizing each of the
letters? - Does word recognition facilitate letter
recognition? - What is the role of top-down and bottom-up
processing in these tasks?
37Letter Recognition in Words
- Just like for phoneme perception in spoken words,
there is a great deal of evidence that word
letter perception are intertwined in visual word
recognition. - We may recognize the word faster than we can
recognize each of the letters, providing the
opportunity for top-down processing from word to
letter.
38A Psycholinguistic Hoax
- Aoccdrnig to rscheearch at Cmabrigde
Uinervtisy, it deosn't mttaer in waht oredr the
ltteers in a wrod are, the olny iprmoetnt tihng
is taht the frist and lsat ltteer be at the rghit
pclae.The rset can be a total mses and you can
sitll raed it wouthit a porbelm. Tihs is bcuseae
the huamn mnid deos not raed ervey lteter by
istlef, but the wrod as a wlohe.amzanig huh?
- Can we take this at face value? Is the order of
intermediate letters really irrelevant? Do the
number and identity of intermediate letters
matter? - How do we notice typos such as transposed
letters? - How do we realize were reading novel words?
- How do we distinguish skates from steaks?
39Tasks for studying Word Recognition
- Words in Isolation
- Naming
- LDT
- Words in Context
- Eye-tracking during reading
- Priming (often cross-modal)
40Some Basic Findings about Word Recognition
- Frequency influences RT in naming and LDT, and
gaze duration in eye-tracking - LDT slow for wordy non-words
- Priming (Repetition, Semantic, etc.)
- Subliminal priming demonstrates that WR doesnt
require attention - High-level context effects???
- Faster to recognize word in congruent context?
- Slower in incongruent context?
41Experimental Design
- Balota et al. (2004)
- Factorial designs
- Very common
- Many important findings
- Limitations
- (large-scale) Regression studies
- Increasingly popular in word recognition lit
42Experimental Design
- Factorial
- Item factors manipulated categorically
- E.g. frequency or contextual bias split into high
and low conditions - ANOVA
- Main effects, interactions
- If there is a main effect (e.g. of frequency on
naming latency), it suggests that that factor
(frequency) impacts lexical access
43Example Experiment Factorial LDT
- Hypothesis High frequency words will be
recognized faster than low frequency words. - Null Hypothesis No effect of frequency on word
recognition - Dependent Measure time to say yes, measured
from onset of visually presented word. - Participants 24 college students who are native
speakers, with normal vision and no reading
problems.
44Stimuli
- Critical Trials 2 levels of Frequency
- 20 high frequency words, ranging from 75 to 300
tokens per million words - 5-8 letters in length
- 20 low frequency words, ranging from 1 to 15
tokens per million words. - 5-8 letters in length
- Filler Trials
- 20 words
- 5-8 letters in length
- 60 nonwords
- 5-8 letters in length
- All are pronounceable and word-like
45 Analysis of Variance
- For each participant, measure average latency on
high frequency trials average latency on low
frequency trials. - Is there a main effect of frequency?
- F ratio variance between conditions/variance
within conditions - p probability that an effect of size F is
significant, given degrees of freedom in your
study
462 by 2 Factorial Design
- Hypothesis Frequency effect is larger for long
words than for short words - Stimuli (4 critical conditions, 2 factors)
- Short, High freq words
- Short, Low freq words
- Long, High freq words
- Long, Low freq words
- Predicting an interaction between our 2 factors
47Limitations of Factorial Designs
- Hard to manipulate one factor while holding all
other variables constant - E.g., length, regularity, imagability, and age of
acquisition are all correlated with frequency - If we dont control for imagability, it could be
confounded with frequency. If so, our frequency
effect might really be an imagability effect. - Words are not randomly selected
- Though this assumptions is implicit in ANOVA
- Researchers may be using intuitions to select
subsets of words that are recognized fast/slow
due to variability on dimensions not
intentionally manipulated.
48Limitations of Factorial Designs
- Unwanted list-context effects
- Related to non-random sampling
- Experimental stimuli may lead participants to
expect certain types of words - Categorizing continuous variables decreases
statistical power (sensitivity) - More informative to know how much a factor
influences word recognition rather than simply
that the factor has an impact
49Balota et al. Regression Study
- Goals
- What is the best way to measure frequency?
- What is the independent contribution of
theoretically interesting predictor variables? - how much variance can each explain?
- Does importance of predictor variable differ for
naming and lexical decision? - Does it differ for younger (mean age 20) and
older (74) adults? - 50 more years of practice
- Cognitive declines in late adulthood
50Balota et al. Stimuli
- Critical Stimuli All monosyllabic, monomorphemic
words from million-word, balanced corpus (Kucera
Francis, 1967). - 2,428 words with high accuracy in analyses
- Each word coded for various types of frequency,
length, and many other variables. - LDT version has an equal number of nonwords
created by changing 1-3 letters of real words
51Naming RT was not very predictive of LDT RT
52Naming RT was not very predictive of LDT RT
53Young RT predicts Old RT fairly well
54Young RT predicts Old RT fairly well
55Interim Summary
- For a given word, RT in naming is not a very good
predictor of RT in LDT - Suggests that some predictor variables contribute
more to naming, and some contribute more to LDT - For a given word, RT by young adults is a pretty
good predictor of RT by older adults, regardless
of tasks - Older adults are slower, but performance may be
influenced by same predictor variables as young
adults
56Mean SD
Young adults were faster and less
variable. Overall, predictor variables accounted
for about 50 of variance in young, 40 of
variance in old.
57Frequency explains more variance in LDT than in
Naming
They like Zeno (17 million) corpus, which does
pretty well. Other large-scale studies have found
that spoken lg corpora do better than text
corpora (all these are text).
Most common measure does poorly (1 million words)
58Regression Analysis
- Surface Predictors
- Phonetic features of onset phoneme coded as 1
(present) or 0 (absent) - Bilabial, dental, fricative, voiced
- Important for naming, because of voice key
sensitivity - Lexical Predictors
- Semantic Predictors
59Regression Analysis
- Surface Predictors
- Lexical Predictors
- Length in letters (2-8)
- Neighborhood size ( of other words that differ
only by 1 letter) - Objective frequency (Zeno norms)
- Subjective frequency (Balota familiarity ratings)
- Consistency (4 types of spelling-sound
correspondence) - Semantic Predictors
60Regression Analysis
- Surface Predictors step 1
- Lexical Predictors step 2
- Semantic Predictors step 3
- Nelsons set size of associates in free
association task - Imagability rating
- Wordnet connectivity (Miller)
- Many of predictor variables are related to one
another (e.g., longer words have smaller
neighborhoods), so analysis must partial out
shared variance. - I will focus on RT-by items analysis (ignoring
Accuracy subject-level analyses)
61Regression Analysis
Phono onset matters a lot for Naming, especially
for young. Length also matters more for Naming
Freq matters more for LDT
62Cost of length higher for low frequency words
Interaction
63Implications of Length Effects
- Coltheart et al. (2001) predict a length by
lexicality interaction, with non-words showing a
greater effect of length. - They also predict the length by frequency
interaction observed by Balota et al. - Motivated by the Dual Route Model of Visual Word
Recognition
64Dual Route Model
- Two pathways for lexical access
- lexical route proceeds directly from
orthography to lexicon - Available to well-known words
- Preferred for irregularly spelled words
- May dominate in LGs with little ortho-graphemic
consistency - sublexical route graphemic form converted to
phonological representation BEFORE lexical access
(each grapheme is assigned a pronunciation by
mapping to a phoneme)
65Some semantic effects (esp LDT) after partialling
out phono lexical effects
Meaning probably plays a stronger role in the LDT
compared with Naming
66Summary of Balota et al.
- Large-scale regression study replicated many
effects established by factorial studies - PLUSpower to detect many small effects, such as
the influence of imagability on naming RT - while over-coming limitations associated with
small item sets - Clarified unique contributions interactions of
specific variables - Allowed careful examination of
- task differences between naming LDT
- age differences between younger older adults
67Bilingual Word Recognition
68Assumptions about the Lexicon
- Storehouse of knowledge about all the words you
know - Organized phonologically
- Word-initial cohort together
- Distinct from Semantic Memory
- Are bilinguals any different from monolinguals?
69How is bilingual memory organized?
- General agreement on the separation of lexicon(s)
and semantic memory. - Dog and chien access same semantic network,
because both prime cat in French-English
bilinguals. - Whether there is a distinct lexicon for each
language is controversial
70Why study the bilingual lexicon?
- Not really a special case, worldwide
- Mapping between words and meanings
- Mapping between phonology and words/meanings
- Are words from multiple languages in word-initial
cohort? - If not, can we also limit by topic domaine.g.,
no physics words in history class?
71Two Opposite Hypotheses
- Bilinguals have 2 distinct lexicons tri-linguals
have 3, so on. - Everyone has a single lexicon
- How keep lgs straight?
72(No Transcript)
73Possible Evidence for Separate Lexicons
- Lack of repetition priming across languages
chien doesnt prime dog like dog primes dog. - But couch doesnt prime sofa like sofa primes
sofa either
74Possible Evidence for Separate Lexicons
- Release from PI
- In a single language
- It is difficult to recall an item that occurs
late on a list when it is preceded by a lot of
similar items. The earlier items cause proactive
(as opposed to retroactive) interference. - apple, pear, peach, orange, pineapple
- As the list increases in length, likelihood of
remembering a late-occurring item decreases,
unless it is from a new semantic category - apple, pear, peach, orange, fireman
- The same release from PI occurs w/a language
change - pear, peach, orange, pineapple, manzana
75Possible Evidence for Combined Lexicon
- In comprehension, word-initial cohort includes
candidates from both languages - In production, code-switching mid-sentence
- Just use the best word, regardless of language?
- But only if your listener knows both languages
too! - Lexical access vs. lexical selection
76What about Cog-Neuro evidence?
- Patterns of aphasia in bilingual and
multi-lingual speakers - Pre-operative brain stimulation
- Imaging (PET, fMRI) and ERP studies
77Patterns of Recovery in Aphasia
- Fabbro (1999)
- 40 L1 and L2 recover in parallel
- 32 L1 gt L2
- 28 L2 gt L1
78Pre-op electrical stimulation (Ojemann)
79Imaging Dutch-French-English tri-linguals in
Belgium (Vingerhoets et al., 2003)
- Picture naming, word fluency and paragraph
comprehension tasks - All tasks revealed predominantly overlapping
regions for the 3 languages - L2s show activation in more areas and more
extensive recruitment of areas activated by L1
(Dutch)
Word Fluency Task Covertly generate as many
words as possible beginning with a specified
letter.
80Lexical Access in Speaking
- There is currently enthusiasm for single-store
models or partially-overlapping lexicons. - When preparing an utterance for output, how then
does the bilingual activate only words from a
single language? - automatic, parallel access concept words
- deliberate selection mechanism best word
81Some insights from Bilinguals
- More evidence for separation of semantic memory
lexicon - More evidence for automaticity of lexical access
in both comprehension production - Distinction between lexical access lexical
selection - So we may activate physics words in history
class, and then filter them out
82Brain potential and functional MRI evidence for
how to handle two languages with one brain