Stressing what is important: Orthographic cues and Lexical Stress Assignment - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Stressing what is important: Orthographic cues and Lexical Stress Assignment

Description:

with distinct orthography and/or pronunciation and/or grammatical category count ... Orthography up to and including first vowel (as in Arciuli & Cupples, 2006) ... – PowerPoint PPT presentation

Number of Views:191
Avg rating:3.0/5.0
Slides: 54
Provided by: nada2
Category:

less

Transcript and Presenter's Notes

Title: Stressing what is important: Orthographic cues and Lexical Stress Assignment


1
Stressing what is important Orthographic cues
and Lexical Stress Assignment
  • Nada eva
  • University of York, UK
  • Padraic Monaghan
  • Lancaster University, UK
  • Joanne Arciuli
  • Charles Hurst University, Australia

2
Previous models of reading in English
  • Dual-route cascade (DRC) model
  • (Coltheart, 2000 Coltheart, Rastle, Perry,
    Langdon, Ziegler, 2001)
  • rule-based model (Grapheme-to-phoneme (GPC)
    rules for novel words)
  • Connectionist models
  • (Harm Seidenberg, 1999, 2004 Plaut,
    McClelland, Seidenberg, Patterson, 1996
    Seidenberg McClelland, 1989)
  • -triangle model (Harm Seidenberg, 2004)
    interaction between orthography, phonology and
    semantics
  • Connectionist Dual Process (CDP) model (Perry,
    Ziegler, Zorzi, 2007)

3
  • Problems
  • Only monosyllabic words
  • - There is only approx. 8500 monosyllabic words
    in English and over 50000 polysyllabic words
  • - Extension to other languages
  • Increased complexity in grapheme-to-phoneme
    coding in polysyllabic words
  • hothouse
  • Stress assignment

4
  • Stress and spoken words processing
  • lexical access (Donselar et al., 2005
    Soto-Franco et al., 2001)
  • the division of words into sub lexical units such
    as onset-rime (Goswami, 2003 Wood, 2006)
  • word, phrase, sentence boundaries (Cutler et al.,
    1997 Sebastian-Galles Costa, 1997)

5
  • Stress and written words processing
  • Stress sensitivity facilitate learning of reading
    (Wood Terrel, 1998 Wood, 2006) and stress
    assignment in second language learning
    (Wade-Woolley et al, 2004 Goetry et al, 2006)
  • Stress representation is activated during silent
    reading (Ashby Clifton, 2005)

6
  • Nature of the stress representation?
  • Current theories on word production state that
    lexical stress is a part of the metrical
    representation which is retrieved or computed
    parallel to phonological encoding (Caramazza,
    1997 Levelt, Roelofs, Meyer, 1999 Schiller,
    2006).
  • Reading and stress assignment in languages with
    non-fixed stress placement (English, Dutch,
    Italian)?
  • English
  • ZEbra (trochaic) vs. GiRAffe (iambic)
  • 70 30

7
  • Rastle Coltheart (2000) model proposed a
    system of sub-lexical rules which will translate
    orthographic representation to both segmental and
    suprasegmental parts of phonological
    representation.

8
  • Rastle Coltheart (2000) model
  • a) Represents part of the Dual-route Cascade
    (DRC) model of reading (Coltheart et al., 2001)
  • b) linguistic analysis of stress patterns in
    English by Fudge (1984) and Garde (1968)
  • 54 beginnings and 101 endings (most of them
    were morphemes in English) could influence the
    placement of stress

9
Steps in the algorithm 1) identification of
predefined beginnings and then endings 2)
translation of the remaining parts of words into
phonological representation by using
grapheme-to-phoneme (GPC) rules plus a set of
additional rules for correction of illegal
phoneme combinations 3) stress assignment
based on the stored affix stress position and
the quality of the vowels (presence of schwa)
4) if no prefix and suffix was identified,
application of first syllable stress as the
default stress position.
10
  • Correct stress assignment for 89.7 of English
    disyllabic words from the CELEX database (Baayen
    et al., 1993).
  • Nonwords test
  • 210- 115 trochaic and 95 iambic words
  • 15 subjects estimated stress position in reading
    aloud task
  • -84.8 correct stress assignment on the non-word
    test.

11
  • Problems?
  • Is this really sublexical procedure given the
    role of affixes in the stress assignment process?
  • What is the role of orthography?

12
  • Connectionist account?

13
  • The statistical regularities with respect to
    stress assignment could be learned in the same
    way as the learning of regularities in the
    orthography to phonology mapping (Harm
    Seidenberg, 1999, 2004 Plaut et al., 1996
    Seidenberg McClelland, 1989).

14
  • Distributional cues
  • general (trochaic words more frequent)
  • nouns (trochaic) vs. verbs (iambic) (Kelly
    Bock, 1988 Serano, 1986)
  • Phonological cues
  • the rime reduced vowels are unstressed and
    consonantal clusters in codas are stressed
    (Chomsky Halle, 1968)
  • the onset consonantal clusters (Kelly, 2004).
  • Orthographic cues
  • length and complexity of beginnings and
    endings, the identity letters (both consonants
    and vowels) (Arciuli Cupples, 2006, in press
    Kelly, Morris Verrekia, 1998).

15
  • Experimental studies have demonstrated that
    readers are sensitive to such phonological,
    orthographic and distributional cues present in
    the input (Arciuli Cupples, 2006, in press
    Colomobo, 1992 Kelly Bock, 1988 Kelly et al.,
    1998)

16
Corpus analyses of orthographic cues
To what extent can beginnings and endings predict
stress position?
17
Corpus analyses of orthographic cues
  • Disyllabic words from CELEX
  • with distinct orthography and/or pronunciation
    and/or grammatical category count as separate
    words.
  • All words
  • 18,571 1st syllable stress, 2387 2nd syllable
    stress
  • Lemma analyses (no inflectional morphology)
  • 9485 1st syllable stress, 1813 2nd syllable
    stress
  • Monomorphemic analyses (no inflectional or
    derivational morphology)
  • 2420 1st syllable stress, 375 2nd syllable stress

18
Analysis
  • Discriminant analysis
  • used to determine which variables discriminate
    between trochaic vs. iambic words.
  • Type and token analysis (weighted by frequency)

19
Beginnings and endings
  • Beginning cue
  • Orthography up to and including first vowel (as
    in Arciuli Cupples, 2006)
  • 789 distinct beginnings
  • Ending cue
  • Orthography from final vowel onwards
  • 1411 distinct endings
  • E.g.
  • penguin pe-, -uin

20
Results All Words
  • Token
  • Type

21
Results Lemmata
  • Token
  • Type

22
Results Monomorphemes
  • Token
  • Type

23
  • The Educators Word Frequency Guide (Zeno,
    1995).
  • a quantitative summary of the printed vocabulary
    encountered by students in American schools.
  • 60,527 samples of text from over 6,000 textbooks,
    works of literature, and popular works of fiction
    and nonfiction.
  • from grade 1(age of 5) to college.

24
Results Tokens
25
Educators WFL vs. Celex
26
  • There is a large amount of potential information
    in orthography beginnings/endings
  • That goes well beyond morphemes
  • Most beginnings/endings were not morphemes
  • For all analyses, better classification from
    endings than beginnings (more for children than
    for adults)

27
Modelling
  • Architecture

28
  • 25016 English disyllabic words
  • CELEX lexical database (Baayen et al., 1993)
  • 83 trochaic, 17 iambic
  • learning rate0.005
  • alignment left
  • 5 million presentations of words, selected
    according to their log-compressed frequency
  • 20 simulations
  • 90 training, 10 testing, randomly selected

29
d2.1
d2.6
d3.8
30
  • nouns vs. verbs contrast as a noun versus
    contrast as a verb
  • overgeneralization errors
  • ab- about, above, abroad (second syllable)
  • CELEX
  • 60 ab- (51897) 2nd syllable stress,
  • 21 ab- (7708) 1st syllable stress
  • error abject.
  • evenly distributed errors
  • con-
  • CELEX
  • 101 con- (13008) 1st syllable stress,
  • 169 con- (44292) 2nd syllable stress
  • errors 38 1st syllable
  • 44 2nd syllable stress

31
  • Test on Rastle Coltheart (2000) nonwords?

32
RC 2000 nonwords
33
  • no-/-ate
  • nonword nockate (second syllable)
  • CELEX
  • 104 no- (22077) 1st syllable stress,
  • 15 no- (285) 2nd syllable stress
  • 108 -ate (6565) 1st syllable stress,
  • 165 -ate (3608) 2nd syllable stress

34
  • Why does RC model exhibit better performance
    than neural networks?
  • Limited and non-representative training set for
    NN models

35
  • Training on all polysyllabic words with the
    stress on 1st or 2nd syllable
  • 51948 words, 89.6 of the polysyllabic word types
    in the CELEX database.
  • 68.6 1st syllable and 31.4 second syllable
    words
  • (dysillabic words 87 trochaic vs. 13
    iambic words)

36
(No Transcript)
37
  • Why does RC model exhibit better performance
    than neural networks?
  • Limited training set for NN models
  • - Explicitly define beginnings and endings

38
  • Kelly(2004) non-words
  • 96 non-words varying in onset complexity
  • ½ C onset - pamdeen
  • ½ CC onset plamdeen
  • 78 trochaic vs.18 iambic words
  • 20 subjects in silent reading task

39
Kelly2004 nonwords
Trochaic
Iambic
40
Kelly2004 results
41
  • RC(2000) model
  • 1/3 of errors were from the noprefix/nosuffix
    class of words
  • (bolay, wispay)
  • co- (colvane, corlax)
  • Conflicting cues (beginning vs. endings)
  • plamdeen, gronvoon
  • pl-, gr- (complex onset) trochaic words
  • -een, -oon (suffix) iambic words

42
  • Why does RC model exhibit better performance
    than neural networks?
  • Limited training set for NN models
  • - Explicitly define beginnings and endings
  • Phonology and/or parts-of-speech information

43
(No Transcript)
44
Phonology and Parts-of-speech
d0.74
d3.06
45
  • Multiple cue accounts have been shown to
    result in more accurate classification in
  • speech segmentation tasks
  • (Onnis, Monaghan, Chater, Richmond, 2005)
  • grammatical categorisation tasks (Monaghan,
    Christiansen, Chater, 2007).

46
(No Transcript)
47
Orthography, Phonology, Parts-of-speech
48
  • What is the role of orthography?
  • Orthography and other cues?
  • Rule-based vs. connectionist account?
  • Sublexical nature of the stress assignment?

49
Conclusions
  • The present study provided a demonstration that
    stress assignment for words and nonwords can be
    accomplished with good accuracy in a
    connectionist model that learns to map
    orthography onto stress position for disyllabic
    words in English.
  • Additional simulations indicated that combination
    of orthographical, phonological and
    distributional cues can give improved performance
    in the stress assignment task.

50
Conclusions
  • Rule-based vs. connectionist accounts
  • Connectionist account allowed more detailed
    exploration of different cues relevant for the
    stress assignment
  • Stress assignment is clearly part of sublexical
    process.

51
Further simulations
  • Further testing on novel sets of nonwords,
    including phonological and distributional
    information
  • Cross-linguistic comparison with Italian
  • Simulations of the developmental results.

52
  • This work was supported by the ESRC/ARC
    Bilateral Research Awards Grant, RES 000-22-1975.

53
  • Thank you!
Write a Comment
User Comments (0)
About PowerShow.com