Title: Representing Regularity: The English Past Tense
1Representing RegularityThe English Past Tense
- Matt Davis
- William Marslen-Wilson
- Centre for Speech and LanguageBirkbeck College
University of London - and
- Mary Hare
- Center for Research in LanguageUniversity of
CaliforniaSan Diego
- Abstract
- Evidence from priming experiments suggests
differences in the lexical representation of
regular and irregular forms of the English past
tense. Such results have been used to argue for a
dual mechanism account of English inflectional
morphology. - A single mechanism connectionist model is
described which learns an abstract version of the
task of recognising English inflected verbs.
Analysis of the networks internal representations
show differences between regular and irregular
verbs that could account for the priming data.
This suggests that behavioural and
representational differences need not be taken as
evidence for two distinct processing mechanisms.
2The English past tense has been a popular
case-study for investigating language processing
since it provides clear examples of both regular
and irregular linguistic processes.
Psycholinguistic accounts of English inflection
have focused on the process or processes that map
between stem and past tense forms. The debate
between single and dual mechanism accounts of
language processing has been directed at the
psychological status of the rule that describes
how a verb stem is inflected to produce a regular
past tense.
- Dual Mechanism Accounts
- (e.g. Pinker 1991)
- Regular verbs
- Inflected by a symbolic rule-based system
- Irregular verbs
- Stored in an (associative) memory system that
blocks the application of the rule-governed route -
-
- Single Mechanism Accounts
- (eg. Rumelhart McClelland 1986)
- Regular and Irregular verbs
- Both regular and irregular verbs are inflected by
a distributed network mapping from verb stems to
past tenses
3These accounts, focusing just on the phonological
relationship between verb stems and past tenses
seem unsatisfactory as an account of
comprehension or production, and make the
implicit assumption that accessing the lexical
representation of an inflected verb proceeds via
a phonological representation of the verb
stem. Experiments using a repetition priming
task have cast doubt on this assumption since
they suggest that the representations accessed in
comprehending inflected words differ according to
the regularity of the inflection.
- Hare, Older, Ford and Marslen-Wilson (1995)
- Cross-modal immediate repetition priming
- Subjects hear an auditory prime
- A visual target is presented on a computer screen
at the acoustic offset of the prime - Subjects make a lexical decision response to the
target word
- Compared lexical decision RTs to verb stems
preceded by - Past tense primes (reg/irreg)
- Present tense primes (all reg)
- Unrelated control primes
- Tested all the irregular verbs in British English
and matched regular verbs - Excluding homophones (e.g. ate/eight)
- Excluding identity inflected verbs (e.g. hit)
4Results show that the past tense of regular verbs
significantly prime their stems, whereas
irregular verbs do not. Such data is hard to
explain in terms of semantic or phonological
priming and has been interpreted as evidence for
differences in the lexical representation of
regular and irregular verbs a dual mechanism
account (Pinker 1991 citing Stanners et. al.
1979). Our purpose here is to investigate whether
representational differences between regular and
irregular verbs can be accounted for by a single
mechanism, connectionist model.
Previous research has shown that this cross-modal
repetition priming task is not susceptible to
form based priming (i.e. whisky doesnt prime
whisk). Marslen-Wilson et. al. (1994)
- Results Hare, Older, Ford and Marslen-WIlson
(1995)
5The network we report here was trained to map
from a phonological input to a distributed
semantic vector and a tense output. This is the
reverse of the mapping investigated by Cottrell
and Plunkett (1991) - and can be seen as
analogous to the comprehension of inflected
verbs. The network was trained on 988
monosyllabic English verbs, each presented as a
stem and a past tense in proportion to their log
frequency of occurrence. An additional 110
regular verbs were presented in one form only, to
allow testing of the networks generalization
abilities.
- Network trained to identify verb stems and past
tenses
A 50 bit random vector that uniquely identifies
each verb root.
A structured phonological representation
developed for models of reading aloud. It uses
phonotactics and sonority to minimise duplication
of segments within mono-syllables. Plaut et al.
(1996)
6The network was trained for 2000 passes through
the training set at which point the training
error curve had reached asymptote and training
stopped. The performance of the network was then
evaluated using a nearest target criterion.
The hidden unit representations developed by the
network to perform the mapping were also
evaluated. Measures of the Euclidean Distance
between the representation of stem and past tense
forms of regular and irregular verbs were taken.
- Training set
- Error rate lt 3 (of 1768 items)
- Most were homophone errors
- build - billed (65)
- Some tense errors
- threw - identified as stem (35)
- dread - identified as past tense
- Test set
- The network was correct on 85 of the novel forms
of familiar verbs - (of 110 items)
- Euclidean Distance between stem and past tense
representations
n0
r total no. of units in group sn stem
activation, unit n pn past activation, unit n
7Distance measures in hidden unit space show that
the representation of stems and past tenses is
more similar for regular than for irregular
verbs. However we need to confirm that this is an
effect of regularity and not just differences in
the amount of phonological overlap. The same
analysis was therefore carried out on the input
representations. Comparing distance measures in
the input and hidden units shows that the
represent-ation of regular verbs is significantly
more similar than would be predicted on the basis
of phonological overlap alone.
ANOVA on ratio of input/hidden distance show
significant differ-ences between the three sets
of verbs. F(2,961)1434.4,
plt0.0001
The unequal scales in the two graphs reflect the
different numbers of units in the input and
hidden unit representations.
8The network appears to have learnt to use the
consistent relationship between the final segment
of regular verbs and their tense. This can be
seen in the networks generalization performance,
and the tense errors that it makes after
training. Without the inflectional ending on
regular verbs, the network can then map an
invariant phonological form onto the semantics of
the verb. Hence the very similar representation
of both forms of the regular verbs at the hidden
units. However for the irregular verbs,
this process breaks down either through changes
in the verb stem (semi-weak verbs such as
sleep-slept), exceptions to the affix-tense
regularity (semi-weak verbs such as bend-bent) or
a combination of the two (vowel-change verbs such
as give-gave). In these cases there is no longer
a consistent mapping for both forms of the verb
and the network must therefore develop more
separate representations at the hidden units.
Regular verbs
Irregular verbs
sliùp bEnd gIv
tÎùn gaId t