Title: How to Compare Frequent and Rare Words
1How to Compare Frequent and Rare Words
Virginia Savova (Johns Hopkins University)
Leonid Peshkin (MIT)
- Lexical Frequency Effects
- More frequent tokens have different (usually
faster) RTs than less frequent tokens in
psycholinguistic tasks. - Tasks reading, picture naming, lexical decision,
grammatical gender decision, translation, object
decision
- Theoretical significance of frequency effects
- In computational models of lexical retrieval
- Generally regarded as evidence against serial
search models (but see Forester, 1976) - Logogen modelsword detector thresholds lowered
as encounters rise (Morton, 1969), base-level
activation frequency (McClellandRumelhart,
1981) - Distributed versus stage-specific frequency
effect for serial-stage models of speech
production (Dell 1986, Levelt1983, 1989) - Rule-based retrieval versus raw memorization
- In reading, writing -- Frequency/Regularity
interaction HF/R words gt HF/Irr, LF/RgtLF/Irr
Example dataset
References
1 D. J. Bartram. The role of visual and
semantic codes in object naming. Cognitive
Psychology, 6(3)325356, 1974. 2 C. M.
Connine, J. Mullennix, E. Shernoff, and J. Yelen.
Word familiarity and frequency in visual and
auditory word recognition. Journal of
Experimental Psychology Learning, Memory, and
Cognition, 16(6)10841096, 1990. 3 G. S. Dell.
A spreading-activation theory of retrieval in
sentence production. Psychological Review,
93(3)283321, 1986. 4 K. I. Forster. Accessing
the mental lexicon. In R.J Wales E. Walker. New
approaches to language mechanisms, 257276,
1976. 5 K. I. Forster and S. M. Chambers.
Lexical access and naming time. Journal of Verbal
Learning and Verbal Behavior, 12627 635,
1973. 6 Z. M. Griffin and K. Bock. Constraint,
word frequency, and the relationship between
lexical processing levels in spoken word
production. Journal of Memory and Language,
38(3)313338, 1998. 7 C. H. The
language-as-fixed-effect falacy A critique of
language statistics in psychological research.
Journal of Verbal Learning and Verbal Behavior,
12335359, 1973. 8 G. W. Humphreys, M. J.
Riddoch, and P. T. Quinlan. Cascade processes in
picture identification. Cognitive
Neuropsychology, 5(1)67104, 1988. 9 J.
Huttenlocher and L. F. Kubicek. The source of
relatedness effects on naming latency. Journal of
Experimental Psychology Learning, Memory, and
Cognition, 9(3)486496, 1983. 10 J. D.
Jescheniak and W. J. M. Levelt. Word frequency
effects in speech production Retrieval of
syntactic information and of phonological form.
Journal of Experimental Psychology Learning,
Memory, and Cognition, 20(4)824843, 1994. 11
Kucera and Francis. Computational Analysis of
Present-Day American English. Brown University
Press, Providence, 1967. 12 W. J. Levelt.
Monitoring and self-repair in speech. Cognition,
14(1)41104, 1976. 13 W. J. Levelt. Speaking
From intention to articulation. The MIT Press,
Cambridge, MA, US, 1989. 14 J. L. McClelland
and D. E. Rumelhart. An interactive activation
model of context effects in letter perception an
account of basic findings. Psychological Review,
88(5)375407, 1981. C. Morison, Ellis A.W., and
P.T Quinlan. Age of acquisition, not word
frequency, affects object naming, not object
recognition. MemoryCognition 20, (6) 1992
15 J. Morton. Interaction of information in
word recognition. Psychological Review,
76(2)165178, 1969. 16 MRC Psycholinguistic
Database. http//www.psy.uwa.edu.au/MRCDataBase/uw
a mrc.htm. 17 R. Oldfield and A. Wingfield.
Response latencies in naming objects. Quarterly
Journal of Experimental Psychology,
17(4)273281, 1965. 18 K. Rayner and G. W.
McConkie. What guides a readers eye movements?
Vision Research, 16(8)829837, 1976. 19 E.
Strain, Patterson K.E. and M.S.Seidenberg.
Semantic Influences on Word Recognition, JEPLMC
21, 1995 20 M. Taft and G. Hambly. Exploring
the cohort model of spoken word recognition.
Cognition, 22(3)259282, 1986. 21 A.
Wingfield. Perceptual and response hierarchies in
object identification. Acta Psychologica,
26(3)216226, 1967. 22 A. Wingfield. Effects
of frequency on identification and naming of
objects. American Journal of Psychology,
81(2)226234, 1968.
The data suggests the RT distributions of HF and
LF will be affected differently by familiarity if
it is not controlled for (it rarely is).
Significance tests for equal
versus unequal sample variances
For ANOVA with two classes (LF and HF words)
F t
2
s
s
-
standard deviation s
sample
X
sample mean
Ö
m
m
Ö
t
n
(X
-
X
)
-
(
-
)
/(
2s)
1
2
1
2
s
s
Ö
Ö
m
m
For
,
t
n
(X
-
X
)/(
2s) for
t
follows
1
2
1
2
1
2
Student
s
t
distribution
m
m
At a 5 confidence level,
-
(X
-
X
)
t
2s)
1
2
1
2
0.05
s
¹s
Ö
Ö
m
m
For
,
t
n
(X
-
X
)/
(
s
s
) for
t
2
2
1
2
1
2
1
2
1
2
does not
follow Student
s t distribution
Adjustments to make tt
have been proposed by
Behrens, Fisher, Welch and
Aspin
They can be approximated by a modification of the
degrees of freedom assumed.
The data suggests that imageability and
concreteness are other potential sources of
variation affecting differentially the RT
distributions of HF and LF words.