Title: Automatic Verb Classification
1The Role of the Syntax/Semantics Mapping in SLA
Computational Experiments in Verb
Classification Vivian Tsang and Suzanne
Stevenson, University of Toronto
Results
Motivation
- Theories of verb classification have
elaborated a detailed mapping from underlying
semantics to overt syntactic behavior (Pinker
1989 Levin 1993). - This syntax/semantics mapping appears to aid
language acquisition, as the child uses
syntactic/semantic cues to induce properties of
verb semantics (e.g., Gleitman 1990 Gillette
et al. 1999, Pinker 1994).
- Materials
- Corpora British National Corpus (100M
words), Mandarin Chinese News Text (165M words). - Sample 60 English verbs, 20 from each class.
- Method
- Vectors (of relative frequencies of occurrence,
one vector per verb) are training data for
machine learning system. - Chance accuracy 2-way 50, 3-way 33.3.
Theoretical maximum accuracy 100. ()
L1
L2
I break an egg?
I make an egg break?
Does this syntax/semantics mapping play a role in
Second Language Acquisition (SLA) as well? L2
learners appear to generalize their knowledge of
the syntax/semantics mapping of verb classes in
L1 to learn the usage of verbs in L2 ("transfer
effects") (e.g., Helms-Park 2001, Inagaki 2001,
Montrul 2001).
Chinese learner of English
- The best bilingual features consistently
outperform the best English features and the best
Chinese features. See chart. (All
differences in accuracy (unasterisked bars)
are statistically significant, one-way ANOVA with
Tukey-Kramer post-test, p lt 0.05.) - Features that perform well are English
animacy of subject, transitivity, passive
feature. Chinese subcategorization,
active/stative distinction, passive feature. - Merlo and Stevenson (2001) experimentally
determined a best performance of 87 (for a
similar three-way verb classification task) among
a group of human experts -- this suggests a
more realistic upperbound than the theoretical
maximum accuracy of 100.
Automatic Verb Classification
- The Task
- Monolingual computational experiments support
that surface syntactic/semantic indicators can
help determine the underlying verb semantics
(e.g., Allen 1997, Dorr and Jones 1996, Lapata
and Brew 1999, Schulte im Walde 2000, Merlo and
Stevenson 2001). - Statistical syntactic/semantic features within
English can be used to classify English verbs
into semantic classes (Merlo and Stevenson
2001). - Extension In a multilingual computational
setting (corpus-based automatic verb
classification), explore the ability of L1
syntactic/semantic features to aid in the
learning of L2 verb classes.
- Optionally transitive English Verb Classes
- Manner of motion e.g., jump, race
- Change of State e.g., break, melt
- Creation/Transformation e.g., build, dance
- Syntactic/Semantic Features
- The verbs cannot be distinguished by
subcategorization alone. - English features transitivity, passive
construction, causativity, past-tense form,
animacy of subject. - Chinese features subcategorization,
active/stative distinction, overt causative
and passive indicators. - Automatic Verb Classification with Transfer
- For each feature, collect its frequency of
occurrence over a sample of English verbs and
their translations from multiple English and
Chinese corpora. - Use the bilingual features to train a system to
classify the English verbs (i.e., L1 Chinese,
L2 English).
Conclusions
The performance of one feature in one language is
an indicator of the performance of the related
feature in another language -- there are
syntactic and semantic properties that hold
across languages allowing transfer to occur.
- Chinese
- Subcategorization (esp. transitivity)
- Overt causative particle
- Overt passive particle
- English
- Transitivity
- Causativity
- Passive Feature
performance indicator
Features of English verbs
Features of Chinese Translations
- Provided there is sufficient overlap in the
semantics of the L1 and L2 verbs, our experiments
support the hypothesis that L2 learners use
syntax/semantics of verbs in L1 in acquiring
properties of verbs in L2. - Future/On-going work
- Other languages (we have experimented with
German and Italian verbs). - Elaborate a possible mechanism underlying the
transfer of knowledge (statistical analysis of
verb behaviour and its relation to semantic
classes).
Classification of English verbs
Contact Information Vivian Tsang
(vyctsang_at_cs.toronto.edu), Suzanne Stevenson
(suzanne_at_cs.toronto.edu)
Updated 9/17/2009 434 PM