Title: Improving Subcategorization Acquisition using Word Sense Disambiguation
1Improving Subcategorization Acquisition using
Word Sense Disambiguation
- Anna Korhonen and Judith Preiss
- University of Cambridge, Computer
Laboratory - 15 JJ Thomas Avenue, Cambridge CB3
0FD, UK - Anna.Korhonen_at_cl.cam.ac.uk,
Judita.Preiss_at_cl.cam.ac.uk
2Outline
- Subcategorization Acquisition
- Baseline System
- Baseline System combined with WSD
- Probabilistic WSD
- Experiment
- Evaluation
- Methods
3Introduction
- Subcategorization
- The dependents of a verb are classified in
- arguments -subject, object, direct
object - - subject
- - non subject
arguments (complements) - e.g. Mary knows that she is wining.
- adjuncts
- e.g. She read the book with great
interest. - The type of complements that a verb permits
gives the verb classification - The verb classification is called
subcategorization - SCFs subcategorization frames for a
given - predicate essential for parsing
4Introduction
- SCFs- a particular set of arguments that a verb
can appear with - Intransitive verb. NPsubject. They danced.
- Transitive verb. NPsubject, NPobject. Mary
appreciates
her Professor. - Intransitive with PP. NPsubject,PP. He
leave in Paris Transitive with PP.
NPsubject, NPobject, PP. She put the book on
the table. -
5Introduction
- Manual subcategorization versus automatically one
- Manual - does not provide the relative frequency
of SCFs - - predicates change behavior
- Automatically - no lexical/semantic
information is exploited - - reveals only syntactic aspects
- - no distinction between predicate
senses - Korhonen(2002) model back-off estimates
which used the predominant sense of a verb
(WordNet) - Acquisition Goal domain specific lexicon
(written vs. spoken genre based on
different senses) -
6Subcategorization Acquisition
- Baseline System
- system with the knowledge of verb semantics
- Levin(93)
- - verb senses divides them in classes
distinctive for subcategorization - Korhonen(2002)
- - verb forms are able to divide them into
semantic classes based on the
predominant sense (fly - move) - - determine the sense and the
semantic class (Levin Classes Motion
verbs) - Briscoe Carroll(97) SCF distribution
are acquired from corpus data -
7Subcategorization Acquisition
- Baseline System description
- The linear interpolation smoothing back-off
estimates is used for the SCF distribution - The method of obtaining back-off estimates
- a) 4-5 representative verbs are chosen
from a verb class - b) for theses verbs the SCF distribution is
built using manually analysis of 300
occurrences of each verb (BNC) - c) the resulted SCF distributions are
merged giving equal weight to each
distribution - E.g. fly - move, slide, arrive,
travel, sail - An empirical threshold is used to filter out
noisy SCFs
8Subcategorization Acquisition
- Combining with WSD
- Preiss Korhonen(02)
- - created different corpus datasets for the
senses (first/and or second) being
disambiguated and other datasets for the
remaining senses - - SCFs were acquired from both types of
datasets - - back-off estimates used for the SCFs acquired
from the initial dataset, the estimates
were used for smoothing according to the
relevant sense - - the SCF lexicons acquired were merged in the
end SCF distribution was rather specific to a
verb than a sense - - problems with subcategorization acquisition
datasets too small, separation of the data
was unnecessary
9Subcategorization Acquisition
- New method
- does not involve separating data and
it uses back-off estimates for the sense
distribution given by the WSD system not only
for the predominant sense - pj(scfi), j1..nb0 (nb0the number of
back-off estimates) - - the probabilities of SCFs in
different back-off distribution - P(scfi) ??jpj(scfi)
- ?j - weights for the different distributions
that sum up to 1,
are obtained from the probabilistic WSD system
- Probabilistic WSD
- - able to determine the probability
distribution for each noun, verb, adjective
and adverb - - able to determine a probability distribution
on the senses for each verb and compute the
average of it -
nb0
J1
10Subcategorization Acquisition
- System Description
- - it is based on Stevenson and Wilks(2001)
system which combines knowledge sources to
produce a WSD Tool - - it combines the probability distribution on
senses determined by each module used
(modules described in Yarowsky(2000)
Mihalcea(2002) Pederson(2002)) for the WSD
probabilistic system - - a process of smoothing is used for each
module according to each confidence value a low
module confidence is smoothed extensively for
uniform distribution - - the optimal combination of modules is based
on the accuracy (F-measure) for the English
all-words task -
11Subcategorization Acquisition
- Experiment
- Test Data
- - polysemous verbs with the predominant
sense not very frequent 29 verbs
chosen randomly - - the Levin-style senses are used to map the
WordNet senses of the chosen verbs - - he maximum number of Levin
senses considered was 4 and some of the
given senses were left out
12Subcategorization Acquisition
13Subcategorization Acquisition
- Evaluation
- Method
- - 20 mil words of the BNC corpus and
extracted all senses for the test verbs - - 1000 sentences for each verb
disambiguated with the probabilistic
WSD - - applied the modified subcategorization
system - - for each verb an individual set of
back-off estimates was built based on
the different frequency senses from the
corpus data - - results were evaluated against a manual
analysis of the corpus data - - for an average of 300 occurrences for
each verb in the BNC test data 5-21
gold standard SCFs were found (16 SCFs
per verb)
14Subcategorization Acquisition
- Evaluation
- Method
- F-measure 2PR / PR
- P-precision
- R-recall
- RC Sperman rank correction
- KL Kullback-Leibler distance
- CE cross entropy
- - record the total number of SCFs
missing in the distribution for
determining the accuracy of the back-off
estimates - - comparison with other systems the
base-line and other which assumed no sense
at all
15Subcategorization Acquisition
- Results
-
- - using the unsmoothed lexicon from a total of
175 unseen standard SCFs a number of 107 remain
unseen after using the predominant sense method - - using the WSD method only 22 remain unseen
- the performance improves with the numbers of
senses - - IS measure reveals that between the acquired
and the gold standard SCFs exists an
intersection when WSD is used
16Subcategorization Acquisition
17Subcategorization Acquisition
- Results
- - improvement for the highly polysemous verbs
(bear, count, roar e.t.c) - - verbs who differ substantially in terms of
subcategorization (conceive, continue, grasp
e.t.c) - - verbs whose sense involves mainly NP/PP
- - SCFs seems to appear in data as families
for a sense of a verb - - worse performance for seek using WSD even
though is highly polysemous and differs in
terms of subcategorization - -no clear improvement choose, compose,
induce, watch -
-
18Subcategorization Acquisition
- Conclusions
- - using the WSD an improvement can be shown
for SCFs acquisition of difficult verbs because
the senses differ also in terms of
subcategorization not only in the degree of
polysemy - Future work
- - a better way of integrating the frequency of
acquired senses into the SCFs and a refinancefor
the subcategorization method -