Learning noun phrase coreference resolution - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

Learning noun phrase coreference resolution

Description:

Learning noun phrase coreference resolution. Veronique Hoste. ATILA ... Anaphora is the device of making in discourse an abbreviated ... anaphora? ... – PowerPoint PPT presentation

Number of Views:195

Avg rating:3.0/5.0

Slides: 39

Provided by: Vero171

Category:

more less

Transcript and Presenter's Notes

Title: Learning noun phrase coreference resolution

1
Learning noun phrase coreference resolution
Veronique Hoste ATILA Meeting 2004
2
Outline

Definition
Data sets
Experimental setup
The effect of optimization
The effect of skewedness
Results on the test set

3
Definition (Hirst, 81)

Anaphora is the device of making in discourse an
abbreviated reference to some entity in the
expectation that the perceiver will we able to
disabbreviate the reference and thereby determine
the identity of the entity.

4
Definition (Hirst, 81)

Anaphora is the device of making in discourse an
abbreviated reference to some entity in the
expectation that the perceiver will we able to
disabbreviate the reference and thereby determine
the identity of the entity.

ANAPHOR
5
Definition (Hirst, 81)
ANTECEDENT or REFERENT
ANAPHOR

Anaphora is the device of making in discourse an
abbreviated reference to some entity in the
expectation that the perceiver will we able to
disabbreviate the reference and thereby determine
the identity of the entity.

6
Definition (Hirst, 81)
ANTECEDENT or REFERENT
ANAPHOR

Anaphora is the device of making in discourse an
abbreviated reference to some entity in the
expectation that the perceiver will we able to
disabbreviate the reference and thereby determine
the identity of the entity.

RESOLUTION
7
Example (KNACK-2002)

() In de praktijk is er van autonomie of
vrijheid in de beide Kashmirs geen sprake, want
ze zijn sinds jaar en dag de twistappel tussen
Pakistan en India. Die twee landen onstonden in
1947 om een conflict tussen moslims en hindoes te
vermijden. () De Verenigde staten probeerden
vruchteloos van Pakistan en India de belofte af
te dwingen dat ze geen kernwapens zouden
inzetten. Dat leidde zelfs tot economische
sancties tegen beide landen.

8
Example (KNACK-2002)

() In de praktijk is er van autonomie of
vrijheid in de beide Kashmirs geen sprake, want
ze zijn sinds jaar en dag de twistappel tussen
Pakistan en India. Die twee landen onstonden in
1947 om een conflict tussen moslims en hindoes te
vermijden. () De Verenigde staten probeerden
vruchteloos van Pakistan en India de belofte af
te dwingen dat ze geen kernwapens zouden
inzetten. Dat leidde zelfs tot economische
sancties tegen beide landen.

9
Example (KNACK-2002)

() In de praktijk is er van autonomie of
vrijheid in de beide Kashmirs geen sprake, want
ze zijn sinds jaar en dag de twistappel tussen
Pakistan en India. Die twee landen onstonden in
1947 om een conflict tussen moslims en hindoes te
vermijden. () De Verenigde staten probeerden
vruchteloos van Pakistan en India de belofte af
te dwingen dat ze geen kernwapens zouden
inzetten. Dat leidde zelfs tot economische
sancties tegen beide landen.

10
Which anaphora?

Identity relation
lt-gt type-token relation I prefer the red car,
but my husband wanted the grey one.
lt-gt part-whole relation If the gas tank is
empty, you should refuel the car.
NPs
Personal, posessive and demonstrative pronouns
Definite and indefinite NPs

11
MUC-6 and MUC-7

Message Understanding Conference
Extensively used for evaluation
Articles from WSJ and NYT
Identity relation between NPs
MUC-6 2141coreferential NPs in training set
and 2091 in test set
MUC-7 2569 coreferential NPs in training
set and 1728 in test set

12
KNACK-2002

Articles from KNACK 2002 on different topics
national and international politics, science,
culture,
Annotation adapted version of MUC guidelines
Identity, bound, ISA, modality relations between
NPs
http//cnts.uia.ac.be/hoste/manual_dutch.ps
Ca. 12,546 coreferentially annotated NPs

13
(No Transcript)
14
Approaches

The past mostly knowledge-based techniques
(constraints and preferences)
e.g. Lappin Leass (1994), Baldwin (CogNIAC,
1996)
Recently machine learning (C4.5,Ripper, Maximum
entropy)

Redefine coreference resolution as a
CLASSIFICATION task
15
A classification based approach

Given two entities in a text, NP1 and NP2,
classify the pair as coreferent of not
coreferent.
E.g.
De Verenigde staten probeerden vruchteloos van
Pakistan en India de belofte af te dwingen dat
ze geen kernwapens zouden inzetten.
ze - de belofte not coreferential
ze - Pakistan en India coreferential
ze - De Verenigde Staten not
coreferential

16
Selected features (41)

Positional features (eg. dist_sent, dist_NP)
Local context features
Morphological and lexical features (e.g.
i/j/ij-pron, j_demon, j_def, i/j/ij-proper,
num_agree)
Syntactic features (e.g. i/j/ij_SBJ, appos)
String-matching features (comp_match,
part_match, alias, same_head)
Semantic features (syn, hyper, same_NE, 4
features indicating semantic class)

17
Positive and negative instances

Per NP type (Pronouns/Proper nouns/Common
nouns)
Positive combination of the anaphor with each
preceding element in the coreference chain.
Negative combination of the anaphor with each
preceding NP which is not part of the coreference
chain (search scope lt 20 sentences)
e.g. MUC-7 1,905 coreferential NPs
positive 11,266 inst.
negative 159,815 inst.

18
Two step procedure

First step validation
Application of Timbl and Ripper on train set
10-fold-cv
Evaluation accuracy, precision, recall, F-beta
Second step testing
Training of Timbl and Ripper on train set
testing on test set.
Reconstruction of coreference chains
Evaluation using MUC scoring software

19
Algorithms compared

Ripper
Cohen, 95
Rule Induction
Algorithm parameters different class ordering
principles negative conditions or not loss
ratio values cover parameter values
TiMBL
Memory-Based Learning
Algorithm parameters ib1, igtree overlap, mvdm
5 feature weighting methods 4 distance weighting
methods 10 values of k

20
Default classifier results (MUC6)
21
Conclusions default experiments

The concatenation of the NP-type classifiers is
beneficial for Ripper, not for Timbl.
Low precision scores for Timbl (large number of
false positives). The scores are up to 30 lower
than the ones for Ripper.
Reason feature weighting?
Higher recall for Timbl distinguishes better
between true and false negatives.

22
GA optimization
23
GA individuals
Feature weighting 0,1,2,3,4
Neighbour weighting 0,1,2,3
Values 0,1,2
k
0 1 0 1 2 0 2 1 0 2 0 0 2 1 0 2 2 0 3 2
2.0288721872
Parameters
Features
24
GA optimization results MUC6
25
Is skewedness a problem?

In an unbalanced data set, the majority class is
represented by a large portion of all the
instances whereas the other class, the minority
class has only a small part of the instances.
E.g. MUC-7 only 6 positive instances
Imbalanced data sets may result in poor
performances of standard classification
algorithms
gt problem of ignoring the minority class

26
Strategies for dealing with skewed data sets

Sampling
undersampling
oversampling
Adjusting misclassification costs (high cost to
misclassification of the minority class)
Weighting of examples (focus on the minority
class)

27
Sampling

Undersampling examples from the majority class
are
removed
Problem throw away possibly useful
information
Oversampling examples from the minority class
are
duplicated
Problem no increase of information,
overfitting
General observation in ML literature
- undersampling leads to better
performance
- oversampling does not help

28
Skewedness (MUC6)
29
Downsampling results MUC6
30
Changing loss ratio in Ripper

Loss ratio parameter allows to specify the
relative
cost of false positives and false negatives
Focus on recall loss ratio lt 1
Focus on precision loss ratio gt 1

31
Skewedness summary

Comparison of the sensitivity of Timbl and Ripper
to the skewed data set (ML past C4.5)
Both learners large number of FN
Ripper has a much poorer performance on the
minority class (Forgetting exceptions ?)
Ripper is also more sensitive to rebalancing
No particular downsampling level or loss ratio
value leads to overall best performance
gt yet another optimization step ...

32
Testing

Construction of test instances all NPs starting
from the second NP in the document are considered
a possible anaphor, whereas all preceding NPs are
considered possible antecedents of the anaphor
under consideration.
Useful?

33
(No Transcript)
34
Testing (ctd.)

Application of optimized classifiers
Antecedent selection
New evaluation procedure evaluation of the
equivalence classes (transitive closure of a
coreference chain)

35
Testing results (MUC6)
36
What about ...

One reason Lockheed Martin Corp. did not
announce a full acquisition of Loral Corp. was
that Lockheed could not meet the price he had
placed on Lorals 31 percent ownership of
Globalstar telecommunciations Ltd. Lockheed will
invest 344 million in Loral Space and
Communciations Corp., a new company whose
principal holding will be Lorals interest in
Globalstar.

37
What about ...

Hughes pays U.S. 4 mln fine from
whistleblower case. Hughes Electronics Corp. has
paid the U.S. government 4 million to settle a
1990 lawsuit.

38
What about ...

Chinas Foreign Trade Minister Wu Yi has
extended an olive branch to Taiwan saying Beijing
remained committed in talks with the breakaway
island to establish direct trade and
communication links.

Write a Comment

User Comments (0)