Reference Resolution - PowerPoint PPT Presentation

About This Presentation

Title:

Reference Resolution

Description:

Knowledge-based, shallow analysis: CogNIAC ( 95) ... Conflate different aspects of discourse. Task plan, discourse plan. Ignore aspects of discourse ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 54

Provided by: ginaann

Learn more at: https://www.classes.cs.uchicago.edu

Category:

more less

Transcript and Presenter's Notes

Title: Reference Resolution

1
Reference Resolution

Natural Language Processing
January 22, 2008

2
Agenda

Reference resolution
Knowledge-rich, deep analysis approaches
Centering
Knowledge-based, shallow analysis CogNIAC (95)
Learning approaches Fully, Weakly, and Un-
Supervised
CardieNg 99-04

3
Centering

Identify the local center of attention
Pronominalization focuses attention, appropriate
use establishes coherence
Identify entities available for reference
Describe shifts in what discourse is about
Prefer different types for coherence

4
Centering Structures

Each utterance (Un) has
List of forward-looking centers Cf(Un)
Entities realized/evoked in Un
Rank by likelihood of focus of future discourse
Highest ranked element Cp(Un)
Backward looking center (focus) Cb(Un)

5
Centering Transitions
6
Centering Constraints and Rules

Constraints
Exactly ONE backward -looking center
Everything in Cf(Un) realized in Un
Cb(Un) highest ranked item in Cf(Un) in Un-1
Rules
If any item in Cf(Un-1) realized as pronoun in
Un, Cb(Un) must be realized as pronoun
Transitions are ranked
Continuing gt Retaining gt Smooth Shift gt Rough
Shift

7
Centering Example

John saw a beautiful Acura Integra at the
dealership
Cf (John, Integra, dealership) No Cb
He showed it to Bill.
Cf(John/he, Integra/it, Bill) Cb John/he
He bought it
Cf (John/he, Integra/it) Cb John/he

8
Reference Resolution Differences

Different structures to capture focus
Different assumptions about
of foci, ambiguity of reference
Different combinations of features

9
Reference Resolution Agreements

Knowledge-based
Deep analysis full parsing, semantic analysis
Enforce syntactic/semantic constraints
Preferences
Recency
Grammatical Role Parallelism (ex. Hobbs)
Role ranking
Frequency of mention
Local reference resolution
Little/No world knowledge
Similar levels of effectiveness

10
Alternative Strategies

Knowledge-based, but
Shallow processing, simple rules!
CogNIAC (Baldwin 95)
Data-driven
Fully or weakly supervised learning
Cardie Ng ( 02-04)

11
Questions

80 on (clean) text. What about
Conversational speech?
Ill-formed, disfluent
Dialogue?
Multiple speakers introduce referents
Multimodal communication?
How else can entities be evoked?
Are all equally salient?

12
More Questions

80 on (clean) (English) text What about..
Other languages?
Salience hierarchies the same
Other factors
Syntactic constraints?
E.g. reflexives in Chinese, Korean,..
Zero anaphora?
How do you resolve a pronoun if you cant find it?

13
CogNIAC

Goal Resolve with high precision
Identify where ambiguous, use no world knowledge,
simple syntactic analysis
Precision correct labelings/ of labelings
Recall correct labelings/ of anaphors
Uses simple set of ranked rules
Applied incrementally left-to-right
Designed to work on newspaper articles
Tune/rank rules

14
CogNIAC Rules

Only resolve reference if unique antecedent
1) Unique in prior discourse
2) Reflexive nearest legal in same sentence
3) Unique in current prior
4) Possessive Pro single exact poss in prior
5) Unique in current
6) Unique subj/subj pronoun

15
CogNIAC Example

John saw a beautiful Acura Integra in the
dealership.
He showed it to Bill.
He John Rule 1 it -gt ambiguous (Integra)
He bought it.
HeJohn Rule 6 itIntegra Rule 3

16
Data-driven Reference Resolution

Prior approaches
Knowledge-based, hand-crafted
Data-driven machine learning approach
Cast coreference as classification problem
For each pair NPi,NPj, do they corefer?
Cluster to form equivalence classes

17
NP Coreference Examples

Link all NPs refer to same entity

Queen Elizabeth set about transforming her
husband, King George VI, into a viable monarch.
Logue, a renowned speech therapist, was summoned
to help the King overcome his speech
impediment...
Example from CardieNg 2004
18
Training Instances

25 features per instance 2NPs, features, class
lexical (3)
string matching for pronouns, proper names,
common nouns
grammatical (18)
pronoun_1, pronoun_2, demonstrative_2,
indefinite_2,
number, gender, animacy
appositive, predicate nominative
binding constraints, simple contra-indexing
constraints,
span, maximalnp,
semantic (2)
same WordNet class
alias
positional (1)
distance between the NPs in terms of of
sentences
knowledge-based (1)
naïve pronoun resolution algorithm

19
Classification Clustering

Classifiers
C4.5 (Decision Trees), RIPPER
Cluster Best-first, single link clustering
Each NP in own class
Test preceding NPs
Select highest confidence coref, merge classes
Tune Training sample skew class, type

20
Classifier for MUC-6 Data Set
21
Unsupervised Clustering

Analogous features to supervised
Distance measure weighted sum of features
Positive infinite weights block clustering
Negative infinite weights cluster, unless
blocked
Others, heuristic
If distance gt r (cluster radius), non-coref
Clustering
Each NP in own class
Test each preceding NP for dist lt r
If so, cluster, UNLESS incompatible NP
Performance Middling b/t best and worst

22
Problem 1

Coreference is a rare relation
skewed class distributions (2 positive
instances)
remove some negative instances

NP3
NP4
NP5
NP6
NP7
NP8
NP9
NP2
NP1
farthest antecedent
23
Problem 2

Coreference is a discourse-level problem
different solutions for different types of NPs
proper names string matching and aliasing
inclusion of hard positive training instances
positive example selection selects easy positive
training instances (cf. Harabagiu et al. (2001))

Queen Elizabeth set about transforming her
husband, King George VI, into a viable monarch.
Logue, the renowned speech therapist, was
summoned to help the King overcome his speech
impediment...
24
Problem 3

Coreference is an equivalence relation
loss of transitivity
need to tighten the connection between
classification and clustering
prune learned rules w.r.t. the clustering-level
coreference scoring function

coref ?
coref ?
Queen Elizabeth set about transforming her
husband, ...
not coref ?
25
Weakly Supervised Learning

Exploit small pool of labeled training data
Larger pool unlabeled
Single-View Multi-Learner Co-training
2 different learning algorithms, same feature set
each classifier labels unlabeled instances for
the other classifier
data pool is flushed after each iteration

26
Effectiveness

Supervised learning approaches
Comparable performance to knowledge-based
Weakly supervised approaches
Decent effectiveness, still lags supervised
Dramatically less labeled training data
1K vs 500K

27
Reference Resolution Extensions

Cross-document co-reference
(Baldwin Bagga 1998)
Break the document boundary
Question John Smith in A John Smith in B?
Approach
Integrate
Within-document co-reference
with
Vector Space Model similarity

28
Cross-document Co-reference

Run within-document co-reference (CAMP)
Produce chains of all terms used to refer to
entity
Extract all sentences with reference to entity
Pseudo per-entity summary for each document
Use Vector Space Model (VSM) distance to compute
similarity between summaries

29
Cross-document Co-reference

Experiments
197 NYT articles referring to John Smith
35 different people, 24 1 article each
With CAMP Precision 92 Recall 78
Without CAMP Precision 90 Recall 76
Pure Named Entity Precision 23 Recall 100

30
Conclusions

Co-reference establishes coherence
Reference resolution depends on coherence
Variety of approaches
Syntactic constraints, Recency, Frequency,Role
Similar effectiveness - different requirements
Co-reference can enable summarization within and
across documents (and languages!)

31
Coherence Coreference

Cohesion Establishes semantic unity of discourse
Necessary condition
Different types of cohesive forms and relations
Enables interpretation of referring expressions
Reference resolution
Syntactic/Semantic Constraints/Preferences
Discourse, Task/Domain, World knowledge
Structure and semantic constraints

32
Challenges

Alternative approaches to reference resolution
Different constraints, rankings, combination
Different types of referent
Speech acts, propositions, actions, events
Inferrables - e.g. car -gt door, hood, trunk,..
Discontinuous sets
Generics
Time

33
Discourse Structure Theories

,Natural Language Processing
CMSC 35100-1
January 22, 2008

34
Roadmap

Goals of Discourse Structure Models
Limitations of early approaches
Models of Discourse Structure
Attention Intentions (Grosz Sidner 86)
Rhetorical Structure Theory (Mann Thompson 87)
Contrasts, Constraints Conclusions

35
Why Model Discourse Structure?(Theoretical)

Discourse not just constituent utterances
Create joint meaning
Context guides interpretation of constituents
How????
What are the units?
How do they combine to establish meaning?
How can we derive structure from surface forms?
What makes discourse coherent vs not?
How do they influence reference resolution?

36
Why Model Discourse Structure?(Applied)

Design better summarization, understanding
Improve speech synthesis
Influenced by structure
Develop approach for generation of discourse
Design dialogue agents for task interaction
Guide reference resolution

37
Early Discourse Models

Schemas Plans
(McKeown, Reichman, Litman Allen)
Task/Situation model discourse model
Specific-gtGeneral restaurant -gt AI planning
Topic/Focus Theories (Grosz 76, Sidner 76)
Reference structure discourse structure
Speech Act
single utt intentions vs extended discourse

38
Discourse Models Common Features

Hierarchical, Sequential structure applied to
subunits
Discourse segments
Need to detect, interpret
Referring expressions provide coherence
Explain and link
Meaning of discourse more than that of component
utterances
Meaning of units depends on context

39
Earlier Models

Issues
Conflate different aspects of discourse
Task plan, discourse plan
Ignore aspects of discourse
Goals intentions vs focus
Overspecific
Fixed plan, schema, relation inventory

40
Attention, Intentions and the Structure of
Discourse

GroszSidner (1986)
Goals
Integrate approaches for focus (reference res.),
plan/task structure, discourse structure, goals
Three part model
Linguistic structure (utterances)
Attentional structure (focus, reference)
Intentional structure (plans, purposes)

41
Linguistic Structure

Utterances group into discourse segments
Hierarchical, not necessarily contiguous
Not strictly decompositional
2-way interactions
Utterances define structure
Cue phrases mark segment boundaries
But, okay, fine, incidentally
Structure guides interpretation
Reference

42
Intentional Structure

Discourse participants overall purpose
Discourse segments have purposes (DP/DSP)
Contribute to overall
Main DP/DSP intended to be recognized

43
Intentional Structure Relations

Two relations between purposes
Dominance
DSP1 dominates DSP2 if doing DSP2 contributes to
achieving DSP1
Satisfaction-Precedence
DSP1 must be satisfied before DSP2
Purposes
Intend that someone know something, do something,
believe something, etc
Open-ended

44
Attentional State

Captures focus of attention in discourse
Incremental
Focus Spaces
Include entities salient/evoked in discourse
Include a current DSP
Stack-structured
higher-gtmore salient, lower still accessible
Pushsegment contributes to previous DSP
Pop segment to contributes to more dominant DSP
Tied to intentional structure

45
Attentional State cntd.

Focusing structure depends on the intentional
structure the relationships between DSPs
determine pushes and pops from the stack
Focusing structure coordinates the linguistic and
intentional structures during processing
Like the other 2 structures, focusing structure
evolves as discourse proceeds

46
Discourse examples

Essay
Task-oriented dialog
Intentional structure is neither identical nor
isomorphic to the general plan

47
0
The "movies" are so attractive to the great
American public, especially to young people, that
it is time to take careful thought about their
effect on mind and morals. Ought any parent to
permit his children to attend a moving picture
show often or without being quite certain of the
show he permits them to see? No one can deny, of
course, that great educational and ethical gains
may be made through the movies because of their
astonishing vividness. But the important fact to
be determined is the total result of continuous
and indiscriminate attendance on shows of this
kind. Can it other than harmful? In the first
place the character of the plays is seldom of the
best. One has only to read the ever-present
"movie" billboard to see how cheap, melodramatic
and vulgar most of the photoplays are. Even the
best plays, moreover, are bound to be exciting
and over-emotional. Without spoken words, facial
expression and gesture must carry the meaning
but only strong emotion or buffoonery can be
represented through facial expression and
gesture. The more reasonable and quiet aspects of
life are necessarily neglected. How can our young
people drink in through their eyes a continuous
spectacle of intense and strained activity and
feeling without harmful effects? Parents and
teachers will do well to guard the young against
overindulgence in the taste for the "movie".
1
2
3
4
5
6
48
H1. First you have to remove the flywheel . R2.
How do I remove the flywheel? H3. First, loosen
the screw , then pull it off. R4. OK .5. The
tool I have is awkward. Is there another tool
that I could use instead? H6. Show me the tool
you are using. R7. OK. H8. Are you sure you are
using the right size key? R9. Ill try some
others. 10. I found an angle I can get at it
. 11. The screw is loose, but Im having trouble
getting the flywheel off. H12. Use the
wheelpuller . Do you know how to use it ? R13.
No. H14. Do you know what it looks like? R15.
Yes. H16. Show it to me please. R17. OK. H18.
Good. Loosen the screw in the center and place
the jaws around the hub of the flywheel, then
tighten the screw onto the center of the shaft.
The flywheel should slide off.
49
Processing issues

Intention recognition
What info can be used to recognize an intention
At what point does this info become available
Overall processing module has to be able to
operate on partial information
It must allow for incrementally constraining the
range of possibilities on the basis of new info
that becomes available as the segment progresses