Coreference - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Coreference

Description:

Anaphora Resolution. Anaphors: Personal pronouns. E.g. he, she, it, ... Anaphora Resolution. Only considering surface strings, or parse structures is not enough ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 22
Provided by: SEEM9
Category:

less

Transcript and Presenter's Notes

Title: Coreference


1
Coreference Anaphora Resolutionwith Markov
Logic Network
  • Cecia Chan
  • Supervised by Wai Lam
  • Department of Systems Engineering and Engineering
    Management,
  • The Chinese University of Hong Kong

2
Introduction
  • Overwhelming knowledge information in texts
  • Complicated structures, formation of relations
    and entities
  • Understanding of language
  • Function annotation, knowledge extraction
  • Resolving relations
  • Identifying and resolving the relation between
    different named entities
  • Coreference resolution identifying unique
    concepts
  • Two important subtasks
  • Name entity recognition
  • Coreference resolution

3
Name Entity Recognition
  • Named entity recognition (NER)
  • a subtask of information extraction
  • locate and classify atomic elements in text into
    predefined categories such as the names of
    persons, organizations, locations, etc.
  • Much attention on biological domain
  • Recognition of Gene/Protein names

4
Coreference Resolution
  • Coreference
  • Variation of named entities
  • Cohesion in clauses sentences
  • Presentation in Coherent continuation
  • E.g. repetition, conjunction, thematization
  • Lexical and Grammatical Repetition
  • Reference
  • Exophoric reference
  • Pointing outwards
  • Endophoric reference
  • Pointing inwards

5
Coreference Resolution
  • Finding the equivalence relation between noun
    phrases
  • Variation Nenadic, et al. 02
  • E.g. Nuclear factor-kappa B
  • Orthographic
  • Nuclear factor kappa B
  • Morphological
  • mRNAs gt mRNA
  • Syntactic
  • SMRT and Trip-1 mRNAs gt SMRT mRNA, Trip-1 mRNA
  • Acronyms
  • NF-kappa B, NF-kB
  • semantic

6
Anaphora Resolution
  • Anaphors
  • Personal pronouns
  • E.g. he, she, it, they, them
  • Possessive pronouns
  • E.g. its, his, etc.
  • Demonstratives
  • This, that, these, those, here, there, now, then
  • Definite pronouns
  • E.g. the girl
  • Comparatives
  • The same as,
  • Different,
  • Another,
  • The third, etc.

7
Coreference Resolution -- related works
  • Linguistic approaches
  • Hobbs algorithm
  • Searching along the syntactic parse tree of
    sentences
  • Employing syntactic constraints for resolving
    pronouns
  • Contra-indexing
  • E.g. Mark was talking to him
  • Mark was talking to himself
  • Centering theory
  • Tracking entities in focus
  • Resolving demonstrative noun phrases or pronouns
  • By ranking entities

8
Coreference Resolution -- related works
  • Machines Learning Approaches
  • Naïve Bayes
  • Decision Trees
  • Conditional Random Fields
  • Discriminative model
  • McCallum et al. proposed 3 models on CRF
  • Restricting clique potentials to only pairs of
    mentions
  • F1 73 on NP coreference on MUC-6 dataset
  • Coreference as clustering

9
Markov Logic Network
  • Proposed by Richardson Domingos, 2004
  • Combining first-order logic and probabilistic
    graphical models in a single representation
  • MLN a first-order knowledge base with weight
    attached to each formula
  • Flexible
  • Incorporating domain knowledge
  • Enabling uncertainty, tolerating contradictory
    knowledge

10
First-order Logic
  • First-order logic knowledge base (KB)
  • Set of sentences or formulas in first-order logic
  • Formulas constants, variables, functions, and
    predicates
  • E.g. Friends(x, y) Author(x)
  • A set of hard constraints on the set of possible
    worlds (of assigning truth value to each possible
    ground atom, i.e. replacing all arguments
    (variables) in formulas with constants)
  • Zero probability even when a world violates only
    one formula

11
Markov Logic
  • Softening the constraints
  • The fewer formulas a world violates, the more
    probable it is.
  • Associating weight with formula to reflect the
    strength of a constraint.
  • For a possible world, the probability
    distribution
  • where F is the number of formulas in MLN and n(x)
    is the number of true groundings of Fi in x.

12
Example of ground MLN
  • Vx, Studies(x) gt Student(x)
  • VxVy, Friends(x,y) gt (Studies(x) ? Studies(y))
  • C Ann, Mary

Friends(Ann, Mary)
Friends(Ann, Ann)
Friends(Mary, Mary)
Studies(Mary)
Studies(Ann)
Friends(Mary, Ann)
Student(Ann)
Student(Mary)
13
Markov Logic Coreference Resolution
  • Statistical relational learning task
  • Be concisely formulated in MLNs
  • Link Prediction/ Link-based clustering
  • Decision on relations between objects
  • R(x1, x2)
  • Coreference
  • If there exist a cohesion between two objects,
    x1, x2.
  • ? COREF(x1, x2)
  • Where x1, x2 can be proper name entities,

14
Problem Formulation
  • Proper Name Resolution
  • Anaphora Resolution
  • Only considering surface strings, or parse
    structures is not enough
  • Cross sentence analysis
  • Linguists have long trying to define anaphoric
    relations
  • Syntactic clues
  • Semantic relations
  • Deduce and analysis as a logical network
  • Not hard constraints
  • Aims to incorporate those anaphoric relations
    with Markov Logic Network

15
Problem Formulation
  • Assumption
  • Syntactic or Semantic chain existed between two
    coherent entities
  • Linguistic Constraints of coherent entities
  • Syntactic or Semantic Chains
  • The syntactic or semantic path between two
    coherent objects(entities)
  • Syntactic chains
  • From parse tree
  • E.g. (subj(A), comp_of(A, B) ,)
  • Semantic chains
  • From Discourse Representation
  • (agent(A), activity(A, B) )

16
Linguistic Constraints
  • Feature Comparison
  • Vx1,x2,Vy1,y2, Feature(x1, y1) Feature(x2, y2)
    y1 y2 ? COREF(x1, x2)
  • Agreement
  • Quantity-agreement(x1, x2)
  • Both x1 and x2 has to be agreed on the quantities
    involved
  • Vx1Vx2, (quantity(x1) quantity(x2) )?
    quantity-agreement(x1, x2)
  • Identity-agreement(x1, x2)
  • Object type agreement
  • (type(x1) type(x2) )? Identity-agreement(x1,
    x2)
  • Colinked objects (Reflexives)
  • Reflexive appears within the same sentence with
    its antecedents
  • Vx1Vx2, NounPhrase(x1)reflexive(x2)closest(x1,x2
    )within-sent(x1, x2) ? COREF(x1, x2)

17
Linguistic Constraints
  • C(onstituents)-command
  • Def.1
  • A is said to c-commands B iff the first branching
    node that dominates A also dominates B.
  • Def. 2
  • If a pronoun does not c-command the other noun
    phrase, they are referring to the same individual
  • Vx1Vx2, Pronoun(x1) NounPhrase(x2)
    c-commands(x1,x2) within_sent(x1, x2) ?
    COREF(x1, x2)

18
Linguistic Constraints
  • Theme of text
  • Parallelism
  • Objects existed in paralleled position
  • Mary killed John.
  • She also killed Ben.
  • Mary She are coherent.
  • Vx1Vx2 , Subj(x1, z1)subj(x2, z1)
    within-window(z1,z2) ? Parallel(x1, x2)
  • Vx1Vx2, parallel(x1, x2) R(x1, x2) ? COREF(x1,
    x2)
  • Dependency
  • Logical relation between clauses clues of
    indication of coherence
  • E.g. cause-effect condition
  • Resemblance relations
  • Elaboration relations
  • Semantic-variants
  • For demonstrative pronouns
  • Semantic synonyms existed in the objects clues
    for coherence

19
Current Issues
  • Syntactic Semantic Chain Generation
  • Weight Estimation
  • Inference
  • Constraints validation
  • Datasets
  • MUC-6 Coreference Datasets
  • Annotation of Coreference Datasets on Medstract

20
Future Work
  • Tasks of information processing in text are
    inter-related to each other
  • Considering two or more tasks together can affect
    the performance
  • Name entity recognition normalization
    (coreference)
  • Usually done in two separate process
  • Actual decision making
  • entity mentions being coreferenced
  • Help verifying the corresponding mentions as
    being a name entity
  • Mutual reinforcement

21
The EndThank you very much!
Write a Comment
User Comments (0)
About PowerShow.com