Extended Gloss Overlaps as a Measure of Semantic Relatedness - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Extended Gloss Overlaps as a Measure of Semantic Relatedness

Description:

Some pairs of words are closer in meaning than others. E.g. car tire are ... Gloss overlaps = # content words common to two ... up overlapped words. ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 41
Provided by: satanjeev
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Extended Gloss Overlaps as a Measure of Semantic Relatedness


1
Extended Gloss Overlaps as a Measure of Semantic
Relatedness
  • Satanjeev Banerjee Ted Pedersen
  • Carnegie Mellon University University
    of Minnesota Duluth
  • Supported by NSF Grants 0092784, REC-9979894

2
Semantic Relatedness
  • Some pairs of words are closer in meaning than
    others
  • E.g. car tire are strongly related
  • car tree are not strongly related
  • Relatedness between words can consist of
  • Synonymy e.g. car automobile
  • Is-a/has-a relationships e.g. car tire
  • Co-occurrence e.g. car insurance

3
Goal of this Paper
  • Create a measure to quantify semantic relatedness
  • Most existing work measures noun-noun only.
  • Resnik (1995), Lin (1997), Jiang-Conrath (1997),
  • Leacock-Chodorow (1998)
  • We can measure across parts of speech.
  • Based on WordNet definitions and relations.
  • Evaluate
  • Using word sense disambiguation.
  • Compare to human relatedness judgments (in paper)

4
Description of WordNet
  • Online English lexical database.
  • Like dictionaries, contains word senses and their
    definitions or glosses
  • E.g. sentence the penalty meted out to one
    adjudged guilty
  • Word senses that mean the same are grouped into
    synonym sets or synsets
  • E.g. sentence, conviction, condemnation

5
Semantic Relations in WordNet
Synsets are connected to other synsets through
semantic relations
sentence the penalty meted out to one adjudged
guilty
6
Semantic Relations in WordNet
Synsets are connected to other synsets through
semantic relations
final judgment a judgment disposing of the
case before the court of law
a sentence is a
sentence the penalty meted out to one adjudged
guilty
7
Semantic Relations in WordNet
Synsets are connected to other synsets through
semantic relations
final judgment a judgment disposing of the
case before the court of law
a sentence is a
hypernym
sentence the penalty meted out to one adjudged
guilty
8
Semantic Relations in WordNet
Synsets are connected to other synsets through
semantic relations
final judgment a judgment disposing of the
case before the court of law
a sentence is a
hypernym
sentence the penalty meted out to one adjudged
guilty
is a sentence
is a sentence
hard time term served in a maximum security
prison
death penalty punishment by death via
execution
9
Semantic Relations in WordNet
Synsets are connected to other synsets through
semantic relations
final judgment a judgment disposing of the
case before the court of law
a sentence is a
hypernym
sentence the penalty meted out to one adjudged
guilty
is a sentence
hyponym
is a sentence
hyponym
hard time term served in a maximum security
prison
death penalty punishment by death via
execution
10
Gloss Overlaps Relatedness
  • Lesks (1986) idea Related word senses are
    (often) defined using the same words. E.g
  • bank(1) a financial institution
  • bank(2) sloping land beside a body of water
  • lake a body of water surrounded by land

11
Gloss Overlaps Relatedness
  • Lesks (1986) idea Related word senses are
    (often) defined using the same words. E.g
  • bank(1) a financial institution
  • bank(2) sloping land beside a body of water
  • lake a body of water surrounded by land

12
Gloss Overlaps Relatedness
  • Lesks (1986) idea Related word senses are
    (often) defined using the same words. E.g
  • bank(1) a financial institution
  • bank(2) sloping land beside a body of water
  • lake a body of water surrounded by land
  • Gloss overlaps content words common to two
    glosses relatedness
  • Thus, relatedness (bank(2), lake) 3
  • And, relatedness (bank(1), lake) 0

13
Limitations of (Lesks)Gloss Overlaps
  • Most glosses are very short.
  • So not enough words to find overlaps with.
  • Solution Extended gloss overlaps
  • Add glosses of synsets connected to the input
    synsets.

14
Extending a Gloss
sentence the penalty meted out to one adjudged
guilty
bench persons who hear cases in a court of
law
overlapped words 0
15
Extending a Gloss
final judgment a judgment disposing of the
case before the court of law
hypernym
sentence the penalty meted out to one adjudged
guilty
bench persons who hear cases in a court of
law
overlapped words 0
16
Extending a Gloss
final judgment a judgment disposing of the
case before the court of law
hypernym
sentence the penalty meted out to one adjudged
guilty
bench persons who hear cases in a court of
law
overlapped words 2
17
Creating the Extended Gloss Overlap Measure
  • How to measure overlaps?
  • Which relations to use for gloss extension?

18
How to Score Overlaps?
  • Lesk simply summed up overlapped words.
  • But matches involving phrases phrasal matches
    are rarer, and more informative
  • E.g. court of law
  • Aim Score of n words in a phrase gt sum of
    scores of n words in shorter phrases
  • Solution Give a phrase of n words a score of
  • court of law gets score of 9.

19
Which Relations to Use?
  • Hypernyms car ? vehicle
  • Hyponyms car ? convertible
  • Meronyms car ? accelerator
  • Holonym car ? train
  • Also-see relation enter ? move in
  • Attribute measure ? standard
  • Pertainym centennial ? century

20
Extended Gloss Overlap Measure
  • Input two synsets A and B
  • Find phrasal gloss overlaps between A and B
  • Next, find phrasal gloss overlaps between
  • every synset connected to A, and
  • every synset connected to B
  • Compute phrasal scores for all such overlaps
  • Add phrasal scores to get relatedness of A and B
  • A and B can be from different parts of speech.

21
Evaluation On WSD
  • Test semantic relatedness measures on Word Sense
    Disambiguation (WSD) task.
  • WSD determine the intended sense of a
    multi-sense word in a sentence
  • E.g. I sat on the bank of the lake.
  • Our WSD algorithm Pick that sense of the target
    word that is most strongly related to its
    neighboring words. (based on Lesk 86)

22
Word sense disambiguation using a relatedness
measure
the bench pronounced the sentence
23
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
24
pronounce speak or utter in a certain way
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
pronounce pronounce judgment on
25
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
26
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
27
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
28
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
29
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
30
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
31
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
32
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
33
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
34
pronounce speak or utter in a certain way
sentence a string of words that satisfies
grammar rules
bench a long seat for more than one person
the bench pronounced the sentence
bench persons who hear cases in a court of
law
sentence the penalty meted out to one adjudged
guilty
pronounce pronounce judgment on
35
Evaluation Data
  • Data from SENSEVAL-2 WSD exercise.
  • 4,328 passages, each 2-3 sentences long and
    containing 1 multi-sense target word.
  • Each target word labeled by humans with its most
    appropriate WordNet sense.
  • WSD algorithms output senses compared against
    these human labels.
  • Precision, recall, and f-measure reported.

36
Evaluation Results
37
Which WN Relations Help?
  • Evaluation with a single relation at a time
  • E.g., comparing only hypernyms, only hyponyms,
    etc.
  • Result No single comparison is a big source of
    information.
  • No pair exceeded f-measure of 0.136, as compared
    to overall f-measure of 0.346

38
Which WN Relations Help?
  • Most helpful were
  • Hyponym relation
  • kinds of car ? compact, SUV, coupe, etc.
  • Meronym relation
  • parts of car ? accelerator, wheel, hood,
    etc.
  • These relations are usually one-many.
  • Thus they give access to many glosses.
  • Implies more glosses ? more useful.

39
Conclusions
  • We presented a new measure of semantic
    relatedness
  • Can operate across parts of speech.
  • We evaluated on the task of WSD.
  • Performed much better than the Lesk baseline
  • Performance comparable to other systems.
  • Future work
  • Augment using corpus statistics.
  • Evaluate on different task.

40
Resources
  • WordNetSimilarity (relatedness measures)
    (http//search.cpan.org/dist/WordNet-Similarity)
  • Extended gloss overlaps
  • Resnik, Lin, Jiang-Conrath
  • Leacock-Chodorow, Hirst-St. Onge
  • Edge Counting, Random
  • SenseRelate (WSD using relatedness)
  • (http//www.d.umn.edu/tpederse/senserelate.html)
Write a Comment
User Comments (0)
About PowerShow.com