Deep Text Understanding with WordNet - PowerPoint PPT Presentation

1 / 71

About This Presentation

Title:

Deep Text Understanding with WordNet

Description:

... retrieval, text mining, question answering, machine translation, AI/reasoning, ... Machine translation. Natural Language Processing ... – PowerPoint PPT presentation

Number of Views:155

Avg rating:3.0/5.0

Slides: 72

Provided by: tri5322

Category:

more less

Transcript and Presenter's Notes

Title: Deep Text Understanding with WordNet

1
Deep Text Understanding with WordNet

Christiane Fellbaum
Princeton University and
Berlin-Brandenburg Academy of Sciences

2
WordNet

What is WordNet and why is it interesting/useful?
A bit of history
WordNet for natural language processing/word
sense disambiguation

3
What is WordNet?

A large lexical database, or electronic
dictionary, developed and maintained at
Princeton University
http//wordnet.princeton.edu
Includes most English nouns, verbs, adjectives,
adverbs
Electronic format makes it amenable to automatic
manipulation
Used in many Natural Language Processing
applications (information retrieval, text mining,
question answering, machine translation,
AI/reasoning,...)?
Wordnets are built for many languages (including
Danish!)?

4
Whats special about WordNet?

Traditional paper dictionaries are organized
alphabetically words that are found together (on
the same page) are not related by meaning
WordNet is organized by meaning words in close
proximity are semantically similar
Human users and computers can browse WordNet and
find words that are meaningfully related to their
queries (somewhat like in a hyperdimensional
thesaurus)?
Meaning similiarity can be measured and
quantified to support Natural Language
Understanding

5
A bit of history

Research in Artificial Intelligence (AI)
How do humans store and access knowledge about
concept?
Hypothesis concepts are interconnected via
meaningful relations
Knowledge about concepts is huge--must be stored
in an efficient and economic fashion

6
A bit of history

Knowledge about concepts is computed on the fly
via access to general concepts
E.g., we know that canaries fly because
birds fly and canaries are a kind of bird

7
A simple picture

animal (animate, breathes, has
heart,...)?
bird (has feathers, flies,..)?
canary (yellow, sings nicely,..)?

Knowledge is stored at the highest possible node
and inherited by lower (more specific) concepts
rather than being multiply stored
Collins Quillian (1969) measured reaction times
to statements involving knowledge distributed
across different levels

Do birds fly?
--short RT
Do canaries fly?
--longer RT
Do canaries have a heart?
--even longer RT

Collins Quillians results are subject to
criticism (reaction time to statements like do
canaries move? are influenced by
prototypicality, word frequency, uneven semantic
distance across levels)?
But other evidence from psychological experiments
confirms that humans organize knowledge about
words and concept by means of meaningful
relations
Access to one concepts activates related concepts
in an outward spreading (radial) fashion

11
A bit of history

But the idea inspired WordNet (1986), which
asked
Can most/all of the lexicon be represented as a
semantic network where words are interlinked by
meaning?
If so, the result would be a semantic network (a
graph)?

12
WordNet

If the (English) lexicon can be represented as a
semantic network, which are the relations that
connect the nodes?

13
Whence the relations?

Inspection of association norms
stimulus hand reponse finger, arm
Classical ontology (Aristotle) IS-A
(maple-tree),
HAS-A (maple-leaves)?
Co-occurrence patterns in texts (meaningfully
related words are used together)?

14
RelationsSynonymy

One concept is expressed by several different
word forms
beat, hit, strike
car, motorcar, auto, automobile
big, large
Synonymy onemany mapping of meaning and form

15
Synonymy in WordNet

WordNet groups (roughly) synonymous,
denotationally equivalent, words into unordered
sets of synonyms (synsets)?
hit, beat, strike
big, large
queue, line
Each synset expresses a distinct meaning/concept

16
Polysemy

One word form expresses multiple meanings
Polysemy onemany mapping of form and meaning
table, tabular_array
table, piece_of_furniture
table, mesa
table, postpone
Note the most frequent word forms are the most
polysemous!

17
Polysemy in WordNet

A word form that appears in n synsets
is n-fold polysemous
table, tabular_array
table, piece_of_furniture
table, mesa
table, postpone
table is fourfold polysemous/has four senses

18
Some WordNet stats
19
The Net part of WordNet

Synsets arethe building block of the network
Synsets are interconnected via relations
Bi-directional arcs express semantic relations
Result large semantic network (graph)?

20
Hypo-/hypernymy relates noun synsets

Relates more/less general concepts
Creates hierarchies, or trees
vehicle
/ \
car, automobile bicycle, bike
/ \ \
convertible SUV mountain bike
A car is is a kind of vehicle ltgtThe class of
vehicles includes cars, bikes
Hierarchies can have up to 16 levels

21
Hyponymy

Transitivity
A car is a kind of vehicle
An SUV is a kind of car
gt An SUV is a kind of vehicle

22
Meronymy/holonymy(part-whole relation)?

car, automobile
engine
/ \
spark plug cylinder
An engine has spark plugs
Spark plus and cylinders are parts of an engine

23
Meronymy/Holonymy

Inheritance
A finger is part of a hand
A hand is part of an arm
An arm is part of a body
gta finger is part of a body

24
Structure of WordNet (Nouns)?
25
WordNet Data Model
Vocabulary of a language
Concepts
Relations

rec 12345
financial institute

1
bank
rec 54321 - side of a river
2
rec 9876 - small string instrument
1
fiddle
violin
type-of
rec 65438 - musician playing violin
2
fiddler
violist
rec42654 - musician
type-of
rec35576 - string of instrument
1
part-of
string
rec29551 - subatomic particle
2
rec25876 - string instrument
26
(No Transcript)
27
WordNet for Natural Language Processing

Challenge
get a computer to understand language
Information retrieval
Text mining
Document sorting
Machine translation

28
Natural Language Processing

Stemming, parsing currently at gt90 accuracy
level
Word sense discrimination (lexical
disambiguation) still a major hurdle for
successful NLP
Which sense is intended by the writer (relative
to a dictionary)?
Best systems 60 precision, 60 recall (but
human inter-annotator agreement isnt perfect,
either!)?

Understanding text beyond the word level
(joint work with Peter Clark and Jerry Hobbs)?

30
Knowledge in text

Human language users routinely derive knowledge
from text that is NOT expressed on the surface
Perhaps more knowledge is unexpressed than
overtly expressed on the surface
Grasser (1981) estimates
explicitimplicit info 18

31
An example

Text A soldier was killed in a gun battle
Inferences
Soldiers were fighting one another
The soldiers had guns with live ammunition
Multiple shots were fired
One soldier shot another soldier
The shot soldier died as a result of the injuries
caused by the shot
The time interval between the fatal shot and the
death was short

Humans use world knowledge to supplement word
knowledge
(How) can such knowledge be encoded and harnessed
by automatic systems?
Previous attempts (e.g., Cycs microtheories)
--too few theories
--uneven coverage of world knowledge

33
Recognizing Textual Entailment

Task
Evaluate truth of hypothesis H given a text T
(T) A soldier was killed in a gun battle
(H) A soldier died
Answer may be yes/no/probably/...

34
RTE

Many automatic system attempt RTE via lexical,
syntactic matching algorithms (do the same words
occur in T, H? do T, H have the same
subject/object?)?
Not deep language understanding

35
Our RTE test suite

250 Text-Hypothesis pairs
for 50 of them, H is entailed by T
for the remaining 50, H is not (necessarily)
entailed
Focus on semantic interpretation

36
RTE test suite

Core of T statements came from newspaper texts
H statements were hand-coded
focus on general world knowledge

37
RTE test suite

Manually analyzed pairs
Distinguished, classified 19 types of knowledge
among the T-H pairs
some partial overlap

38
Exx Types of knowledge(increasing order of
difficulty)?

Lexical relation among irregular forms of a
single lemma, Named Entities vs. proper nouns
Lexical-semantic (paradigmatic) synonyms,
hypernyms, meronyms, antonyms, metonymy,
derivations
Syntagmatic selectional preferences, telic roles
Propositional cause-effect, preconditions
World knowledge/core theories (e.g., ambush
entails concealment)?

39
Overall approach (bag of tricks)?

Initial text interpretation with language
processing tools (Peter Clark et al.)?
Compute subsumption among text fragments
WordNet augmentations

40
Text interpretation

First step parsing (assign a structure to a
sentence or phrase)?
SAPIR parser (Harrison Maxwell 1986)?
SAPIR also produces a Logical Form (LF)?

41
LFs

LF structures are trees generated by rules
parallel to grammar rules
contain logic elements
nouns, verbs, adjs, prepositions represented as
variables
LFs are parsed and have part-of-speech tags
LFs generate ground logical assertions

42
Example

LF for "A soldier was killed in a gun battle."
(DECL
((VAR X1 "a" "soldier")
(VAR X2 "a" "battle" (NN "gun"
"battle")))
(S (PAST) NIL "kill" ?X1 (PP "in" ?X2)))?

43
Logical assertions

logic for "A soldier was killed in a gun
battle."
object(kill,soldier) in(kill,battle)
modifier(battle,gun)?

Result T, H in Logical Form

45
Matching sentences/fragments with subsumption

A basic reasoning operation
A person loves a person
subsumes
A man loves a woman
Set1 of clauses subsumes another Set2 of clauses
if each clause in S1 subsumes some member of S2.
Similary, a set of clauses subsumes another set
of clauses if the arguments of the first subsume
or match the arguments of the second
Argument (word) subsumption as in WordNet (X is a
Y)?
Matching synonyms

46
Syntactic matching of predicates

--both are the same
--one is predicate of or modifier (my friends
car, the car of my friend)?
--predicates subject and by match (passives)?

47
Lexical (word) matching

Words related by derivational morphology
(destroy, destruction) are considered matches in
conjunction with syntactic matches

Recognize as equivalent
the bomb destroyed the shrine
the destruction of the shrine by the bomb
But not
the destruction of the bomb by the shrine
a person attacks with a bomb
there is a bomb attack by a person

49
Benefits for text understanding/RTE

(T) Moore is a prolific writer
(H) Moore writes many books
Moore is the Agent of write

Exploiting word and world knowledge encoded in
WordNet

51
Use of WordNet glosses

Glosses definition of concept expressed by
synset members
airplane, plane (an aircraft that has fixed
wings and is powered by propellers or jets)
syntagmatic information, world knowledge

52
Translating glossed into First Order Logic Axioms

bridge, span (any structure that allows people
or vehicles to cross an obstacle such as a river
or canal...)
bridgeN1(x,y)?
lt--gt structureN1(x) allowV1(x,e1)
crossV1(e1,z,y)
obstacleN2(y) person/vehicle(z)?
personN1(z) --gt person/vehicle(z)?
vehicleN1(z) --gt vehicle/person(z)?
riverN2(y) --gt obstacleN2(y)?
canalN3(y) --gt obstacleN2(y)

The nouns, verbs, adjectives, adverbs in the LF
glosses were manually disambiguated
Thus, each variable in the LFs was identified not
just with a word form, but a form-meaning pair
(sense) in WordNet
LFs were generated for 110K glosses
Particular emphasis on CoreWordNet

54
How well do our tricks perform?
55
An example that works

Exploiting formally related words in WN
(T) go through licensing procedures
(H) go through licensing processes
Exploiting hyponymy (IS-A relation)
(T) Beverley served at WEDCOR
(H) Beverley worked at WEDCOR

56
More complex example that works

(T) Britain puts curbs on immigrant labor from
Bulgaria
(H) Britain restricted workers from Bulgaria

57
Knowledge from WordNet

Synset with gloss restrict, restrain,
place_limits_on, (place restrictions on)
Synonymy put, place, curb, limit
Morphosemantic link labor - laborer
Hyponymy laborer ISA worker

58
Example that doesnt work

(T) The Philharmonic orchestra draws large crowds
(H) Large crowds were drawn to listen to the
orchestra
WordNet tells us that
orchestra collection of musicians
musician someone who plays musical instrument
music sound produced by musical instruments
listen hear perceive sound
But WN doesnt tell us that playing results in
sound production and that there is a listener

59
Examples that dont work

The most fundamental knowledge that humans take
for granted trips up automatic systems
Such knowledge is not explicitly taught to
children
But it must be taught to machines!

60
Core theories (Jerry Hobbs)

Attempt to encode fundamental knowledge
Space, time, causality,...
Essential for reasoning
Not encoded in WordNet glosses

61
Core theories

Manually encoded
Axiomatized

62
Core theories

Composite entities (things made of other things,
stuff)?
Scalar notions (time, space,...)?
Change of state
Causality

63
Core theories

Example of predications
change(e1,e2)?
changeFrom(e1)?
changeTo(e2)?

64
Core theories and WordNet

map core theories to Core WN synsets
encode meanings of synsets denoting events, event
structure in terms of core theory predications

65
Examples

let(x,e) lt--gt not(cause(x,not(e)))?
go, become, get (he went wild)
go(x,e) lt--gt changeTo(e)?
free(x,y) lt--gt cause(x,changeTo(free(y)))?
(All words are linked to WN senses)?

66
Example

The captors freed the hostages
The hostages were free
free let(x, go(y, free(y)))?
lt--gt not(cause(x, not(changeTo(free(y)))?
lt--gt cause(x, changeTo(free(y)))?
lt--gt free(x,y)

67
Preliminary evaluation

(What) does each component contribute to RTE?
For the 250 Text-Hypothesis pairs in our test
suite

68
(No Transcript)
69
Conclusion

Way to go!
Deliberately exclude statistical similarity
measures (this hurts our results)
Symbolic approach aim at deep level understanding

70
WordNet for Deeper Text Understanding

Axioms in Logical Form are useful for many other
NL Understanding applications
E.g., automated question answering translate Qs
and As into logic representation
Logic representations enable reasoning (axioms
can be fed into a reasoner/logic prover)?