Title: Lexical Resources for Kindle
1Lexical Resources for Kindle
- Martha Palmer
- with Olga Babko-Malaya, Edward Loper, Jinying
Chen, Szuting Yi and Ben Snyder - University of Pennsylvania
- September 10, 2004
- ARDA Site Visit - UIUC
2Outline
- VerbNet
- Proposition Bank
- NomBank
- Semantic Role Labeling
- Comparison to FrameNet
- Comparison to WordNet
3Levin classes (Levin, 1993)
- 3100 verbs, 47 top level classes, 193 second and
third level - Each class has a syntactic signature based on
alternations. - John broke the jar. / The jar broke. /
Jars break easily. - John cut the bread. / The bread cut. /
Bread cuts easily. - John hit the wall. / The wall hit. /
Walls hit easily.
4Levin classes (Levin, 1993)
- Verb class hierarchy 3100 verbs, 47 top level
classes, 193 - Each class has a syntactic signature based on
alternations. - John broke the jar. / The jar broke. /
Jars break easily. - change-of-state
- John cut the bread. / The bread cut. /
Bread cuts easily. - change-of-state, recognizable
action, - sharp instrument
- John hit the wall. / The wall hit. /
Walls hit easily. - contact, exertion of force
5Limitations to Levin Classes
- Coverage of only half of the verbs (types) in the
Penn Treebank (1M words,WSJ) - Usually only one or two basic senses are covered
for each verb - Confusing sets of alternations
- Different classes have almost identical
syntactic signatures - or worse, contradictory signatures
Dang, Kipper Palmer, ACL98
6Multiple class listings
- Homonymy or polysemy?
- draw a picture, draw water from the well
- Conflicting alternations?
- Carry verbs disallow the Conative,
- (she carried at the ball), but include
- push,pull,shove,kick,yank,tug
- also in Push/pull class, does take the Conative
(she kicked at the ball)
7Intersective Levin Classes
Dang, Kipper Palmer, ACL98
apart CH-STATE
across the room CH-LOC
at CH-LOC
8Intersective Levin Classes
- More syntactically and semantically coherent
- sets of syntactic patterns
- explicit semantic components
- relations between senses
- VERBNET
- www.cis.upenn.edu/verbnet
9VerbNet Karin Kipper
- Class entries
- Capture generalizations about verb behavior
- Organized hierarchically
- Members have common semantic elements, semantic
roles and syntactic frames - Verb entries
- Refer to a set of classes (different senses)
- each class member linked to WN synset(s) (not
all WN senses are covered)
Dang, Kipper Palmer, IJCAI00, Coling00
10Semantic role labels
- Grace broke the LCD projector.
- break (agent(Grace), patient(LCD-projector))
- cause(agent(Grace),
- change-of-state(LCD-projector))
- (broken(LCD-projector))
agent(A) -gt intentional(A), sentient(A),
causer(A), affector(A) patient(P) -gt affected(P),
change(P),
11VerbNet entry for leaveLevin class
future_having-13.3
- WordNet Senses leave, (WN 2,10,13), promise,
offer, . - Thematic Roles Agentanimate OR organization
-
Recipientanimate OR organization - Theme
- Frames with Semantic Roles
- "I promised somebody my time" Agent V
Recipient Theme - I left my fortune to Esmerelda" Agent V
Theme Prep(to) Recipient ) "I offered my
services" Agent V Theme
12Handmade resources vs. Real data
- VerbNet is based on linguistic theory
- how useful is it?
- How well does it correspond to syntactic
variations found in naturally occurring text? -
13Proposition BankFrom Sentences to Propositions
(Predicates!)
meet(Somebody1, Somebody2)
. . .
When Powell met Zhu Rongji on Thursday they
discussed the return of the spy
plane. meet(Powell, Zhu) discuss(Powell,
Zhu, return(X, plane))
14Capturing semantic roles
SUBJ
- John broke PATIENT the laser pointer.
-
- PATIENT The windows were broken by the
hurricane. - PATIENT The vase broke into pieces when it
toppled over
SUBJ
SUBJ
15Capturing neutral roles
- John broke Arg1 the laser pointer.
-
- Arg1 The windows were broken by the hurricane.
- Arg1 The vase broke into pieces when it toppled
over
16A TreeBanked phrase1M words WSJ Penn TreeBank
II
Marcus, et al, 93
A GM-Jaguar pact would give the U.S. car maker an
eventual 30 stake in the British company.
S
VP
VP
would
NP
give
PP-LOC
17The same phrase, PropBankedSame data released,
March04
A GM-Jaguar pact would give the U.S. car maker
an eventual 30 stake in the British company.
Arg0
would give
Arg1
an eventual 30 stake in the British company
Arg2
the US car maker
18Frames File example givelt 4000 Frames for
PropBank
- Roles
- Arg0 giver
- Arg1 thing given
- Arg2 entity given to
- Example double object
- The executives gave the chefs a standing
ovation. - Arg0 The executives
- REL gave
- Arg2 the chefs
- Arg1 a standing
ovation
19NomBank Frames File example gift
(nominalizations, noun predicates, partitives,
etc.
- Roles
- Arg0 giver
- Arg1 thing given
- Arg2 entity given to
- Example double object
- Nancys gift from her cousin was a
complete surprise. - Arg0 her cousin
- REL gave
- Arg2 her
- Arg1 gift
20Frames File example givew/ Thematic Role Labels
- Roles
- Arg0 giver
- Arg1 thing given
- Arg2 entity given to
- Example double object
- The executives gave the chefs a standing
ovation. - Arg0 Agent The executives
- REL gave
- Arg2 Recipient the chefs
- Arg1 Theme a standing ovation
VerbNet based on Levin classes
21Annotation procedure Released March, 2004 (LDC)
- PTB II Extract all sentences of a verb
- Create Frame File for that verb Paul Kingsbury
- (3100 lemmas, 4700 framesets,120K predicates)
- 1st pass Automatic tagging Joseph Rosenzweig
- 2nd pass Double blind hand correction by verb
- Inter-annotator agreement 84
- 3rd pass Adjudication Olga Babko-Malaya
- 4th pass Train automatic semantic role labellers
- Dan Gildea, Sameer Pradhan, Nianwen Xue, Szuting
Yi, . - CoNLL-04 shared task.
22Mapping from PropBank to VerbNet
- Overlap with PropBank framesets
- 50,000 PropBank instances
- lt 50 VN entries, gt 85 VN classes
- Results
- MATCH - 78.63. (80.90 relaxed)
- (VerbNet isnt just linguistic theory!)
- Benefits
- Thematic role labels and semantic predicates
- Can extend PropBank coverage with VerbNet classes
- WordNet sense tags
- Kingsbury Kipper, NAACL03, Text Meaning
Workshop - http//www.cs.rochester.edu/gildea/VerbNet/
23Outline
- VerbNet
- Proposition Banks (PropBanks)
- Semantic Role Labeling
- Comparison to FrameNet
- Comparison to WordNet
24Automatic Semantic Role Labeling
- Objective
- correctly detecting and characterizing semantic
relations within text is the heart of successful
natural language processing applications - Problem Formalization
- Problem Machine Learning
- Classification based statistical approach
- Training Testing Data PropBank annotations
- both 2004 release 2002 release (for
comparison)
25Approach
- Pre-processing
- A heuristic which filters out unwanted
constituents with significant confidence - Argument Identification
- A binary SVM classifier which identifies
arguments - Argument Classification
- A multi-class SVM classifier which tags arguments
as ARG0-5, ARGA, and ARGM
26Automatic Semantic Role Labeling
Gildea Jurafsky, CL02, Gildea Palmer, ACL02
- Stochastic Model
- Basic Features
- Predicate, (verb)
- Phrase Type, (NP or S-BAR)
- Parse Tree Path
- Position (Before/after predicate)
- Voice (active/passive)
- Head Word of constituent
- Subcategorization
27Discussion Part I Szuting Yi
- Comparisons between Pradhan and Penn (SVM)
- Both systems are SVM-based
- Kernel Pradhan uses a degree 2 polynomial
kernel Penn uses a degree 3 RGB kernel - Multi-classification Pradhan uses a
one-versus-others approach Penn uses a pairwise
approach - Features Pradhan includes rich features
including NE, head word POS, partial path, verb
classes, verb sense, head word of PP, first or
last word/pos in the constituent, constituent
tree distance, constituent relative features,
temporal cue words, dynamic class context
(Pradhan et al, 2004)
28Discussion Part II
Xue Palmer, EMNLP04
- Different features for different subtasks
- Basic features analysis
29Discussion Part III (New Features Bert Xue)
- Syntactic frame
- use NPs as pivots
- varying with position within the frame
- lexicalization with predicate
- Predicate
- head word
- phrase type
- head word of PP parent
- Position voice
30Results
31Outline
- VerbNet
- Proposition Banks (PropBanks)
- Semantic Role Labeling
- Comparison to FrameNet
- Comparison to WordNet
32PropBank/FrameNet
Rambow, et al., PBML03
Buy Arg0 buyer Arg1 goods Arg2
seller Arg3 rate Arg4 payment
Sell Arg0 seller Arg1 goods Arg2
buyer Arg3 rate Arg4 payment
More syntactic and generic, maps readily to
VN,TR NSF funding to map PropBank to FrameNet
33Word Senses in PropBank
- Orders to ignore word sense not feasible for 700
verbs - Mary left the room
- Mary left her daughter-in-law her pearls in her
will - Frameset leave.01 "move away from"
- Arg0 entity leaving
- Arg1 place left
- Frameset leave.02 "give"
- Arg0 giver
- Arg1 thing given
- Arg2 beneficiary
- How do these relate to traditional word senses in
VerbNet and WordNet?
34WordNet Princeton Miller 1985, Fellbaum 1998
- On-line lexical reference (dictionary)
- Nouns, verbs, adjectives, and adverbs grouped
into synonym sets also hypernyms (ISA),
antonyms, meronyms - Limitations as a sense inventory for creating
training data for supervised Machine Learning - No explicit ties to syntactic structure or of
participants - Definitions fine-grained and vague
- SENSEVAL2
- 29 Verbs gt 16 senses (including call)
- Inter-annotator Agreement ITA 73,
- Automatic Word Sense Disambiguation, WSD 62.5
- Slow annotation speed 60 tokens per hour
Dang Palmer, SIGLEX02
35WordNet call, 28 senses
- name, call -- (assign a specified, proper name
to - "They named their son David" )
- -gt LABEL
- 2. call, telephone, call up, phone, ring -- (get
or try to get into communication (with someone)
by telephone - "I tried to call you all night" )
- -gtTELECOMMUNICATE
- 3. call -- (ascribe a quality to or give a name
of a common noun that reflects a quality - "He called me a bastard" )
- -gt LABEL
- 4. call, send for -- (order, request, or command
to come - "She was called into the director's office"
"Call the police!") - -gt ORDER
36WordNet - call, 28 senses
WN2 , WN13,WN28 WN15 WN26 WN3
WN19 WN4 WN 7 WN8 WN9
WN1 WN22 WN20 WN25 WN18
WN27 WN5 WN 16 WN6 WN23 WN12
WN17 , WN 11 WN10, WN14, WN21, WN24
37WordNet - call, 28 senses, Senseval2 groups
WN5, WN16,WN12 WN15
WN26 WN3 WN19 WN4 WN 7
WN8 WN9 WN1 WN22 WN20
WN25 WN18 WN27 WN2 WN 13 WN6 WN23
WN28 WN17 , WN 11 WN10, WN14,
WN21, WN24,
Loud cry
Bird or animal cry
Request
Label
Call a loan/bond
Challenge
Visit
Phone/radio
Bid
38Overlap with PropBank Framesets
WN5, WN16,WN12 WN15
WN26 WN3 WN19 WN4 WN 7
WN8 WN9 WN1 WN22 WN20
WN25 WN18 WN27 WN2 WN 13 WN6 WN23
WN28 WN17 , WN 11 WN10, WN14,
WN21, WN24,
Loud cry
Bird or animal cry
Request
Label
Call a loan/bond
Challenge
Visit
Phone/radio
Bid
39Overlap between Senseval2Groups and Framesets
95
Frameset2
Frameset1
WN1 WN2 WN3 WN4 WN6 WN7 WN8
WN5 WN 9 WN10 WN11 WN12 WN13
WN 14 WN19 WN20
develop
40Sense Hierarchy (Palmer, et al, SNLU04 - NAACL04)
- PropBank Framesets ITA gt90
- coarse grained distinctions
- 20 Senseval2 verbs w/ gt 1 Frameset
- Maxent WSD system, 73.5 baseline, 90 accuracy
- Sense Groups (Senseval-2) - ITA 82
- Intermediate level
- (includes Levin classes) 69
- WordNet ITA 71
- fine grained distinctions, 60.2
Tagging w/groups, ITA 89, 200_at_hr
41PropBank I
I
nominal reference
Event variables
- Also, Arg0substantially lower Dutch corporate
tax rates helped Arg1Arg0 the company keep
Arg1 its tax outlay Arg3-PRD flat ArgM-ADV
relative to earnings growth.
ArgM-ADV
Arg3-PRD
Arg1
Arg0
REL
the company keep its tax outlay flat
tax rates
help
relative to earnings
flat
its tax outlay
the company
keep
42Summary of Needed KINDLE Resources