Lexical Resources for Kindle - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Lexical Resources for Kindle

Description:

Grace broke the LCD projector. break (agent(Grace), patient ... Nancy's gift from her cousin was a complete surprise. Arg0: her cousin. REL: gave. Arg2: her ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 43
Provided by: l2rCs
Category:

less

Transcript and Presenter's Notes

Title: Lexical Resources for Kindle


1
Lexical Resources for Kindle
  • Martha Palmer
  • with Olga Babko-Malaya, Edward Loper, Jinying
    Chen, Szuting Yi and Ben Snyder
  • University of Pennsylvania
  • September 10, 2004
  • ARDA Site Visit - UIUC

2
Outline
  • VerbNet
  • Proposition Bank
  • NomBank
  • Semantic Role Labeling
  • Comparison to FrameNet
  • Comparison to WordNet

3
Levin classes (Levin, 1993)
  • 3100 verbs, 47 top level classes, 193 second and
    third level
  • Each class has a syntactic signature based on
    alternations.
  • John broke the jar. / The jar broke. /
    Jars break easily.
  • John cut the bread. / The bread cut. /
    Bread cuts easily.
  • John hit the wall. / The wall hit. /
    Walls hit easily.

4
Levin classes (Levin, 1993)
  • Verb class hierarchy 3100 verbs, 47 top level
    classes, 193
  • Each class has a syntactic signature based on
    alternations.
  • John broke the jar. / The jar broke. /
    Jars break easily.
  • change-of-state
  • John cut the bread. / The bread cut. /
    Bread cuts easily.
  • change-of-state, recognizable
    action,
  • sharp instrument
  • John hit the wall. / The wall hit. /
    Walls hit easily.
  • contact, exertion of force

5
Limitations to Levin Classes
  • Coverage of only half of the verbs (types) in the
    Penn Treebank (1M words,WSJ)
  • Usually only one or two basic senses are covered
    for each verb
  • Confusing sets of alternations
  • Different classes have almost identical
    syntactic signatures
  • or worse, contradictory signatures

Dang, Kipper Palmer, ACL98
6
Multiple class listings
  • Homonymy or polysemy?
  • draw a picture, draw water from the well
  • Conflicting alternations?
  • Carry verbs disallow the Conative,
  • (she carried at the ball), but include
  • push,pull,shove,kick,yank,tug
  • also in Push/pull class, does take the Conative
    (she kicked at the ball)

7
Intersective Levin Classes
Dang, Kipper Palmer, ACL98
apart CH-STATE
across the room CH-LOC
at CH-LOC
8
Intersective Levin Classes
  • More syntactically and semantically coherent
  • sets of syntactic patterns
  • explicit semantic components
  • relations between senses
  • VERBNET
  • www.cis.upenn.edu/verbnet

9
VerbNet Karin Kipper
  • Class entries
  • Capture generalizations about verb behavior
  • Organized hierarchically
  • Members have common semantic elements, semantic
    roles and syntactic frames
  • Verb entries
  • Refer to a set of classes (different senses)
  • each class member linked to WN synset(s) (not
    all WN senses are covered)

Dang, Kipper Palmer, IJCAI00, Coling00
10
Semantic role labels
  • Grace broke the LCD projector.
  • break (agent(Grace), patient(LCD-projector))
  • cause(agent(Grace),
  • change-of-state(LCD-projector))
  • (broken(LCD-projector))

agent(A) -gt intentional(A), sentient(A),
causer(A), affector(A) patient(P) -gt affected(P),
change(P),
11
VerbNet entry for leaveLevin class
future_having-13.3
  • WordNet Senses leave, (WN 2,10,13), promise,
    offer, .
  • Thematic Roles Agentanimate OR organization

  • Recipientanimate OR organization
  • Theme
  • Frames with Semantic Roles
  • "I promised somebody my time" Agent V
    Recipient Theme
  • I left my fortune to Esmerelda" Agent V
    Theme Prep(to) Recipient ) "I offered my
    services" Agent V Theme

12
Handmade resources vs. Real data
  • VerbNet is based on linguistic theory
  • how useful is it?
  • How well does it correspond to syntactic
    variations found in naturally occurring text?

13
Proposition BankFrom Sentences to Propositions
(Predicates!)
meet(Somebody1, Somebody2)
. . .
When Powell met Zhu Rongji on Thursday they
discussed the return of the spy
plane. meet(Powell, Zhu) discuss(Powell,
Zhu, return(X, plane))
14
Capturing semantic roles
SUBJ
  • John broke PATIENT the laser pointer.
  • PATIENT The windows were broken by the
    hurricane.
  • PATIENT The vase broke into pieces when it
    toppled over

SUBJ
SUBJ
15
Capturing neutral roles
  • John broke Arg1 the laser pointer.
  • Arg1 The windows were broken by the hurricane.
  • Arg1 The vase broke into pieces when it toppled
    over

16
A TreeBanked phrase1M words WSJ Penn TreeBank
II
Marcus, et al, 93
A GM-Jaguar pact would give the U.S. car maker an
eventual 30 stake in the British company.
S
VP
VP
would
NP
give
PP-LOC
17
The same phrase, PropBankedSame data released,
March04
A GM-Jaguar pact would give the U.S. car maker
an eventual 30 stake in the British company.
Arg0
would give
Arg1
an eventual 30 stake in the British company
Arg2
the US car maker
18
Frames File example givelt 4000 Frames for
PropBank
  • Roles
  • Arg0 giver
  • Arg1 thing given
  • Arg2 entity given to
  • Example double object
  • The executives gave the chefs a standing
    ovation.
  • Arg0 The executives
  • REL gave
  • Arg2 the chefs
  • Arg1 a standing
    ovation

19
NomBank Frames File example gift
(nominalizations, noun predicates, partitives,
etc.
  • Roles
  • Arg0 giver
  • Arg1 thing given
  • Arg2 entity given to
  • Example double object
  • Nancys gift from her cousin was a
    complete surprise.
  • Arg0 her cousin
  • REL gave
  • Arg2 her
  • Arg1 gift

20
Frames File example givew/ Thematic Role Labels
  • Roles
  • Arg0 giver
  • Arg1 thing given
  • Arg2 entity given to
  • Example double object
  • The executives gave the chefs a standing
    ovation.
  • Arg0 Agent The executives
  • REL gave
  • Arg2 Recipient the chefs
  • Arg1 Theme a standing ovation

VerbNet based on Levin classes
21
Annotation procedure Released March, 2004 (LDC)
  • PTB II Extract all sentences of a verb
  • Create Frame File for that verb Paul Kingsbury
  • (3100 lemmas, 4700 framesets,120K predicates)
  • 1st pass Automatic tagging Joseph Rosenzweig
  • 2nd pass Double blind hand correction by verb
  • Inter-annotator agreement 84
  • 3rd pass Adjudication Olga Babko-Malaya
  • 4th pass Train automatic semantic role labellers
  • Dan Gildea, Sameer Pradhan, Nianwen Xue, Szuting
    Yi, .
  • CoNLL-04 shared task.

22
Mapping from PropBank to VerbNet
  • Overlap with PropBank framesets
  • 50,000 PropBank instances
  • lt 50 VN entries, gt 85 VN classes
  • Results
  • MATCH - 78.63. (80.90 relaxed)
  • (VerbNet isnt just linguistic theory!)
  • Benefits
  • Thematic role labels and semantic predicates
  • Can extend PropBank coverage with VerbNet classes
  • WordNet sense tags
  • Kingsbury Kipper, NAACL03, Text Meaning
    Workshop
  • http//www.cs.rochester.edu/gildea/VerbNet/

23
Outline
  • VerbNet
  • Proposition Banks (PropBanks)
  • Semantic Role Labeling
  • Comparison to FrameNet
  • Comparison to WordNet

24
Automatic Semantic Role Labeling
  • Objective
  • correctly detecting and characterizing semantic
    relations within text is the heart of successful
    natural language processing applications
  • Problem Formalization
  • Problem Machine Learning
  • Classification based statistical approach
  • Training Testing Data PropBank annotations
  • both 2004 release 2002 release (for
    comparison)

25
Approach
  • Pre-processing
  • A heuristic which filters out unwanted
    constituents with significant confidence
  • Argument Identification
  • A binary SVM classifier which identifies
    arguments
  • Argument Classification
  • A multi-class SVM classifier which tags arguments
    as ARG0-5, ARGA, and ARGM

26
Automatic Semantic Role Labeling
Gildea Jurafsky, CL02, Gildea Palmer, ACL02
  • Stochastic Model
  • Basic Features
  • Predicate, (verb)
  • Phrase Type, (NP or S-BAR)
  • Parse Tree Path
  • Position (Before/after predicate)
  • Voice (active/passive)
  • Head Word of constituent
  • Subcategorization

27
Discussion Part I Szuting Yi
  • Comparisons between Pradhan and Penn (SVM)
  • Both systems are SVM-based
  • Kernel Pradhan uses a degree 2 polynomial
    kernel Penn uses a degree 3 RGB kernel
  • Multi-classification Pradhan uses a
    one-versus-others approach Penn uses a pairwise
    approach
  • Features Pradhan includes rich features
    including NE, head word POS, partial path, verb
    classes, verb sense, head word of PP, first or
    last word/pos in the constituent, constituent
    tree distance, constituent relative features,
    temporal cue words, dynamic class context
    (Pradhan et al, 2004)

28
Discussion Part II
Xue Palmer, EMNLP04
  • Different features for different subtasks
  • Basic features analysis

29
Discussion Part III (New Features Bert Xue)
  • Syntactic frame
  • use NPs as pivots
  • varying with position within the frame
  • lexicalization with predicate
  • Predicate
  • head word
  • phrase type
  • head word of PP parent
  • Position voice

30
Results
31
Outline
  • VerbNet
  • Proposition Banks (PropBanks)
  • Semantic Role Labeling
  • Comparison to FrameNet
  • Comparison to WordNet

32
PropBank/FrameNet
Rambow, et al., PBML03
Buy Arg0 buyer Arg1 goods Arg2
seller Arg3 rate Arg4 payment
Sell Arg0 seller Arg1 goods Arg2
buyer Arg3 rate Arg4 payment
More syntactic and generic, maps readily to
VN,TR NSF funding to map PropBank to FrameNet

33
Word Senses in PropBank
  • Orders to ignore word sense not feasible for 700
    verbs
  • Mary left the room
  • Mary left her daughter-in-law her pearls in her
    will
  • Frameset leave.01 "move away from"
  • Arg0 entity leaving
  • Arg1 place left
  • Frameset leave.02 "give"
  • Arg0 giver
  • Arg1 thing given
  • Arg2 beneficiary
  • How do these relate to traditional word senses in
    VerbNet and WordNet?

34
WordNet Princeton Miller 1985, Fellbaum 1998
  • On-line lexical reference (dictionary)
  • Nouns, verbs, adjectives, and adverbs grouped
    into synonym sets also hypernyms (ISA),
    antonyms, meronyms
  • Limitations as a sense inventory for creating
    training data for supervised Machine Learning
  • No explicit ties to syntactic structure or of
    participants
  • Definitions fine-grained and vague
  • SENSEVAL2
  • 29 Verbs gt 16 senses (including call)
  • Inter-annotator Agreement ITA 73,
  • Automatic Word Sense Disambiguation, WSD 62.5
  • Slow annotation speed 60 tokens per hour

Dang Palmer, SIGLEX02
35
WordNet call, 28 senses
  • name, call -- (assign a specified, proper name
    to
  • "They named their son David" )
  • -gt LABEL
  • 2. call, telephone, call up, phone, ring -- (get
    or try to get into communication (with someone)
    by telephone
  • "I tried to call you all night" )
  • -gtTELECOMMUNICATE
  • 3. call -- (ascribe a quality to or give a name
    of a common noun that reflects a quality
  • "He called me a bastard" )
  • -gt LABEL
  • 4. call, send for -- (order, request, or command
    to come
  • "She was called into the director's office"
    "Call the police!")
  • -gt ORDER

36
WordNet - call, 28 senses

WN2 , WN13,WN28 WN15 WN26 WN3
WN19 WN4 WN 7 WN8 WN9
WN1 WN22 WN20 WN25 WN18
WN27 WN5 WN 16 WN6 WN23 WN12
WN17 , WN 11 WN10, WN14, WN21, WN24
37
WordNet - call, 28 senses, Senseval2 groups

WN5, WN16,WN12 WN15
WN26 WN3 WN19 WN4 WN 7
WN8 WN9 WN1 WN22 WN20
WN25 WN18 WN27 WN2 WN 13 WN6 WN23
WN28 WN17 , WN 11 WN10, WN14,
WN21, WN24,
Loud cry
Bird or animal cry
Request
Label
Call a loan/bond
Challenge
Visit
Phone/radio
Bid
38
Overlap with PropBank Framesets

WN5, WN16,WN12 WN15
WN26 WN3 WN19 WN4 WN 7
WN8 WN9 WN1 WN22 WN20
WN25 WN18 WN27 WN2 WN 13 WN6 WN23
WN28 WN17 , WN 11 WN10, WN14,
WN21, WN24,
Loud cry
Bird or animal cry
Request
Label
Call a loan/bond
Challenge
Visit
Phone/radio
Bid
39
Overlap between Senseval2Groups and Framesets
95

Frameset2
Frameset1
WN1 WN2 WN3 WN4 WN6 WN7 WN8
WN5 WN 9 WN10 WN11 WN12 WN13
WN 14 WN19 WN20
develop
40
Sense Hierarchy (Palmer, et al, SNLU04 - NAACL04)
  • PropBank Framesets ITA gt90
  • coarse grained distinctions
  • 20 Senseval2 verbs w/ gt 1 Frameset
  • Maxent WSD system, 73.5 baseline, 90 accuracy
  • Sense Groups (Senseval-2) - ITA 82
  • Intermediate level
  • (includes Levin classes) 69
  • WordNet ITA 71
  • fine grained distinctions, 60.2

Tagging w/groups, ITA 89, 200_at_hr
41
PropBank I
I
nominal reference
Event variables
  • Also, Arg0substantially lower Dutch corporate
    tax rates helped Arg1Arg0 the company keep
    Arg1 its tax outlay Arg3-PRD flat ArgM-ADV
    relative to earnings growth.

ArgM-ADV
Arg3-PRD
Arg1
Arg0
REL
the company keep its tax outlay flat
tax rates
help
relative to earnings
flat
its tax outlay
the company
keep
42
Summary of Needed KINDLE Resources
Write a Comment
User Comments (0)
About PowerShow.com