Proposition Bank: a resource of predicateargument relations - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Proposition Bank: a resource of predicateargument relations

Description:

Compare DLI Human translation to system output (200) ... of cars. 10/9/01. PropBank. 22. How are arguments numbered? Examination of example sentences ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 39
Provided by: paulkin
Category:

less

Transcript and Presenter's Notes

Title: Proposition Bank: a resource of predicateargument relations


1
Proposition Bank a resource of
predicate-argument relations
  • Martha Palmer
  • University of Pennsylvania
  • October 9, 2001
  • Columbia University

2
Outline
  • Overview (Ace consensus BBN,NYU,MITRE,Penn)
  • Motivation
  • Approach
  • Guidelines, lexical resources, frame sets
  • Tagging process, hand correction of automatic
    tagging
  • Status accuracy, progress
  • Colleagues Joseph Rosenzweig, Paul Kingsbury,
    Hoa Dang, Karin Kipper, Scott Cotton, Laren
    Delfs, Christiane Fellbaum

3
Proposition BankGeneralizing from Sentences to
Propositions
Powell met Zhu Rongji
meet(Somebody1, Somebody2)
When Powell met Zhu Rongji on Thursday they
discussed the return of the spy
plane. meet(Powell, Zhu) discuss(Powell,
Zhu, return(X, plane))
4
Penn English Treebank
  • 1.3 million words
  • Wall Street Journal and other sources
  • Tagged with Part-of-Speech
  • Syntactically Parsed
  • Widely used in NLP community
  • Available from Linguistic Data Consortium

5
A TreeBanked Sentence
(S (NP-SBJ Analysts) (VP have (VP
been (VP expecting (NP
(NP a GM-Jaguar pact) (SBAR
(WHNP-1 that) (S (NP-SBJ
T-1) (VP would
(VP give
(NP the U.S. car maker)
(NP (NP an eventual (ADJP 30 ) stake)
(PP-LOC in (NP the British
company))))))))))))
S
VP
NP-SBJ
Analysts
NP
S
VP
NP-SBJ
T-1
would
NP
PP-LOC
Analysts have been expecting a GM-Jaguar pact
that would give the U.S. car maker an eventual
30 stake in the British company.
6
The same sentence, PropBanked
(S Arg0 (NP-SBJ Analysts) (VP have
(VP been (VP expecting
Arg1 (NP (NP a GM-Jaguar pact)
(SBAR (WHNP-1 that) (S
Arg0 (NP-SBJ T-1)
(VP would (VP give
Arg2 (NP the U.S.
car maker) Arg1 (NP (NP an
eventual (ADJP 30 ) stake)
(PP-LOC in (NP the British company))))))))))))
have been expecting
Arg1
Arg0
Analysts
expect(Analysts, GM-J pact) give(GM-J pact, US
car maker, 30 stake)
7
Motivation
  • Why do we need accurate predicate-argument
    relations?
  • They have a major impact on Information
    Processing.
  • Ex Korean/English Machine Translation ARL/SBIR
  • CoGenTex, Penn, Systran (K/E Bilinugal Lexicon,
    20K)
  • 4K words ( lt 500 words from Systran, military
    messages)
  • Plug and play architecture based on DsyntS
  • (rich dependency structure)
  • Converter bug led to random relabeling of
    predicate arguments
  • Correction of predicate argument labels alone led
    to tripling of acceptable sentence output

8
Focusing on Parser comparisons
  • 200 sentences hand selected to represent good
    translations given a correct parse.
  • Used to compare
  • Corrected DsyntS output
  • Juntaes parser output (off-the-shelf)
  • Anoops parser output (Treebank trained, 95 F)

9
Evaluating translation quality
  • Compare DLI Human translation to system output
    (200)
  • Criteria used by human judges (2 or more, not
    blind)
  • g good, exactly right
  • f1 fairly good, but small grammatical
    mistakes
  • f2 Needs fixing, but vocabulary basically
    there
  • f3 Needs quite a bit of fixing, usually some
  • un-translated vocabulary, but most
    v. is right
  • m seems grammatical, but semantically wrong,
  • actually misleading
  • i irredeemable, really wrong, major problems

10
Results Comparison 200 sent.
11
Plug and play?
  • Converter used to map Parser outputs into MT
    DsyntS format
  • Bug in the converter affected both systems
  • Predicate argument structure labels were being
    lost in the conversion process, relabeled
    randomly
  • The converter was also still tuned to Juntaes
    parse output, needed to be customized to Anoops

12
Anoops parse -gt MTW DsyntS
  • 0010Target Unit designations are normally
    transmitted in code.
  • 0010Corrected Normally unit designations are
    notified in the code.
  • 0010Anoop Normally it is notified unit
    designations in code.

notified
P Arg0
C Arg1
code
designations
normally
unit
13
Anoops parse -gt MTW DsyntS
0022Target Under what circumstances does radio
inteference occur? 0022Corrected In what
circumstances does the interference happen in the
radio? 0022Anoop Do in what circumstance happen
interference in radio?
happen
P Arg0
P ArgM
C Arg0
C Arg1
interference
circumstances
radio
what
14
New and Old Results Comparison
15
English PropBank
  • 1M words of Treebank over 2 years, May01-03
  • New semantic augmentations
  • Predicate-argument relations for verbs
  • label arguments Arg0, Arg1, Arg2,
  • First subtask, 300K word financial subcorpus
  • (12K sentences, 35K predicates)
  • Spin-off Guidelines (necessary for annotators)
  • English lexical resource
  • 6000 verbs with labeled examples, rich semantics

16
Task not just undoing passives
  • The earthquake shook the building.
  • ltarg0gt ltWN3gt ltarg1gt
  • The walls shook the building rocked.
  • ltarg1gt ltWN3gt ltarg1gt ltWN1gt
  • The guidelines lexicon with examples
  • Frames Files

17
Guidelines Frames Files
  • Created manually Paul Kingsbury
  • working on semi-automatic expansion
  • Refer to VerbNet, WordNet and Framenet
  • Currently in place for 230 verbs
  • Can expand to 2000 using VerbNet
  • Will need hand correction
  • Use semantic role glosses unique to each verb
    (map to Arg0, Arg1 labels appropriate to class)

18
Frames Example expect
Roles Arg0 expecter Arg1 thing
expected Example Transitive, active
Portfolio managers expect further declines in
interest rates. Arg0
Portfolio managers REL
expect Arg1 further
declines in interest rates
19
Frames File example give
  • Roles
  • Arg0 giver
  • Arg1 thing given
  • Arg2 entity given to
  • Example double object
  • The executives gave the chefs a standing
    ovation.
  • Arg0 The executives
  • REL gave
  • Arg2 the chefs
  • Arg1 a standing
    ovation

20
The same sentence, PropBanked
(S Arg0 (NP-SBJ Analysts) (VP have
(VP been (VP expecting
Arg1 (NP (NP a GM-Jaguar pact)
(SBAR (WHNP-1 that) (S
Arg0 (NP-SBJ T-1)
(VP would (VP give
Arg2 (NP the U.S.
car maker) Arg1 (NP (NP an
eventual (ADJP 30 ) stake)
(PP-LOC in (NP the British company))))))))))))
have been expecting
Arg1
Arg0
Analysts
expect(Analysts, GM-J pact) give(GM-J pact, US
car maker, 30 stake)
21
Complete Sentence
Analysts have been expecting a GM-Jaguar pact
that T-1 would give the U.S. car maker an
eventual 30 stake in the British company and
create joint ventures that T-2 would produce an
executive-model range of cars.
22
How are arguments numbered?
  • Examination of example sentences
  • Determination of required / highly preferred
    elements
  • Sequential numbering, Arg0 is typical first
    argument, except
  • ergative/unaccusative verbs (shake example)
  • Arguments mapped for "synonymous" verbs

23
Additional tags (arguments or adjuncts?)
  • Variety of ArgMs (Arggt4)
  • TMP - when?
  • LOC - where at?
  • DIR - where to?
  • MNR - how?
  • PRP -why?
  • REC - himself, themselves, each other
  • PRD -this argument refers to or modifies another
  • ADV -others

24
Tense/aspect
  • Verbs also marked for tense/aspect
  • Passive
  • Perfect
  • Progressive
  • Infinitival
  • Modals and negation marked as ArgMs

25
Ergative/Unaccusative Verbs rise
  • Roles
  • Arg1 Logical subject, patient, thing rising
  • Arg2 EXT, amount risen
  • Arg3 start point
  • Arg4 end point
  • Sales rose 4 to 3.28 billion from 3.16 billion.

Note Have to mention prep explicitly,
Arg3-from, Arg4-to, or could have used
ArgM-Source, ArgM-Goal. Arbitrary distinction.
26
Synonymous Verbs add in sense rise
  • Roles
  • Arg1 Logical subject, patient, thing
    rising/gaining/being added to
  • Arg2 EXT, amount risen
  • Arg4 end point
  • The Nasdaq composite index added 1.01 to
    456.6 on paltry volume.

27
Phrasal Verbs
  • Put together
  • Put in
  • Put off
  • Put on
  • Put out
  • Put up
  • ...

28
Frames Multiple Rolesets
  • Rolesets are not necessarily consistent between
    different senses of the same verb
  • Verb with multiple senses can have multiple
    frames, but not necessarily
  • Roles and mappings onto argument labels are
    consistent between different verbs that share
    similar argument structures, Similar to Framenet
  • Levin / VerbNet classes
  • http//www.cis.upenn.edu/dgildea/VerbNet/
  • Out of the 179 most frequent verbs
  • 1 Roleset 92
  • 2 rolesets 45
  • 3 rolesets 42 (includes light verbs)

29
Annotation procedure
  • Extraction of all sentences with given verb
  • First pass automatic tagging
  • Second pass Double blind hand correction
  • Variety of backgrounds
  • less syntactic training than for treebanking
  • Script to discover discrepancies
  • Third pass Solomonization (adjudication)

30
Inter-annotator agreement
31
Annotator Accuracy vs. Gold Standard
  • One version of annotation chosen (sr. annotator)
  • Solomon modifies gt Gold Standard

32
Status
  • 179 verbs framed ( Senseval2 verbs)
  • 97 verbs first-passed
  • 12,300 predicates
  • Does not include 3000 predicates tagged for
    Senseval
  • 54 verbs second-passed
  • 6600 predicates
  • 9 verbs solomonized
  • 885 predicates

33
Throughput
  • Framing approximately 2 verbs per hour
  • Annotation approximately 50 sentences per hour
  • Solomonization approximately 1 hour per verb

34
Automatic Predicate Argument Tagger
  • Predicate argument labels
  • Uses TreeBank cues
  • Consults lexical semantic KB
  • Hierarchically organized verb subcategorization
    frames and alternations associated with tree
    templates
  • Ontology of noun-phrase referents
  • Multi-word lexical items
  • Matches annotated tree templates against parse in
    Tree-adjoining Grammar style
  • standoff annotation in external file referencing
    treenodes
  • Preliminary accuracy rate of 83.7 (800
    predicates)

35
Summary
  • Predicate-argument structure labels are arbitrary
    to a certain degree, but still consistent, and
    generic enough to be mappable to particular
    theoretical frameworks
  • Automatic tagging as a first pass makes the task
    feasible
  • Agreement and accuracy figures are reassuring

36
Solomonization
  • Source tree Intel told analysts that the company
    will resume shipments of the chips within two to
    three weeks .
  • kate said
  • arg0 Intel
  • arg1 the company will resume shipments of
    the chips within two to three weeks
  • arg2 analysts
  • erwin said
  • arg0 Intel
  • arg1 that the company will resume shipments
    of the chips within two to three weeks
  • arg2 analysts

37
Solomonization
  • Such loans to Argentina also remain classified as
    non-accruing, TRACE-1 costing the bank 10
    million TRACE-U of interest income in the
    third period.
  • kate said
  • argM-TMP in the third period
  • arg3 the bank
  • arg2 10 million TRACE-U of interest
    income
  • arg1 TRACE-1
  • erwin said
  • argM-TMP in the third period
  • arg3 the bank
  • arg2 10 million TRACE-U of interest
    income
  • arg1 TRACE-1
  • Such loans to Argentina

38
Solomonization
  • Also , substantially lower Dutch corporate tax
    rates helped the company keep its tax outlay flat
    relative to earnings growth.
  • kate said
  • argM-MNR relative to earnings growth
  • arg3-PRD flat
  • arg1 its tax outlay
  • arg0 the company
  • katherine said
  • argM-ADV relative to earnings growth
  • arg3-PRD flat
  • arg1 its tax outlay
  • arg0 the company
Write a Comment
User Comments (0)
About PowerShow.com