Title: Automatic semantic role labeling using FrameNet: a literature survey
1Automatic semantic role labeling using FrameNet
a literature survey
- Justin Betteridge
- 11-731 Machine Translation
- April 18, 2005
2Outline
- Introduction
- Automatic Labeling of Semantic Roles -- Gildea
Jurafsky, 2002 - SENSEVAL 3 ASR task
- Applications in MT
- Conclusions / Work in progress
31. Introduction
- Example
- She blames the Government for failing to do
enough to help. - Judge She blames Evaluee the Government
Reason for failing to do enough to help .
41.2 Semantic Roles
- linking theory
- mapping from syntax to semantics
- spectrum of approaches
- specific ? general
- (CS) (linguists)
51.3 FrameNet (Johnson et al., 2003)
62. Automatic Labeling of Semantic Roles (Gildea
Jurafsky, 2002)
- Statistical techniques
- training 36,995 sentences
- development 8,167 sentences
- test 7,900 sentences
- FrameNet 1.0 (67 frames, 12 domains)
- theory of frame semantics by Fillmore (1976)
- Applications
- generalizing IR, QA, semantic dialogue systems
- help in WSD
- intermediate representation in SMT, text
summarization, text data mining - incorporation into probabilistic language models
accurate parsers, better LM for ASR
72.1 Previous work
- Data-driven techniques
- Miller et al. (1996) ATIS air travel domain
- Riloff (1993) data-driven IR, dictionary of
patterns for filling slots in a specific domain - Riloff and Schmelzenbach (1998) automatically
derive entire "case frames" for words in the
domain - Blaheta and Charniak (2000) domain independent
system trained on function tags in PTB doesn't
include all arguments of most predicates - this work identify all semantic roles for wide
variety of predicates in unrestricted text
82.2 Features Used
- Governing Category
- either S (for subjects) or VP (objects)
- only applies to NPs
- Parse Tree Path
- Example
- constituent's syntactic relation to target word
(unlike gov) - dependent on parse tree formalism 2,978 different
values from training data, not counting unmatched
frame elements, 4,086 otherwise - subsumes gov when S or VP in path (4 of 35,138
NPs do not)
92.2 Features Used
- Position
- whether constituent occurs before or after target
word - gov, position, path all represent syntactic
relation between target word and constituent - individual experiments showed all performed
pretty much the same - Voice
- often active verb direct objects lt--gt passive
verb subjects - passives identified using patterns (passive
auxiliary and past participle) - about 5 of examples were classified as passive
- Head Word
- expected lexical information to be important
(like other areas in NLP) - integral part in Collins parser --gt read directly
from parse tree - prepositions, complementizers are heads
102.3 Probability Estimation
112.3 Probability Estimation
- Linear interpolation of distributions
- used equal weights and also EM training
- only included distributions with data
- interpolation weights have "relatively little
impact" - (interested in ranking, not exact probabilities)
- "backoff" model lattice of distributions
- organized by specificity
- minimal lattice gives 9/10 of the performance of
whole system
122.3 Probability Estimation
132.4 Identification of Frame Element Boundaries
- Finding boundaries handled separately
- probabilities using similar features
- path, target word, constituent head word
- fe whether a constituent is a frame element
- Distributions
- data sparseness/fragmentation problems for some
feature combinations - only about 30 sentences available for each target
word
142.5 Generalizing Lexical Statistics
- Lexical features most informative
- 87.4 accuracy for P(rh,pt,t)
- also lowest coverage data sparseness problem
- 3 ways to generalize NPs (4,086 instances)
- automatic clustering (85)
- WordNet-based semantic hierarchy (84.3)
- Bootstrapping (83.2)
152.7 Verb Argument Structure
- how to handle different argument signatures for
the same word? - He opened the door vs. The door opened
- 2 strategies
- sentence-level feature for frame element groups
- subcategorization feature
- 81.6 performance
162.8 Integrating Syntactic and Semantic Parsing
- Collins (1999) form of chart parsing with PCFG
- frame element probabilities applied to full
parses - average of 14.9 parses per sentence
- 18 of sentences assigned different parse
- still, performance effect quite small
- not enough available parses per sentence?
inefficient n-best parsing algorithm
172.9 Generalizing to Unseen Data
- Used 18 abstract thematic roles
- Performance using abstract roles 82.1
- Unseen Predicates
- results encouraging linearly interpolated model
79.4 test set performance - agrees with linking theory
- FN frames are fine-grained enough
- Unseen Frames
- only 67 frames in FrameNet 1.0
- using a minimal lattice, performance was 51
- Unseen Domains 39.8 (40.9 baseline)
- FN domains for way of organizing the project
183. SENSEVAL-3 Task (Litkowski 2004)
- FrameNet 1.1
- 487 frames
- 696 different frame element names (may have
different meanings in different frames) - 132,968 annotated sentences (mostly from BNC)
- test set for this task
- 8,002 sentences selected randomly from 40 frames
(also randomly selected from those with at least
370 annotations) - training set
- 24,558 sentence IDs --gt look up in FN
- 2 cases
- Unrestricted case Frame element labeling
- Restricted case Frame element identification
labeling
193. SENSEVAL-3 Task
- scoring measures
- Precision
- correct / attempted
- Recall
- correct / total FE
- Overlap (avg. overlap of all correct)
- overlapping chars / chars in FN answer
- Attempted
- ( FE generated / FE in test set ) 100
203.1 Results
- 8 teams, 20 runs
- CLResearch, USaarland, UAmsterdam 1 restricted
case - ISI, UTDMoldovan, UTDMorarescu 1 restricted, 1
unrestricted - HKPolyU 8 unrestricted cases
- UUtah 2 restricted, 2 unrestricted
- Unrestricted case (classification task)
- avg. precision 0.870
- avg. recall 0.828
- overlap almost identical to precision (slight
positional errors) - Restricted case
- avg. precision 0.677
- avg. recall 0.547
- avg. overlap 0.622
213.2 University of Amsterdam
- dependency based syntactic analysis important
- added PTB functional tags, non-local dependencies
w/ TiMBL - Memory Based Learning based on syntactic paths
from target word - features used include
- frame name
- words along the path
- semantic classes of words along the path
- nouns using WordNet
- adverbs, prepositions one of 6 clusters obtained
from FN using k-means - POS tags of words along the path
- subcategorization of target word
- others (22 total)
- P86.9, O84.7, R75.2, A86.4
223.3 Saarland University
- focused on generalizing using various similarity
measures for frame elements of different frames - syntactic nodes are instances for frame-level
learning - 2 learning methods
- log-linear Maximum Entropy model
- Memory Based Learning
- generalization techniques
- Frame hierarchy
- peripheral frame elements
- EM-based semantic clustering
- novel features
- preposition (if any)
- whether this path had been seen for a frame
element in training data - MaxEnt learner w/ all features and 3 most helpful
generalization techniques (EM head lemma, EM
path, Peripherals) - P65.4, O60.2, R47.1, A72.0
- MBL learner w/ all features, no extra training
data from generalization - P73.6, O67.5, R59.4, A80.7
233.4 CL Research
- only exploratory participation
- integrate frame semantics into their Knowledge
Management System - P58.3, O48.0, R11.1, A19.0
243.5 Information Sciences Institute (ISI)
- Maximum Entropy models
- FE identification classify as FE, Target, or
None - features for FE identification
- partial path
- logical function external argument, object
argument, other - previous class class information of the
nth-previous constituent (Target, FE or None) - Semantic role classification
- order relative position of a FE in the sentence
(0 at left) - syntactic pattern phrase type and logical
function of each FE - Unrestricted P86.7, O86.6, R85.8, A99.0
- Restricted P80.2, O78.4, R65.4, A81.5
253.6 University of Texas, Dallas Morarescu
- SVM classifier for each frame using combinations
of 4 feature sets! - from Gildea Jurafsky study
- from (Surdeanu 2003)
- content word (for PPs, SBARs, and VPs)
- head word POS
- content word POS
- named entity class of content word
- boolean named entity flags (whether an
organization, location, person, etc. was
recognized in the phrase) - new features
- from (Pradhan 2004)
263.6 University of Texas, Dallas Morarescu
- New Features
- human personal pronoun or a hyponym of PERSON
sense 1 in WN - support verbs head of the VP that contains the
target word (nouns, adjectives) - target type lexical class of the target word
(verb, noun or adjective) - list constituent (FEs) phrase types of the other
FEs - grammatical function external argument, object,
complement, modifier, head noun modified by
attributive adjective, genitive determiner,
appositive - list grammatical function grammatical functions
of the other FEs - number FEs in sentence
- frame name
- coverage whether there is subparse that exactly
covers the FE - coreness core, peripheral, or extrathematic
- subcorpus name of subcorpus (12,456 possible)
indicates relations between target word and some
of its FEs
273.6 University of Texas, Dallas Morarescu
- Generalization to obtain extended data
- Unrestricted P94.6, O94.6, R90.7, A95.8
- Restricted P89.9, O88.2, R77.2, A85.9
283.7 University of Texas, Dallas Moldovan
- SVM classifiers
- divided up training data by target word type
verb, noun, adjective - 16 features, 3 sets
- baseline same as GJ
- modified slight modifications
- new
- argument structure phrase structure of the nodes
along the path between the root node of the
argument and its head word in a level-order
fashion. - distance between argument and target
- PropBank semantic argument captures semantic
type of argument - diathesis alternation flat representation of the
predicate argument structure (as in PropBank) - Unrestricted P89.8, O89.7, R83.9, A93.4
- Restricted P80.7, O77.7, R78.0, A96.7
293.8 Hong Kong Polytechnic University
- frame-level classifiers
- 5 different ML techniques
- Boosting
- most successful
- SVM
- Maximum Entropy
- SNOW (Sparse Network Of Winnows)
- Decision Lists
- various ensembles of these techniques (submitted
8 runs) - SVM, Boosting, MaxEnt (binary) got highest scores
- Unrestricted P87.4, O87.3, R86.7, A99.2
303.9 University of Utah
- used generative models (Jordan 1999)
- joint probability distribution over targets,
frames, roles, constituents (basically just a
1st-order HMM) - novel features
- depth, height of constituent in parse tree
- constituent word count
- Unrestricted P85.8, O85.7, R84.9, A98.9
- Restricted P35.5, O25.5, R45.3, A127.9
314. Applications in MT
- intermediate representation in SMT
- Replacement for expensive hand-crafted domain
models in KBMT
324. Applications in MT
- KANT Domain Model
- used for disambiguating PP attachment
- Use SRL to learn
- Lift Theme the engine Source from the chassis
Instrument with a hoist .
335. Conclusions
- FrameNet useful resource for shallow semantic
parsing via supervised learning - Still need word sense / frame disambiguation
345. Work in progress
- Work outside FrameNet/SENSEVAL-3
- Comparison with PropBank/CoNLL
- Details of application in KBMT
35Questions?