Title: Approximating Textual Entailment with LFG and FrameNet Frames
1Approximating Textual Entailment with LFG and
FrameNet Frames
- Aljoscha Burchardt, Anette Frank
- Computational Linguistics Department
- Saarland University, Saarbrücken
- Second Pascal Challenge Workshop
- Venice, April 2006
2Outline of this Talk
- Frame Semantics
- A baseline system for approximating Textual
Entailment - LFG syntactical analyses with
- Frame semantics
- Statistical decision entailed?
- Walk-through example from RTE 2006
- RTE 2006 results / brief conclusions
3Frame Semantics (Fillmore 1976, Fillmore et. al.
2003)
- Lexical semantic classification of predicates and
their argument structure - A frame represents a prototypical situation (e.g.
Commercial_transaction, Theft, Awareness) - A set of roles identifies the participants or
propositions involved - Frames are organized in a hierarchy
- Berkeley FrameNet Project db 600 frames, 9.000
lexical units, 135.000 annotated sentences
4Linguistic Normalizations(Frame Commerce_buy)
Seller BMW bought Rover from British Aerospace.
Buyer Rover was bought by BMW, which financed ... the new Range Rover.
Goods BMW, which acquired Rover in 1994, is now dismantling the company.
Money BMWs purchase of Rover for 1.2 billion was a good move.
5Frame Semantics for RTE
- Focusing on lexical semantic classes and
role-based argument structure - Built-in normalizations help to determine
semantic similarity at a high level of
abstraction - Disregarding aspects of deep semantics
negation, modality, quantification, ... - Open for deeper modeling on demand (e.g. our
treatment of modality)
6A Baseline System for Approximating Textual
Entailment
- Fine-grained LFG-based syntactic analysis
- English LFG grammar (Riezler et al. 2002)
- Wide-coverage with high-quality probabilistic
disambiguation - Frame Semantics
- Shallow lexical-semantic classification of
predicate-argument structure - Extensions WordNet senses, SUMO concepts
- Computing structural and semantic overlap of t
and h - Hypothesis large overlap entailment
7A Baseline System for Approximating Textual
Entailment
Computing Semantic Overlap
Linguistic Analyses
Model training classification
Statistical Decision Entailment?
8Linguistic Components
XLE parsing LFG f-structure
WordNet-based WSD WordNet SUMO
Fred / Detour / Rosy frames roles
F-structure w/ semantics projection
Using XLE term rewriting system (Crouch 2005)
- Rule-based extend refine sem. proj.
- NEs, Locations
- Co-reference
- Modality, etc.
9Example from RTE 2006
- Pair 716
- Text
- In 1983, Aki Kaurismäki directed his first
full-time feature. - Hypothesis
- Aki Kaurismäki directed a film.
10LFG F-Structures
11Automatic Frame Annotation for Text (SALTO
Viewer)
Collins Parse
12Automatic Frame Annotation for Hypothesis
- 716_h Aki Karusmäki directed a film.
13LFG Frames for Hypothesis(FEFViewer)
Aki Kaurismäki directed a film.
14Hypothesis-Text-Match Graphs Computing Structural
and Semantic overlap
- Match graph bundles overlapping partial graphs
marked by match types - Aspects of similarity
- Syntax-based (i.e. lexical and structural)
Identical predicates (attributes) trigger node
(edge) matches. - Semantics-based Identical frames/concepts
(roles) trigger node (edge) matches. - Degrees of similarity
- Strict matching
- Weak matching conditions for non-identical
predicates - Structurally related e.g. via coreference
(relative clauses, appositives, pronominals) - Semantically related via WordNet,
Frame-Relations
15t In 1983, Aki Kaurismäki directed his first
full-time feature.
16Statistical Modeling
- Feature extraction on the basis of
- Syntactic, Semantic matches (of different types)
- Matching clusters sizes
- Ratio (matched vs. hypothesis)
- (Non-)matching modality
- RTE-task, fragmentary (parse),
- Training/classification with WEKA tool
- Feature selection
- Predicate Matches
- Frame overlap
- Matching cluster size
- Model 1 Conjunctive rule (Feat. 1,2)
- Model 2 LogitBoost (Feat. 1,2,3)
17RTE 2006 Results
all tasks IE IR QA SUM
Model 1 59.0 49.5 59.5 54.5 72.5
Model 2 57.8 48.5 58.5 57.0 67.0
- SUM (and IR) are natural tasks for Frame
Semantics, IE and QA need more deeper modeling
(aboutness vs. factivity) - Error analysis
- True positives high semantic overlap
- True negatives 27 involve modality mismatches
- False examples poor modeling of dissimalrity
- Many high-frequency features measuring similarity
- Few low-frequency features measuring dissimilarity
18Brief Conclusions
- Good approximation of semantic similarity
- Deep LFG syntactical analyses integrated with
- Shallow lexical Frame Semantics (plus other lex.
resources) - Match graph measuring overlap
- Need better model for semantic dissimilarity
- Too few rejections (false positives gtgt false
negatives) - Towards deeper modeling
- Treatment of modal contexts
- Integration of lexical inferences
- Open for collaborations
19LFG Frames for Hypothesis (FEF)
stmt_type(f(0),declarative). tense(f(0),past). pred(f(0),direct). mood(f(0),indicative). dsubj(f(0),f(7)). dobj(f(0),f(2)). pred(f(2),film). num(f(2),sg). det_type(f(2),indef). proper(f(7),name). pred(f(7),'Kaurismaki'). num(f(7),sg). mod(f(7),f(10)). proper(f(10),name). pred(f(10),'Aki'). num(f(10),sg). sslink(f(0),s(41)). sslink(f(2),s(42)). sslink(f(7),s(45)). sslink(f(10),s(59)). frame(s(41),'Behind_the_scenes'). artist(s(41),s(45)). production(s(41),s(42)). frame(s(42),'Behind_the_scenes'). frame(s(45),'People'). person(s(45),s(59)). person(s(45),s(45)). ont(s(41),s(48)). ont(s(42),s(49)). ont(s(45),s(56)). wn_syn(s(48),'directv11'). sumo_sub(s(48),'Steering'). milo_sub(s(48),'Steering'). wn_syn(s(49),'filmn1'). sumo_sub(s(49),'MotionPicture'). milo_sub(s(49),'MotionPicture'). sumo_syn(s(56),'Human'). sumo_syn(s(58),'Human').