Modelling Human Thematic Fit Judgments - PowerPoint PPT Presentation

About This Presentation
Title:

Modelling Human Thematic Fit Judgments

Description:

Deep semantic annotation: Frames code situations, group verbs that describe ... Annotation relatively sensible and reliable for non-FN verbs ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 21
Provided by: matthewc150
Category:

less

Transcript and Presenter's Notes

Title: Modelling Human Thematic Fit Judgments


1
Modelling Human Thematic Fit Judgments
  • IGK Colloquium
  • 3/2/2005
  • Ulrike Padó

2
Overview
  • (Very) quick introduction to my framework
  • Testing the Semantic Module
  • Different input corpora
  • Smoothing
  • Comparing the Semantic Module to standard
    selectional preference methods

3
Modelling Semantic Processing
  • General idea Build a
  • probabilistic
  • large scale
  • broad coverage
  • model of syntactic and semantic sentence
    processing

4
Semantic Processing
  • Assign thematic roles on the basis of
    co-occurrence statistics from semantically
    annotated corpora
  • Corpus-based frequency estimates of
  • Semantic Subcategorisation (Probability of seeing
    the role with the verb)
  • Selectional Preferences (Probability of seeing
    the argument head in a role given the verb frame)

5
Testing the Semantic Module
  • Evaluate just thematic fit of verbs and argument
    phrases
  • Evaluation
  • Correlate predictions with human judgments
  • Role labelling (prefer correct role)
  • Try
  • Different input corpora
  • Smoothing

6
Training Data
  • Frequency counts from
  • the PropBank (ca. 3000 verb types)
  • Very specific domain
  • Relatively flat, syntax-based annotation
  • FrameNet (ca. 1500 verb types)
  • Deep semantic annotation Frames code situations,
    group verbs that describe similar events and
    their arguments
  • Extracted from balanced corpus
  • Skewed sample through frame-wise annotation

7
Development/Test Data
  • Development 60 verb-argument pairs from McRae et
    al. 98
  • Two judgments for each data point Agent/Patient
  • Use to determine optimal parameters of clustering
    (number of clusters, smoothing)
  • Test 50 verb-argument pairs, 100 data points

8
Sparse Data
  • Raw frequencies are sparse
  • 1 (Dev)/2 (Test) pairs seen in PropBank
  • 0 (Dev)/2 (Test) pairs seen in FrameNet
  • Use semantic classes as level of abstraction
    Class-based smoothing

9
Smoothing
  • Reconstruct probabilities for unseen data
  • Smoothing by verb and noun classes
  • Count class members instead of word tokens
  • Compare two alternatives
  • Hand-constructed classes
  • Induced verb classes (clustering)

10
Hand-constructed Verb and Noun classes
  • WordNet Use top-level ontology and synsets as
    noun classes
  • VerbNet Use top-level classes for verbs
  • Presumably correct and reliable
  • Result No significant correlations with human
    data for either training corpus

11
Induced Verb Classes
  • Automatically cluster verbs
  • Group by similarities of argument heads, paths
    from argument to verb, frame, role labels
  • Determine optimal number of clusters and
    parameters of the clustering algorithm on the
    development set

12
Induced Classes, PB/FN
Data points covered ?/ Significance
Raw data 2 -/-
Raw data 2 -/-
All Arguments 59 ns
All Arguments 12 ?0.55/ plt0.05
Just NPs 48 ns
Just NPs 16 ?0.56/ plt0.05
13
Results
  • Hand-built classes do not work (with this amount
    of data)
  • Module achieves reliable correlations with FN
    data
  • Important result for the overall feasibility of
    my model

14
Adding Noun Classes (PB/FN)
Data points covered ?/ Significance
Raw data 2 -/-
Raw data 2 -/-
PB, all args, Noun classes 4 ?1/ plt0.01
FN, just NPs, Noun classes 18 ?0.63/ plt0.01
15
Results
  • Hand-built classes do not work (with this amount
    of data)
  • Module achieves reliable correlations with FN
    data
  • Adding noun classes helps yet a little

16
Comparison with Selectional Preference Methods
  • Have established that our system reliably
    predicts human data
  • How do we do in comparison to standard
    computational linguistics methods?

17
Selectional Preference Methods
  • Clark Weir (2002)
  • Add data points by finding the topmost class in
    WN that still reliably mirrors the target word
    frequency
  • Resnik (1996)
  • Quantify contribution of WN class n to the
    overall preference strength of the verb
  • Both rely on WN noun classes, no verb class
    smoothing

18
Selectional Preference Methods (PB/FN)
Data points covered ?/ Significance Labelling (Cov/Acc)
Sem. Module 1 18 ?0.63/ plt0.01 38/47.4
Sem. Module 2 16 ?0.56/ plt0.05 30/60
Clark Weir 72 ns 84/50
Clark Weir 23 ns 36/50
Resnik 75 ns 74/48.6
Resnik 46 ns 50/48
19
Results
  • Too little input data
  • No results for selectional preference models
  • Small coverage for Semantic Module
  • Semantic module manages to make predictions all
    the same
  • Relies on verb clusters Verbs are less sparse
    than nouns in small corpora
  • Annotate larger corpus with FN roles

20
Annotating the BNC
  • Annotate large, balanced corpus BNC
  • More data points for verbs covered in FN
  • More verb coverage (though purely syntactic
    annotation for unknown verbs)
  • Results
  • Annotation relatively sensible and reliable for
    non-FN verbs
  • Frame-wise annotation in FN causes problems for
    FN verbs
Write a Comment
User Comments (0)
About PowerShow.com