System Combination - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

System Combination

Description:

Parse hybridization: Method 1. Constituent voting: ... For each parse pi, create the constituent set Si. The score for pi is. Choose the parse with the highest ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 27
Provided by: facultyWa4
Category:

less

Transcript and Presenter's Notes

Title: System Combination


1
System Combination
  • LING 572
  • Fei Xia
  • 01/31/06

2
Papers
  • (Henderson and Brill, EMNLP-1999) Exploiting
    Diversity in NLP Combining Parsers
  • (Henderson and Brill, ANLP-2000) Bagging and
    Boosting a Treebank Parser

3
Task
4
Paper 1
ML1
f1
ML2
f2
f
MLm
fm
5
Paper 2 bagging
ML
f1
ML
f2
f
ML
fm
6
Combining parsers
7
Scenario
ML1
f1
ML2
f2
f
MLm
fm
8
Three parsers
  • Collins (1997)
  • Charniak (1997)
  • Ratnaparkhi (1997)

9
Major strategies
  • Parse hybridization combine substructures of the
    input parses to produce a better parse.
  • Parser switching for each x, f(x) is one of the
    fi(x)

10
Parse hybridization Method 1
  • Constituent voting
  • Include a constituent if it appears in the output
    of a majority of the parsers.
  • It requires no training.
  • All parsers are treated equally.

11
Parse hybridization Method 2
  • Naïve Bayes

Yp(c) is a binary function return true when c
should be included in the hyp XiMi(c) is a
binary function return true when parser i
suggests c should be in the parse
12
Parse hybridization
  • If the number of votes required by constituent
    voting is greater than half of the parsers, the
    resulting structure has no crossing constituents.
  • What will happen if the input parsers disagree
    often?

13
Parser switching Method 1
  • Similarity switching
  • Intuition choose the parse that is most similar
    to the other parses.
  • Algorithm
  • For each parse pi, create the constituent set Si.
  • The score for pi is
  • Choose the parse with the highest score.
  • No training is required.

14
Parser switching Method 2
  • Naïve Bayes

15
Experiments
  • Training data WSJ except sections 22 and 23
  • Development data Section 23
  • For training Naïve Bayes
  • Test data Section 22

16
Parsing results
17
Robustness testing
90.43 90.74 91.25 91.25
Add a 4th parser F-measure about
67.6 Performance remains the same except for
constituent voting
18
Summary of 1st paper
  • Combining parsers produces good results
  • 89.67 ? 91.25
  • Different methods of combining
  • Parse hybridization
  • Constituent voting
  • Naïve Bayes
  • Parser switching
  • Similarity switching
  • Naïve Bayes

19
Bagging and Boosting a Treebank Parser
20
Experiment settings
  • Parser Collinss Model 2 (1997)
  • Training data sections 01-21
  • Test data Section 23

21
Bagging
ML
f1
ML
f2
f
(s,t)
ML
fm
Combining method constituent voting
22
Experiment results
Baseline (no bagging) 88.63 Initial (one bag)
88.38 Final (15 bags) 89.17
23
Training corpus size effects
24
Boosting
ML
f1
Training Sample
ML
Weighted Sample
f2
f

ML
fT
  • Weighted Sample

25
Boosting results
Boosting does not help 88.63 ? 88.84
26
Summary
  • Combining parsers produces good results
  • 89.67 ? 91.25
  • Bagging helps 88.63 ? 89.17
  • Boosting does not help (in this case) 88.63 ?
    88.84
Write a Comment
User Comments (0)
About PowerShow.com