System Combination presentation

About This Presentation

Transcript and Presenter's Notes

Title: System Combination

1
System Combination

LING 572
Fei Xia
01/31/06

2
Papers

(Henderson and Brill, EMNLP-1999) Exploiting
Diversity in NLP Combining Parsers
(Henderson and Brill, ANLP-2000) Bagging and
Boosting a Treebank Parser

3
Task
4
Paper 1
ML1
f1
ML2
f2
f
MLm
fm
5
Paper 2 bagging
ML
f1
ML
f2
f
ML
fm
6
Combining parsers
7
Scenario
ML1
f1
ML2
f2
f
MLm
fm
8
Three parsers

Collins (1997)
Charniak (1997)
Ratnaparkhi (1997)

9
Major strategies

Parse hybridization combine substructures of the
input parses to produce a better parse.
Parser switching for each x, f(x) is one of the
fi(x)

10
Parse hybridization Method 1

Constituent voting
Include a constituent if it appears in the output
of a majority of the parsers.
It requires no training.
All parsers are treated equally.

11
Parse hybridization Method 2

Naïve Bayes

Yp(c) is a binary function return true when c
should be included in the hyp XiMi(c) is a
binary function return true when parser i
suggests c should be in the parse
12
Parse hybridization

If the number of votes required by constituent
voting is greater than half of the parsers, the
resulting structure has no crossing constituents.
What will happen if the input parsers disagree
often?

13
Parser switching Method 1

Similarity switching
Intuition choose the parse that is most similar
to the other parses.
Algorithm
For each parse pi, create the constituent set Si.
The score for pi is
Choose the parse with the highest score.
No training is required.

14
Parser switching Method 2

Naïve Bayes

15
Experiments

Training data WSJ except sections 22 and 23
Development data Section 23
For training Naïve Bayes
Test data Section 22

16
Parsing results
17
Robustness testing
90.43 90.74 91.25 91.25
Add a 4th parser F-measure about
67.6 Performance remains the same except for
constituent voting
18
Summary of 1st paper

Combining parsers produces good results
89.67 ? 91.25
Different methods of combining
Parse hybridization
Constituent voting
Naïve Bayes
Parser switching
Similarity switching
Naïve Bayes

19
Bagging and Boosting a Treebank Parser
20
Experiment settings

Parser Collinss Model 2 (1997)
Training data sections 01-21
Test data Section 23

21
Bagging
ML
f1
ML
f2
f
(s,t)
ML
fm
Combining method constituent voting
22
Experiment results
Baseline (no bagging) 88.63 Initial (one bag)
88.38 Final (15 bags) 89.17
23
Training corpus size effects
24
Boosting
ML
f1
Training Sample
ML
Weighted Sample
f2
f

ML
fT

Weighted Sample

25
Boosting results
Boosting does not help 88.63 ? 88.84
26
Summary

Combining parsers produces good results
89.67 ? 91.25
Bagging helps 88.63 ? 89.17
Boosting does not help (in this case) 88.63 ?
88.84

Write a Comment

User Comments (0)

About PowerShow.com

System Combination PowerPoint PPT Presentation