Title: Learning and Inference for Hierarchically Split PCFGs
1Learning and Inference for Hierarchically Split
PCFGs
- Slav Petrov and Dan Klein
2The Game of Designing a Grammar
- Annotation refines base treebank symbols to
improve statistical fit of the grammar - Parent annotation Johnson 98
3The Game of Designing a Grammar
- Annotation refines base treebank symbols to
improve statistical fit of the grammar - Parent annotation Johnson 98
- Head lexicalization Collins 99, Charniak 00
4The Game of Designing a Grammar
- Annotation refines base treebank symbols to
improve statistical fit of the grammar - Parent annotation Johnson 98
- Head lexicalization Collins 99, Charniak 00
- Automatic clustering?
5Learning Latent Annotations
Matsuzaki et al. 05
- Brackets are known
- Base categories are known
- Only induce subcategories
Just like Forward-Backward for HMMs.
6Overview
- Hierarchical Training - Adaptive Splitting -
Parameter Smoothing
7Refinement of the DT tag
DT
8Refinement of the DT tag
DT
9Hierarchical refinement of the DT tag
DT
10Hierarchical Estimation Results
11Refinement of the , tag
- Splitting all categories the same amount is
wasteful
12Adaptive Splitting
- Want to split complex categories more
- Idea split everything, roll back splits which
were least useful
13Adaptive Splitting
- Want to split complex categories more
- Idea split everything, roll back splits which
were least useful
14Adaptive Splitting Results
15Number of Phrasal Subcategories
16Number of Phrasal Subcategories
NP
VP
PP
17Number of Phrasal Subcategories
NAC
X
18Number of Lexical Subcategories
POS
TO
,
19Number of Lexical Subcategories
NNP
JJ
NNS
NN
20Smoothing
- Heavy splitting can lead to overfitting
- Idea Smoothing allows us to pool
- statistics
21Result Overview
22Linguistic Candy
- Proper Nouns (NNP)
- Personal pronouns (PRP)
23Linguistic Candy
- Relative adverbs (RBR)
- Cardinal Numbers (CD)
24Inference
Exhaustive parsing 1 min per sentence
25Coarse-to-Fine Parsing
Goodman 97, CharniakJohnson 05
26Hierarchical Pruning
lt t
- Consider again the span 5 to 12
coarse
split in two
split in four
split in eight
27Intermediate Grammars
X-BarG0
G
28Projected Grammars
X-BarG0
G
29Final Results (Efficiency)
- Parsing the development set (1600 sentences)
- Berkeley Parser
- 10 min
- Implemented in Java
- Charniak Johnson 05 Parser
- 19 min
- Implemented in C
30Final Results (Accuracy)
31Extensions
- Acoustic modeling
- Infinite Grammars
- Nonparametric Bayesian Learning
Petrov, Pauls Klein 07
Liang, Petrov, Jordan Klein 07
32Conclusions
- Split Merge Learning
- Hierarchical Training
- Adaptive Splitting
- Parameter Smoothing
- Hierarchical Coarse-to-Fine Inference
- Projections
- Marginalization
- Multi-lingual Unlexicalized Parsing
33Thank You!
- http//nlp.cs.berkeley.edu