Title: Efficient HPSG Parsing with Supertagging and CFGfiltering
1Efficient HPSG Parsing with Supertagging and
CFG-filtering
- Takuya MatsuzakiYusuke MiyaoJunichi
TsujiiUniversity of Tokyo
2Overview
- Objective to make the HPSG parsing faster
- Method by combining three techniques
complementarily - Supertagging
- CFG-filtering
- Deterministic disambiguation
- Result 6-fold speed-up on the PennWSJ data
3Background HPSG
- Head-driven Phrase Structure GrammarPollard and
Sag, 1994 - Constraint-based, lexicalized grammar
- A small number of Rule Schemata? generic
grammatical constraints - A large number of Lexical Entries (LEs)?
word-specific constraints
4Example HPSG Parsing
I
like
it
5Example HPSG Parsing
Assignments of Lexical Entries
6Schema Application
Head-Complement
2
7Schema Application
Subject-Head
2
1
8Real Grammar
- Lexical entries parse trees huge feature
structures ? demo - Rule schema application unification of the
feature structures? costly operations
Our basic strategy for the speed-up minimize
the operation on the feature structures
9Specificity of HPSG from the view point of parsing
Many constraints are specified in the lexical
entries (LEs)
The assignment of LEs determines the form of the
parse tree for the most part
10Specificity of HPSG from the view point of the
parsing
11Specificity of HPSG from the view point of the
parsing
Many constraints are specified in the lexical
entries (LEs)
The assignment of LEs determines the form of the
syntactic tree for the most part
Correct LE-assignments ? the parsing thereafter
is easy Wrong LE-assignments ? there is a risk
of parse failures
12Previous method Supertagger-based parsing
Clark and Curran, 2004 Ninomiya et al., 2006
- Supertagging Bangalore and Joshi,
1999Selecting a few LEs for a word by using a
probabilistic model of P(LEs input sentence)
P small
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
P large
it
I
13Previous method Supertagger-based parsing
Clark and Curran, 2004 Ninomiya et al., 2006
- Ignore the LEs with small probabilities
P small
threshold
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
P large
it
I
14A dilemma in the previous method
- Fewer LEs ? Faster parsing, but
- Too few LEs ? More risk of no well-formed parse
trees
15Previous solution to the dilemma
- Assign a few LEs at first
- Input the LEs to a chart parser
- If no well-formed parse tree is found, increase
the number of LEs and go to step 2
16Our idea
- If we guarantee the parsability of the LE
assignment, - the parsing becomes faster because
- we could replace the chart parsing by a more
simple algorithm ?deterministic parsing
17Our systemEnumeration of LE assignments
Deterministic disambiguation
- Enumerate (maybe-) parsable LE assignments in the
order of their probabilities by combining the
supertagger and a CFG-filter - Input the LE assignments one by one to a
deterministic disambiguation module until a
well-formed tree is obtained
18System Overview
input sentence
I like it
Enumeration of assignments
Supertagger
Deterministicdisambiguation
Prob.
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
I
it
19Enumaration of the maybe-parsable LE assignments
- Based on the CFG-filtering technique Kiefer and
Krieger, 2000 Torisawa et al, 2000 - Parsing with a CFG that approximates the HPSG
- Covering property if a LE assignment is parsable
by the HPSG ? it is also parsable by the
approximating CFG - CFG parsing is much faster than HPSG parsing
20Enumaration of the maybe-parsable LE assignments
Enumeration of thehighest-prob. LE sequences
Supertaggingresult
CFG-filter
(
...
21Deterministic disambiguation
- Implemented as a shift-reduce parser
- Deterministic parsing only one analysis at one
time - Next parsing action is selected using a scoring
function
- F scoring function (averaged-perceptron
algorithm Collins and Duffy, 2002) - Features are extracted from the stack state S
and lookahead queue Q - A the set of possible actions (CFG-forest is
used as a guide)
22Example
Initial state
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
I
like
23argmax F(a, S, Q) SHIFT
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
I
like
24argmax F(a, S, Q) SHIFT
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
I
like
25 argmax F(a, S, Q) SHIFT
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
I
like
26 argmax F(a, S, Q) REDUCE(Head_Comp)
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ lt1NPgtCOMPS ltgt
Head-Comp-Schema
I
HEAD verbSUBJ lt1gtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
like
27 argmax F(a, S, Q) REDUCE(Subj_Head)
Q
S
HEAD verbSUBJ ltgtCOMPS ltgt
Subj-Head-Schema
HEAD verbSUBJ lt1NPgtCOMPS ltgt
HEAD nounSUBJ lt gtCOMPS lt gt
I
HEAD verbSUBJ lt1gtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
like
28Experimental result
- Training / test corpus Penn Treebank section
02-21/23 - Grammar the Enju HPSG English grammar Miyao et
al., 2005 - Evaluation metrics
- LP/LR Labeled Precision/Recall of
predicate-argument relations in the output - Avg. time average parse time per a sentence (on
a 2.4GHz CPU) - Previous method 1 Ninomiya et al, 2006
- Previous method 2 Miyao and Tsujii, 2005
29Experimental resultThe effect of the CFG-filter
- We got a well-formed parse tree on the first
maybe-parsable LE assignments for 95 of the
sentences
We reduced the cost for the operations on
feature structures nearly to its minimum
30Conclusion
- An efficient HPSG parsing system
- Combination of supertagging, CFG-filtering, and
deterministic disambiguation - 6-fold speed up with almost the same level of
parsing accuracy as previous method - Inefficiency of the feature structure operation
is not a problem in the HPSG parsing anymore
31Back-up slides
32If we do not use the CFG-filter?
33Breakdown of the processing time