Efficient HPSG Parsing with Supertagging and CFGfiltering - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Efficient HPSG Parsing with Supertagging and CFGfiltering

Description:

Efficient HPSG Parsing with Supertagging and CFG-filtering. Takuya ... A small number of Rule Schemata. generic ... [ Bangalore and Joshi, 1999] ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 34
Provided by: wwwtsuji
Category:

less

Transcript and Presenter's Notes

Title: Efficient HPSG Parsing with Supertagging and CFGfiltering


1
Efficient HPSG Parsing with Supertagging and
CFG-filtering
  • Takuya MatsuzakiYusuke MiyaoJunichi
    TsujiiUniversity of Tokyo

2
Overview
  • Objective to make the HPSG parsing faster
  • Method by combining three techniques
    complementarily
  • Supertagging
  • CFG-filtering
  • Deterministic disambiguation
  • Result 6-fold speed-up on the PennWSJ data

3
Background HPSG
  • Head-driven Phrase Structure GrammarPollard and
    Sag, 1994
  • Constraint-based, lexicalized grammar
  • A small number of Rule Schemata? generic
    grammatical constraints
  • A large number of Lexical Entries (LEs)?
    word-specific constraints

4
Example HPSG Parsing
I
like
it
5
Example HPSG Parsing
Assignments of Lexical Entries
6
Schema Application
Head-Complement
2
7
Schema Application
Subject-Head
2
1
8
Real Grammar
  • Lexical entries parse trees huge feature
    structures ? demo
  • Rule schema application unification of the
    feature structures? costly operations

Our basic strategy for the speed-up minimize
the operation on the feature structures
9
Specificity of HPSG from the view point of parsing
Many constraints are specified in the lexical
entries (LEs)
The assignment of LEs determines the form of the
parse tree for the most part
10
Specificity of HPSG from the view point of the
parsing
11
Specificity of HPSG from the view point of the
parsing
Many constraints are specified in the lexical
entries (LEs)
The assignment of LEs determines the form of the
syntactic tree for the most part
Correct LE-assignments ? the parsing thereafter
is easy Wrong LE-assignments ? there is a risk
of parse failures
12
Previous method Supertagger-based parsing
Clark and Curran, 2004 Ninomiya et al., 2006
  • Supertagging Bangalore and Joshi,
    1999Selecting a few LEs for a word by using a
    probabilistic model of P(LEs input sentence)

P small
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
P large
it
I
13
Previous method Supertagger-based parsing
Clark and Curran, 2004 Ninomiya et al., 2006
  • Ignore the LEs with small probabilities

P small
threshold
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
P large
it
I
14
A dilemma in the previous method
  • Fewer LEs ? Faster parsing, but
  • Too few LEs ? More risk of no well-formed parse
    trees

15
Previous solution to the dilemma
  • Assign a few LEs at first
  • Input the LEs to a chart parser
  • If no well-formed parse tree is found, increase
    the number of LEs and go to step 2

16
Our idea
  • If we guarantee the parsability of the LE
    assignment,
  • the parsing becomes faster because
  • we could replace the chart parsing by a more
    simple algorithm ?deterministic parsing

17
Our systemEnumeration of LE assignments
Deterministic disambiguation
  • Enumerate (maybe-) parsable LE assignments in the
    order of their probabilities by combining the
    supertagger and a CFG-filter
  • Input the LE assignments one by one to a
    deterministic disambiguation module until a
    well-formed tree is obtained

18
System Overview
input sentence
I like it
Enumeration of assignments
Supertagger
Deterministicdisambiguation
Prob.
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
I
it
19
Enumaration of the maybe-parsable LE assignments
  • Based on the CFG-filtering technique Kiefer and
    Krieger, 2000 Torisawa et al, 2000
  • Parsing with a CFG that approximates the HPSG
  • Covering property if a LE assignment is parsable
    by the HPSG ? it is also parsable by the
    approximating CFG
  • CFG parsing is much faster than HPSG parsing

20
Enumaration of the maybe-parsable LE assignments
Enumeration of thehighest-prob. LE sequences
Supertaggingresult
CFG-filter
(
...
21
Deterministic disambiguation
  • Implemented as a shift-reduce parser
  • Deterministic parsing only one analysis at one
    time
  • Next parsing action is selected using a scoring
    function
  • F scoring function (averaged-perceptron
    algorithm Collins and Duffy, 2002)
  • Features are extracted from the stack state S
    and lookahead queue Q
  • A the set of possible actions (CFG-forest is
    used as a guide)

22
Example
Initial state
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
I
like
23
argmax F(a, S, Q) SHIFT
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
I
like
24
argmax F(a, S, Q) SHIFT
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
I
like
25

argmax F(a, S, Q) SHIFT
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ ltNPgtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
I
like
26

argmax F(a, S, Q) REDUCE(Head_Comp)
Q
S
HEAD nounSUBJ lt gtCOMPS lt gt
HEAD verbSUBJ lt1NPgtCOMPS ltgt
Head-Comp-Schema
I
HEAD verbSUBJ lt1gtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
like
27

argmax F(a, S, Q) REDUCE(Subj_Head)
Q
S
HEAD verbSUBJ ltgtCOMPS ltgt
Subj-Head-Schema
HEAD verbSUBJ lt1NPgtCOMPS ltgt
HEAD nounSUBJ lt gtCOMPS lt gt
I
HEAD verbSUBJ lt1gtCOMPS ltNPgt
HEAD nounSUBJ lt gtCOMPS lt gt
it
like
28
Experimental result
  • Training / test corpus Penn Treebank section
    02-21/23
  • Grammar the Enju HPSG English grammar Miyao et
    al., 2005
  • Evaluation metrics
  • LP/LR Labeled Precision/Recall of
    predicate-argument relations in the output
  • Avg. time average parse time per a sentence (on
    a 2.4GHz CPU)
  • Previous method 1 Ninomiya et al, 2006
  • Previous method 2 Miyao and Tsujii, 2005

29
Experimental resultThe effect of the CFG-filter
  • We got a well-formed parse tree on the first
    maybe-parsable LE assignments for 95 of the
    sentences

We reduced the cost for the operations on
feature structures nearly to its minimum
30
Conclusion
  • An efficient HPSG parsing system
  • Combination of supertagging, CFG-filtering, and
    deterministic disambiguation
  • 6-fold speed up with almost the same level of
    parsing accuracy as previous method
  • Inefficiency of the feature structure operation
    is not a problem in the HPSG parsing anymore

31
Back-up slides
32
If we do not use the CFG-filter?
33
Breakdown of the processing time
Write a Comment
User Comments (0)
About PowerShow.com