Title: LING 406 Intro to Computational Linguistics Parsing 1
1LING 406Intro to Computational
LinguisticsParsing 1
- Richard Sproat
- URL http//catarina.ai.uiuc.edu/L406_08/
2This Lecture
- Some Context-free Parsing Algorithms
- Simple top-down/bottom-up parsing
- Problems
- Deterministic left-corner parsing
- Cocke-Younger-Kasami chart parsing
3Parsing
- Parsing is the recovery of structure for a string
given a grammar.
- Parsing is a search problem. (So are finite-state
operations such as composition.)
- Find the right route to generating parse tree(s)
amongst all possible routes.
- Different parsing algorithms have different
advantages and disadvantages and, especially,
different time complexity.
4Basic approaches
- Top-Down Parsers start at the top of the
grammar and predict constituents.
- Bottom-Up Parsers start with the input words and
build constituents.
- You might think that one is more intuitively
correct than the other, but in fact both have
their drawbacks.
5Simple top-down algorithm
- Builds from the root S node to the leaves
- Assuming we build all trees in parallel
- Find all trees with root S (or all rules with
left-hand side S)
- Next expand all constituents in these
trees/rules
- Continue until leaves are parts of speech (POS)
- Candidate trees failing to match POS of input
string are rejected
- This describes breadth-first search.
- In depth-first search you keep expanding rules
until you reach a terminal, and then when that
succeeds (or fails) you backtrack and search
other rules.
6Simple top-down algorithm
7Schematic breadth-first search
8Depth-first search
- Depth-first search
- Agenda of search states expand search space
incrementally, exploring most recently generated
state (tree) each time
- When you reach a state (tree) inconsistent with
input, backtrack to most recent unexplored state
(tree)
- Which node to expand? Leftmost or rightmost
- Which grammar rule to use? Order in the grammar
9Top-down, depth-first, left-right strategy
- Initialize agenda with S tree and pointer to
first word and make this the current search state
(cur)
- Loop until successful parse or empty agenda
- Apply all applicable grammar rules to leftmost
unexpanded node of cur
- If this node is a POS category and matches that
of the current input, push this onto agenda
- Otherwise push new trees onto agenda
- Pop new cur from agenda
10Does this flight include a meal?
11Bottom-up parsing
- Parser begins with words of input and builds up
trees, applying grammar rules with right-hand
side match
- Parse continues until an S root node reached or
no further node expansion possible
12Bottom-up parsing
13General issues
- Top-Down parsers never explore illegal parses
(e.g. cant form an S) but waste time on trees
that can never match the input
- Bottom-Up parsers never explore trees
inconsistent with input but waste time
exploring illegal parses (no S root)
14Problems with top-down parsing
- Left-recursion is a big problem for top-down
parsers. With a rule such as
- NP? NP PP
- a simple depth-first search will keep
expanding the NP for ever.
- Ambiguity in natural language means that any
sentence might have hundreds or thousands of
possible parses. With no way to filter out any
valid parse, simple bottom-up and top-down
parsers simply have to compute all of the
parses. - There is no storage other than the agenda and the
cumulated set of successful parses. This means
that tree fragments might get rebuilt many times
as the parser reexplores analyses for the same
string. - For British left waffles on Falkland
Islands, the analysis of Falkland Islands is the
same no matter which analysis is picked for the
first part of the sentence.
15Adding a bottom-up filter for top-down parsing
- Generate a left corner table that includes all of
the leftmost dependents of non-terminals
- Dont expand any non-terminal for which the
left-most word does not have a category in the
non-terminals left-corner table. E.g., dont
explore NP, if the left-most POS is Aux.
16Left-corner parsing
Predicted
Announced
17Left-corner parsing
- Handles left-recursion because waits until the
leftmost child is completed before predicting the
parent.
- Algorithm can be used to transform a grammar into
a left-corner grammar, that can be used with a
regular top-down parser (see RS, pp 141-142).
- But algorithm is deterministic, thus not well
suited to natural language grammars.
18(No Transcript)
19Cocke-Younger-Kasami Algorithm
- Bottom-up Algorithm
- Uses dynamic programming.
- JM dont discuss this until later in the book
(Chapter 12) in the context of discussion of
probabilistic CFGs.
- Same for RS we discuss it as a probabilistic
method (thats Roarks fault).
- But originally it was developed as a
non-probabilistic algorithm
- I find it the easiest algorithm to understand
20What is dynamic programming?
- Answer a class of algorithms that use tables to
store solutions to subproblems of larger
problems. Some examples in language processing
- minimum edit distance
- CYK algorithm
- Earley algorithm
- Viterbi algorithm
- forward algorithm
- Well return to several of these later on
21Minimum edit distance
- Compute the minimum edit distance between cat and
at according to the following criteria
(Levenshtein Distance)
- Substituting one letter with another costs 1
point
- Deleting a letter costs 1 point
- Inserting a letter costs 1 point
- Intuitively the right alignment is as follows,
and costs 1 (deletion/insertion)
- c a t
- - a t
- How to compute this efficiently?
22Efficient algorithm (see JM p 156 for
pseudocode)
- Pad each string with a dummy symbol at the
beginning (e.g. ).
- Create an n m matrix, where n and m are the
lengths of the padded strings.
- Seed the matrix at 0,0 with distance (cost) 0.
- Loop over all columns i, loop over all rows j,
assigning the following distance to i, j
23- a
-- at
0
1
2
c -
1
1
2
-c at
c a
2
ca at
2
1
ca - -
ca -a
1
cat -at
2
3
cat -a-
cat - --
24Minimum edit distance
- Use back pointers to recover the cheapest path
- This algorithm was independently discovered seven
times.
- String matching is very important in a lot of
fields including computational linguistics and
computational biology
25Back to CYK algorithm
26CYK algorithm requires CNF grammars
- The rule
-
- VP ? V NP PP
- must be converted. So well do the
following
- VP ? V XX
- XX ? NP PP
- We also need to do something with
-
- NP ? N
- What we can do here is allow the following
as well
- NP ? British, left, waffles, Falkland,
Islands
27CYK algorithm
28CYK algorithm
- For a more formal statement of the algorithm
see
- JM sections 14.2
- RS pages 145-152
- but ignore the probabilistic aspect for now
- Question do you see why the grammar needs to be
in CNF for this to work?
29Example
S
S, VP
S
---
VP, S, XX
VP,XX
----
----
XX
. VP
PP
NP
---
NP
, NP
NP
PP
N, A, NP
N, V, NP
N, V, NP
P
N, NP
N, NP
British left waffles
on Falkland Islands
0 1 2
3 4
5 6
30Notes
- What we have shown is actually just a recognizer,
not a parser. For a parser we also need to
extract the trees from the chart.
- The recognizer itself runs in polynomial time,
more specifically cubic time in the length of the
sentence
- You have to loop over all i and j between 0 and
n, and then over each k between i and j, where k
determines the split of the span i, j into two
parts. - Extracting all possible trees is much worse it
can be exponential in time complexity
- For each span, there might be several different
ways of constructing the nodes at that span each
of those nodes might have several different ways
of being constructed and so forth.