LING 406 Intro to Computational Linguistics Parsing 1 - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

LING 406 Intro to Computational Linguistics Parsing 1

Description:

NP British, left, waffles, Falkland, Islands. 5/4/09. Linguistics 406. 26. CYK algorithm ... British left waffles on Falkland Islands. 0 1 2 3 4 5 6. N, A, NP ... – PowerPoint PPT presentation

Number of Views:60

Avg rating:3.0/5.0

Slides: 31

Provided by: serrano5

Category:

more less

Transcript and Presenter's Notes

Title: LING 406 Intro to Computational Linguistics Parsing 1

1
LING 406Intro to Computational
LinguisticsParsing 1

Richard Sproat
URL http//catarina.ai.uiuc.edu/L406_08/

2
This Lecture

Some Context-free Parsing Algorithms
Simple top-down/bottom-up parsing
Problems
Deterministic left-corner parsing
Cocke-Younger-Kasami chart parsing

3
Parsing

Parsing is the recovery of structure for a string
given a grammar.
Parsing is a search problem. (So are finite-state
operations such as composition.)
Find the right route to generating parse tree(s)
amongst all possible routes.
Different parsing algorithms have different
advantages and disadvantages and, especially,
different time complexity.

4
Basic approaches

Top-Down Parsers start at the top of the
grammar and predict constituents.
Bottom-Up Parsers start with the input words and
build constituents.
You might think that one is more intuitively
correct than the other, but in fact both have
their drawbacks.

5
Simple top-down algorithm

Builds from the root S node to the leaves
Assuming we build all trees in parallel
Find all trees with root S (or all rules with
left-hand side S)
Next expand all constituents in these
trees/rules
Continue until leaves are parts of speech (POS)
Candidate trees failing to match POS of input
string are rejected
This describes breadth-first search.
In depth-first search you keep expanding rules
until you reach a terminal, and then when that
succeeds (or fails) you backtrack and search
other rules.

6
Simple top-down algorithm
7
Schematic breadth-first search
8
Depth-first search

Depth-first search
Agenda of search states expand search space
incrementally, exploring most recently generated
state (tree) each time
When you reach a state (tree) inconsistent with
input, backtrack to most recent unexplored state
(tree)
Which node to expand? Leftmost or rightmost
Which grammar rule to use? Order in the grammar

9
Top-down, depth-first, left-right strategy

Initialize agenda with S tree and pointer to
first word and make this the current search state
(cur)
Loop until successful parse or empty agenda
Apply all applicable grammar rules to leftmost
unexpanded node of cur
If this node is a POS category and matches that
of the current input, push this onto agenda
Otherwise push new trees onto agenda
Pop new cur from agenda

10
Does this flight include a meal?
11
Bottom-up parsing

Parser begins with words of input and builds up
trees, applying grammar rules with right-hand
side match
Parse continues until an S root node reached or
no further node expansion possible

12
Bottom-up parsing
13
General issues

Top-Down parsers never explore illegal parses
(e.g. cant form an S) but waste time on trees
that can never match the input
Bottom-Up parsers never explore trees
inconsistent with input but waste time
exploring illegal parses (no S root)

14
Problems with top-down parsing

Left-recursion is a big problem for top-down
parsers. With a rule such as
NP? NP PP
a simple depth-first search will keep
expanding the NP for ever.
Ambiguity in natural language means that any
sentence might have hundreds or thousands of
possible parses. With no way to filter out any
valid parse, simple bottom-up and top-down
parsers simply have to compute all of the
parses.
There is no storage other than the agenda and the
cumulated set of successful parses. This means
that tree fragments might get rebuilt many times
as the parser reexplores analyses for the same
string.
For British left waffles on Falkland
Islands, the analysis of Falkland Islands is the
same no matter which analysis is picked for the
first part of the sentence.

15
Adding a bottom-up filter for top-down parsing

Generate a left corner table that includes all of
the leftmost dependents of non-terminals
Dont expand any non-terminal for which the
left-most word does not have a category in the
non-terminals left-corner table. E.g., dont
explore NP, if the left-most POS is Aux.

16
Left-corner parsing
Predicted
Announced
17
Left-corner parsing

Handles left-recursion because waits until the
leftmost child is completed before predicting the
parent.
Algorithm can be used to transform a grammar into
a left-corner grammar, that can be used with a
regular top-down parser (see RS, pp 141-142).
But algorithm is deterministic, thus not well
suited to natural language grammars.

18
(No Transcript)
19
Cocke-Younger-Kasami Algorithm

Bottom-up Algorithm
Uses dynamic programming.
JM dont discuss this until later in the book
(Chapter 12) in the context of discussion of
probabilistic CFGs.
Same for RS we discuss it as a probabilistic
method (thats Roarks fault).
But originally it was developed as a
non-probabilistic algorithm
I find it the easiest algorithm to understand

20
What is dynamic programming?

Answer a class of algorithms that use tables to
store solutions to subproblems of larger
problems. Some examples in language processing
minimum edit distance
CYK algorithm
Earley algorithm
Viterbi algorithm
forward algorithm
Well return to several of these later on

21
Minimum edit distance

Compute the minimum edit distance between cat and
at according to the following criteria
(Levenshtein Distance)
Substituting one letter with another costs 1
point
Deleting a letter costs 1 point
Inserting a letter costs 1 point
Intuitively the right alignment is as follows,
and costs 1 (deletion/insertion)
c a t
- a t
How to compute this efficiently?

22
Efficient algorithm (see JM p 156 for
pseudocode)

Pad each string with a dummy symbol at the
beginning (e.g. ).
Create an n m matrix, where n and m are the
lengths of the padded strings.
Seed the matrix at 0,0 with distance (cost) 0.
Loop over all columns i, loop over all rows j,
assigning the following distance to i, j

23
- a
-- at
0
1
2
c -
1
1
2
-c at
c a
2
ca at
2
1
ca - -
ca -a
1
cat -at
2
3
cat -a-
cat - --
24
Minimum edit distance

Use back pointers to recover the cheapest path
This algorithm was independently discovered seven
times.
String matching is very important in a lot of
fields including computational linguistics and
computational biology

25
Back to CYK algorithm
26
CYK algorithm requires CNF grammars

The rule
VP ? V NP PP
must be converted. So well do the
following
VP ? V XX
XX ? NP PP
We also need to do something with
NP ? N
What we can do here is allow the following
as well
NP ? British, left, waffles, Falkland,
Islands

27
CYK algorithm
28
CYK algorithm

For a more formal statement of the algorithm
see
JM sections 14.2
RS pages 145-152
but ignore the probabilistic aspect for now
Question do you see why the grammar needs to be
in CNF for this to work?

29
Example
S
S, VP
S
---
VP, S, XX
VP,XX
----
----
XX
. VP
PP
NP
---
NP
, NP
NP
PP
N, A, NP
N, V, NP
N, V, NP
P
N, NP
N, NP
British left waffles
on Falkland Islands
0 1 2
3 4
5 6
30
Notes

What we have shown is actually just a recognizer,
not a parser. For a parser we also need to
extract the trees from the chart.
The recognizer itself runs in polynomial time,
more specifically cubic time in the length of the
sentence
You have to loop over all i and j between 0 and
n, and then over each k between i and j, where k
determines the split of the span i, j into two
parts.
Extracting all possible trees is much worse it
can be exponential in time complexity
For each span, there might be several different
ways of constructing the nodes at that span each
of those nodes might have several different ways
of being constructed and so forth.