LING 406 Intro to Computational Linguistics Parsing 1 - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

LING 406 Intro to Computational Linguistics Parsing 1

Description:

NP British, left, waffles, Falkland, Islands. 5/4/09. Linguistics 406. 26. CYK algorithm ... British left waffles on Falkland Islands. 0 1 2 3 4 5 6. N, A, NP ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 31
Provided by: serrano5
Category:

less

Transcript and Presenter's Notes

Title: LING 406 Intro to Computational Linguistics Parsing 1


1
LING 406Intro to Computational
LinguisticsParsing 1
  • Richard Sproat
  • URL http//catarina.ai.uiuc.edu/L406_08/

2
This Lecture
  • Some Context-free Parsing Algorithms
  • Simple top-down/bottom-up parsing
  • Problems
  • Deterministic left-corner parsing
  • Cocke-Younger-Kasami chart parsing

3
Parsing
  • Parsing is the recovery of structure for a string
    given a grammar.
  • Parsing is a search problem. (So are finite-state
    operations such as composition.)
  • Find the right route to generating parse tree(s)
    amongst all possible routes.
  • Different parsing algorithms have different
    advantages and disadvantages and, especially,
    different time complexity.

4
Basic approaches
  • Top-Down Parsers start at the top of the
    grammar and predict constituents.
  • Bottom-Up Parsers start with the input words and
    build constituents.
  • You might think that one is more intuitively
    correct than the other, but in fact both have
    their drawbacks.

5
Simple top-down algorithm
  • Builds from the root S node to the leaves
  • Assuming we build all trees in parallel
  • Find all trees with root S (or all rules with
    left-hand side S)
  • Next expand all constituents in these
    trees/rules
  • Continue until leaves are parts of speech (POS)
  • Candidate trees failing to match POS of input
    string are rejected
  • This describes breadth-first search.
  • In depth-first search you keep expanding rules
    until you reach a terminal, and then when that
    succeeds (or fails) you backtrack and search
    other rules.

6
Simple top-down algorithm
7
Schematic breadth-first search
8
Depth-first search
  • Depth-first search
  • Agenda of search states expand search space
    incrementally, exploring most recently generated
    state (tree) each time
  • When you reach a state (tree) inconsistent with
    input, backtrack to most recent unexplored state
    (tree)
  • Which node to expand? Leftmost or rightmost
  • Which grammar rule to use? Order in the grammar

9
Top-down, depth-first, left-right strategy
  • Initialize agenda with S tree and pointer to
    first word and make this the current search state
    (cur)
  • Loop until successful parse or empty agenda
  • Apply all applicable grammar rules to leftmost
    unexpanded node of cur
  • If this node is a POS category and matches that
    of the current input, push this onto agenda
  • Otherwise push new trees onto agenda
  • Pop new cur from agenda

10
Does this flight include a meal?
11
Bottom-up parsing
  • Parser begins with words of input and builds up
    trees, applying grammar rules with right-hand
    side match
  • Parse continues until an S root node reached or
    no further node expansion possible

12
Bottom-up parsing
13
General issues
  • Top-Down parsers never explore illegal parses
    (e.g. cant form an S) but waste time on trees
    that can never match the input
  • Bottom-Up parsers never explore trees
    inconsistent with input but waste time
    exploring illegal parses (no S root)

14
Problems with top-down parsing
  • Left-recursion is a big problem for top-down
    parsers. With a rule such as
  • NP? NP PP
  • a simple depth-first search will keep
    expanding the NP for ever.
  • Ambiguity in natural language means that any
    sentence might have hundreds or thousands of
    possible parses. With no way to filter out any
    valid parse, simple bottom-up and top-down
    parsers simply have to compute all of the
    parses.
  • There is no storage other than the agenda and the
    cumulated set of successful parses. This means
    that tree fragments might get rebuilt many times
    as the parser reexplores analyses for the same
    string.
  • For British left waffles on Falkland
    Islands, the analysis of Falkland Islands is the
    same no matter which analysis is picked for the
    first part of the sentence.

15
Adding a bottom-up filter for top-down parsing
  • Generate a left corner table that includes all of
    the leftmost dependents of non-terminals
  • Dont expand any non-terminal for which the
    left-most word does not have a category in the
    non-terminals left-corner table. E.g., dont
    explore NP, if the left-most POS is Aux.

16
Left-corner parsing
Predicted
Announced
17
Left-corner parsing
  • Handles left-recursion because waits until the
    leftmost child is completed before predicting the
    parent.
  • Algorithm can be used to transform a grammar into
    a left-corner grammar, that can be used with a
    regular top-down parser (see RS, pp 141-142).
  • But algorithm is deterministic, thus not well
    suited to natural language grammars.

18
(No Transcript)
19
Cocke-Younger-Kasami Algorithm
  • Bottom-up Algorithm
  • Uses dynamic programming.
  • JM dont discuss this until later in the book
    (Chapter 12) in the context of discussion of
    probabilistic CFGs.
  • Same for RS we discuss it as a probabilistic
    method (thats Roarks fault).
  • But originally it was developed as a
    non-probabilistic algorithm
  • I find it the easiest algorithm to understand

20
What is dynamic programming?
  • Answer a class of algorithms that use tables to
    store solutions to subproblems of larger
    problems. Some examples in language processing
  • minimum edit distance
  • CYK algorithm
  • Earley algorithm
  • Viterbi algorithm
  • forward algorithm
  • Well return to several of these later on

21
Minimum edit distance
  • Compute the minimum edit distance between cat and
    at according to the following criteria
    (Levenshtein Distance)
  • Substituting one letter with another costs 1
    point
  • Deleting a letter costs 1 point
  • Inserting a letter costs 1 point
  • Intuitively the right alignment is as follows,
    and costs 1 (deletion/insertion)
  • c a t
  • - a t
  • How to compute this efficiently?

22
Efficient algorithm (see JM p 156 for
pseudocode)
  • Pad each string with a dummy symbol at the
    beginning (e.g. ).
  • Create an n m matrix, where n and m are the
    lengths of the padded strings.
  • Seed the matrix at 0,0 with distance (cost) 0.
  • Loop over all columns i, loop over all rows j,
    assigning the following distance to i, j

23
- a
-- at
0
1
2
c -
1
1
2
-c at
c a
2
ca at
2
1
ca - -
ca -a
1
cat -at
2
3
cat -a-
cat - --
24
Minimum edit distance
  • Use back pointers to recover the cheapest path
  • This algorithm was independently discovered seven
    times.
  • String matching is very important in a lot of
    fields including computational linguistics and
    computational biology

25
Back to CYK algorithm
26
CYK algorithm requires CNF grammars
  • The rule
  • VP ? V NP PP
  • must be converted. So well do the
    following
  • VP ? V XX
  • XX ? NP PP
  • We also need to do something with
  • NP ? N
  • What we can do here is allow the following
    as well
  • NP ? British, left, waffles, Falkland,
    Islands

27
CYK algorithm
28
CYK algorithm
  • For a more formal statement of the algorithm
    see
  • JM sections 14.2
  • RS pages 145-152
  • but ignore the probabilistic aspect for now
  • Question do you see why the grammar needs to be
    in CNF for this to work?

29
Example
S
S, VP
S
---
VP, S, XX
VP,XX
----
----
XX
. VP
PP
NP
---
NP
, NP
NP
PP
N, A, NP
N, V, NP
N, V, NP
P
N, NP
N, NP
British left waffles
on Falkland Islands
0 1 2
3 4
5 6
30
Notes
  • What we have shown is actually just a recognizer,
    not a parser. For a parser we also need to
    extract the trees from the chart.
  • The recognizer itself runs in polynomial time,
    more specifically cubic time in the length of the
    sentence
  • You have to loop over all i and j between 0 and
    n, and then over each k between i and j, where k
    determines the split of the span i, j into two
    parts.
  • Extracting all possible trees is much worse it
    can be exponential in time complexity
  • For each span, there might be several different
    ways of constructing the nodes at that span each
    of those nodes might have several different ways
    of being constructed and so forth.
Write a Comment
User Comments (0)
About PowerShow.com