Title: The CYK Parsing Method
1The CYK Parsing Method
- Chiyo Hotani
- Tanya Petrova
- CL2 Parsing Course
- 28 November, 2007
2Overview
- CYK Recognition with CF grammar
- Basic Algorithm
- Problems unit-rules, ?-rules
- Recognition with a grammar in CNF
- CYK Parsing with CNF
- Parsing with CNF
- Recognition Table
- Chart Parsing
- Summary
- Advantages and Disadvantages
- Other remarks
3Basic Algorithm of CYK Recognition (1)
- Example Grammar
- A grammar describing numbers in scientific
notation - Input 32.5e1
4Basic Algorithm of CYK Recognition (2)
Digit -gt 0 1 2 3 4 5 6 7 8
9 Sign -gt -
derivations of substrings of length 1
5Basic Algorithm of CYK Recognition (3)
- NumberS -gt Integer Real
- Integer -gt Digit Integer Digit
- Digit -gt 0 1 2 3 4 5 6 7 8 9
- derivations of substrings of length 1
- Unit Rule rules of the form A?B, where A and B
are non-terminals. We can have chains of them in
a derivation.
6Basic Algorithm of CYK Recognition (4)
- NumberS -gt Integer Real
- Integer -gt Digit Integer Digit
- Fraction -gt . Integer
- Scale -gt e Sign Integer Empty
7Basic Algorithm of CYK Recognition (5)
- NumberS -gt Integer Real
- Real -gt Integer Fraction Scale
Number does indeed derive 32.5e1.
8Basic Algorithm of CYK Recognition (6)
?-rules
9Basic Algorithm of CYK Recognition (7)
- R? Empty, Scale
- sentence z z1 z2 . . . znsubstring of z
starting at position i, of length l.si,l
zizi1. . . zil-1 - Rsi,l the set of non-terminals deriving the
substring si,l
A graphical presentation of substrings
10CYK recognition with a grammar in CNF
- Required restrictions
- Eliminate ?-rules and unit rules
- Limit the maximum length of RHS of the rule to 2
- CNF
- No ?-rules and unit rules
- all rules have one of the following two forms
A?a A?BC
11Our example grammar in CNF
12CYK Parsing with CNF
- Building the recognition table
- Input
- Our example grammar in CNF
- input sentence 32.5 e 1
13CYK Parsing with the CNF
- bottom-row read directly from the grammar
(rules of the form A? a )
14Two Ways to Copmute a R s i,l
- check each right-hand side
- compute possible right-hand sides from the
recognition table
15How this is done
- Example 2.5 e ( s 2, 4)
- 1) N1 not in R s 2, 1 or R s 2, 2
- N1 is a member of R s 2, 3
- But Scale is not a member of R s 5, 1
- 2) R s 2, 4 is the set of Non- Terminals that
have a right-hand side AB where either - A in R s 2, 1 and B in R s 3, 3
- A in R s 2, 2 and B in R s 4, 2
- A in R s 2, 3 and B in R s 5, 1
- Possible combinations N1 T2 or Number T2
- In our grammar we do not have such a right-hand
side, so nothing is added to R s 2, 4.
16Recognition table
l
i
17As a result we find out that
- This process is much less complicated than the
one we saw before
18Reasons
- We do not have to repeat the process again and
again until no new Non-Terminals are added to R s
i,l - (The substrings we are dealing with
- are really substrings and cannot be equal to
the string we start with) - We only have to find one place where the
substring must be split into two A ? B C -
-
Here !
19Chart Parsing
A chart is just a recognition table.
20 A short retrospective of CYK
- First recognition table using the original
grammar. - Then transforming grammar to CNF.
21 A short retrospective of CYK cont.
- CNF is useful for improving the efficiency, but
it is actually a bit too restrictive - Disadvantage of CNF
- Resulting recognition table lacks the information
we need to construct a derivation using the
original grammar!
22 A short retrospective of CYK cont.
- In the transformation process, some non-terminals
were thrown away - (non-productive)
- Missing information could be added.
23 A short retrospective of CYK cont.
- Result almost the same recognition table.
- Extra information on non-terminals
- Obtained in a simpler and much more efficient
way.
24- Thank you
- for your attention! ?