Parsing Techniques - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Parsing Techniques

Description:

Quiz (on your time): Find a general argument to show the grammar is not LL(k). 7. Example ... Quiz: What happens if a grammar is left-recursive such as S Sa | b? ... – PowerPoint PPT presentation

Number of Views:119
Avg rating:3.0/5.0
Slides: 32
Provided by: hong88
Category:

less

Transcript and Presenter's Notes

Title: Parsing Techniques


1
Chapter 12 - 3
  • Parsing Techniques

2
Section 12.3 Parsing Techniques
  • We know (via a theorem) that the context-free
    languages are exactly those languages that are
    accepted by PDAs.
  • When a context-free language can be recognized by
    a deterministic final-state PDA, it is called a
    deterministic context-free language.

3
LL(k) grammar
  • An LL(k) grammar has the property that a parser
    can be constructed to scan an input string from
    left to right and build a leftmost derivation by
    examining next k input symbols to determine the
    unique production for each derivation step.
  • If a language has an LL(k) grammar, it is called
    an LL(k) language.
  • LL(k) languages are deterministic context-free
    languages, but there are deterministic
    context-free languages that are not LL(k). (See
    text for an example on page 730.)

4
Example
  • Consider the language anb n ? N.
  • (1) It has the LL(1) grammar S ? aS b.
  • A parser can examine one input letter to decide
    whether to use S ? aS or S ? b for the next
    derivation step.
  • (2) It has the LL(2) grammar S ? aaS ab b.
  • A parser can examine two input letters to
    determine whether to use S ? aaS or S ? ab for
    the next derivation step. Notice that the grammar
    is not LL(1).
  • (3) Quiz. Find an LL(3) grammar that is not
    LL(2). Solution. S ? aaaS aab ab b.

5
Example/Quiz
  • Why is the following grammar
  • S ? AB
  • A ? aAb ?
  • B ? bB ?.
  • for anbn k n, k ? N an-LL(1) grammar?
  • Answer Any derivation starts with S ? AB. The
    next derivation step uses one of the productions
    A ? aAb or A ? ?, depending on whether the
    scanned input letter is a or not. The argument is
    the same for B-productions.

6
Example
  • The following grammar for anbn k n, k ? N
    is not LL(k) for every k.
  • S ? aSb T
  • T ? bT ?.
  • It is not LL(1) Let the input be ab. The first
    letter a tells us to start with S ? aSb.The
    letter a in aSb matches the first input letter,
    so we look ahead to the second input letter
    b.This tells us we must use S ? T to get S ? aSb
    ? aTb.
  • The look ahead remains at b, so we cant
    determine whether it is the last b of the input
    string.
  • So we dont know whether to choose T ? bT or T ?
    ? for the next step. Thus the grammar is not
    LL(1).
  • It is not LL(2) Let the input be aabb. The first
    two input letters aa, tell us S ? aSb.
  • The letter a in aSb matches the first input, so
    we look ahead to the next two-letter substring
    ab.We must use S ? aSb to get S ? aSb ? aaSbb.
  • Now the look ahead becomes bb, so we must use S
    ? T to get S ? aSb ? aaSbb ? aaTbb.
  • The lookahead remains at bb, so we cant
    determine whether these are the last two bs of
    the input string. So we dont know whether to
    choose T ? bT or T ? ? for the next step. Thus
    the grammar is not LL(2).
  • Quiz (on your time) Find a general argument to
    show the grammar is not LL(k).

7
Example
  • The language an k bn k, n ? N is not
    deterministic context-free. So it has no LL(k)
    grammar for every k.
  • Proof Any PDA for the language must keep a count
    of the as with the stack so that when the bs
    come along the stack can be popped with each b.
    But there might still be as on the stack (e.g.,
    when k gt 0), so there must be a nondeterministic
    state transition to a final state from the
    popping state. i.e., We need two instructions
    like, (i, b, a, pop, i) and (i, ?, a, pop,
    final).
  • Note The previous two examples show that ambn
    m n is LL(1) but ambn m n is not LL(k)
    for any k.

8
Grammar Transformations
  • Left factoring Sometimes we can left-factor an
    LL(k) grammar to obtain an equivalent LL(n)
    grammar where n lt k.
  • Example. The grammar S ? aaS ab b is LL(2)
    but not LL(1). But we can factor out the common
    prefix a from productions S ? aaS ab to obtain
  • S ? aT
  • T ? aS b.
  • This gives the new grammar
  • S ? aT b
  • T ? aS b.

9
Quiz
  • Find an LL(k) grammar where k is as small as
    possible that is equivalent to the following
    grammar.
  • S ? abS abcT ab
  • T ? cT c.
  • Solution The given grammar is LL(3) and it can
    be left-factored to become an LL(1) grammar as
    follows
  • S ? abR
  • R ? S cT ?
  • T ? cU
  • U ? T ?.

10
Removing left recursion
  • A left-recursive grammar is one which has a
    derivation of the form
  • A ? Ax
  • for some nonterminal A and sentential form x.

11
Left-recursive grammars are not LL(k) for any k
  • Example. The language ban n ? N has a grammar
    S ? Sa b, which is left-recursive. It is not
    LL(k) for any k. Consider the following cases
  • LL(1) case If the input string is ba, the
    lookahead is b. So we dont know whether there
    are any as to the right of b. Therefore we dont
    know which production to start the derivation.
  • LL(2) case If the input string is baa, the
    lookahead is ba. So the derivation starts with S
    ? Sa. But the a in Sa denotes the a at the right
    end of the derived string. So the input string
    could be ba or baa. Therefore we dont know which
    production to choose next.
  • LL(k) case If the input string is baa with k
    as, the lookahead is baa with k 1 as. So the
    derivation after k 1 steps is S ? Sa ? Saa ? ?
    Saa. Now, the input string could be baa (length
    k) or baa (length k 1). Therefore we dont
    know which production to choose next.

12
removing left recursion
  • Sometimes we can obtain an LL(k) grammar by
    removing left recursion.
  • Algorithm Idea for direct left recursion
  • Transform A ? Aw Au Av a b.
  • To A ? aB bB
  • B ? wB uB vB ?.
  • Example/Quiz. Remove left recursion.
  • S ? Sa b.
  • Solution. S ? bT
  • T ? aT ?.
  • It is LL(1).

13
Example/Quiz
  • Example/Quiz. Remove left recursion.
  • S ? Saa aab aac.
  • Solution S ? aabT aacT
  • T ? aaT ?.
  • It is LL(3).
  • Quiz Rewrite the previous solution as LL(1).
  • Solution S ? aaU
  • U ? bT cT
  • T ? aaT ?.

14
Example
  • Example (Removing indirect left recursion).
    Consider the following grammar.
  • S ? Ab a
  • A ? Sa b.
  • The grammar is left-recursive because of the
    indirect left recursion S ? Ab ? Sab.
  • To remove the indirect left recursion, replace A
    in S ? Ab by the right side of A ? Sa b to
    obtain S ? Sab bb. The grammar becomes S ? Sab
    bb a. Remove the left recursion
  • S ? bbT aT
  • T ? abT ?.

15
Quiz
  • Remove left recursion from the grammar
  • S ? Ab a
  • A ? SAa b.
  • Solution Replace A in S ? Ab a by the right
    side of A ? SAa b to obtain
  • S ? SAab bb a
  • A ? SAa b.
  • Now remove the direct left recursion
  • S ? bbT aT
  • T ? AabT ?
  • A ? SAa b.

16
Top-Down Parsing of LL Languages
  • LL(k) grammars have top-down parsing algorithms
    because a leftmost derivation can be constructed
    by starting at the start symbol and proceeding to
    the desired string.
  • Example/Quiz. Consider the following LL(1)
    grammar.
  • S ? aSC b
  • C ? cC d.
  • The string aabcdd has the following leftmost
    derivation, where each step is uniquely
    determined by the current lookahead symbol.
  • S ? aSC ? aaSCC ? aabCC ? aabcCC ? aabcdC ?
    aabcdd.

17
Recursive Descent LL(1) Parsing
  • A procedure is associated with each nonterminal.
    Well use the following procedure for LL(1)
    grammars to match a symbol with the lookahead
    symbol.
  • match(x) if lookahead x
  • then lookahead next input symbol
  • else error
  • fi.
  • Example. For the preceding example here are two
    possible recursive descent procedures
  • S if lookahead a then match(a) S C else
    match(b) fi.
  • C if lookahead c then matct(c) C else
    match(d) fi.

18
Quiz
  • Write recursive descent procedures for the
    following LL(1) grammar.
  • S ? aaM
  • M ? bT cT
  • T ? aaT ?.
  • Solution
  • S match(a) match(a) M.
  • M if lookahead b then match(b) T else
    match(c) T fi
  • T if lookahead a then match(a) match(a) T fi

19
Table-Driven LL(1) Parsing
  • Well start with an example showing how to use
    the table to parse a string. Then well discuss
    how to construct the table.
  • Example. Consider the following LL(1) grammar and
    its corresponding parse table.
  • Grammar S ? aSb ?.
  • a b
  • Table S

20
Parse of the string aabb
21
LL(1) Parse Table Construction
  • To construct the table we need two types of sets
    constructed from the grammar.
  • First Sets For a sentential form w the set
    First(w) consists of any terminal at the left end
    of w or at the left end of any sentential form
    derived from w, including ? if w is ? or w
    derives ?.
  • Example/Quiz. Given the grammar
  • S ? aS Tb
  • T ? cT ?.
  • Use the definition to find the First sets for
    aS, cT, T, Tb and S.

22
Algorithm to Construct First Sets
  • First(?) ?.
  • First(aw) First(a) a for a terminal a.
  • If A ? w1 wn, then First(A) First(w1) ?
    ? First(wn).
  • Case for First(Aw) where w ? ?
  • If ? ? First(A), then First(Aw) First(A).
  • If ? ? First(A), then First(Aw) (First(A) -
    ?) ? First(w).

23
Example/Quiz
  • Quiz What happens if a grammar is left-recursive
    such as S ? Sa b?
  • Answer The algorithm defines First(Sa) and
    First(S) in terms of each other.
  • Example/Quiz. Given the grammar
  • S ? abTU bUT
  • T ? aT b
  • U ? cU ?.
  • Find the First sets for T, U, TU, and UT.

24
Follow Sets
  • Follow Sets If A is a nonterminal, then
    Follow(A) is the set of terminals to the right of
    A in sentential forms derived from the start
    symbol S, where we include in Follow(S).
  • Example. Given the grammar S ? aSb c ?, we
    have Follow(S) , b.
  • Example/Quiz. Given the grammar
  • S ? aSb Tc T
  • T ? dT ?.
  • Find the Follow sets for S and T.
  • Solution
  • The derivation S ? aSb implies that b ?
    Follow(S). So we have Follow(S) , b.
  • The derivations S ? aSb ? aTb and S ? Tc imply
    that b, c ? Follow(T). The derivation S ? T
    implies that anything following S also follows T.
    So we have Follow(T) , b, c.

25
Algorithm to Construct Follow Sets
  • Algorithm to Construct Follow Sets (x and y
    denote sentential forms)
  • ? Follow(S).
  • If A ? xB, then Follow(A) ? Follow(B).
  • If A ? xBy, then (First(y) - ?) ? Follow(B).
  • If A ? xBy and ? ? First(y), then Follow(A) ?
    Follow(B).
  • Notice Follow(B) is constructed by examining the
    productions with B on the right side.

26
Example/Quiz
  • Given the grammar
  • S ? aTU bUT
  • T ? aT b
  • U ? cU ?.
  • Find the Follow sets for S, T, and U.
  • Solution
  • Follow(S) .
  • Follow(T) , c.
  • Follow(U) , a, b.

27
Algorithm to Construct LL(1) Parse Table
  • Let Px, y represent the contents of row x
    column y of the table. For each production A ? w,
    do the following actions.
  • (1) For each terminal a ? First(w) put A ? w in
    PA, a.
  • (2) If ? ? First(w) then for each symbol t ?
    Follow(A) put A ? w in PA, t.
  • Remark If the grammar has no ?-productions, then
    there is no need to construct Follow sets.

28
Example
  • Construct the parse table for the following LL(1)
    grammar.
  • S ? AB ?
  • A ? aAb ?
  • B ? bB c.
  • Solution Calculate the First sets for the
    right-hand sides of the productions
  • First(AB) (First(A) - ?) ? First(B) a ?
    b, c a, b, c.
  • First(?) ?, First(aAb) a, First(bB)
    b, First(c) c.
  • Calculate the Follow sets for the nonterminals
  • Follow(S) , Follow(A) b, c, Follow(B)
    .
  • Construct the table

29
LL(k) Facts and Notes
  • In 1969 Kurki-Suoni showed that the LL(k)
    languages form an infinite hierarchy
  • For each k there is an LL(k 1) language that
    is not LL(k).
  • Example. The language defined by the following
    grammar is LL(k 1) but has no LL(k) grammar,
    where aa stands for a k-length string of as.
  • S ? aSA ?
  • A ? aabS c.

30
Example/Quiz
  • Why is the following grammar LL(2) but not LL(1)?
  • S ? aSA ?
  • A ? abS c.
  • Answer Consider the string aab. A derivation
    must start with S ? aSA. Now the lookahead is at
    the second a in aab, but there are two choices to
    pick from S ? aSA and S ? ?. So the grammar is
    not LL(1). But two lookahead letters allow the
    derivation to see the substring ab or ac, so that
    S ? aSA can be chosen for the next step.

31
The End of Chapter 12 - 3
Write a Comment
User Comments (0)
About PowerShow.com