Context-Free Languages - PowerPoint PPT Presentation

About This Presentation
Title:

Context-Free Languages

Description:

Title: Paradigmas y Perspectivas Futuras en Computaci n Author: Manuel Bermudez Last modified by: manuel Created Date: 3/29/2000 4:40:24 PM Document presentation format – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 165
Provided by: ManuelB6
Learn more at: https://www.cise.ufl.edu
Category:
Tags: context | free | languages | ppe

less

Transcript and Presenter's Notes

Title: Context-Free Languages


1
Context-Free Languages
Programming Language Translators
  • Prepared by
  • Manuel E. Bermúdez, Ph.D.
  • Associate Professor
  • University of Florida

2
Context-Free Grammars
  • Definition A context-free grammar (CFG) is a
    quadruple G (?, ?, P, S), where all productions
    are of the form A ? ?, where
  • A ? ? and ? ? (?u? ).
  • Left-most derivation At each step, the
    left-most nonterminal is re-written.
  • Right-most derivation At each step, the
    right-most nonterminal is re-written.

3
(No Transcript)
4
Derivation Trees
  • Derivation trees Describe re-writes,
    independently of the order (left-most or
    right-most).
  • Each tree branch matches a production rule in the
    grammar.

5
(No Transcript)
6
Derivation Trees (contd)
  • Notes
  • Leaves are terminals.
  • Bottom contour is the sentence.
  • Left recursion causes left branching.
  • Right recursion causes right branching.

7
Goals of Parsing
  • Examine input string, determine whether it's
    legal.
  • Equivalent to building derivation tree.
  • Added benefit tree embodies syntactic structure
    of input.
  • Therefore, tree should be unique.

8
Grammar Ambiguity
  • Definition A CFG is ambiguous if there exist
    two different right-most (or left-most, but not
    both) derivations for some sentence z.
  • (Equivalent) Definition A CFG is ambiguous if
    there exist two different derivation trees for
    some sentence z.

9
Ambiguous Grammars
  • Classic ambiguities
  • Simultaneous left/right recursion
  • E ? E E
  • Dangling else problem
  • S ? if E then S
  • ? if E then S else S

10
(No Transcript)
11
Grammar Reduction
  • What language does this grammar generate?
  • S ? a D ? EDBC
  • A ? BCDEF E ? CBA
  • B ? ASDFA F ? S
  • C ? DDCF
  • L(G) a
  • Problem Many nonterminals (and productions)
    cannot be used in the generation of any sentence.

12
Grammar Reduction
  • Definition A CFG is reduced iff for all A ? ?,
  • a) S gt aAß, for some a, ß ? V,
  • (we say A is generable), and
  • b) A gt z, for some z ? S
  • (we say A is terminable)
  • G is reduced iff every nonterminal A is both
    generable and terminable.

13
Grammar Reduction
  • Example S ? BB A ? aA
  • B ? bB ? a
  • B is not terminable, since B gt z, for any z ?
    S.
  • A is not generable, since S gt aAß, for any
    a,ß?V.

14
Grammar Reduction
  • To find out which nonterminals are generable
  • Build the graph (?, d), where (A, B) ? d iff
  • A ? aBß is a production.
  • Check that all nodes are reachable from S.

15
Grammar Reduction
  • Example S ? BB A ? aA
  • B ? bB ? a
  • A is not reachable
  • from S,
    so A is not

  • generable.

S
B
A
16
Grammar Reduction
  • Algorithmically,
  • Generable S
  • while(Generable changes) do
  • for each A ? ?Bß do
  • if A ? Generable then
  • Generable Generable U B
  • od
  • Now, Generable contains the
  • nonterminals that are generable

17
Grammar Reduction
  • To find out which nonterminals are terminable
  • Build the graph (2?, d), where
  • (N, N U A) ? d iff
  • A ? X1 Xn is a production, and for all i,
  • either Xi ? S or Xi ? N.
  • Check that the node ? (set of all nonterminals)
    is reachable from node ø (empty set).

18
Grammar Reduction
  • Example S ? BB A ? aA
  • B ? bB ? a
  • A, S, B not reachable from ø ! Only A is
    reachable from ø. Thus S and B are not terminable.

19
Grammar Reduction
  • Algorithmically,
  • Terminable
  • while (Terminable changes) do
  • for each A ? X1Xn do
  • if every nonterminal among the Xs
  • is in Terminable then
  • Terminable Terminable U A
  • od
  • Now, Terminable contains the nonterminals
  • that are terminable.

20
Grammar Reduction
  • Reducing a grammar
  • Find all generable nonterminals.
  • Find all terminable nonterminals.
  • Remove any production A ? X1 Xn
  • if either a) A is not generable
  • b) any Xi is not terminable
  • If the new grammar is not reduced, repeat the
    process.

21
Grammar Reduction
  • Example E ? E T F ? not F
  • ? T Q ? P / Q
  • T ? F T P ? (E)
  • ? P ? i
  • Generable E, T, F, P, not Generable Q
  • Terminable P, T, E, not Terminable F, Q
  • So, eliminate every production for Q, and every
  • production whose right-part contains either F or
    Q.

22
Grammar Reduction
  • New Grammar
  • E ? E T
  • ? T
  • T ? P
  • P ? (E)
  • ? i
  • Generable E , T, P Now, grammar
  • Terminable P, T, E is reduced.

23
Operator Precedence and Associativity
  • Lets build a CFG for expressions consisting of
  • elementary identifier i.
  • and - (binary ops) have lowest precedence, and
    are left associative .
  • and / (binary ops) have middle precedence, and
    are right associative.
  • and - (unary ops) have highest precedence, and
    are right associative.

24
Sample Grammar for Expressions
  • E ? E T E consists of T's,
  • ? E - T separated by s and 's
  • ? T (lowest precedence).
  • T ? F T T consists of F's,
  • ? F / T separated by 's and /'s
  • ? F (next precedence).
  • F ? - F F consists of a single P,
  • ? F preceded by 's and -'s.
  • ? P (next precedence).
  • P ? '(' E ')' P consists of a parenthesized
    E,
  • ? i or a single i (highest
    precedence).

25
Operator Precedence and Associativity (contd)
  • Operator Precedence
  • The lower in the grammar, the higher the
    precedence.
  • Operator Associativity
  • left recursion in the grammar means left
    associativity of the operator, and causes left
    branching in the tree.
  • right recursion in the grammar means right
    associativity of the operator, and causes right
    branching in the tree.

26
Building Derivation Trees
  • Sample Input
  • - i - i ( i i ) / i i
  • (Human) derivation tree construction
  • Bottom-up.
  • On each pass, scan entire expression, process
    operators with highest precedence (parentheses
    are highest).
  • Lowest precedence operators are last, at the top
    of tree.

27
(No Transcript)
28
Operator Precedence and Associativity
  • Exercise
  • Write a grammar for expressions that consists
    of
  • elementary identifier i.
  • , , are next (left associative)
  • , , are next (right associative)
  • _at_, ! have highest precedence (left
    associative.)
  • Parentheses override precedence and associativity.

29
Precedence and Associativity
  • Grammar E0 ? E0 E1
  • ? E0 E1
  • ? E0 E1
  • ? E1
  • E1 ? E2 E1
  • ? E2 E1
  • ? E2
  • E2 ? E2 _at_ E3
  • ? E2 ! E3
  • ? E3
  • E3 ? (E0)
  • ? i

30
Operator Precedence and Associativity
  • Example Construct the derivation tree for
  • i i _at_ i i ( i i i ! ) ( i i ) i
    _at_ i
  • Easier to construct the tree from the leaves to
    the root.
  • On each pass, scan the entire expression, and
    process first the operators with highest
    precedence.
  • Leave operators with lowest precedence for last.

31
Derivation Tree
32
Transduction Grammars
  • Definition A transduction grammar (a.k.a.
    syntax-directed translation scheme) is like a
    CFG, except for the following generalization
  • Each production is a triple (A, ß, ?) ? ? x V x
    V, called a translation rule, denoted A ? ß gt
    ?, where
  • A is the left part,
  • ß is the right part, and
  • ? is the translation part.

33
Sample Transduction Grammar
  • Translation of infix to postfix expressions.
  • E ? E T gt E T
  • ? T gt T
  • T ? P T gt P T
  • ? P gt P
  • P ? (E) gt E Note ()s discarded
  • ? i gt i
  • The translation part describes how the output is
    generated, as the input is derived.

34
Sample Transduction Grammar
  • We keep track of a pair (?, ß), where ? and ß are
    the sentential forms of the input and output.
  • ( E, E )
  • gt ( E T, E T )
  • gt ( T T, T T )
  • gt ( P T, P T )
  • gt ( i T, i T )
  • gt ( i P T, i P T )
  • gt ( i i T, i i T )
  • gt ( i i i, i i i )

35
String to Tree Transduction
  • Transduction to Abstract Syntax Trees
  • Notation lt N t1 tn gt denotes
  • String-to-tree transduction grammar
  • E ? E T gt lt E T gt
  • ? T gt T
  • T ? P T gt lt P T gt
  • ? P gt P
  • P ? (E) gt E
  • ? i gt i

N
t1 tn
36
String to Tree Transduction
  • Example
  • (E, E)
  • gt (E T, lt E T gt)
  • gt (T T, lt T T gt)
  • gt (P T, lt P T gt)
  • gt (i T, lt i T gt)
  • gt (i P T, lt i lt P T gt gt)
  • gt (i i T, lt i lt i T gt gt)
  • gt (i i P, lt i lt i P gt gt)
  • gt (i i i, lt i lt i i gt gt)


i

i
i
37
String to Tree Transduction
  • Definition A transduction grammar is simple if
    for every rule A ? ? gt ß, the sequence of
    nonterminals appearing in ? is identical to the
    sequence appearing in ß.
  • Example E ? E T gt lt E T gt
  • ? T gt T
  • T ? P T gt lt P T gt
  • ? P gt P
  • P ? (E) gt E
  • ? i gt i

38
String to Tree Transduction
  • For notational convenience, we dispense with both
    the nonterminals and the tree notation in the
    translation parts, leaving
  • E ? E T gt
  • ? T
  • T ? P T gt
  • ? P
  • P ? (E)
  • ? i gt i Look familiar ?

39
Abstract Syntax Trees
  • AST is a condensed version of the derivation
    tree.
  • No noise (intermediate nodes).
  • Result of simple String-to-tree transduction
    grammar.
  • Rules of the form A ? ? gt 's'.
  • Build 's' tree node, with one child per tree from
    each nonterminal in ?.
  • We transduce from vocabulary of input symbols
    (which appear in ?), to vocabulary of tree node
    names.

40
Sample AST
Input - i - i ( i i ) / i i
DT
G
AST
41
The Game of Syntactic Dominoes
  • The grammar
  • E ? ET T ? PT P ? (E)
  • ? T ? P ? i
  • The playing pieces An arbitrary supply of each
    piece (one per grammar rule).
  • The game board
  • Start domino at the top.
  • Bottom dominoes are the "input".

42
(No Transcript)
43
Parsing The Game of Syntactic Dominoes (contd)
  • Game rules
  • Add game pieces to the board.
  • Match the flat parts and the symbols.
  • Lines are infinitely elastic.
  • Object of the game
  • Connect start domino with the input dominoes.
  • Leave no unmatched flat parts.

44
Parsing Strategies
  • Same as for the game of syntactic dominoes.
  • Top-down parsing start at the start symbol,
    work toward the input string.
  • Bottom-up parsing start at the input string,
    work towards the goal symbol.
  • In either strategy, can process the input
    left-to-right ? or right-to-left ?

45
Top-Down Parsing
  • Attempt a left-most derivation, by predicting the
    re-write that will match the remaining input.
  • Use a string (a stack, really) from which the
    input can be derived.

46
Top-Down Parsing
  • Start with S on the stack.
  • At every step, two alternatives
  • ? (the stack) begins with a terminal t. Match t
    against the first input symbol.
  • ? begins with a nonterminal A. Consult an OPF
    (omniscient parsing function) to determine which
    production for A would lead to a match with the
    first symbol of the input.
  • The OPF does the predicting in such a
    predictive parser.

47
(No Transcript)
48
Classical Top-Down Parsing Algorithm
  • Push (Stack, S)
  • while not Empty (Stack) do
  • if Top(Stack) ??
  • then if Top(Stack) Head(input)
  • then input tail(input)
  • Pop(Stack)
  • else error (Stack, input)
  • else P OPF (Stack, input)
  • Push (Pop(Stack), RHS(P))
  • od

49
(No Transcript)
50
Top-Down Parsing (contd)
  • Most parsing methods impose bounds on the amount
    of stack lookback and input lookahead. For
    programming languages, a common choice is (1,1).
  • We must define OPF (A,t), where A is the top
    element of the stack, and t is the first symbol
    on the input.
  • Storage requirements O(n2), where n is the size
    of the grammar vocabulary
  • (a few hundred).

51
Top-Down Parsing
A
?
t
  • OPF (A, t) A ? ? if
  • ? gt t?, for some ?.
  • ? gt e, and S gt ?A?t?, for some ?, ?, where ?
    gt e.

or
52
Top-Down Parsing
  • Example S ? A B ? b
  • (illustrating 1) A ? BAd C ? c
  • ? C
  • OPF b c d
  • B B ? b B ? b B ? b
  • C C ? c C ? c C ? c
  • S S ? A S ? A S ? A
  • A A ? BAd A ? C ???
  • OPF (A, b) A ? BAd because BAd gt bAd
  • OPF (A, c) A ? C because C gt c
  • i.e., B begins with b, and C begins with c.

Tan entries are optional. So is the ??? entry.
53
Top-Down Parsing
  • Example (illustrating 2) S ? A A ? bAd

  • ?
  • OPF b d ?
  • S S ? A S ? A
  • A A ? bAd A ? A ?
  • OPF (S, b) S ? A , because A gt bAd
  • OPF (S, d) -------- , because S gt
    aS?dß
  • OPF (S, ? ) S ? A , because S? is legal
  • OPF (A, b) A ? bAd , because A gt bAd
  • OPF (A, d) A ? , because S gt bAd
  • OPF (A, ? ) A ? , because S? gtA?

54
Top-Down Parsing
  • Definition
  • First (A) t / A gt t?, for some ?
  • Follow (A) t / S gt ?Atß, for some ?, ß
  • Computing First sets
  • Build graph (?, d), where (A,B) ? d if
  • B ? ?A?, ? gt e (First(A) ? First(B))
  • Attach to each node an empty set of terminals.
  • Add t to the set for A if A ? ?A?, ? gt e.
  • Propagate the elements of the sets along the
    edges of the graph.

55
Top-Down Parsing
  • Example S ? ABCD A ? CDA C ? A
  • B ? BC ? a D ? AC
  • ? b ?
  • Nullable A, C, D

a, b
b
S
B
White after step 3 Tan after step 4
A
C
a
a
D
a
56
Top-Down Parsing
  • Computing Follow Sets
  • Build graph (?, d), where (A,B) ? d if
  • A ? ?B?, ? gt e.
  • Follow(A) ? Follow(B), because any symbol X that
    follows A, also follows B.

A
X
?
B
a
e
57
Top-Down Parsing
  • Attach to each node an empty set of terminals.
    Add ? to the set for the start symbol.
  • Add First(X) to the set for A (i.e. Follow(A)) if
  • B ? ?A?X?, ? gt e.
  • Propagate the elements of the sets along the
    edges of the graph.

58
Top-Down Parsing
  • Example S ? ABCD A ? CDA C ? A
  • B ? BC ? a D ? AC
  • ? b ?
  • Nullable A, C, D First(S) a, b
  • First(C) a
  • First(A) a
  • First(D) a
  • First(B) b



a
-
,
S
B
-
a
,b,
A
C
a,b,

-
-
White after step 3 Tan after step 4
a
,b,
D
-
59
Top-Down Parsing
  • So,
  • Follow(S) ?
  • Follow(A) Follow(C) Follow(D) a, b, ?
  • Follow(B) a, ?

60
Top-Down Parsing
  • Back to Parsing
  • We want OPF(A, t) A ? ? if either
  • t ? First(?),
  • i.e. ? gt tß
  • ? gt e and t ? Follow(A),
  • i.e. S gt ?A?
  • gt ?Atß

A a
?
t ß
A a
?
e
t ß
61
Top-Down Parsing
  • Definition Select (A? ?)
  • First(?) U
  • if ? gt e then Follow(A)
  • else ø
  • So PT(A, t) A ? ? if t ? Select(A ? ?)
  • Parse Table, rather than OPF, because it isnt
  • omniscient.

62
Top-Down Parsing
  • Example First (S) a, b Follow (S) ?
  • First (A) a Follow(A) a, b, ?
  • First (B) b Follow(B) a, ?
  • First (C) a Follow (C) a, b, ?
  • First (D) a Follow(D) a, b, ?
  • Grammar Selects sets

S ? ABCD a, b B ? BC b
? b b A ? CDA a, b, ? ? a a
? a, b, ? C ? A a, b, ? D ? AC a, b,
?
Grammar is not LL(1)
63
Top-Down Parsing
Non LL(1) grammar multiple entries in PT.
S ? ABCD a, b C ? A
a, b, ? B ? BC b D ? AC
a, b, ? ? b b A ? CDA a, b,
? ? a a ? a, b,
?
  • a b -
  • S S ? ABCD S ? ABCD
  • A A ? CDA, A? a, A ? A ? CDA, A ? A ?
    CDA,A ?
  • B B ? BC, B ? b
  • C C ? A C ? A C ? A
  • D D ? AC D ? AC D ? AC

64
LL(1) Grammars
  • Definition A CFG G is LL(1)
  • ( Left-to-right, Left-most, (1)-symbol lookahead)
  • iff for all A? ?, and for all productions
  • A??, A ?? with ? ? ?,
  • Select (A ? ?) n Select (A ? ?) ?
  • Previous example grammar is not LL(1).
  • More later on what do to about it.

65
Sample LL(1) Grammar
  • S ? A b,?
  • A ? bAd b
  • ? d, ?

Disjoint! Grammar is LL(1) !
d b ?
S S ? A S ? A
A A ? A ? bAd A ?
One production per entry.
66
Example
  • Build the LL(1) parse table for the following
    grammar.
  • S ? begin SL end begin
  • ? id E id
  • SL ? SL S begin,id
  • ? S begin,id
  • E ? ET (, id
  • ? T (, id
  • T ? PT (, id
  • ? P (, id
  • P ? (E) (
  • ? id id




- not LL(1)
67
(No Transcript)
68
Example (contd)
  • Lemma Left recursion always produces a
    non-LL(1) grammar (e.g., SL, E above)
  • Proof Consider
  • A ? A? First (?) or Follow (A)
  • ? ? First (?) Follow (A)

69
Problems with our Grammar
  • SL is left recursive.
  • E is left recursive.
  • T ? P T both begin with the same ? P
    sequence of symbols (P).

70
Solution to Problem 3
  • Change T ? P T (, id
  • ? P (, id
  • to T ? P X (, id
  • X ? T
  • ? , , )
  • Follow(X)
  • Follow(T) due to T ? P X
  • Follow(E) due to E ? ET , E ? T
  • , , ) due to E ? ET, S ? id E
  • and P ? (E)

Disjoint!
71
Solution to Problem 3 (contd)
  • In general, change
  • A ? ??1
  • ? ??2
  • . . .
  • ? ??n
  • to A ? ? X
  • X ? ?1
  • . . .
  • ? ?n

Hopefully all the ?s begin with different symbols
72
Solution to Problems 1 and 2
  • We want (((( T T) T) T))
  • Instead, (T) (T) (T) (T)
  • Change E ? E T (, id
  • ? T (, id
  • To E ? T Y (, id
  • Y ? T Y
  • ? , )
  • Follow(Y) ? Follow(E)
  • , )

No longer contains , because we eliminated the
production E ? E T
73
Solution to Problems 1 and 2 (contd)
  • In general,
  • Change A ? A?1 A ? ? 1
  • . . . . . .
  • ? A?n ? ? m
  • to A ? ?1 X X ? ?1 X
  • . . . . . .
  • ? ?m X ? ?n X
  • ?

74
Solution to Problems 1 and 2 (contd)
  • In our example,
  • Change SL ? SL S begin, id
  • ? S begin, id
  • To SL ? S Z begin, id
  • Z ? S Z begin, id
  • ? end

75
Modified Grammar
  • S ? begin SL end begin
  • ? id E id
  • SL ? S Z begin,id
  • Z ? S Z begin,id
  • ? end
  • E ? T Y (,id
  • Y ? T Y
  • ? ,)
  • T ? P X (,id
  • X ? T
  • ? ,,)
  • P ? (E) (
  • ? id id

Disjoint. Grammar is LL(1)
76
(No Transcript)
77
(No Transcript)
78
Recursive Descent Parsing
  • Top-down parsing strategy, suitable for LL(1)
    grammars.
  • One procedure per nonterminal.
  • Contents of stack embedded in recursive call
    sequence.
  • Each procedure commits to one production, based
    on the next input symbol, and the select sets.
  • Good technique for hand-written parsers.

79
Sample Recursive Descent Parser
  • proc S S ? begin SL end
  • ? id E
  • case Next_Token of
  • T_begin Read(T_begin)
  • SL
  • Read (T_end)
  • T_id Read(T_id)
  • Read (T_)
  • E
  • Read (T_)
  • otherwise Error
  • end
  • end

Read (T_X) verifies that the upcoming token is
X, and consumes it.
Next_Token is the upcoming token.
80
Sample Recursive Descent Parser
  • proc SL SL ? SZ
  • S
  • Z
  • end
  • proc E E ? TY
  • T
  • Y
  • end

Technically, should have insisted that Next Token
be either T_begin or T_id, but S will do that
anyway. Checking early would aid error
recovery.
// Ditto for T_( and T_id.
81
Sample Recursive Descent Parser
  • proc ZZ ? SZ
  • ?
  • case Next Token of
  • T_begin, T_id SZ
  • T_end
  • otherwise Error
  • end
  • end

82
Sample Recursive Descent Parser
Could have used a case statement
  • proc Y Y ? TY
  • ?
  • if Next Token T_ then
  • Read (T_)
  • T
  • Y
  • end
  • proc T T ? PX
  • P
  • X
  • end

Could have checked for T_( and T_id.
83
Sample Recursive Descent Parser
  • proc XX ? T
  • ?
  • if Next Token T_ then
  • Read (T_)
  • T
  • end

84
Sample Recursive Descent Parser
  • proc P P ?(E)
  • ? id
  • case Next Token of
  • T_( Read (T_()
  • E
  • Read (T_))
  • T_id Read (T_id)
  • otherwise Error
  • end
  • end

85
String-To-Tree Transduction
  • Can obtain derivation or abstract syntax tree.
  • Tree can be generated top-down, or bottom-up.
  • We will show how to obtain
  • Derivation tree top-down
  • AST for the original grammar, bottom-up.

86
TD Generation of Derivation Tree
  • In each procedure, and for each alternative,
    write out the appropriate production AS SOON AS
    IT IS KNOWN

87
TD Generation of Derivation Tree
  • proc S S ? begin SL end
  • ? id E
  • case Next_Token of
  • T_begin Write(S ? begin SL end)
  • Read(T_begin)
  • SL
  • Read(T_end)

88
TD Generation of Derivation Tree
  • T_id Write(S ? id E)
  • Read(T_id)
  • Read (T_)
  • E
  • Read (T_)
  • otherwise Error
  • end
  • end

89
TD Generation of Derivation Tree
  • proc SL SL ? SZ
  • Write(SL ? SZ)
  • S
  • Z
  • end
  • proc E E ? TY
  • Write(E ? TY)
  • T
  • Y
  • end

90
TD Generation of Derivation Tree
  • proc Z Z ? SZ
  • ?
  • case Next_Token of
  • T_begin, T_id Write(Z ? SZ)
  • S
  • Z
  • T_end Write(Z ? )
  • otherwise Error
  • end
  • end

91
TD Generation of Derivation Tree
  • proc Y Y ? TY
  • ?
  • if Next_Token T_ then
  • Write (Y ? TY)
  • Read (T_)
  • T
  • Y
  • else Write (Y ? )
  • end

92
TD Generation of Derivation Tree
  • proc T T ? PX
  • Write (T ? PX)
  • P
  • X
  • end
  • proc XX ? T
  • ?

93
TD Generation of Derivation Tree
  • if Next_Token T_ then
  • Write (X ? T)
  • Read (T_)
  • T
  • else Write (X ? )
  • end

94
TD Generation of Derivation Tree
  • proc PP ? (E)
  • ? id
  • case Next_Token of
  • T_( Write (P ? (E))
  • Read (T_()
  • E
  • Read (T_))
  • T_id Write (P ? id)
  • Read (T_id)
  • otherwise Error
  • end

95
Notes
  • The placement of the Write statements is obvious
    precisely because the grammar is LL(1).
  • Can build the tree as we go, or have it built
    by a post-processor.

96
Example
  • Input String
  • begin id (id id) id end
  • Output

S ? begin SL end SL ? SZ S ? id E E ? TY T ?
PX P ? (E) E ? TY T ? PX P ? id X ?
Y ? TY T ? PX P ? id X ? Y ? X ? T T ? PX P ?
id X ? Y ? Z ?
97
(No Transcript)
98
Bottom-up Generation of the Derivation Tree
  • We could have placed the write statements at the
    END of each phrase, instead of the beginning. If
    we do, the tree will be generated bottom-up.
  • In each procedure, and for each alternative,
    write out the production A ? ? AFTER ? is parsed.

99
BU Generation of the Derivation Tree
  • proc SS ? begin SL end
  • ? id E
  • case Next_Token of
  • T_begin Read (T_begin)
  • SL
  • Read (T_end)
  • Write (S ? begin SL end)
  • T_id Read (T_id)
  • Read (T_)
  • E
  • Read (T_)
  • Write (S ? idE)
  • otherwise Error
  • end

100
BU Generation of the Derivation Tree
  • proc SL SL ? SZ
  • S
  • Z
  • Write(SL ? SZ)
  • end
  • proc E E ? TY
  • T
  • Y
  • Write(E ? TY)
  • end

101
BU Generation of the Derivation Tree
  • proc Z Z ? SZ
  • ?
  • case Next_Token of
  • T_begin, T_id S
  • Z
  • Write(Z ? SZ)
  • T_end Write(Z ? )
  • otherwise Error
  • end
  • end

102
BU Generation of the Derivation Tree
  • proc Y Y ? TY
  • ?
  • if Next_Token T_ then
  • Read (T_)
  • T
  • Y
  • Write (Y ? TY)
  • else Write (Y ? )
  • end

103
BU Generation of the Derivation Tree
  • proc T T ? PX
  • P
  • X
  • Write (T ? PX)
  • end
  • proc XX ? T
  • ?
  • if Next_Token T_ then
  • Read (T_)
  • T
  • Write (X ? T)
  • else Write (X ? )
  • end

104
BU Generation of the Derivation Tree
  • proc PP ? (E)
  • ? id
  • case Next_Token of
  • T_( Read (T_()
  • E
  • Read (T_))
  • Write (P ? (E))
  • T_id Read (T_id)
  • Write (P ? id)
  • otherwise Error
  • end

105
Notes
  • The placement of the Write statements is still
    obvious.
  • The productions are emitted as procedures quit,
    not as they start.
  • Productions emitted in reverse order, i.e., the
    sequence of productions must be used in reverse
    order to obtain a right-most derivation.
  • Again, can built tree as we go (need stack of
    trees), or later.

106
Example
  • Input String
  • begin id (id id) id end
  • Output

P ? id X ? T ? PX P ? id X ? T ? PX Y ? Y ?
TY E ? TY P ? (E)
P ? id X ? T ? PX X ? T T ? PX Y ? E ? TY S ?
idE Z ? SL ? SZ S ? begin SL end
107
(No Transcript)
108
Replacing Recursion with Iteration
  • Not all the nonterminals are needed.
  • The recursion in SL, X, Y and Z can be replaced
    with iteration.

109
Replacing Recursion with Iteration
SL ? S Z Z ? S Z ?
  • proc S S ? begin SL end
  • ? id E
  • case Next_Token of
  • T_begin Read(T_begin)
  • repeat
  • S
  • until Next_Token ? T_begin,T_id
  • Read(T_end)
  • T_id Read(T_id)
  • Read (T_)
  • E
  • Read (T_)
  • otherwise Error
  • end
  • end

SL
Replaces call to SL.
Replaces recursion on Z.
110
Replacing Recursion with Iteration
  • proc E E ? TY
  • Y ? TY
  • ?
  • T
  • while Next_Token T_ do
  • Read (T_)
  • T
  • od
  • end

Replaces recursion on Y.
111
Replacing Recursion with Iteration
  • proc T T ? PX
  • X ? T
  • ?
  • P
  • if Next_Token T_
  • then Read (T_)
  • T
  • end

Replaces call to X.
112
Replacing Recursion with Iteration
  • proc PP ? (E)
  • ? id
  • case Next_Token of
  • T_( Read (T_()
  • E
  • Read (T_))
  • T_id Read (T_id)
  • otherwise Error
  • end
  • end

113
Construction of Derivation Tree for the Original
Grammar (Bottom Up)
  • proc S (1)S ? begin SL end (2)S ? begin SL
    end
  • ? id E ? id E
  • SL ? SZ SL ? SL S
  • Z ? SZ ? S
  • ?
  • case Next_Token of
  • T_begin Read(T_begin)
  • S
  • Write (SL ? S)
  • while Next_Token in T_begin,T_id do
  • S
  • Write (SL ? SL S)
  • od
  • Read(T_end)
  • T_id Read(T_id)
  • Read (T_)
  • E
  • Read (T_)
  • Write (SL ? id E)

114
Construction of Derivation Tree for the Original
Grammar (Bottom Up)
  • proc E (1)E ? TY (2) E ? ET
  • Y ? TY ? T
  • ?
  • T
  • Write (E ? T)
  • while Next_Token T_ do
  • Read (T_)
  • T
  • Write (E ? ET)
  • od
  • end

115
Construction of Derivation Tree for the Original
Grammar (Bottom Up)
  • proc T (1)T ? PX (2) T ? PT
  • X ? T ? P
  • ?
  • P
  • if Next_Token T_
  • then Read (T_)
  • T
  • Write (T ? PT)
  • else Write (T ? P)
  • end

116
Construction of Derivation Tree for the Original
Grammar (Bottom Up)
  • proc P(1)P ? (E) (2)P ? (E)
  • ? id ? id
  • // SAME AS BEFORE
  • end

117
Example
  • Input String
  • begin id (id id) id end
  • Output

P ? id T ? P E ? T P ? id T ? P E ? ET P ? (E) P
? id T ? P
T ? PT E ? T S ? idE SL? S S ? begin SL end
118
(No Transcript)
119
Generating the Abstract Syntax Tree, Bottom Up,
for the Original Grammar
  • proc S S ? begin S end ? 'block'
  • ? id E ? 'assign'
  • var Ninteger
  • case Next_Token of
  • T_begin Read(T_begin)
  • S
  • N1
  • while Next_Token in T_begin,T_id do
  • S
  • NN1
  • od
  • Read(T_end)
  • Build Tree ('block',N)
  • T_id Read(T_id)
  • Read (T_)
  • E
  • Read (T_)
  • Build Tree ('assign',2)
  • otherwise Error

Build Tree (x,n) pops n trees from the stack,
builds an x node as their parent, and pushes
the resulting tree.
Assume this builds a node.
120
Generating the Abstract Syntax Tree, Bottom Up,
for the Original Grammar
  • proc E E ? ET ?''
  • ? T
  • T
  • while Next_Token T_ do
  • Read (T_)
  • T
  • Build Tree ('',2)
  • od
  • end

Left branching in tree!
121
Generating the Abstract Syntax Tree, Bottom Up,
for the Original Grammar
  • proc T T ? PT ?''
  • ? P
  • P
  • if Next_Token T_
  • then Read (T_)
  • T
  • Build Tree ('',2)
  • end

Right branching in tree!
122
Generating the Abstract Syntax Tree, Bottom Up,
for the Original Grammar
  • proc PP ? (E)
  • ? id
  • // SAME AS BEFORE,
  • // i.e.,no trees built
  • end

123
Example
  • Input String
  • begin id1 (id2 id3) id4 end
  • Sequence of events

id1
id4
id2
BT('',2) BT('assign',2) BT('block',1)
id3
BT('',2)
124
(No Transcript)
125
Summary
  • Bottom-up or top-down tree construction.
  • Original or modified grammar.
  • Derivation Tree or Abstract Syntax Tree.
  • Technique of choice
  • Top-down, recursive descent parser.
  • Bottom-up tree construction for the original
    grammar.

126
LR Parsing
  • Procedures in the recursive descent code can be
    annotated with items, i.e. productions with a
    dot marker somewhere in the right-part.
  • We can use the items to describe the operation of
    the recursive descent parser.
  • There is an FSA that describes all possible
    calling sequences in the R.D. parser.

127
Recursive Descent Parser with items
  • Example
  • proc E E ? .E T, E ?.T
  • T E ? E. T, E ? T.
  • while Next_Token T_ do
  • E ? E. T
  • Read(T_) E ? E .T
  • T E ? E T.
  • od
  • E ? E T. E ? T.
  • end

T
T


T
128
FSA Connecting Items
  • The FSA is
  • M (DP, V, ?, S ? .S?, S ? S?.)
  • where DP is the set of all possible items (DP
    dotted productions), and ? is defined such that
  • simulate a call to B
  • simulate the execution of statement
  • X, if X is a nonterminal, or
  • Read(X), if X is a terminal.

?
1
A ? a.Bß
B ? . ?
X
2
A ? a.Xß
A??X.ß
129
FSA Connecting Items
  • Example E ? E T T ? i S ? E ?
  • ? T T ? (E)

E
-
S ? . E?
S ? E ? .
S ? E . ?
e
T
E ? . T
E ? T .
e
e
e
i
e
T ? . i
T ? i .
e
(
E
T ? . (E)
T ? (E) .
T ? (.E)
)
e
T ? (E.)
e
e
e
E

T
E ? .E T
E ? E. T
E ? E . T
E ? E T.
130
FSA Connecting Items
  • Need to run this machine with the aid of a stack,
    i.e. need to keep track of the recursive calling
    sequence.
  • To return from A ? ?., back up ? 1 states,
    then advance on A.
  • Problem with this machine it is
    nondeterministic.
  • No problem. Be happy ?. Transform it to a DFA !

131
Deterministic FSA Connecting Items
-
E
-
-
-
S ? . E
S ? E .
S ? E .
E ? .E T
E ? E. T

E ? . T
T ? . i
i
i
T
E ? E T.
E ? E . T
T ? i .
T ? . (E)
T ? .i
(
T ? .(E)
i
T
(
T
E ? T .

T ? (.E)
E ? .E T
E
)
T ? (E.)
(
T ? (E) .
E ? .T
E ? E. T
T ? .i
T ? .(E)
  • THIS IS AN LR(0) AUTOMATON

132
LR Parsing
  • LR means Left-to-Right, Right-most Derivation.
  • Need a stack of states to operate the parser.
  • No look-ahead required, thus LR(0).
  • DFA describes all possible positions in the R.D.
    parsers code.
  • Once the automaton is built, items can be
    discarded.

133
LR Parsing
  • Operation of an LR parser
  • Two moves shift and reduce.
  • Shift Advance from current state on Next_Token,
    push new state on stack.
  • Reduce (on A ? ?). Pop ? states from stack.
    Advance from new top state on A.

134
LR Parsing
  • Stack Input Derivation Tree
  • 1 i (i i) i (
    i i )
  • 14 (i i)
  • 13 (i i)
    T
  • 12 (i i)
  • 127 (i i)
    E
  • 1275 i i)
  • 12754 i)
  • 12753 i)
    T
  • 12758 i)
  • 127587 i)
    E
  • 1275874 )
  • 1275879 )
    T
  • 12758 )
    E
  • 12758 10
  • 1279
    T
  • 12
    E
  • 126 ------

E ? T
T
1
3
(
T
E
i
(
i
2
4
5
T?i

i
E
(
-
6
7
8

)
T
9
10
T ? (E)
E ? ET
135
LR Parsing
  • Table Representation of LR Parsers
  • Two Tables
  • Action Table indexed by state, and by terminal
    symbol. Contains all shift and reduced moves.
  • GOTO Table indexed by state, and by nonterminal
    symbol. Contains all transitions on nonterminals
    symbols.

136
LR Parsing
ACTION GOTO
i ( ) E T
-
  • Example

E ? T
1 S/4 S/5 2 3
2 S/7 S/6
3 R/E?T R/E?T R/E?T R/E?T R/E?T
4 R/T? i R/T? i R/T? i R/T? i R/T? i
5 S/4 S/5 8 3
6 Accept Accept Accept Accept Accept
7 S/4 S/5 9
8 S/7 S/10
9 R/ E ?ET R/ E ?ET R/ E ?ET R/ E ?ET R/ E ?ET
10 R/ T ? (E) R/ T ? (E) R/ T ? (E) R/ T ? (E) R/ T ? (E)
T
1
3
(
T
E
i
(
2
4
5
T?i
i

i
E
(
-
6
7
8

)
T
9
10
T ? (E)
E ? ET
137
LR Parsing
  • Algorithm LR_Driver
  • Push(Start_State, S)
  • while ACTION (Top(S), ?) ? Accept do
  • case ACTION (Top(S), Next_Token) of
  • s/r Read(Next_Token)
  • Push(r, S)
  • R/A ? ? Pop(S) ? times
  • Push(GOTO (Top(S), A), S)
  • empty Error
  • end
  • end

138
LR Parsing
  • Direct Construction of the LR(0) Automaton
  • PT(G) Closure(S ? .S ? ) U
  • Closure(P) P ? Successors(P), P ? PT(G)
  • Closure(P) P U A ? .w B ? a.Aß ? Closure(P)
  • Successors(P) Nucleus(P, X) X ? V
  • Nucleus(P, X) A ? aX .ß A ? a.Xß ? P

139
LR Parsing
  • Direct Construction of Previous Automaton

-
E
E
)
T ? (E.) E ? E. T
S ? .E E ? .E T E ? .T T ? .i T ? .(E)
T ? (.E) E ? .E T E ? .T T ? .i T ? .(E)
2
8
10
1
5
8
E
E

2
8
7
T
T
E ? E T.
3
3
9
i
i
4
4
T ? (E).
10
(
(
5
5
-
-
S ? E . E ? E. T
S ? E?.
2
6
6

7
E ? E .T T ? .i T ? .(E)
T
9
7
E ? T.
i
3
4
T ? i.
(
5
4
140
LR Parsing
  • Notes
  • Two states are equal if their Nuclei are
    identical.
  • This grammar is LR(0) because there are no
    conflicts.
  • A conflict occurs when a state contains
  • i Both a final (dot-at-the-end) item and
    non-final one (shift-reduce), or
  • ii Two or more final items (reduce-reduce).

141
LR Parsing
  • Example E ? E T T ? P T P ? i
  • ? T ? P P ? (E)

-
E
E
T
T ? P .T T ? .P T T ? .P P ? .i P ? .(E)
S ? .E E ? .E T E ? .T T ? .P T T ? .P P ?
.i P ? .(E)
P ? (.E) E ? .E T E ? .T T ? .P T T ? .P P ?.
i P ? .(E)
2
10
12
1
6
9
E
E
P
2
10
4
P
T
T
4
3
3
i
P
P
5
4
4
(
P
P
4
6
4
i
i
5
5
)
P ? (E.) E ? E. T
13
10
(

(
6
8
6
-
-
-
S ? E .
S ? E . E ? E. T
E ? E T.
7
11
2
7
T
E ? E .T T ? .P T T ? .P P ? .i P ? .(E)
11

8
8
T ? P T .
12
E ? T.
P
4
3
P
P ? (E).
4
13
T ? P. T T ? P.

9
4
i
5
Grammar is not LR(0).
(
6
P ?i.
5
142
LR Parsing
  • Solution Use lookahead!
  • In LL(1), lookahead is used at the beginning of
    the production.
  • In LR(1), lookahead is used at the end of the
    production.
  • We will use SLR(1) Simple LR(1)
  • LALR(1) Lookahead LR(1)

143
LR Parsing
  • The Conflict appears in the ACTION table, as
    multiple entries.
  • i ( )
  • 1 S/5 S/6
  • 2 S/8
    S/7
  • 3 R/E?T
  • 4
  • 5 R/P?i
  • 6 S/5 S/6
  • 7 Accept
  • 8 S/5 S/6
  • 9 S/5 S/6
  • 10 S/8
    S/13
  • 11 R/E?ET
  • 12 R/T?PT
  • 13 R/P?(E)

-
ACTION
R/T?P S/9,R/T?P R/T?P
144
LR Parsing
  • SLR(1) For each inconsistent state p, compute
    Follow(A) for each conflict production A ? ?.
    Then place R/A ? ? in the ACTION table, row p,
    column t, only if t ? Follow(A). In our case,
    Follow(T) ? Follow(E) , ), ? . So,
  • i ( )
  • 4 R/T?P S/9
    R/T?P R/T?P

-
Grammar is SLR(1)
145
LR Parsing
  • Example S ? aSb anbn/ n gt 0
  • ?

-
S
4
-
1
2
4
S
S ? .S S ? .aSb S ? .
2
S ?
1
a
3
a
b
S
3
5
6
S ? aSb
-
-
S ?
S ? S .
a
4
2
a b ? S
1 S/3 R/S? R/S? R/S? 2
2 S/4
3 S/3 R/S? R/S? R/S? 5
4 Accept Accept Accept
5 S/6
6 R/S?aSb
S ? a.Sb S ? .aSb S ? .
S
5
3
a
3
-
S ? S .
4
b
S ? aS.b
6
5
S ? aSb.
6
Grammar is not LR(0)
146
LR Parsing
  • SLR(1) Analysis
  • State 1 Follow(S) b, ?. Since a ?
    Follow(S), the shift/reduce conflict is
    resolved.
  • State 3 Same story.
  • Rows 1 and 3 become
  • a b - S
  • 1 S/3 R/S ? R/S ? 2
  • 3 S/3 R/S ? R/S ? 5
  • All single entries. Grammar is SLR(1).

147
LR Parsing
  • LALR(1) Grammars
  • Consider the grammar S ? AbAa A ? a
  • ? Ba B ? a
  • LR(0)
  • Automaton

S
?
1
2
6
A
a
b
3
7
10
A ? a
A
a
A ? AbAa
9
11
B
a
4
8
S ? Ba
a
A ? a
5
Grammar is not LR(0) reduce-reduce conflict.
B ? a
148
LR Parsing
  • SLR(1) Analysis (State 5)
  • Follow(A) a, b
  • Follow(B) a

Conflict not resolved. Grammar is not SLR(1).
149
LR Parsing
  • LALR(1) Technique
  • I. For each conflicting reduction A ? ? at each
    inconsistent state q, find all nonterminal
    transitions (pi, A) such that
  • II. Compute Follow(pi, A) (see below), for all
    i, and union together the results. The resulting
    set is the LALR(1) lookahead set for the A ? ?
    reduction at q.

A
p1
?
q
A ? ?
?
A
pn
150
LR Parsing
  • Computation of Follow(p, A)
  • Ordinary Follow computation, except on a
    different grammar, called G. G embodies both
    the structure of G, and the structure of the
    LR(0) automaton. To build G For each
    nonterminal transition (p, A) and for each
    production A ? ?, there exists the following in
    the LR(0) automaton

  • For each such situation, G contains a production
    of the form
  • (p, A) ? (p, w1)(p2, w2)(pn, wn)

A
p
w1
wn
w2

q
A ? w1wn
151
LR Parsing
  • In our example G S ? AbAa A ? a
  • ? Ba B ? a
  • G (1, S) ? (1, A)(3, b)(7, A)(9, a)
  • ? (1, B)(4, a)
  • (1, A) ? (1, a)
  • (7, A) ? (7, A)
  • (1, B) ? (1, a)

S
?
1
2
6
A
a
b
3
7
10
A ? a
A
a
A ? AbAa
9
11
B
a
4
8
S ? Ba
a
A ? a
5
B ? a
these have split!
152
LR Parsing
  • For the conflict in state 5, we need
  • Follow(1, A) (3, b)
  • Follow(1, B) (4, a). Extract the terminal
    symbols from these to obtain
  • a b -
  • 5 R/B ? a R/A ? a Conflict
    is resolved.
  • Grammar is LALR(1).

A ? a b
a
5
B ? a a
153
LR Parsing
  • Example S ? bBb B ? A
  • ? aBa A ? c
  • ? acb
  • LR(0)
  • Automaton

G
?
S
1
2
5
8
A ? c
c
b
B
b
11
6
4
S ? bBb
A
7
B ? A
A
S ? aBa
a
a
B
4
12
9
c
b
10
13
S ? acb
State 10 is inconsistent (shift-reduce conflict).
A ? c
Grammar is not LR(0).
154
LR Parsing
  • SLR(1) Analysis, state 10
  • Follow(A) ? Follow(B) a, b.
  • Grammar is not SLR(1).
  • LALR(1) Analysis Need Follow(4, A).
  • G (1,S) ? (1, b)(3, B)(6, b) (3, B) ? (3,
    A)
  • ? (1, a)(4, B)(9, a) (4, B) ? (4,
    A)
  • ? (1, a)(4, c)(10, b) (3, B) ? (3,
    c)
  • (4, A) ? (4, c)
  • Thus Follow(4, A) ? Follow(4, B) (9, a).
  • The lookahead set is a. The grammar is LALR(1).

155
LR Parsing
  • Example S ? aBd B ? A
  • ? aDa A ? a
  • ? bBa D ? a
  • ? bDb
  • LR(0)
  • Automaton

G
S
1
2
15
a
B
b
11
5
3
S ? aBb
D
a
a
7
12
S ? aDa
A
A ? a
8
6
B ? A
D ? a
A
a
a
b
B
9
13
4
S ? bBa
D
b
S ? bDb
10
14
State 10 is inconsistent. Grammar is not LR(0).
156
LR Parsing
  • SLR(1) Analysis Follow(A) Follow(B) a, b
  • Follow(D) a, b Grammar is not SLR(1).
  • LALR(1) Analysis
  • G (1, 5) ? (1, a)(3, B)(5, b) (3, B) ? (3, A)
  • ? (1, a)(3, D)(7, a) (3, D) ? (3,
    a)
  • ? (1, b)(4, B)(9, a) (3, A) ? (3, a)
  • ? (1, b)(4, D)(10, b) (4, B) ? (4, A)
  • (4, D) ? (4, a)
  • (4, A) ? (4, a)
  • Need Follow(3, A) U Follow(4, A) a, b
  • Follow(3, D) U Follow(4, D) a, b The
    lookahead sets are not disjoint. The grammar is
    not LALR(1).

?
157
LR Parsing
  • Solution Modify the LR(0) automaton, by
    splitting state 8 in two states.
  • LR(1) Parsers
  • Construction similar to LR(0).
  • Difference lookahead symbol carried explicitly,
    as part of each item, e.g. A ? a. ß t
  • PT(G) Closure(S ? .S ) U Closure(P)
  • P ? Successors(P), P ? PT(G)

-
158
LR Parsing
  • Closure(P) P U A ? .w t B ? a. Aß t ?
    Closure(P), t ? First(ßt)
  • Successors(P) Nucleus(P, X) X ? V
  • Nucleus(P, X) A ? aX. ? t A ? a. X? t ?
    P
  • Notes
  • New lookahead symbols appear during Closure.
  • Lookahead symbols are carried from state to
    state.

159
LR Parsing
  • Example S ? aBd B ? A
  • ? aDa A ? a
  • ? bBa D ? a
  • ? bDb

-
-
-
S ? bB.a
a
15
S
B
S ? .S S ? .aBd S ? .aDa S ? .bBa S ?
.bDb
2
S ? b.Ba S ? b.Db B ? .A a A ? .a a D ? .a
b
9
9
1
4
-
-
-
a
D
10
S ? bd.b
3
b
16
-
10
a
A
11
3
-
B ? A. a
b
11
4
a
12
-
A ? a. a D ? a. b
b
a
12
12
4
-
-
S ? aB.b
b
13
S
S ? S.
2
5
2
-
-
-
S ? aBb.
B
S ? b.Ba S ? b.Db B ? .A a A ? .a a D ? .a
b
5
S ? aD.a
13
a
6
13
-
3
-
D
6
S ? aDa.
B ? A.b
7
13
A
-
7
S ? bBa.
A ? a. b D ? a. b
8
13
a
8
-
S ? bDb.
a
13
8
160
LR Parsing
S
S ? S
1
2
-
b
5
13
S ? aBb
B
-
a
S ? aDa
6
14
A
a
3
D
B ? A b
7
  • No conflicts.
  • Grammar is LR(1).

A
D ? a a
8
A ? a b
A ? a a
12
B
D ? a b
11
B ? a a
A
b
4
D
-
b
10
16
S ? bDb
A
-
a
S ? bBa
9
15
161
Summary of Parsing
  • Top-Down Parsing
  • Hand-written or Table Driven (LL(1))

S
part of tree known
part of tree known
stack
w
part of tree left to predict
ß
a
remaining input
input already parsed
162
Summary of Parsing
ß
LL(1) Table
w
Driver
  • Two moves
  • Terminal on stack match input
  • Nonterminal on stack re-write according to
    Table.
Write a Comment
User Comments (0)
About PowerShow.com