Title: Chapter 4: Top-Down Parsing
1Chapter 4 Top-Down Parsing
2Objectives of Top-Down Parsing
- an attempt to find a leftmost derivation for an
input string. - an attempt to construct a parse tree for the
input string starting from the root and creating
the nodes of the parse tree in preorder.
3Input String
lm
lm
lm
gt
gt
gt
4 Approaches of Top-Down Parsing
- 1. with backtracking (making repeated scans of
the input, a general form of top-down parsing) -
- Methods To create a procedure for each
nonterminal.
5L cabd, cad
- e.g. S -gt cAd A -gt ab a
-
- S( ) if input symbol c A( )
isave input-pointer - Advance()
if input-symbol a - if A()
Advance() - if input-symbol d
if input-symbol b - Advance()
Advance() - return true
return true -
-
- return false
input-pointer isave -
if input-symbol a -
Advance() -
return true -
else -
return false -
c
a
d
6- Problems for top-down parsing with backtracking
- (1) left-recursion (can cause a top-down parser
to go into an infinite loop) - Def. A grammar is said to be left-recursive if
it has a nonterminal A s.t. there is a derivation
A gt A ? for some ? . -
- (2) backtracking - undo not only the movement but
also the semantics entering in symbol table. -
- (3) the order the alternatives are tried (For
the grammar shown above, try w cabd where A -gt
a is applied first)
7 Elimination of Left-Recursion
- With immediate left recursion A -gt A ? ?
- gt transform into A -gt ? A' A' -gt ? A' ?
A
A
?
A'
?
A
?
A'
gt
A
?
.
.
?
A'
.
A
.
?
A'
?
A
?
? ???
?
8e.g. E -gt E T T T -gt T F F
F -gt (E) id
- After transformation
- E -gt TE' E' -gt TE' ?
- T -gt FT' T' -gt FT' ?
- F -gt (E) id
9- General form (with left recursion)
- A -gt A ?1 A ?2 ... A ?n ?1 ?2 ...
?m -
- After transformation
-
- gt A -gt ?1 A' ?2 A' ... ?m A'
- A' -gt ?1 A' ?2 A' ... ?n A' ?
10- How about left recursion occurred for derivation
with more than two steps? - e.g., S -gt Aa b A -gt Ac Sd e
- where S gt Aa gt Sda
11Algorithm Eliminating left recursion
- Input Context-free Grammar G with no cycles
- (i.e., A gt A ) or ?-production
- Methods
- 1. Arrange the nonterminals in some order A1, A2,
... , An - 2. for i 1 to n do
- for j 1 to i -1 do
- replace each production of the form
Ai -gt Aj ? by the - production Ai -gt ?1 ? ?2 ? ...
?k ? , where - Aj -gt ?1 ?2 ... ?k are all
current Aj-production - eliminate the immediate left-recursion
among the Ai- - production
-
-
12An Example
- e.g. S -gt Aa b
- A -gt Ac Sd e
-
- Step 1 gt S -gt Aa b
- Step 2 gt A -gt Ac Aad bd e
- Step 3 gt A -gt bdA' eA' A' -gt cA' adA'
?
132. Non-backtracking (recursive-descent) parsing
- recursive descent use a collection of mutually
recursive - routines to perform the syntax analysis.
- Left Factoring A -gt ??1 ? ?2 gt A -gt ? A'
A' -gt ?1 ?2 - Methods
- For each nonterminal A find the longest prefix ?
common to two or more of its alternatives. If ? ?
? replace all the A productions - A -gt ? ?1 ? ?2 ... ? ?n others by
A -gt ? A others A' -gt ?1 ?2 ... ?n - 2. Repeat the transformation until no more found
- e.g. S -gt iCtS iCtSeS a C -gt b
- gt S -gt iCtSS' a S' -gt eS ? C -gt b
14Predicative Parsing
- Features
- - maintains a stack rather than recursive
calls - - table-driven
- Components
- 1. An input buffer with end marker ()
- 2. A stack with endmarker () on the bottom
- 3. A parsing table, a two-dimensional array
MA,a, where A is a nonterminal symbol and a
is the current input symbol (terminal/token). The
entry of each array element can be a production
(grammar rule) or blank. -
15Parsing Table
(
)
MA,a
S ? ( S ) S
S ? e
S ? e
S
16- Algorithm
- Input An input string w and a parsing table M
for grammar G. - Output A leftmost derivation of w or an error
indication.
17- Initially w is in input buffer and S is in the
stack. -
- Method
- do Let a of w be the next input symbol and X
be the top stack symbol - if X is a terminal
- if X a then pop X from stack
and remove a from input - else ERROR()
- else
- if MX, a X -gt Y1Y2...Yn then
- 1. pop X from the stack
- 2. push YnYn-1...Y1 onto the
stack with Y1 on top - else
- ERROR()
-
- while (X ? )
- if (X ) and (the next input symbol )
then accept else error()
Starting Symbol of the grammar
18(No Transcript)
19An Example
20(No Transcript)
21(No Transcript)
22Construction of the parsing table for predictive
parser
- First and Follow
-
- Def. First(?) /? denotes grammar symbol/
is the set of - terminals that begin the string derived
from ?. If ? gt ?, - then ? is also in First(?).
- Def. Follow(A), A is a nonterminal, is the
set of terminals a that can appear immediately
to the right of A in some sentential form, that
is, the set of terminals 'a' s.t. there exists a
derivation of the form S gt ? A a ? for some ?
and ?. If A can be the rightmost symbol in some
sentential form, then ? is in Follow(A).
23Compute First(X) for all grammar symbols X
- 1. If X is terminal, then First(X) X.
- 2. If X -gt ? is a production then ? is in
- First(X).
- 3. If X is nonterminal and X -gt Y1Y2...Yk is a
production, then place 'a' in First(X) if for
some i, a is in First(Yi), and ? is in all of
First(Y1), ... , First(Yi-1) that is Y1 ... Yi-1
gt ?. If ? is in First(Yj) for all j 1,2,...,k,
then add ? in First(X).
24An Example
- E -gt TE' E' -gt TE' ? T -gt FT' T' -gt FT
? - F -gt (E) id
-
- First(E) First(T) First(F) (, id
- First(E') , ?
- First(T') , ?
25(No Transcript)
26Compute Follow(A) for all nonterminals A
- 1. Place in Follow(S), where S is the start
symbol and is the input buffer endmarker. - 2. If there is a production A -gt ? B ?, then
everything in First(?) except for ? is placed in
Follow(B). - 3. If there is a production A -gt ? B, or a
production A -gt ? B ? where First(?) contains ?,
then everything in Follow(A) is in Follow(B).
27An Example
- E -gt TE' E' -gt TE' ? T -gt FT' T' -gt FT'
? - F -gt (E) id / E is the start symbol /
- Follow(E) ,) // rules 1 2
- Follow(E') ,) // rule 3
- Follow(T) ,,) // rules 2 3
- Follow(T') ,,) // rule 3
- Follow(F) ,,,) // rules 2 3
28E -gt TE' E' -gt TE' ? T -gt FT' T' -gt FT
? F -gt (E) id First(E) First(T)
First(F) (, id First(E') , ?
First(T') , ?
29Construct a Predicative Parsing Table
- 1. For each production A -gt ? of the grammar, do
steps 2 and 3. - 2. For each terminal a in First(?), add A -gt ? to
MA, a. - 3. If ? is in First(?), add A -gt ? to MA, b for
each terminal b in Follow(A). If ? is in First(?)
and is in Follow(A), add A -gt ? to MA, . - 4. Make each undefined entry of M be error.
30LL(1) grammar
- A grammar whose parsing table has no
multiply-defined entries is said to be LL(1). - First 'L' scan the input from left to
right. - Second 'L' produce a leftmost derivation.
- '1' use one input symbol to
determine parsing - action.
- No ambiguous or left-recursive grammar can
be LL(1).
31Properties of LL(1) grammar
- A grammar G is LL(1) iff whenever A -gt ? ?
are two distinct productions of G, the following
conditions hold - (1) For no terminal a do both ? and ? derive
strings beginning with a. (based on method 2) - ?
First(?) n First(?) F - (2) At most one of ? and ? can derive the
empty string ? (based on method 3). - (3) if ? gt ? then ? does not derive any
string beginning with a terminal in Follow (A)
(based on methods 2 and 3). - ?
First(?) n Follow(A) F - (i.e. If First(A) contains ? then First(A) n
Follow(A) F)
32- Def. for Multiply-defined entry
-
- If G is left-recursive or ambiguous, then M
- will have at least one multiply-defined entry.
- e.g.
- S -gt iCtSS' a S' -gt eS ? C -gt b
- generates
- MS',e S' -gt ?, S' -gt eS with multiply-
defined entry.
33 Parsing table with multiply-defined entry
a b e i t
S S-gt a S -gt iCtSS'
S S-gt ? S' -gt eS S-gt ?
C C-gtb
34Difficulty in predictive parsing
- Left recursion elimination and left factoring
make the resulting grammar hard to read and
difficult to use for translation purpose. - Thus
- Use predictive parser for control constructs
- Use operator precedence for expressions.