CS412/413 - PowerPoint PPT Presentation

About This Presentation

Title:

CS412/413

Description:

Bottom-up parsing (1 2 (3 4)) 5 (E 2 (3 4)) 5 (S 2 (3 4)) 5 (S E (3 4)) 5 ... Advantage of bottom-up parsing: can select productions based on more information ... – PowerPoint PPT presentation

Number of Views:27

Avg rating:3.0/5.0

Slides: 35

Provided by: andrew433

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: CS412/413

1
CS412/413

Introduction to
Compilers and Translators
Spring 99
Lecture 5 Bottom-up parsing

2
Outline

Creating LL(1) grammars
Limitations of LL(1) grammars
Bottom-up parsing
LR(0) parser construction

3
Administration

Should have received mail about group assignments
by now
Homework 1 due next class (Friday)
Monday considered 2 days late (-20), Tuesday 3
days (-40)
No class next Monday (Feb 8)

4
Programming Assignment

Due Monday, Feb 15
Implement a lexer for Iota language
Do not need to implement DFA construction
Opportunity to work as group
We expect high quality

5
Review

Can construct recursive descent parsers for LL(1)
grammars

Language grammar
How to perform this step?
LL(1) grammar
predictive parse table
recursive-descent parser
recursive-descent parser w/ AST generation
6
Grammars

Have been using grammar for language of sums
with parentheses
Original grammar
S ? S E E
E ? number ( S )
LL(1) grammar for same language
S ? ES
S ? ? S
E ? number ( S )

(1(34))5
7
Left-recursive vs Right-recursive
(1 2 (3 4)) 5

Original grammar was left-recursive
S ? S E
S ? E
LL(1) grammar is right-recursive parsed
top-down
S ? E S
S ? ? S
Left-recursive grammars dont work with top-down
parsing -- need an arbitrary amount of look-ahead

S ? E S S ? E
S
S E
(...) (...) (...) (...) ...
S E
S E
8
How to create an LL(1) grammar

Write a right-recursive grammar
S ? E S
S ? E
Left-factor common prefixes, place suffix in new
non-terminal
S ? E S
S ? ?
S ? S

9
Right Recursion
(1 2 (3 4)) 5
S

Right recursion right-associative

E S

( S ) S
5
E S
5
1
1
S
2
E S
3 4
2 S
E S

Left recursion left-associative

?
( S )
5

E S

S
3
E
3
4
1
2
4
10
Associativity

We can provide left-associativity by massaging
the recursive-descent code

void parse_S() switch (token) case (
case number parse_E() parse_S()
return default throw new ParseError()
void parse_S() switch (token) case
token input.read() parse_S()
return case ) return case EOF
return default throw new ParseError()
11
Associativity

void parse_S() // parses a sequence of E E E
...
switch (token)
case ( case number
parse_E()
switch (token)
case token input.read() parse_S()
return
case ) return
case EOF return
default throw new ParseError()
return
default throw new ParseError()

tail recursion
12
Flattening Associative Operators

void parse_S () // parses an arbitrary sequence
of E E E ...
while (true)
switch (token)
case (
case number
parse_E ()
switch (token)
case token input.read()
break case ) case EOF return
default throw new ParseError()
break
default throw new ParseError()

(1 2 (34)) 5

5
1
2
13
Summary

Now have complete recipe for building a parser

Language grammar
LL(1) grammar
predictive parse table
recursive-descent parser
recursive-descent parser w/ AST generation
14
Bottom-up parsing

A more powerful parsing technology
LR grammars -- more power than LL
can handle left-recursive grammars, virtually all
programming languages
More natural expression of programming language
syntax
Shift-reduce parsers
automatic parser generators (e.g. yacc)
detect errors as soon as possible
allows better error recovery

15
Top-down parsing
S ? S E E E ? number ( S )
(12(34))5

S ? SE ? EE ? (S)E ? (SE)E ?(SEE)E
?(EEE)E ?(1EE)E?(12E)E ...
In left-most derivation, entire tree above a
token (2) has been expanded when encountered
Must be able to predict!

S
S E
E
5
( S )
S E
( S )
S E
E
S E
2
4
1
E
3
16
Bottom-up parsing
S ? S E E E ? number ( S )

Right-most derivation-- backward
Start with the tokens
End with the start symbol
(12(34))5 ? (E2(34))5 ? (S2(34))5
?(SE(34))5 ? (S(34))5 ? (S(E4))5
?(S(S4))5 ?(S(SE))5 ? (S(S))5 ?(SE)5 ?
(S)5 ? E5 ? SE ? S

17
Bottom-up parsing
S ? S E E E ? number ( S )
(12(34))5 ? (12(34))5 (E2(34))5
? (1 2(34))5 (S2(34))5 ? (1
2(34))5 (SE(34))5 ? (12
(34))5 (S(34))5 ? (12(3
4))5 (S(E4))5 ? (12(3
4))5 (S(S4))5 ? (12(3
4))5 (S(SE))5 ? (12(34
))5 (S(S))5 ? (12(34 ))5 (SE)5
? (12(34) )5 (S)5 ? (12(34)
)5 E5 ? (12(34)) 5 SE ?
(12(34))5 S (12(34))5
right-most derivation
18
Bottom-up parsing
S ? S E E E ? number ( S )

(12(34))5 ? (E2(34))5 ? (S2(34))5
?(SE(34))5
Advantage of bottom-up parsing can select
productions based on more information

S
S E
E
5
( S )
S E
( S )
S E
E
S E
2
4
1
E
3
19
Top-down vs. Bottom-up
Bottom-up Dont need to figure out as much of
the parse tree for a given amount of input
scanned unscanned
scanned unscanned
Top-down
Bottom-up
20
Shift-reduce parsing

Parsing is a sequence of shift and reduce
operations
Parser state is a stack of terminals and
non-terminals (grows to the right)
Unconsumed input is a string of terminals
Current derivation step is always stackinput
Shift -- push head of input onto stack
stack input
( 12(34))5
(1 2(34))5

21
Reduce

Replace symbols ? in top of stack with
non-terminal symbol X, corresponding to
production X ? ? (pop ?, push X)
stack input
(SE (34))5 reduce S? SE
(S (34))5
What effect does this have on derivation?

22
Shift-reduce parsing
S ? S E E E ? number ( S )

(12(34))5 ? (12(34))5 shift
(12(34))5 ? ( 12(34))5 shift
(12(34))5 ? (1 2(34))5 reduce E?num
(E2(34))5 ? (E 2(34))5 reduce S ? E
(S2(34))5 ? (S 2(34))5 shift
(S2(34))5 ? (S 2(34))5 shift
(S2(34))5 ? (S2 (34))5 reduce E?num
(SE(34))5 ? (SE (34))5 reduce S? SE
(S(34))5 ? (S (34))5 shift
(S(34))5 ? (S (34))5 shift
(S(34))5 ? (S( 34))5 shift
(S(34))5 ? (S(3 4))5 reduce E?num

derivation
input stream
action
stack
23
Problem

How do we know which action to take -- whether to
shift or reduce, and which production?
Sometimes can reduce but shouldnt
e.g., X ? ? can always be reduced
Sometimes can reduce in different ways

24
Action Selection Problem

Given stack ? and input symbol b, should we
shift b onto the stack (making it ?b)
reduce some production X ? ? assuming that stack
has the form ? ? (making it ?X)
Should apply reduction X ? ? depending on what
stack prefix ? is -- but ? is different for
different possible reductions, since ?s have
different length. How to keep track?

25
Parser States

Idea summarize all possible stack prefixes ? as
a parser state
A state transition function updates the parser
state as shifts and reductions are performed DFA
Summarizing discards information
affects what grammars parser handles
affects size of DFA (number of states)

26
LR(0) parser

Left-to-right scanning, Right-most derivation,
zero look-ahead characters
Too weak to handle most language grammars
(including this one)
But will help us understand how to build better
parsers

27
LR(0) states

A state is a set of items
An LR(0) item is a production from the language
with a separator . somewhere in the RHS of the
production
Stuff before . already on stack (beginnings of
possible ?s)
Stuff after . what we might see next
The prefixes ? represented by state

E ? number . E ? ( . S )
state
item
28
An LR(0) grammar non-empty lists

S ? ( L )
S ? id
L ? S
L ? L , S
x (x,y) (x, (y,z), w)
((((x)))) (x, (y, (z, w)))

29
Closure
S ? ( L ) id L ? S L, S
S ? . S S ? . ( L ) S ? . id
Closure
start state
S ? . S

Closure of a state adds items for all
productions whose LHS occurs in an item in the
state, just after .
Added items have the . located at the
beginning
Like NFA ? DFA conversion

30
Applying shift actions
S ? ( . L ) L ? . S L ? . L , S S ? . ( L
) S ? . id
S ? ( L ) id L ? S L , S
S ? . S S ? . ( L ) S ? . id
(
(
id
id
S ? id .
In new state, include all items that have
appropriate input symbol just after dot, and
advance dot in those items. (and take closure)
31
Applying reduce actions
S ? ( . L ) L ? . S L ? . L , S S ? . ( L
) S ? . id
S ? ( L . ) L ? L . , S
L
S ? . S S ? . ( L ) S ? . id
(
S
(
L ? S .
id
id
S ? id .
states causing reductions

Need to set state after reducing
On reduction, pop back to old state and take DFA
transition on non-terminal reduced

32
Full DFA (Appel p. 63)
8
9
2
L ? L , . S S ? . ( L ) S ? . id
1
id
S
id
S ? . S S ? . ( L ) S ? . id
S ? id .
L ? L , S .
id
3
S ? ( . L ) L ? . S L ? . L , S S ? . ( L
) S ? . id
(
5
L
S ? ( L . ) L ? L . , S
S
)
(
S
6
S ? ( L ) .
4
7
L ? S .
S ? S .

final state
33
S ? ( L ) id L ? S L, S

Idea stack is labeled w/state
Lets try parsing ((x),y)
derivation stack input action
((x),y) ? 1 ((x),y) shift, goto 3
((x),y) ? 1 (3 (x),y) shift, goto 3
((x),y) ? 1 (3 (3 x),y) shift, goto 2
((x),y) ? 1 (3 (3 x2 ),y) reduce S?id
((S),y) ? 1 (3 (3 S7 ),y) reduce L?S
((L),y) ? 1 (3 (3 L5 ),y) shift, goto 6
((L),y) ? 1 (3 (3 L5)6 ,y) reduce S?(L)
(S,y) ? 1 (3 S7 ,y) reduce L?S
(L,y) ? 1 (3 L5 ,y) shift, goto 8
(L,y) ? 1 (3 L5 , 8 y) shift, goto 9
(L,y) ? 1 (3 L5 , 8 y2 ) reduce S?id
(L,S) ? 1 (3 L5 , 8 S9 ) reduce L?L , S
(L) ? 1 (3 L5 ) shift, goto 6
(L) ? 1 (3 L5 )6 reduce S?(L)
S 1 S4 done

34
Summary

Grammars can be parsed bottom-up using a DFA
stack
State construction converts grammar into states
that capture information needed to know what
action to take
Stack entries labeled by state index
Next time SLR, LR(1) parsers, automatic parser
generators

Write a Comment

User Comments (0)