CS412/413 - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

CS412/413

Description:

Bottom-up parsing (1 2 (3 4)) 5 (E 2 (3 4)) 5 (S 2 (3 4)) 5 (S E (3 4)) 5 ... Advantage of bottom-up parsing: can select productions based on more information ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 35
Provided by: andrew433
Category:
Tags: bottom | cs412

less

Transcript and Presenter's Notes

Title: CS412/413


1
CS412/413
  • Introduction to
  • Compilers and Translators
  • Spring 99
  • Lecture 5 Bottom-up parsing

2
Outline
  • Creating LL(1) grammars
  • Limitations of LL(1) grammars
  • Bottom-up parsing
  • LR(0) parser construction

3
Administration
  • Should have received mail about group assignments
    by now
  • Homework 1 due next class (Friday)
  • Monday considered 2 days late (-20), Tuesday 3
    days (-40)
  • No class next Monday (Feb 8)

4
Programming Assignment
  • Due Monday, Feb 15
  • Implement a lexer for Iota language
  • Do not need to implement DFA construction
  • Opportunity to work as group
  • We expect high quality

5
Review
  • Can construct recursive descent parsers for LL(1)
    grammars

Language grammar
How to perform this step?
LL(1) grammar
predictive parse table
recursive-descent parser
recursive-descent parser w/ AST generation
6
Grammars
  • Have been using grammar for language of sums
    with parentheses
  • Original grammar
  • S ? S E E
  • E ? number ( S )
  • LL(1) grammar for same language
  • S ? ES
  • S ? ? S
  • E ? number ( S )

(1(34))5
7
Left-recursive vs Right-recursive
(1 2 (3 4)) 5
  • Original grammar was left-recursive
  • S ? S E
  • S ? E
  • LL(1) grammar is right-recursive parsed
    top-down
  • S ? E S
  • S ? ? S
  • Left-recursive grammars dont work with top-down
    parsing -- need an arbitrary amount of look-ahead

S ? E S S ? E
S
S E
(...) (...) (...) (...) ...
S E
S E
8
How to create an LL(1) grammar
  • Write a right-recursive grammar
  • S ? E S
  • S ? E
  • Left-factor common prefixes, place suffix in new
    non-terminal
  • S ? E S
  • S ? ?
  • S ? S

9
Right Recursion
(1 2 (3 4)) 5
S
  • Right recursion right-associative

E S

( S ) S
5
E S
5
1
1
S
2
E S
3 4
2 S
E S
  • Left recursion left-associative


?
( S )
5

E S


S
3
E
3
4
1
2
4
10
Associativity
  • We can provide left-associativity by massaging
    the recursive-descent code

void parse_S() switch (token) case (
case number parse_E() parse_S()
return default throw new ParseError()
void parse_S() switch (token) case
token input.read() parse_S()
return case ) return case EOF
return default throw new ParseError()
11
Associativity
  • void parse_S() // parses a sequence of E E E
    ...
  • switch (token)
  • case ( case number
  • parse_E()
  • switch (token)
  • case token input.read() parse_S()
    return
  • case ) return
  • case EOF return
  • default throw new ParseError()
  • return
  • default throw new ParseError()

tail recursion
12
Flattening Associative Operators
  • void parse_S () // parses an arbitrary sequence
    of E E E ...
  • while (true)
  • switch (token)
  • case (
  • case number
  • parse_E ()
  • switch (token)
  • case token input.read()
    break case ) case EOF return
  • default throw new ParseError()
  • break
  • default throw new ParseError()

(1 2 (34)) 5

5
1
2
13
Summary
  • Now have complete recipe for building a parser

Language grammar
LL(1) grammar
predictive parse table
recursive-descent parser
recursive-descent parser w/ AST generation
14
Bottom-up parsing
  • A more powerful parsing technology
  • LR grammars -- more power than LL
  • can handle left-recursive grammars, virtually all
    programming languages
  • More natural expression of programming language
    syntax
  • Shift-reduce parsers
  • automatic parser generators (e.g. yacc)
  • detect errors as soon as possible
  • allows better error recovery

15
Top-down parsing
S ? S E E E ? number ( S )
(12(34))5
  • S ? SE ? EE ? (S)E ? (SE)E ?(SEE)E
    ?(EEE)E ?(1EE)E?(12E)E ...
  • In left-most derivation, entire tree above a
    token (2) has been expanded when encountered
  • Must be able to predict!

S
S E
E
5
( S )
S E
( S )
S E
E
S E
2
4
1
E
3
16
Bottom-up parsing
S ? S E E E ? number ( S )
  • Right-most derivation-- backward
  • Start with the tokens
  • End with the start symbol
  • (12(34))5 ? (E2(34))5 ? (S2(34))5
    ?(SE(34))5 ? (S(34))5 ? (S(E4))5
    ?(S(S4))5 ?(S(SE))5 ? (S(S))5 ?(SE)5 ?
    (S)5 ? E5 ? SE ? S

17
Bottom-up parsing
S ? S E E E ? number ( S )
(12(34))5 ? (12(34))5 (E2(34))5
? (1 2(34))5 (S2(34))5 ? (1
2(34))5 (SE(34))5 ? (12
(34))5 (S(34))5 ? (12(3
4))5 (S(E4))5 ? (12(3
4))5 (S(S4))5 ? (12(3
4))5 (S(SE))5 ? (12(34
))5 (S(S))5 ? (12(34 ))5 (SE)5
? (12(34) )5 (S)5 ? (12(34)
)5 E5 ? (12(34)) 5 SE ?
(12(34))5 S (12(34))5
right-most derivation
18
Bottom-up parsing
S ? S E E E ? number ( S )
  • (12(34))5 ? (E2(34))5 ? (S2(34))5
    ?(SE(34))5
  • Advantage of bottom-up parsing can select
    productions based on more information

S
S E
E
5
( S )
S E
( S )
S E
E
S E
2
4
1
E
3
19
Top-down vs. Bottom-up
Bottom-up Dont need to figure out as much of
the parse tree for a given amount of input
scanned unscanned
scanned unscanned
Top-down
Bottom-up
20
Shift-reduce parsing
  • Parsing is a sequence of shift and reduce
    operations
  • Parser state is a stack of terminals and
    non-terminals (grows to the right)
  • Unconsumed input is a string of terminals
  • Current derivation step is always stackinput
  • Shift -- push head of input onto stack
  • stack input
  • ( 12(34))5
  • (1 2(34))5

21
Reduce
  • Replace symbols ? in top of stack with
    non-terminal symbol X, corresponding to
    production X ? ? (pop ?, push X)
  • stack input
  • (SE (34))5 reduce S? SE
  • (S (34))5
  • What effect does this have on derivation?

22
Shift-reduce parsing
S ? S E E E ? number ( S )
  • (12(34))5 ? (12(34))5 shift
  • (12(34))5 ? ( 12(34))5 shift
  • (12(34))5 ? (1 2(34))5 reduce E?num
  • (E2(34))5 ? (E 2(34))5 reduce S ? E
  • (S2(34))5 ? (S 2(34))5 shift
  • (S2(34))5 ? (S 2(34))5 shift
  • (S2(34))5 ? (S2 (34))5 reduce E?num
  • (SE(34))5 ? (SE (34))5 reduce S? SE
  • (S(34))5 ? (S (34))5 shift
  • (S(34))5 ? (S (34))5 shift
  • (S(34))5 ? (S( 34))5 shift
  • (S(34))5 ? (S(3 4))5 reduce E?num

derivation
input stream
action
stack
23
Problem
  • How do we know which action to take -- whether to
    shift or reduce, and which production?
  • Sometimes can reduce but shouldnt
  • e.g., X ? ? can always be reduced
  • Sometimes can reduce in different ways

24
Action Selection Problem
  • Given stack ? and input symbol b, should we
  • shift b onto the stack (making it ?b)
  • reduce some production X ? ? assuming that stack
    has the form ? ? (making it ?X)
  • Should apply reduction X ? ? depending on what
    stack prefix ? is -- but ? is different for
    different possible reductions, since ?s have
    different length. How to keep track?

25
Parser States
  • Idea summarize all possible stack prefixes ? as
    a parser state
  • A state transition function updates the parser
    state as shifts and reductions are performed DFA
  • Summarizing discards information
  • affects what grammars parser handles
  • affects size of DFA (number of states)

26
LR(0) parser
  • Left-to-right scanning, Right-most derivation,
    zero look-ahead characters
  • Too weak to handle most language grammars
    (including this one)
  • But will help us understand how to build better
    parsers

27
LR(0) states
  • A state is a set of items
  • An LR(0) item is a production from the language
    with a separator . somewhere in the RHS of the
    production
  • Stuff before . already on stack (beginnings of
    possible ?s)
  • Stuff after . what we might see next
  • The prefixes ? represented by state

E ? number . E ? ( . S )
state
item
28
An LR(0) grammar non-empty lists
  • S ? ( L )
  • S ? id
  • L ? S
  • L ? L , S
  • x (x,y) (x, (y,z), w)
  • ((((x)))) (x, (y, (z, w)))

29
Closure
S ? ( L ) id L ? S L, S
S ? . S S ? . ( L ) S ? . id
Closure
start state
S ? . S
  • Closure of a state adds items for all
    productions whose LHS occurs in an item in the
    state, just after .
  • Added items have the . located at the
    beginning
  • Like NFA ? DFA conversion

30
Applying shift actions
S ? ( . L ) L ? . S L ? . L , S S ? . ( L
) S ? . id
S ? ( L ) id L ? S L , S
S ? . S S ? . ( L ) S ? . id
(
(
id
id
S ? id .
In new state, include all items that have
appropriate input symbol just after dot, and
advance dot in those items. (and take closure)
31
Applying reduce actions
S ? ( . L ) L ? . S L ? . L , S S ? . ( L
) S ? . id
S ? ( L . ) L ? L . , S
L
S ? . S S ? . ( L ) S ? . id
(
S
(
L ? S .
id
id
S ? id .
states causing reductions
  • Need to set state after reducing
  • On reduction, pop back to old state and take DFA
    transition on non-terminal reduced

32
Full DFA (Appel p. 63)
8
9
2
L ? L , . S S ? . ( L ) S ? . id
1
id
S
id
S ? . S S ? . ( L ) S ? . id
S ? id .
L ? L , S .
id
3
S ? ( . L ) L ? . S L ? . L , S S ? . ( L
) S ? . id
(
5
L
S ? ( L . ) L ? L . , S
S
)
(
S
6
S ? ( L ) .
4
7
L ? S .
S ? S .

final state
33
S ? ( L ) id L ? S L, S
  • Idea stack is labeled w/state
  • Lets try parsing ((x),y)
  • derivation stack input action
  • ((x),y) ? 1 ((x),y) shift, goto 3
  • ((x),y) ? 1 (3 (x),y) shift, goto 3
  • ((x),y) ? 1 (3 (3 x),y) shift, goto 2
  • ((x),y) ? 1 (3 (3 x2 ),y) reduce S?id
  • ((S),y) ? 1 (3 (3 S7 ),y) reduce L?S
  • ((L),y) ? 1 (3 (3 L5 ),y) shift, goto 6
  • ((L),y) ? 1 (3 (3 L5)6 ,y) reduce S?(L)
  • (S,y) ? 1 (3 S7 ,y) reduce L?S
  • (L,y) ? 1 (3 L5 ,y) shift, goto 8
  • (L,y) ? 1 (3 L5 , 8 y) shift, goto 9
  • (L,y) ? 1 (3 L5 , 8 y2 ) reduce S?id
  • (L,S) ? 1 (3 L5 , 8 S9 ) reduce L?L , S
  • (L) ? 1 (3 L5 ) shift, goto 6
  • (L) ? 1 (3 L5 )6 reduce S?(L)
  • S 1 S4 done

34
Summary
  • Grammars can be parsed bottom-up using a DFA
    stack
  • State construction converts grammar into states
    that capture information needed to know what
    action to take
  • Stack entries labeled by state index
  • Next time SLR, LR(1) parsers, automatic parser
    generators
Write a Comment
User Comments (0)
About PowerShow.com