Title: Bottom-Up Syntax Analysis
1Bottom-Up Syntax Analysis
- Mooly Sagiv
- html//www.math.tau.ac.il/msagiv/courses/wcc03.ht
ml - TextbookModern Compiler Design
- Chapter 2.2.5
2Efficient Parsers
- Pushdown automata
- Deterministic
- Report an error as soon as the input is not a
prefix of a valid program - Not usable for all context free grammars
bison
Ambiguity errors
parse tree
3Kinds of Parsers
- Top-Down (Predictive Parsing) LL
- Construct parse tree in a top-down matter
- Find the leftmost derivation
- For every non-terminal and token predict the next
production - Bottom-Up LR
- Construct parse tree in a bottom-up manner
- Find the rightmost derivation in a reverse order
- For every potential right hand side and token
decide when a production is found
4Bottom-Up Syntax Analysis
- Input
- A context free grammar
- A stream of tokens
- Output
- A syntax tree or error
- Method
- Construct parse tree in a bottom-up manner
- Find the rightmost derivation in (reversed order)
- For every potential right hand side and token
decide when a production is found - Report an error as soon as the input is not a
prefix of valid program
5Plan
- Pushdown automata
- Bottom-up parsing (informal)
- Bottom-up parsing (given a parser table)
- Constructing the parser table
- Interesting non LR grammars
6Pushdown Automaton
input
u
t
w
V
control
parser-table
stack
7Informal Example
S ? E E ? T E T T ? i ( E )
shift
8Informal Example
S ? E E ? T E T T ? i ( E )
reduce T ? i
9Informal Example
S ? E E ? T E T T ? i ( E )
input
T
reduce E ? T
10Informal Example
S ? E E ? T E T T ? i ( E )
shift
11Informal Example
S ? E E ? T E T T ? i ( E )
shift
12Informal Example
S ? E E ? T E T T ? i ( E )
reduce T ? i
13Informal Example
S ? E E ? T E T T ? i ( E )
reduce E ? E T
14Informal Example
S ? E E ? T E T T ? i ( E )
shift
15Informal Example
S ? E E ? T E T T ? i ( E )
input
stack
E
reduce Z ? E
16Informal Example
reduce Z ? E
reduce E ? E T
reduce T ? i
reduce E ? T
reduce T ? i
17Handles
- Identify the leftmost node (nonterminal) that has
not been constructed but all whose children have
been constructed - A ? ?
18Identifying Handles
- Create a finite state automaton over grammar
symbols - Accepting states identify handles
- Report if the grammar is inadequate
- Push automaton states onto parser stack
- Use the automaton states to decide
- Shift/Reduce
19Example Finite State Automaton
20Example Control Table
21Example Control Table (2)
i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
22i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
input
stack
shift 5
i i
s0()
23i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
input
s5 (i) s0 ()
reduce T ? i
i
24i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
input
s6 (T) s0 ()
i
reduce E ? T
25i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
input
shift 3
s1(E) s0 ()
i
26i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
input
s3 () s1(E) s0 ()
shift 5
i
27i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
input
s5 (i) s3 () s1(E) s0()
reduce T ? i
28i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
input
stack
reduce E ? E T
s4 (T) s3 () s1(E) s0()
29i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
input
stack
s1 (E) s0 ()
shift 2
30i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
input
s2 () s1 (E) s0 ()
reduce Z ? E
31i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
input
stack
shift 7
((i)
s0()
32i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
input
shift 7
s7(() s0()
(i)
33i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
s7 (() s7(() s0()
input
shift 5
i)
34i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
s5 (i) s7 (() s7(() s0()
input
reduce T ? i
)
35i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
s6 (T) s7 (() s7(() s0()
input
reduce E ?T
)
36i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
s8 (E) s7 (() s7(() s0()
input
shift 9
)
37i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
s9 ()) s8 (E) s7 (() s7(() s0()
stack
input
reduce T ? ( E )
38i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
s6 (T) s7(() s0()
input
reduce E ? T
39i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
stack
s8 (E) s7(() s0()
input
err
40Grammar Hierarchy
Non-ambiguous CFG
CLR(1)
LL(1)
LALR(1)
SLR(1)
LR(0)
41Constructing LR(0) parsing table
- Add a production S ? S
- Construct a finite automaton accepting valid
stack symbols - States are set of items A? ???
- The states of the automaton becomes the states of
parsing-table - Determine shift operations
- Determine goto operations
- Determine reduce operations
42Example Finite State Automaton
43Constructing a Finite Automaton
- NFA
- For X ? X1 X2 Xn
- X ? X1 X2 Xi?Xi1 Xn
- prefixes of rhs (handles)
- X1 X2 Xi is at the top of the stack and we
expect Xi1 Xn - The initial state S ? ?S
- ?(X ? X1Xi?Xi1 Xn, Xi1 ) X ? X1 XiXi1
? Xn - For every production Xi1 ? ? ?(X ? X1 X2
X?Xi1 Xn, ? ) Xi1 ?? ? - Convert into DFA
44S ? E E ? T E T T ? i ( E )
45Example Finite State Automaton
46Filling Parsing Table
- A state si
- reduce A ??
- A ?? ? ? si
- Shift
- A?? ? t ? ? si
- Goto(si, X) sj
- A ?? ? X ? ? si
- ?(si, X) sj
- When conflicts occurs the grammar is not LR(0)
47Example Finite State Automaton
48Example Control Table
i ( ) E T
0 s5 err s7 err err 1 6
1 err s3 err err s2
2 acc acc acc acc acc
3 s5 err s7 err err 4
4 reduce E?ET reduce E?ET reduce E?ET reduce E?ET reduce E?ET
5 reduce T ? i reduce T ? i reduce T ? i reduce T ? i reduce T ? i
6 reduce E ? T reduce E ? T reduce E ? T reduce E ? T reduce E ? T
7 s5 err s7 err err 8 6
8 err s3 err s9 err
9 reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E) reduce T?(E)
49Example
S ? E E ? E E i
504
E?E ? E E??EE E ? ?i
2
S??E? E?E?E
E
i
E
5
3
1
E?E E? E?E?E
E ? i?
S ?E ?
i
i E
0 s1 err err 2
1 reduce E ? i reduce E ? i reduce E ? i
2 err s4 s3
3 accept accept accept
4 s1 5
5 red s4/red red
51Interesting Non LR(0) Grammar
S ? S S ? L R R L ? R
id R ? L
Partial DFA
S ?L?R R? ?L L ? ?R L ??id
S ?? S S ? ? LR S ? ? R L ? ? R L ?? id R? ?
L
S ?L?R R ?L?
L
52LR(1) Parser
- Item A ????, t
- ? is at the top of the stack and we are
expecting ?t - LR(1) State
- Sets of items
- LALR(1) State
- Merge items with the same look-ahead
53Interesting Non LR(1) Grammars
- Ambiguous
- Arithmetic expressions
- Dangling-else
- Common derived prefix
- A ? B1 a b B2 a c
- B1 ? ?
- B2 ? ?
- Optional non-terminals
- St ? OptLab Ass
- OptLab ? id ?
- Ass ? id Exp
54A motivating example
- Create a desk calculator
- Challenges
- Non trivial syntax
- Recursive expressions (semantics)
- Operator precedence
55Solution (lexical analysis)
/ desk.l / 0-9 yylval
atoi(yytext) return NUM
return PLUS - return MINUS /
return DIV return MUL (
return LPAR ) return RPAR
//.\n / comment / \t\n
/ whitespace / . error (illegal
symbol, yytext0)
56Solution (syntax analysis)
/ desk.y / token NUM left PLUS, MINUS left
MUL, DIV token LPAR, RPAR start P P
E printf(d\n, 1) E NUM
1 LPAR e RPAR 2
e PLUS e 1 3 e
MINUS e 1 - 3 e MUL e
1 3 e DIV e 1 / 3
include lex.yy.c
flex desk.l
bison desk.y
cc y.tab.c ll -ly
57Summary
- LR is a powerful technique
- Generates efficient parsers
- Generation tools exit
- Bison, yacc, CUP
- But some grammars need to be tuned
- Shift/Reduce conflicts
- Reduce/Reduce conflicts
- Efficiency of the generated parser