LR Parsing - PowerPoint PPT Presentation

About This Presentation
Title:

LR Parsing

Description:

Both are used extensively in production compilers Yacc Tool Brief History YACC stands for Yet Another Compiler-Compiler It was first developed by Steve Johnson in ... – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 57
Provided by: Baoji1
Category:
Tags: parsing | yacc

less

Transcript and Presenter's Notes

Title: LR Parsing


1
LR Parsing
  • Compiler
  • Baojian Hua
  • bjhua_at_ustc.edu.cn

2
Front End
lexical analyzer
source code
tokens
abstract syntax tree
parser
semantic analyzer
IR
3
Parsing
  • The parser translates the source program into
    abstract syntax trees
  • Token sequence
  • returned from the lexer
  • abstract syntax tree
  • check validity of programs
  • form compiler internal data structures for
    programs
  • Must take account the program syntax

4
Conceptually
parser
token sequence
abstract syntax tree
language syntax
5
Predicative Parsing
  • Grammars encode enough information on how to
    choose production rules, when input terminals are
    seen
  • LL(1) pros
  • simple, easy to implement
  • efficient
  • Cons
  • grammar rewriting
  • ugly

6
Todays Topic
  • Bottom-up Parsing
  • shift-reduce parsing, LR parsing
  • This is the predominant algorithm used by
    automatic YACC-like parser generators
  • YACC, bison, CUP, etc.

7
Bottom-up Parsing
  • 1 S exp
  • 2 exp exp term
  • 3 exp term
  • 4 term term factor
  • 5 term factor
  • 6 factor ID
  • 7 factor INT

2 3 4 factor 3 4 term 3 4 exp 3
4 exp factor 4 exp term 4 exp term
factor exp term exp S
A reverse of right-most derivation!
8
Dot notation
  • As a convenient notation, we will mark how much
    of the input we have consumed by using a symbol

exp 3 ? 4
consumed
remaining input
9
Bottom-up Parsing
2 ? 3 4 factor ? 3 4 term ? 3 4 exp
3 ? 4 exp factor ? 4 exp term 4 ? exp
term factor ? exp term ? exp ? S ?
2 3 4 factor 3 4 term 3 4 exp 3
4 exp factor 4 exp term 4 exp term
factor exp term exp S
10
Another View
2 3 4 ? 3 4 ? 3 4 ? 3 4 ? 3
4 ? 3 4 ? 4 ? 4 ? 4 ? 4 ? ? ? ? ?
2 factor term exp exp exp 3 exp
factor exp term exp term exp term
4 exp term factor exp term exp S
  • S exp
  • exp exp term
  • exp term
  • term term factor
  • term factor
  • factor ID
  • factor INT

Whats the data structure of the left?
11
Producing a rightmost derivation in reverse
  • We do two things
  • shift a token (terminal) onto the stack, or
  • reduce the top n symbols on the stack by a
    production
  • When we reduce by a production A ?
  • ? is on the top of the stack, pop ?
  • and push A
  • Key problem when to shift or reduce?

12
Yet Another View
2 3 4 ? 3 4 ? 3 4 ? 3 4 ? 3 4
2 factor term exp
E
T
F
2
13
Yet Another View
2 3 4 ? 3 4 ? 3 4 ? 3 4 ? 3
4 ? 3 4 ? 4 ? 4 ? 4
2 factor term exp exp exp 3 exp
factor exp term
S

T
E
T
F

T
F
4
F
3
2
14
A shift-reduce parser
  • Two components
  • Stack holds the viable prefixes
  • Input stream holds remaining source
  • Four actions
  • shift push token from input stream onto stack
  • reduce right-end (? of A ?) is at top of
    stack, pop ?, push A
  • accept success
  • error syntax error discovered

15
Table-driven LR(k) parsers
AST
tokens
Parser Loop
Lexer
Stack
Action table GOTO table
Grammar
Parser Generator
16
An LR parser
  • Put S on stack in state s0
  • Parser configuration is(S, s0, X1, s1, X2, s2,
    Xm, sm ai ai1 an )
  • do forever
  • read ai.
  • if (actionai, sm is shift s then(S, s0, X1,
    s1, X2, s2, Xm, sm, ai, s ai1 an )
  • if (actionai, sm is reduce A ? then(S, s0,
    X1, s1, X2, s2, Xm- ?, sm- ?, A, s ai ai1
    an )where s gotosm- ?, A
  • if (actionai, sm is accept, DONE
  • if (actionai, sm is error, handle error

17
Generating LR parsers
  • In order to generate an LR parser, we must create
    the action and GOTO tables
  • Many different ways to do this
  • We will start here with the simplest approach,
    called LR(0)
  • Left-to-right parsing, Rightmost derivation, 0
    lookahead

18
Item
  • LR(0) items have the formproduction-with-dot
  • For example, X -gt A B C has 4 forms of items
  • X ? A B C
  • X A ? B C
  • X A B ? C
  • X A B C ?

19
What items mean?
  • X ? ? ? ?
  • input is consistent with X ? ? ?
  • X ? ? ? ?
  • input is consistent with X ? ? ? and we have
    already recognized ?
  • X ? ? ? ?
  • input is consistent with X ? ? ? and we have
    already recognized ? ?
  • X ? ? ? ?
  • input is consistent with X ? ? ? and we can
    reduce to X

20
LR(0) Items
8
2
x
L -gt L, ? S S -gt ? (L) S -gt ? x
S -gt x ?

0 S -gt S 1 S -gt x S 2 S -gt y
3
x
(
(
S -gt (? L) L -gt ? S L -gt ? L, S S -gt ? (L) S -gt
? x
9
S
,
L -gt L, S ?
(
S -gt (L ? ) L -gt L ?, S
5
L
)
S
7
6
L -gt S ?
S -gt (L) ?
21
LR(0) Items
8
2
x
L -gt L, ? S S -gt ? (L) S -gt ? x
S -gt x ?

0 S -gt S 1 S -gt (L) 2 S -gt x 3 L -gt S 4 L
-gt L, S
3
x
(
(
S -gt (? L) L -gt ? S L -gt ? L, S S -gt ? (L) S -gt
? x
9
S
,
L -gt L, S ?
(
S -gt (L ? ) L -gt L ?, S
5
L
)
S
7
6
L -gt S ?
S -gt (L) ?
22
LR(0) table construction
  • Construct LR(0) Items
  • Item Ii becomes state i
  • Parsing actions at state i are
  • A ? ? a ? ? Ii and goto(Ii, a) Ijthen
    actioni, a shift j
  • A ? ? ? Ii and A ? Sthen actioni, a
    reduce by A ?
  • S S ? ? Ii then actioni, accept

23
LR(0) table construction, contd
  • GOTO table for non-terminals GOTOi,A j if
    GOTO(Ii, A) Ij
  • Empty entries are error

24
LR(0) Table
action action action action action goto goto
s\t ( ) x , S L
1 s3 s2 g4
2 r2 r2 r2 r2 r2
3 s3 s3 g7 g5
4 accept
5 s6 s8
6 r1 r1 r1 r1 r1
7 r3 r3 r3 r3 r3
8 s3 s2 g9
9 r4 r4 r4 r4 r4
25
Problems with LR(0)
  • For every item of the form X -gt ? ?
  • blindly reduce to X, followed with a goto
  • which may not miss any error, but may postpone
    the detection of some errors

26
Problems with LR(0)
8
2
x
L -gt L, ? S S -gt ? (L) S -gt ? x
S -gt x ?

0 S -gt S 1 S -gt (L) 2 S -gt x 3 L -gt S 4 L
-gt L, S
3
x
(
(
S -gt (? L) L -gt ? S L -gt ? L, S S -gt ? (L) S -gt
? x
9
S
,
L -gt L, S ?
(
S -gt (L ? ) L -gt L ?, S
5
L
)
S
7
6
Consider this input x 5
L -gt S ?
S -gt (L) ?
27
Problems with LR(0)
action action action action action goto goto
s\t ( ) x , S L
1 s3 s2 g4
2 r2 r2 r2 r2 r2
3 s3 s3 g7 g5
4 accept
5 s6 s8
6 r1 r1 r1 r1 r1
7 r3 r3 r3 r3 r3
8 s3 s2 g9
9 r4 r4 r4 r4 r4
28
Another Example
2
S -gt E ?

0 S -gt E 1 E -gt TE 2 E -gt T 3 T -gt x
3
T
E -gt T ? E E -gt T ?

T
4
x
E -gt T ? E E -gt ? TE E -gt ? T T -gt ? x
6
E
E -gt TE ?
A shift-reduce conflict!
29
LR(0) Parse Table
action action action goto goto
s\t x E T
1 s5 g2 g3
2 accept
3 r2 s4, r2 r2
4 s5 g6 g3
5 r3 r3 r3
6 r1 r1 r1
30
SLR table construction
  • Construct LR(0) Items
  • Item Ii becomes state i
  • Parsing actions at state i are
  • A ? ? a ? ? Ii and goto(Ii, a) Ijthen
    actioni,a shift j
  • A ? ? ? Ii and A ? Sthen actioni,a
    reduce by A ? for all a ? FOLLOW(A)
  • S S ? ? Ii then actioni, accept
  • GOTO table for non-terminals
  • GOTOi,A j if GOTO(Ii, A) Ij
  • Empty entries are error

31
Reduce LR(0) Table
8
2
x
L -gt L, ? S S -gt ? (L) S -gt ? x
S -gt x ?

0 S -gt S 1 S -gt (L) 2 S -gt x 3 L -gt S 4 L
-gt L, S
3
x
(
(
S -gt (? L) L -gt ? S L -gt ? L, S S -gt ? (L) S -gt
? x
9
S
,
L -gt L, S ?
(
S -gt (L ? ) L -gt L ?, S
5
L
)
S
7
6
Follow set S S , ,, ) L ,, )
L -gt S ?
S -gt (L) ?
32
Reduce LR(0) Table
action action action action action goto goto
s\t ( ) x , S L
1 s3 s2 g4
2 r2 r2 r2 r2 r2
3 s3 s3 g7 g5
4 accept
5 s6 s8
6 r1 r1 r1 r1 r1
7 r3 r3 r3 r3 r3
8 s3 s2 g9
9 r4 r4 r4 r4 r4
33
Resolve Shift-reduce Conflict
2
S -gt E ?

0 S -gt E 1 E -gt TE 2 E -gt T 3 T -gt x
3
T
E -gt T ? E E -gt T ?

T
4
x
E -gt T ? E E -gt ? TE E -gt ? T T -gt ? x
6
E
E -gt TE ?
Follow set S E T ,
34
Resolve Shift-reduce Conflict
action action action goto goto
s\t x E T
1 s5 g2 g3
2 accept
3 r2 s4, r2 r2
4 s5 g6 g3
5 r3 r3 r3
6 r1 r1 r1
35
Problems with SLR

R

S S S L R R L R id R
L

id
L
L
L

R
36
Problems with SLR
  • Reduce on ALL terminals in FOLLOW set
  • FOLLOW(R) FOLLOW(L)
  • But, we should never reduce R L on
  • Thus, there should be no reduction in state 2
  • Why this happen and how can we solve this?

S L R R L R id R L
37
LR(1) Items
  • X ? ? ?, a Means
  • ? is at top of stack
  • Input string is derivable from ?a
  • In other words, when we reduce X ??, a had
    better be the look ahead symbol.
  • Or, put reduce by X ?? in actions, a only

38
LR(1) table construction
  • Construct LR(1) Items
  • Item Ii becomes state i
  • Parsing actions at state i are
  • A ? ? a ? ,b ? Ii and goto(Ii, a)
    Ijthen actioni, a shift j
  • A ? ? ,b ? Ii and A ? Sthen actioni, a
    reduce by A ? for b
  • S S ? , ? Ii then actioni,
    accept
  • GOTO table for non-terminals GOTOi, A j if
    GOTO(Ii, A) Ii
  • Empty entries are error
  • Initial state is from Item containing S ? S
    ,

39
LR(1) Items (part)

S S S L R R L R id R
L
L
40
More

R
S L R R L R id R L

id
L
L
R
others
41
Notice similar states?

R
S L R R L R id R L

id
L
L
R
others
42
Notice similar states?

R
S L R R L R id R L

id
L
L
L id ? ,/
5
R L ? ,/
8
R L ? ,
10
L id ? ,
11
R
others
43
LALR
S CC C cC d
c
C

c
d
d
S
c
C
C
C
c
d
44
LALR
S CC C cC d
c
C

c
d
d
d
c
S
c
C
C
C
45
LALR
S CC C cC d
c
C

c
d
d
d
c
S
C
C
46
LALR Construction
  • Merge items with common cores
  • Change GOTO table to reflect merges
  • Can introduce reduce/reduce conflicts
  • Cannot introduce shift/reduce conflicts

47
Ambiguous Grammars
  • No ambiguous grammars can be LR(k)
  • hence can not be parsed bottom-up
  • Nevertheless, some of the ambiguous grammar are
    well-understood, and can be parsed by LR(k) with
    some tricks
  • precedence
  • associativity
  • dangling-else

48
Precedence
E EE EE id
S E ? E E ? E E E ? E
S ? E E ? E E E ? E E E ? id
E
s/r on both and
49
Precedence
E EE EE id
S E ? E E ? E E E ? E
S ? E E ? E E E ? E E E ? id
E
What if we want both and right-associative?
reduce on reduce on
reduce on shift on
50
Parser Implementation
  • Implementation Options
  • Write a parser from scratch
  • not as boring as writing a lexer, but not exactly
    simple as you may imagine
  • Use an automatic parser generator
  • Very general robust. sometimes not quite as
    efficient as hand-written parsers.
  • Nevertheless, good for lazy compiler writers.
  • Both are used extensively in production compilers

51
Yacc Tool

semantic analyzer specification
parser
Yacc
Creates a parser from a declarative specification
involving a context-free grammar
52
Brief History
  • YACC stands for Yet Another Compiler-Compiler
  • It was first developed by Steve Johnson in 1975
    for Unix
  • There have been many later versions of YACC
    (e.g., GNU Bison), each offering minor
    improvements
  • Ported to many languages
  • YACC is now a standard tool, defined in IEEE
    Posix standard P1003.2

53
ML-Yacc
  • User Declarations declare values available in
  • the rule actions
  • ML-Yacc Definitions declare terminals and non-
  • terminals special declarations to resolve
  • conflicts
  • Rules parser specified by CFG rules and
  • associated semantic action that generate abstract
  • syntax

54
ML-Yacc Definitions (preliminaries)
  • Specify type of positions
  • pos int int
  • Specify terminal and nonterminal symbols
  • term IF THEN ELSE PLUS MINUS ...
  • nonterm prog exp stm
  • Specify end-of-parse token
  • eop EOF
  • Specify start symbol (by default, non terminal in
    LHS of first rule)
  • start prog

55
Example
  • term ASSIGN ID PLUS NUM SEMICOLON TIMES
  • nonterm s e
  • pos int start p eop EOF
  • left PLUS
  • left TIMES
  • p -gt s SEMICOLON p ()
  • -gt ()
  • s -gt ID ASSIGN e ()
  • e -gt e PLUS e ()
  • e TIMES e ()
  • ID ()
  • NUM ()

56
Summary
  • Bottom-up parsing
  • reverse order of derivations
  • LR grammars are more powerful
  • use of stacks and parse tables
  • yet more complex
  • Bonus tools take the hard work for you, read the
    online ML-Yacc manual
Write a Comment
User Comments (0)
About PowerShow.com