Title: Parsing
1Parsing
2Front-End Parser
tokens
sourcecode
IR
scanner
parser
errors
- Checks the stream of words and their parts of
speech for grammatical correctness
3Front-End Parser
tokens
sourcecode
IR
scanner
parser
errors
- Determines if the input is syntactically well
formed
4Front-End Parser
tokens
sourcecode
IR
scanner
parser
errors
- Guides context-sensitive (semantic) analysis
(type checking)
5Front-End Parser
tokens
sourcecode
IR
scanner
parser
errors
- Builds IR for source program
6Syntactic Analysis
- Natural language analogy consider the sentence
He
wrote
the
program
7Syntactic Analysis
He
wrote
the
program
noun
verb
article
noun
8Syntactic Analysis
He
wrote
the
program
noun
verb
article
noun
subject
predicate
object
9Syntactic Analysis
He
wrote
the
program
noun
verb
article
noun
subject
predicate
object
sentence
10Syntactic Analysis
if
( b lt 0 )
a b
assignment
bool expr
if-statement
11Syntactic Analysis
- syntax errors
- int foo(int i, int j))
-
- for(k0 i j )
- fi( i gt j )
- return j
12Compiler Construction
13Syntactic Analysis
- int foo(int i, int j))
-
- for(k0 i j )
- fi( i gt j )
- return j
extra parenthesis
Missing expression
not a keyword
14Semantic Analysis
He
wrote
the
computer
noun
verb
article
noun
subject
predicate
object
sentence
15Semantic Analysis
- semantically (meaning) wrong!
He
wrote
the
computer
noun
verb
article
noun
subject
predicate
object
sentence
16Semantic Analysis
- int foo(int i, int j)
-
- for(k0 i lt j j )
- if( i lt j-2 )
- sum sumi
- return sum
undeclared var
return type mismatch
17Role of the Parser
- Not all sequences of tokens are program.
- Parser must distinguish between valid and invalid
sequences of tokens.
18Role of the Parser
- Not all sequences of tokens are program.
- Parser must distinguish between valid and invalid
sequences of tokens.
19Role of the Parser
- What we need
- An expressive way to describe the syntax
- An acceptor mechanism that determines if input
token stream satisfies the syntax
20Role of the Parser
- What we need
- An expressive way to describe the syntax
- An acceptor mechanism that determines if input
token stream satisfies the syntax
21Role of the Parser
- What we need
- An expressive way to describe the syntax
- An acceptor mechanism that determines if input
token stream satisfies the syntax
22Study of Parsing
- Parsing is the process of discovering a
derivation for some sentence
23Study of Parsing
- Mathematical model of syntax a grammar G.
- Algortihm for testing membership in L(G).
24Study of Parsing
- Mathematical model of syntax a grammar G.
- Algortihm for testing membership in L(G).
25Context Free Grammars
- A CFG is a four tuple G(S,N,T,P)
- S is the start symbol
- N is a set of non-terminals
- T is a set of terminals
- P is a set of productions
26Why Not Regular Expressions?
- Reason regular languages do not have enough
power to express syntax of programming languages.
27Limitations of Regular Languages
- Finite automaton cant remember number of times
it has visited a particular state
28Example of CFG
- Context-free syntax is specified with a CFG
29Example of CFG
- ExampleSheepNoise ? SheepNoise baa baa
- This CFG defines the set of noises sheep make
30Example of CFG
- We can use the SheepNoise grammar to create
sentences - We use the productions as rewriting rules
31Example of CFG
- SheepNoise ? SheepNoise baa baa
Rule Sentential Form
- SheepNoise
2 baa
32Example of CFG
- SheepNoise ? SheepNoise baa baa
Rule Sentential Form
- SheepNoise
1 SheepNoise baa
2 baa baa
33Example of CFG
Rule Sentential Form
- SheepNoise
1 SheepNoise baa
1 SheepNoise baa baa
2 baa baa baa
34Example of CFG
- While it is cute, this example quickly runs out
intellectual steam - To explore uses of CFGs, we need a more complex
grammar
35Example of CFG
- While it is cute, this example quickly runs out
intellectual steam - To explore uses of CFGs, we need a more complex
grammar
36More Useful Grammar
1 expr ? expr op expr
2 num
3 id
4 op ?
5
6
7 /
37Backus-Naur Form (BNF)
- Grammar rules in a similar form were first used
in the description of the Algol60 Language.
38Backus-Naur Form (BNF)
- The notation was developed by John Backus and
adapted by Peter Naur for the Algol60 report. - Thus the term Backus-Naur Form (BNF)
39Backus-Naur Form (BNF)
- The notation was developed by John Backus and
adapted by Peter Naur for the Algol60 report. - Thus the term Backus-Naur Form (BNF)
40Derivation
- Let us use the expression grammar to derive the
sentence - x 2 y
41Derivation x 2 y
Rule Sentential Form
- expr
1 expr op expr
2 ltid,xgt op expr
5 ltid,xgt expr
1 ltid,xgt expr op expr
42Derivation x 2 y
Rule Sentential Form
2 ltid,xgt ltnum,2gt op expr
6 ltid,xgt ltnum,2gt ? expr
3 ltid,xgt ltnum,2gt ? ltid,ygt
43Derivation
- Such a process of rewrites is called a
derivation. - Process or discovering a derivations is called
parsing
44Derivation
- Such a process of rewrites is called a
derivation. - Process or discovering a derivations is called
parsing
45Derivation
We denote this derivation as expr ? id
num id
46Derivations
- At each step, we choose a non-terminal to replace
- Different choices can lead to different
derivations.
47Derivations
- At each step, we choose a non-terminal to replace
- Different choices can lead to different
derivations.
48Derivations
- Two derivations are of interest
- Leftmost derivation
- Rightmost derivation
49Derivations
- Leftmost derivation replace leftmost
non-terminal (NT) at each step - Rightmost derivation replace rightmost NT at
each step
50Derivations
- Leftmost derivation replace leftmost
non-terminal (NT) at each step - Rightmost derivation replace rightmost NT at
each step
51Derivations
- The example on the preceding slides was leftmost
derivation - There is also a rightmost derivation
52Rightmost Derivation
Rule Sentential Form
- expr
1 expr op expr
3 expr op ltid,xgt
6 expr ? ltid,xgt
1 expr op expr ? ltid,xgt
53Derivation x 2 y
Rule Sentential Form
2 expr op ltnum,2gt ? ltid,xgt
5 expr ltnum,2gt ? ltid,xgt
3 ltid,xgt ltnum,2gt ? ltid,ygt
54Derivations
- In both cases we have expr ? id num ? id
55Derivations
- The two derivations produce different parse
trees. - The parse trees imply different evaluation orders!
56Derivations
- The two derivations produce different parse
trees. - The parse trees imply different evaluation orders!