Recap - PowerPoint PPT Presentation

About This Presentation
Title:

Recap

Description:

Kinds of Parsers. Top-Down (Predictive Parsing) LL. Construct parse tree in a top-down matter ... Identify potential bugs. Prove the absence of runtime errors ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 43
Provided by: mathT
Category:
Tags: bug | cup | is | kind | of | recap | this | what

less

Transcript and Presenter's Notes

Title: Recap


1
Recap
  • Mooly Sagiv

2
Outline
  • Subjects Studied
  • Questions Answers

3
Lexical Analysis (Scanning)
  • input
  • program text (file)
  • output
  • sequence of tokens
  • Read input file
  • Identify language keywords and standard
    identifiers
  • Handle include files and macros
  • Count line numbers
  • Remove whitespaces
  • Report illegal symbols
  • Produce symbol table

4
The Lexical Analysis Problem
  • Given
  • A set of token descriptions
  • An input string
  • Partition the strings into tokens (class, value)
  • Ambiguity resolution
  • The longest matching token
  • Between two equal length tokens select the first

5
Jlex
  • Input
  • regular expressions and actions (Java code)
  • Output
  • A scanner program that reads the input and
    applies actions when input regular expression is
    matched

Jlex
6
Summary
  • For most programming languages lexical analyzers
    can be easily constructed automatically
  • Exceptions
  • Fortran
  • PL/1
  • Lex/Flex/Jlex are useful beyond compilers

7
Syntax Analysis (Parsing)
  • input
  • Sequence of tokens
  • output
  • Abstract Syntax Tree
  • Report syntax errors
  • unbalanced parenthesizes
  • Create symbol-table
  • Create pretty-printed version of the program
  • In some cases the tree need not be generated
    (one-pass compilers)

8
Pushdown Automaton
input

u
t
w
V
control
parser-table

stack
9
Efficient Parsers
  • Pushdown automata
  • Deterministic
  • Report an error as soon as the input is not a
    prefix of a valid program
  • Not usable for all context free grammars

cup
Ambiguity errors
parse tree
10
Kinds of Parsers
  • Top-Down (Predictive Parsing) LL
  • Construct parse tree in a top-down matter
  • Find the leftmost derivation
  • For every non-terminal and token predict the next
    production
  • Preorder tree traversal
  • Bottom-Up LR
  • Construct parse tree in a bottom-up manner
  • Find the rightmost derivation in a reverse order
  • For every potential right hand side and token
    decide when a production is found
  • Postorder tree traversal

11
Top-Down Parsing
1
input
t1 t2
12
Bottom-Up Parsing
input
t1 t2 t4 t5
t6 t7 t8
13
Example Grammar for Predictive LL Top-Down Parsing
expression ? digit ( expression operator
expression ) operator ? digit ? 0
1 2 3 4 5 6 7 8
9
14
Example Grammar for Predictive LL Top-Down Parsing
expression ? digit ( expression operator
expression ) operator ? digit ? 0
1 2 3 4 5 6 7 8
9
15
static int Parse_Expression(Expression expr_p)
Expression expr expr_p new_expression()
/ try to parse a digit / if (Token.class
DIGIT) expr-gttypeD
expr-gtvalueToken.repr 0
get_next_token() return 1 /
try parse parenthesized expression / if
(Token.class () expr-gttypeP
get_next_token() if (!Parse_Expression(exp
r-gtleft)) Error(missing expression) if
(!Parse_Operator(expr-gtoper)) Error(missing
operator) if (Token.class ! ))
Error(missing )) get_next_token()
return 1 return 0
16
Parsing Expressions
  • Try every alternative production
  • For P ? A1 A2 An B1 B2 Bm
  • If A1 succeeds
  • Call A2
  • If A2 succeeds
  • Call A3
  • If A2 fails report an error
  • Otherwise try B1
  • Recursive descent parsing
  • Can be applied for certain grammars
  • Generalization LL1 parsing

17
int P(...) / try parse the alternative P ?
A1 A2 ... An / if (A1(...)) if
(!A2()) Error(Missing A2) if (!A3())
Error(Missing A3) .. if (!An())
Error(Missing An) return 1
/ try parse the alternative P ? B1 B2
... Bm / if (B1(...)) if (!B2())
Error(Missing B2) if (!B3())
Error(Missing B3) .. if (!Bm())
Error(Missing Bm) return 1
return 0
18
Predictive Parser for Arithmetic Expressions
  • Grammar
  • C-code?
  • E ? E T
  • E ? T
  • T ? T F
  • T ? F
  • 5 F ? id
  • 6 F ? (E)

19
Bottom-Up Syntax Analysis
  • Input
  • A context free grammar
  • A stream of tokens
  • Output
  • A syntax tree or error
  • Method
  • Construct parse tree in a bottom-up manner
  • Find the rightmost derivation in (reversed order)
  • For every potential right hand side and token
    decide when a production is found
  • Report an error as soon as the input is not a
    prefix of valid program

20
Constructing LR(0) parsing table
  • Add a production S ? S
  • Construct a finite automaton accepting valid
    stack symbols
  • States are set of items A? ???
  • The states of the automaton becomes the states of
    parsing-table
  • Determine shift operations
  • Determine goto operations
  • Determine reduce operations
  • Report an error when conflicts arise

21
1 S ? ?E 4 E ? ? T 6 E ? ? E T 10 T ? ?
i 12 T ? ? (E)
2 S ? E ? 7 E ? E ? T
T
E
5 E ? T ?
i
11 T ? i ?

(
i
13 T ? (? E) 4 E ? ? T 6 E ? ? E T 10 T ?
? i 12 T ? ? (E)
7 E ? E ? T 10 T ? ? i 12 T ? ? (E)
i
(

T
8 E ? E T ?
)
15 T ? (E) ?
22
Parsing (i)
1 S ? ?E 4 E ? ? T 6 E ? ? E T 10 T ? ?
i 12 T ? ? (E)
2 S ? E ? 7 E ? E ? T
T
E
5 E ? T ?
i
11 T ? i ?

(
i
13 T ? (? E) 4 E ? ? T 6 E ? ? E T 10 T ?
? i 12 T ? ? (E)
7 E ? E ? T 10 T ? ? i 12 T ? ? (E)
i
(

T
8 E ? E T ?
)
15 T ? (E) ?
23
Summary (Bottom-Up)
  • LR is a powerful technique
  • Generates efficient parsers
  • Generation tools exit LALR(1)
  • Bison, yacc, CUP
  • But some grammars need to be tuned
  • Shift/Reduce conflicts
  • Reduce/Reduce conflicts
  • Efficiency of the generated parser

24
Summary (Parsing)
  • Context free grammars provide a natural way to
    define the syntax of programming languages
  • Ambiguity may be resolved
  • Predictive parsing is natural
  • Good error messages
  • Natural error recovery
  • But not expressive enough
  • But LR bottom-up parsing is more expressible

25
Abstract Syntax
  • Intermediate program representation
  • Defines a tree - Preserves program hierarchy
  • Generated by the parser
  • Declared using an (ambiguous) context free
    grammar (relatively flat)
  • Not meant for parsing
  • Keywords and punctuation symbols are not stored
    (Not relevant once the tree exists)
  • Big programs can be also handled (possibly via
    virtual memory)

26
Semantic Analysis
  • Requirements related to the context in which a
    construct occurs
  • Examples
  • Name resolution
  • Scoping
  • Type checking
  • Escape
  • Implemented via AST traversals
  • Guides subsequent compiler phases

27
Abstract InterpretationStatic analysis
  • Automatically identify program properties
  • No user provided loop invariants
  • Sound but incomplete methods
  • But can be rather precise
  • Non-standard interpretation of the program
    operational semantics
  • Applications
  • Compiler optimization
  • Code quality tools
  • Identify potential bugs
  • Prove the absence of runtime errors
  • Partial correctness

28
Constant Propagation
x??, y??, z??
z 3
x??, y??, z ? 3
x??, y??, z?3
while (xgt0)
x??, y??, z?3
if (x1)
x??, y??, z?3
x?1, y??, z?3
y 7
y z4
x?1, y?7, z?3
x??, y?7, z?3
assert y7
29
a 0
/ c / L0 a 0 / ac / L1 b a
1 / bc / c c b / bc / a b 2 /
ac / if c lt N goto L1 / c / return c
b a 1
c c b
a b2
c ltN goto L1
return c
30
a 0
?
b a 1
?
c c b
?
a b2
?
c ltN goto L1
?
return c
?
31
a 0
?
b a 1
?
c c b
?
a b2
?
c ltN goto L1
c
return c
?
32
a 0
?
b a 1
?
c c b
?
a b2
c
c ltN goto L1
c
return c
?
33
a 0
?
b a 1
?
c c b
c, b
a b2
c
c ltN goto L1
c
return c
?
34
a 0
?
b a 1
c, b
c c b
c, b
a b2
c
c ltN goto L1
c
return c
?
35
a 0
c, a
b a 1
c, b
c c b
c, b
a b2
c
c ltN goto L1
c
return c
?
36
a 0
c, a
b a 1
c, b
c c b
c, b
a b2
c, a
c ltN goto L1
c, a
return c
?
37
Summary Iterative Procedure
  • Analyze one procedure at a time
  • More precise solutions exit
  • Construct a control flow graph for the procedure
  • Initializes the values at every node to the most
    optimistic value
  • Iterate until convergence

38
Basic Compiler Phases
39
Overall Structure
40
Techniques Studied
  • Simple code generation
  • Basic blocks
  • Global register allocation
  • Activation records
  • Object Oriented
  • Assembler/Linker/Loader

41
Heap Memory Management
  • Part of the runtime system
  • Utilities for dynamic memory allocation
  • Utilities for automatic memory reclamation
  • Garbage Colletion

42
Garbage Collection
  • Techniques
  • Mark and sweep
  • Copying collection
  • Reference counting
  • Modes
  • Generational
  • Incremental vs. Stop the world
Write a Comment
User Comments (0)
About PowerShow.com