CSCI 435 Compiler Design - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

CSCI 435 Compiler Design

Description:

names and literals, and the : and ; are. Yacc punctuation. Grammar Rules ... dinosaur.compilertools.net/yacc/index.html and http://dinosaur.compilertools.net ... – PowerPoint PPT presentation

Number of Views:95

Avg rating:3.0/5.0

Slides: 25

Provided by: OwenAst9

Category:

more less

Transcript and Presenter's Notes

Title: CSCI 435 Compiler Design

1
CSCI 435 Compiler Design

Week 6 Class 2
Section 3.2.3 to section 3.2.3.2
(253-260)
Ray Schneider

2
Topics of the Day

Data Flow Equations
Setting them up
Solving them

3
Data-flow Equations

a half-way automation of full symbolic
interpretation
Stack Representation is replaced by a Collection
of Sets
the Semantics of the Node is described more
formally
Interpretation is replaced by a built-in and
fixed propagation mechanism
two set variables are associated with each node N
in the control flow graph (both start off empty)
the input set IN(N), and
the output set OUT(N)
they replace the stack representation and are
computed by the propagation mechanism

4
Node(s) of the Control Flow Graph

for each Node
IN(N) input set
OUT(N) output set
Input / Output sets contain static information
about the run-time situation at the node
Variable X is equal to 1 here
There has been no remote procedure call in any
path from the routine entry to here
Definitions for the variable y reach here from
nodes N1 and N2
Global variable line_count has been modified
since routine entry
GEN(N) contains items added by the node
KILL(N) contains items removed by the node N

5
Interpretation mechanism is missing so ...

nodes that modify the stack size are not handled
easily in setting up the data flow equations
ex. nodes occurring in expressions like '' which
will remove two entries from the stack and push
one entry back on.
in practice this is dealt with by combining
groups of control flow nodes such that there is
no net stack effect
ex. for data-flow equation purposes this entire
set of control flow nodes is considered a single
node, with one IN, OUT, GEN, and KILL set.

6
Setting up the data-flow equations
IN(N)Mdynamic? predecessor of N
OUT(M) OUT(N)(IN(N)\KILL(N)) ? GEN(N)

actual data-flow equations are the same for all
nodes
information at the ENTRANCE of a node N is equal
to the union of the information at the exit of
all dynamic predecessors to N
obviously true since no information is lost going
from Node to Node
information at the EXIT of a node N is in
principle equal to that at the entrance, except
that all the information in the KILL set has been
removed from it and all the information in the
GEN set has been added to it. (The order of
removing and adding is important first the
information being invalidated is removed, then
the new information is added.)

7
example

Arrive at a node xy with the IN set Variable
x in equal to 0 here.
THEN the KILL set of the node contains the item
Variable x is equal to here and the GEN set
contains Variable x equals y here
1) all items in the IN set that are also in the
KILL set are erased, i.e. Variable x is equal to
here, subsumes Variable x equals 0 here, so
that item is erased
2) next items from the GEN set are added,
3) so the OUT set is Variable x equals y here

8
Interpreting the Data-flow Equations

While the operators for set union ? and set
difference \ are used they really apply as
information union and information difference
operators
sometimes behave as ordinary set union and
difference and can be implemented with binary,
Boolean representations (say bit sets), ex.
Variable V may be unintialized here set union,
(i.e. some predecessor node may be
uninitialized) or ex. Variable V is guaranteed
to have a value here set intersection (i.e. all
predecessor nodes must have a value)
sometimes the information is more complicated,
ex. Variable x has a value in the range M to N,
requiring ad hoc code be designed and written
that knows how to create, merge and examine such
ranges

9
Third Data-flow Equation

Zeroth Data Flow Equation
Defines the IN set of the first node of the
routine as the set of information items
established by the parameters of the routine
in particular each IN and INOUT parameter gives
rise to an item 'Parameter Pi has a value here'

IN all value parameters have values
KILL all local information
10
Solving the Data-flow equations (Closure)
First Data Flow equation tells us how to obtain
the IN set of all nodes when we know the OUT sets
of all nodes. Second Data Flow equation tells us
how to obtain the OUT set of a node if we know
its IN set (and its GEN and KILL sets, but they
are constants).
Closure Algorithm for Solving the Data-Flow
Equations Data definitions 1. Constant KILL
and GEN sets for each node. 2. Variable IN and
OUT sets for each node. Initializations 1. The
IN set of the top node is initialized with
information established externally 2. For all
other nodes N, IN(N) and OUT(N) are set to
empty. Inference rules
IN(N)Mdynamic? predecessor of N
OUT(M) OUT(N)(IN(N)\KILL(N)) ? GEN(N)
11
Implementation of the Closure Algorithm

implemented by traversing the control graph
repeatedly computing IN and OUT sets of the nodes
visited until a complete traversal of the Control
Flow Graph produces no further change
Now we're ready to use the information for
context checking and code generation.
NOTE predecessors of a node are easy to find if
the Control Flow Graph is doubly linked as shown
earlier

12
Trivalent Logic for initialization of variables
Note 11 may or may not have a value 10
definitely does not have a value 01 definitely
has a value 00 an error
x is guaranteed to have a value y may or may not
have a value
x is guaranteed not to have a value the
combination of 00 for y is an error
13
if ygt0 then xy else y0 end if
x y
Note 11 may or may not have a value 10
definitely does not have a value 01 definitely
has a value 00 an error
14
Summing Up a Little

Generally we visit all the nodes in Control Flow
Graph order this is not necessary but is
generally logical and convenient
The data-flow algorithm in itself only collect
information it does not checking and does not
generate error messages or warnings
Additional traversals are needed to use the
information
ex. checking for uninitialized variables

15
Flex and Bison

Lex/Flex as we have seen previously is a program
generator for lexical processing of character
input streams
It accepts a high-level description for character
string matching and produces a program which
recognizes regular expressions
Lex written code recognizes the expressions in
the input
The Lex source file associates regular
expressions and program fragments provided by the
user which are executed as each expression
appears in the input

16
General Format of Lex
ex. \t \t printf(" ") /
this lex input causes lex to ignore sequences
of 1 or more blanks or tabs up to the end of
line and for blanks not followed by the end
of line it will substitute a single blank /

definitions
rules
user subroutines

User definitions and user subroutines are often
omitted
17
Uses of Lex

It can be used alone for simple transformations
of files, or for analysis and statistics
gathering at the lexical level
Lex generates lexical analyzers that are easy to
interface with Yacc/Bison
Lex programs recognize only regular expressions
Yacc writes parsers that accept a large class of
context free grammars, but requires a lower level
analyzer to recognize the input tokens

18
Combining Lex and Yacc

Lex is used to partition the input stream and
Yacc (the parser generator) assigns structure to
the remaining pieces

Lexical Rules
Grammar Rules
Yacc
Lex
yyparse
yylex
Parsed Input
Input
Note all Yacc variables begin with 'yy' so you
can avoid collisions with the user generated code.
19
Yacc Specifications

Generally the Lexical Analyzer (ex. yylex.c) is
included as part of the Yacc Specification File
The full Yacc Specification File looks like
declarations
rules
programs
Where have we seen this before? Structure is
similar to Lex input, but what goes in the
sections is different.

20
Grammar Rules and Actions
Grammar Rules
Smallest legal Yacc Specification is rules
Grammar Rules look like A BODY where A
a non-terminal name BODY a sequence of zero or
more names and literals, and the and
are Yacc punctuation.
Actions Associated with Rules
With each grammar rule, the user may associate
actions to be performed each time the rule is
recognized in the input process. An action is
specified by one or more statements enclosed in
curly braces '' and ''
21
Examples

A '(' B ')' hello(1,"abc")
XXX YYY ZZZ printf("a message \n")
flag25
To facilitate easy communication between the
actions and the parser, Yacc uses the special ''
symbol. '' is a pseudo-variable for the left
hand side of the grammar rule, and 1, 2, etc.
are pseudovariables for the elements of the rhs
A B C D 2 has the value returned by C
etc.
default is 1, the value of the first element

22
How the parser works

Yacc turns the specification file into a C
program which parses the input according to the
specification given.
The parser that is produced consists of a Finite
State Machine with a stack with a look ahead
token. The current state is the one on top of
the stack.
The machine has only four actions shift, reduce,
accept and error
We'll LEAVE YACC THERE You need to read about it
so you can use it.

23
Homework for Week 8

Bison Familiarization
Read the entire 39 pages of "A Compact Guide To
Lex and Yacc" // you can skim through it the
first time
THEN concentrate first on getting the lex example
on page 10 running
THEN after you have that running go on to
Practice, Part 1 and strive to get the primitive
calculator running (pages 14 through 17)

24
References

Text Modern Compiler Design Figures
Lex A Lexical Analyzer Generator by M.E. Lesk
and E. Schmidt
Yacc Yet Another Compiler-Compiler by Stephen C.
Johnson
see http//dinosaur.compilertools.net/yacc/index.h
tml and http//dinosaur.compilertools.net/lex/in
dex.html

Write a Comment

User Comments (0)