Title: Compiler Design Chapter 1
1Compiler Design - Chapter 1
Introduction
2What is a Compiler?
3What is a compiler?
- Translates source code to target code
- Source code is typically a high level programming
language (Java, C, etc) but does not have to be - Target code is often a low level language like
assembly or machine code but does not have to be - Can you think of other compilers that you have
used according to this definition?
4What is a compiler?
- Javadoc -gt HTML
- High level description of a circuit -gt machine
instructions to fabricate circuit - SQL Query output -gt Table
- Postscript -gt PDF
5This Course
Modern Compiler has many phases This
course Organization of a compiler each
covering a successive phase.
- Techniques
- Data Structures
- Algorithms
6The Compilation Phases
7Lexical Analysis
also called scanning or tokenization
double d1 double d2 d2 d1 2.0
double TOK_DOUBLE reserved word d1 TOK_ID vari
able name TOK_PUNCT has value of double
TOK_DOUBLE reserved word d2 TOK_ID variable
name TOK_PUNCT has value of
d2 TOK_ID variable name TOK_OPER
has value of d1
TOK_ID variable name
TOK_OPER has value of 2.0
TOK_FLOAT_CONST has value of 2.0
TOK_PUNCT has value of
8Syntax and Semantics
- Syntax - the form or structure of the expressions
(consist of tokens) - Syntax analysis also called Parsing
- whether an expression is well formed
- d1 2.0 is legal while d1 2.0 is not
- Semantics the meaning of an expression
- Semantic analysis
- interprets types, operations, etc.
- translates syntax into an intermediate
representation (IR) for generating machine code
9Why intermediate representation?
executable code for target machine
back-end synthesis
program in some source language
Intermediate Representation
front-end analysis
10Compiler Phases Interfaces
11Reusable Modules
- Each phase implemented by one or more
software modules - To change the target machine replace Frame
layout and Instruction Selection Module - To change the source language change modules up
through Translate
12Interfaces as Data Structures
- Some interfaces data structures
- e.g. Parsing Actions phase -gt Abstract Syntax
Tree (AST) -gt Semantic Analysis phase
13Parse tree and AST example
- Grammar
- define syntactic structure declaratively using a
set of productions of the form - symbol ? symbol symbol symbol
- Expression grammar
- expression ? expression term expression -
term term - term ? term factor term / factor factor
- factor ? identifier constant ( expression
) - example expression
- bb 4ac
14Parse tree bb 4ac
expression
expression
-
term
term
term
factor
term
factor
term
factor
identifier
factor
identifier
factor
identifier
c
identifier
constant
b
a
b
4
15AST bb 4ac
-
b
b
c
4
a
16Interfaces as Abstract Data Types
- Other interfaces abstract data types
- Tokens interface a function that the Parser
calls to get the next token of the input program - Translate interface a set of functions that
the Semantic Analysis phase can call
17Tools and Software
- Two Important Abstractions
- Regular expressions for lexical analysis
- Context-free grammars for parsing
- To make use of abstractions use special tools
- Lex converts a declarative specification into
a lexical analysis program - Yacc converts a grammar into a parsing
program - Java versions JLex, CUP, and JavaCC
18Why use Java ?
- Java is object-oriented
- Java is safe programs cannot circumvent
the type system to violate abstractions - Java has garbage collection simplifies
management of dynamic storage allocation
19A Straight-line Programming Language
Statements Expressions no loops or if
statements
class name
20Informal Semantics
- Stm statement, Exp - expression
- s1s2 executes statement s1 then statement s2
- ie evaluates the expression e then stores
the result in i - print(e1,e2, en) displays values of all the
expression - (s, e) is expression sequence evaluating the
statement s for side effects before returning the
result of e.
21Tree Representation of Straight-line Program
one node for each s and e
22Representation of Straight-line Program
Grammar can be translated directly into data
structure definitions
23Abstract Classes
- Each grammar symbol abstract class
- Each grammar rule - concrete class
- constructor initializes Right-hand side (RHS)
components - RHS components are represented using data
structures
fields (Instance variables)
24Programming Style
- Trees are described by a grammar
- A tree is described by one or more abstract
classes corresponding to grammar symbols - Each abstract class extended by subclassesone
per grammar rule - For each symbol in RHS of rule field in class
- Every class constructor that initializes all
the fields - Data structures are initialized when constructors
create them never modified after that
25Modularity Principles
- Compilers are big programs to prevent chaos
- Each phase / module own package
- No import on demand , e.g.import A.G.
- Only single-type imports, e.g.import A.G.X
- Java naturally multi-threaded.
- Multiple compiler threads therefore no static
variables unless they are final (constant). Never
want two compiler threads updating the same
(static) instance of a variable!
26Assignment 0
- Read chapter 1 2
- JavaCC (available at https//javacc.dev.java.net/
) has been installed at C\javacc-3.2 in lab 303.
- Go through the simple examples located under the
"examples" directory in a directory called
"SimpleExamples". - Read the file "README" in this directory for
complete instructions. - Read the tutorial (.pdf) and introduction notes
(.ppt) - Program Straight-line program interpreter
- A warm-up exercise in Java programming
- No due date and will not be graded