Title: Course Overview
1Course Overview
- Mooly Sagiv
- msagiv_at_post.tau.ac.il
- Schrierber 317
- 03-640-7606
- Wed 1000-1200
- html//www.math.tau.ac.il/msagiv/courses/wcc02.ht
ml - TextbookModern Compiler Implementation in C
- Andrew Appel
- ISBN 0-521-58390-X
- CS 0368-4452-01_at_listserv.tau.ac.il
2Outline
- High level programming languages
- Interpreter vs. Compiler
- Abstract Machines
- Why study compilers?
- Main Compiler Phases
3High Level Programming Languages
- Imperative
- Algol, PL1, Fortran, Pascal, Ada, Modula, and C
- Closely related to von Neumann Computers
- Object-oriented
- Simula, Smalltalk, Modula3, C, Java, C
- Data abstraction and evolutionaryform of
program development - Class An implementation of an abstract data type
(datacode) - Objects Instances of a class
- Fields Data (structure fields)
- Methods Code (procedures/functions with
overloading) - Inheritance Refining the functionality of a class
with different fields and methods - Functional
- Lisp, Scheme, ML, Miranda, Hope, Haskel
- Logic Programming
- Prolog
4Other Languages
- Hardware description languages
- VHDL
- The program describes Hardware components
- The compiler generates hardware layouts
- Shell-languages Shell, C-shell, REXX
- Include primitives constructs from the current
software environment - Graphics and Text processing TeX, LaTeX,
postscript - The compiler generates page layouts
- Web/Internet
- HTML, MAWL, Telescript, JAVA
- Intermediate-languages
- P-Code, Java bytecode, IDL, CLR
5Interpreter
- Input
- A program
- An input for the program
- Output
- The required output
interpreter
6Example
C interpreter
7Compiler
- Input
- A program
- Output
- An object program that reads the input and
writes the output
compiler
8Example
Sparc-cc-compiler
add fp,-8, l1 mov l1, o1 call
scanf ld fp-8,l0 add l0,1,l0 st
l0,fp-8 ld fp-8, l1 mov l1,
o1 call printf
assembler/linker
object-program
9Interpreter vs. Compiler
- More efficient
- Compilation is done once for all the inputs ---
many computations can be performed at
compile-time - Sometimes evencompile-time execution-time lt
interpretation-time - Can report errors before input is given
- Conceptually simpler (the definition of the
programming language) - Easier to port
- Can provide more specific error report
- Normally faster
10Interpreters provide specific error report
- Input-program
- Input data y0
scanf(d, y) if (y lt 0) x 5 ... if (y
lt 0) z x 1
11Compilers are usually more efficient
Sparc-cc-compiler
add fp,-8, l1 mov l1, o1 call
scanf mov 5, l0st l0,fp-12 mov
7,l0 st l0,fp-16 ld fp-8, l0 ld
fp-8,l0 add l0, 35 ,l0 st
l0,fp-8 ld fp-8, l1 mov l1,
o1 call printf
12Compilers can provide errors beforeactual input
is given
- Input-program
- Compiler-Output line 4 improper
pointer/integer combination op ''
int a100, x, y scanf(d, y) if (y lt
0) / line 4/ y a
13Compilers can provide errors beforeactual input
is given
- Input-program
- Compiler-Output line 88 x may be used before
set''
scanf(, y) if (y lt 0) x 5 ... if (y lt
0) / line 88 / z x 1
14Abstract Machines
- A compromise between compilers and interpreters
- An intermediate program representation
- The intermediate representation is interpreted
- Example Zurich P4 Pascal Compiler(1981)
- Other examples Java bytecode, MS .NET
- The intermediate code can be compiled
Pascal compiler
P-code
interpreter
programs input
15Why Study Compilers
- Become a compiler writer
- New programming languages
- New machines
- New compilation modes just-in-time
- Using some of the techniques in other contexts
- Design a very big software program using a
reasonable effort - Learn applications of many CS results (formal
languages, decidability, graph algorithms,
dynamic programming, ... - Better understating of programming languages and
machine architectures - Become a better programmer
16Course Requirements
- Compiler Project 35
- Develop a Tiger Front-End in C
- Two parts
- LexYacc (Chapter 2, 3, 4)
- Semantic analysis (5, 12)
- Tight schedule
- Bonus 10
- Theoretical Exercises 15
- Final exam 50
17Compiler Phases
- The compiler program is usually written as
sequence of well defined phases - The interfaces between the phases is well defined
(another language) - It is sometimes convenient to use auxiliary
global information (e.g., symbol table) - Advantages of the phase separation
- Modularity
- Simplicity
- Reusabilty
18Basic Compiler Phases
Source program (string)
Finite automata
lexical analysis
Tokens
Pushdown automata
syntax analysis
Abstract syntax tree
semantic analysis
Memory organization
Translate
Intermediate representation
Instruction selection
Dynamic programming
Assembly
Register Allocation
graph algorithms
Fin. Assembly
19Examplestraight-line programming
Stm Stm Stm //(CompoundStm) Stm id
Exp // (AssignStm) Stm print (ExpList) //
(PrintStm) Exp id // (IdExp) Exp
num // (NumExp) Exp Exp Binop Exp //
(OpExp) Exp (Stm, Exp) // (EseqExp) ExpList
Exp, ExpList // (PairExpList) ExpList
Exp // (LastExpList) Binop //
(Plus) Binop - // (Minus) Binop
// (Times) Binop / // (Div)
20Example Input
a 5 3 b ( print(a, a-1),
10 a) b print(b)
21Questions
- How to check that a program is correct?
- How to internally represent the compiled program?
22Lexical Analysis
a\b 5 3 \nb (print(a, a-1), 10 a)
\nprint(b)
id (a) assign num (5) num(3) id(b) assign
print(id(a) , id(a) - num(1)), num(10)
id(a)) print(id(b))
23Syntax Analysis
- Tokens
- Abstract Syntax tree
id (a) assign num (5) num(3) id(b) assign
print(id(a) , id(a) - num(1)), num(10)
id(a)) print(id(b))
CompoundStm
CompoundStm
AssignStm
AssignStm
opExp
id
eseqExp
id
numExp
numExp
opExp
Plus
a
PrintStm
b
5
3
24Summary
- Phases drastically simplifies the problem of
writing a good compiler - The Textbook offers a reasonable partition into
phases with interface definition (in C) - Every week we will study a new compiler phase