Title: Overview of Compilation
1Overview of Compilation
Programming Language Concepts Lecture 2
- Prepared by
- Manuel E. Bermúdez, Ph.D.
- Associate Professor
- University of Florida
2Overview of Translation
- Definition A translator is an algorithm that
converts source programs into equivalent target
programs. - Definition A compiler is a translator whose
target language is at a lower level than its
source language.
3Overview of Translation (contd)
- When is one languages level lower than
anothers? - Definition An interpreter is an algorithm that
simulates the execution of programs written in a
given source language.
4Overview of Translation (contd)
- Definition An implementation of a programming
language consists of a translator (or compiler)
for that language, and an interpreter for the
corresponding target language.
input
Target
Source
Interpreter
Compiler
output
5Translation
- A source program may be translated an arbitrary
number of times before the target program is
generated.
Source
Translator1
Translator2
...
TranslatorN
Target
6Translation (contd)
- Each of these translations is called a phase, not
to be confused with a pass, i.e., a disk dump. - Q How should a compiler be divided into phases?
- A So that each phase can be easily described by
some formal model of computation, and so the
phase can be carried out efficiently.
7Translation (contd)
- Q How is a compiler usually divided?
- A Two major phases, with many possibilities for
subdivision. - Phase 1 Analysis (determine correctness)
- Phase 2 Synthesis (produce target code)
- Another criterion
- Phase 1 Syntax (form).
- Phase 2 Semantics (meaning).
8Typical Compiler Breakdown
- Scanning (Lexical analysis).
- Goal Group sequences of characters that occur on
the source, into logical atomic units called
tokens. - Examples of tokens Identifiers, keywords,
integers, strings, punctuation marks, white
spaces, end-of-line characters, comments, etc.,
Scanner (Lexical analysis)
Source
Sequence of Tokens
9Lexical Analysis
- Must deal with end-of-line and end-of-file
characters. - A preliminary classification of tokens is made.
For example, both program and Ex are
classified as Identifier. - Someone must give unambiguous rules for forming
tokens.
10(No Transcript)
11Screening
- Goals
- Remove unwanted tokens.
- Classify keywords.
- Merge/simplify tokens.
12Screening
- Keywords recognized.
- White spaces (and comments) discarded.
- The screener acts as an interface between the
scanner and the next phase, the parser.
13(No Transcript)
14Parsing (Syntax Analysis)
- Goals
- To group together the tokens, into the correct
syntactic structures, if possible. - To determine whether the tokens appear in
patterns that syntactically correct.
15Parsing (Syntax Analysis)
- Syntactic structures
- Expressions
- Statements
- Procedures
- Functions
- Modules
- Methodology
- Use re-write rules (a.k.a. BNF).
16String-To-Tree Transduction
- Goal To build a syntax tree from the sequence
of rewrite rules. The tree will be the functional
representation of the source. - Method Build tree bottom-up, as the rewrite
rules are emitted. Use a stack of trees.
17(No Transcript)
18Contextual Constraint Analysis
- Goal To analyze static semantics, e.g.,
- Are variables declared before they are used?
- Is there assignment compatibility?
- e.g., a3
- Is there operator type compatibility?
- e.g., a3
- Do actual and formal parameter types match?
- e.g. int f(int n, char c)
- ...
- f('x', 3)
- Enforcement of scope rules.
19Contextual Constraint Analysis
- Method Traverse the tree recursively, deducing
type information at the bottom, and passing it
up. - Make use of a DECLARATION TABLE,
- to record information about names.
- Decorate tree with reference information.
20(No Transcript)
21Example
- Chronologically,
- Enter x into the DCLN table, with its type.
- Check type compatibility for x5.
- X2 not declared!
- Verify type of is boolean.
- Check type compatibility for .
- Check type compatibility between x and int, for
assignment.
22Code Generation
- Goal Convert syntax tree to target code.
- Target code could be
- Machine language.
- Assembly language.
- Quadruples for a fictional machine
- label
- opcode
- operands (1 or 2)
?
?
23Code Generation
- Example
- pc on UNIX generates assembly code
- pi on UNIX generates code for the p machine,
which is interpreted by an interpreter. - pc slow compilation, fast running code.
- pi fast compilation, slow running code.
- Method Traverse the tree again.
24Code (for a stack machine)
- LOAD 5
- STORE X
- LOAD X
- LOAD 10
- BGT
- COND L1 L2
- L1 LOAD X
- LOAD 1
- BADD
- STORE X
- GOTO L3
- L2
- . . .
- L3
25Code Optimization
- Goals
- Reduce the size of the target program.
- Decrease the running time of the target.
- Note Optimization is a misnomer. Code
improvement would be better. - Two types of optimization
- Peephole optimization (local).
- Global optimization (improve loops, etc.).
26Code Optimization (contd)
- Example (from previous slide)
- LOAD 5 can be LOAD 5
- STORE X replaced STND X
- LOAD X with
Store non-destructively, i.e., store in X, but do
not destroy value on top of stack.
27Summary
Source
Scanner
Tokens
Screener
Tokens
Error Routines
Table Routines
Parser
Tree
Constrainer
Tree
Code Generator
Code (for an abstract machine)
Interpreter
Input
Output
28Overview of Compilation
Programming Language Concepts Lecture 2
- Prepared by
- Manuel E. Bermúdez, Ph.D.
- Associate Professor
- University of Florida