Title: Compiler Construction
1Compiler Construction
2Todays Goals
- Summary of the subjects weve covered
- Perspectives and final remarks
3High-level View
- Definitions
- Compiler consumes source code produces target
code - usually translate high-level language programs
into machine code - Interpreter consumes executables produces
results - virtual machine for the input code
4Why Study Compilers?
- Compilers are important
- Enabling technology for languages, software
development - Allow programmers to focus on problem solving,
hiding the hardware complexity - Responsible for good system performance
- Compilers are useful
- Language processing is broadly applicable
- Compilers are fun
- Combine theory and practice
- Overlap with other CS subjects
- Hard problems
- Engineering and trade-offs
- Got a taste in the labs!
5Structure of Compilers
6The Front-end
7Lexical Analysis
- Scanner
- Maps character stream into tokens
- Automate scanner construction
- Define tokens using Regular Expressions
- Construct NFA (Nondeterministic Finite Automata)
to recognize REs - Transform NFA to DFA
- Convert NFA to DFA through subset construction
- DFA minimization (set split)
- Building scanners from DFA
- Tools
- ANTLR, lex
8Syntax Analysis
- Parsing language using CFG (context-free grammar)
- CFG grammar theory
- Derivation
- Parse tree
- Grammar ambiguity
- Parsing
- Top-down parsing
- recursive descent
- table-driven LL(1)
- Bottom-up parsing
- LR(1) shift reduce parsing
- Operator precedence parsing
9Top-down Predictive Parsing
- Basic idea
- Build parse tree from root. Given A ? a ß,use
look-ahead symbol to choose between a ß - Recursive descent
- Table-driven LL(1)
- Left recursion elimination
10Bottom-up Shift-Reduce Parsing
- Build reverse rightmost derivation
- The key is to find handle (rhs of production)
- All active handles include top of stack (TOS)
- Shift inputs until TOS is right end of a handle
- Language of handles is regular (finite)
- Build a handle-recognizing DFA
- ACTION GOTO tables encode the DFA
11Semantic Analysis
- Analyze context and semantics
- types and other semantic checks
- Attribute grammar
- associate evaluation rules with grammar
production - Ad-hoc
- build symbol table
12Intermediate Representation
13Intermediate Representation
- Front-end translates program into IR format for
further analysis and optimization - IR encodes the compilers knowledge of the
program - Largely machine-independent
- Move closer to standard machine model
- AST Tree high-level
- Linear IR low-level
- ILOC 3-address code
- Assembly-level operations
- Expose control flow, memory addressing
- unlimited virtual registers
14Procedure Abstraction
- Procedure is key language construct for building
large systems - Name Space
- Caller-callee interface linkage convention
- Control transfer
- Context protection
- Parameter passing and return value
- Run-time support for nested scopes
- Activation record, access link, display
- Inheritance and dynamic dispatch for OO
- multiple inheritance
- virtual method table
15The Back-end
16The Back-end
- Instruction selection
- Mapping IR into assembly code
- Assumes a fixed storage mapping code shape
- Combining operations, using address modes
- Instruction scheduling
- Reordering operations to hide latencies
- Assumes a fixed program (set of operations)
- Changes demand for registers
- Register allocation
- Deciding which values will reside in registers
- Changes the storage mapping, may add false
sharing - Concerns about placement of data memory
operations
17Code Generation
- Expressions
- Recursive tree walk on AST
- Direct integration with parser
- Assignment
- Array reference
- Boolean Relational Values
- If-then-else
- Case
- Loop
- Procedure call
18Instruction Selection
- Hand-coded tree-walk code generator
- Automatic instruction selection
- Pattern matching
- Peephole Matching
- Tree-pattern matching through tiling
19Instruction Scheduling
- The Problem
- Given a code fragment for some target machine and
the - latencies for each individual operation, reorder
the operations - to minimize execution time
- Build Precedence Graph
- List scheduling
- NP-complete problem
- Heuristics work well for basic blocks
- forward list scheduling
- backward list scheduling
- Scheduling for larger regions
- EBB and cloning
- Trace scheduling
20Register Allocation
- Local register allocation
- top-down
- bottom-up
- Global register allocation
- Find live-range
- Build an interference graph GI
- Construct a k-coloring of interference graph
- Map colors onto physical registers
21Web-based Live Ranges
- Connect common defs and uses
- Solve the Reaching data-flow problem!
22Interference Graph
- The interference graph, GI
- Nodes in GI represent live ranges
- Edges in GI represent individual interferences
- For x, y ? GI, ltx,ygt ? iff x and y interfere
- A k-coloring of GI can be mapped into an
- allocation to k registers
23Key Observation on Coloring
- Any vertex n that has fewer than k neighbors in
the interference graph (nlt k) can always be
colored ! - Remove nodes nlt k for GI , coloring for GI is
also coloring for GI
24Chaitins Algorithm
- While ? vertices with lt k neighbors in GI
- Pick any vertex n such that nlt k and put it on
the stack - Remove that vertex and all edges incident to it
from GI - This will lower the degree of ns neighbors
- If GI is non-empty (all vertices have k or more
neighbors) then - Pick a vertex n (using some heuristic) and spill
the live range associated with n - Remove vertex n from GI , along with all edges
incident to it and put it on the stack - If this causes some vertex in GI to have fewer
than k neighbors, then go to step 1 otherwise,
repeat step 2 - If no spill, successively pop vertices off the
stack and color them in the lowest color not used
by some neighbor otherwise, insert spill code,
recompute GI and start from step 1
25Briggs Improvement
- Nodes can still be colored even with gt k
neighbors if some neighbors have same color - While ? vertices with lt k neighbors in GI
- Pick any vertex n such that nlt k and put it on
the stack - Remove that vertex and all edges incident to it
from GI - This may create vertices with fewer than k
neighbors - If GI is non-empty (all vertices have k or more
neighbors) then - Pick a vertex n (using some heuristic condition),
push n on the stack and remove n from GI , along
with all edges incident to it - If this causes some vertex in GI to have fewer
than k neighbors, then go to step 1 otherwise,
repeat step 2 - Successively pop vertices off the stack and color
them in the lowest color not used by some
neighbor - If some vertex cannot be colored, then pick an
uncolored vertex to spill, spill it, and restart
at step 1
26The Middle-end Optimizer
27Principles of Compiler Optimization
- safety
- Does applying the transformation change the
results of executing the code? - profitability
- Is there a reasonable expectation that applying
the transformation will improve the code? - opportunity
- Can we efficiently and frequently find places to
apply optimization - Optimizing compiler
- Program Analysis
- Program Transformation
28Program Analysis
- Control-flow analysis
- Data-flow analysis
29Control Flow Analysis
- Basic blocks
- Control flow graph
- Dominator tree
- Natural loops
- Dominance frontier
- the join points for SSA
- insert ? node
30Data Flow Analysis
- compile-time reasoning about the runtime flow of
values - represent effects of each basic block
- propagate facts around control flow graph
31DFA The Big Picture
- Set up a set of equations that relate program
properties at different program points in terms
of the properties at "nearby" program points
- Transfer function
- Forward analysis compute OUT(B) in terms IN(B)
- Available expressions
- Reaching definition
- Backward analysis compute IN(B) in terms of
OUT(B) - Variable liveness
- Very busy expressions
- Meet function for join points
- Forward analysis combine OUT(p) of predecessors
to form IN(B) - Backward analysis combine IN(s) of successors to
form OUT(B)
32Available Expression
- Basic block b
- IN(b) expressions available at bs entry
- OUT(b) expressiongs available at bs exit
- Local sets
- def(b) expressions defined in b and available on
exit - killed(b) expressions killed in b
- An expression is killed in b if operands are
assigned in b - Transfer function
- OUT(b) def(b) ? (IN(b) killed(b))
- Meet function
- IN(b)
33More Data Flow Problems
- AVAIL Equations
- More data flow problems
- Reaching Definition
- Liveness
meet function ? n
forward reaching definition available expression
backward variable liveness very busy expression
34Compiler Optimization
- Local optimization
- DAG CSE
- Value numbering
- Global optimization enabled by DFA
- Global CSE (AVAIL)
- Constant propagation (Def-Use)
- Dead code elimination (Use-Def)
- Advanced topic SSA
35Perspective
- Front end essentially solved problem
- Middle end domain-specific language
- Back end new architecture
- Verifying compiler, reliability, security
36Interesting Stuff We Skipped
- Interprocedural analysis
- Alias (pointer) analysis
- Garbage collection
- Check the literature reference in EaC
37How will you use the knowledge?
- As informed programmer
- As informed small language designer
- As informed hardware engineer
- As compiler writer
38Informed Programmer
- Knowledge is power
- Compiler is no longer a black box
- Know how compiler works
- Implications
- Use of language features
- Avoid those can cause problem
- Give compiler hints
- Code optimization
- Dont optimize prematurely
- Dont write complicated code
- Debugging
- Understand the compiled code
39Solving Problem the Compiler Way
- Solve problems from language/compiler perspective
- Implement simple language
- Extend language
40Informed Hardware Engineer
- Compiler support for programmable hardware
- pervasive computing
- new back-ends for new processors
- Design new architectures
- what can compiler do and not do
- how to expose and use compiler to manage hardware
resources
41Compiler Writer
- Make a living by writing compilers!
- Theory
- Algorithms
- Engineering
- We have built
- scanner
- parser
- AST tree builder, type checker
- register allocator
- instruction scheduler
- Used compiler generation tools
- ANTLR, lex, yacc, etc
On track to jump into compiler development!
42Final Remarks
- Compiler construction
- Theory
- Implementation
- How to use what you learned in this lecture?
- As informed programmer
- As informed small language designer
- As informed hardware engineer
- As compiler writer
- and live happily ever after