Overview of Compiling - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Overview of Compiling

Description:

FRONT END: Determine and represent structure of input program ... Here are the front ends. A Simple Compiler. The Back End (BE) ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 29
Provided by: barbara179
Category:

less

Transcript and Presenter's Notes

Title: Overview of Compiling


1
Overview of Compiling
  • Basics of Compilation
  • Main Components

2
Structure of a Compiler
Source code
Target code
Front End
Back End
  • FRONT ENDDetermine and represent structure of
    input program
  • Ensure that it is well-formed report errors
  • BACK ENDGenerate corresponding object code for
    the target architecture
  • Optimize code according to compiler flags

3
Major Modules in Open64
Here are the front ends
-IPA
-O3
.B
LNO
Local IPA
Main IPA
Lower to High IR
Inliner
gfec
.I
Lower I/O
gfecc
(only for f90)
.w2c.c .w2c.h
/.w2f.f
WHIRL2 C/Fortran
f90
-mp
(only for OpenMP)
Lower MP
Take either path
-O0
Lower all
Very high WHIRL
CG
High WHIRL
-phase woff
Mid WHIRL
Low WHIRL
Main opt
Lower Mid W
-O2/O3
4
The Back End (BE)
Object code
Back End
  • Main purpose of Back End (BE) is to generate
    target machine code
  • Match IR with hardware features
  • Translation details are machine-specific
  • But there are typical problems and strategies for
    overcoming them for classes of architectures
  • BE decides where to store data objects in program
  • Object code usually makes calls to a run time
    library
  • Handles common actions, improves efficiency
  • Part of compiler design is to decide what should
    handled by in run-time system

Code Generator
Optimizer

5
Three-Address Code
IR
  • 3-address code is popular IR for interface
    between Front End and Back End
  • General form Instruction argument1 argument2
    result
  • Two-address or one-address code have been used
  • one argument, single argument is also result
    location
  • e.g. a increments contents of a single location
  • these can save memory
  • Complex instructions broken down into several
    simpler ones to generate 3-address code.
  • Compiler generates temporary variables as needed

t1 ? c d a ? b t1
a b c d becomes
6
Translation Into Machine Code
Code Generator
  • Many translations are straightforward. But there
    are some tricky problems too
  • How do we deal with operations where the operands
    have different data types?
  • How do we figure out where variables in different
    storage classes should be stored?
  • How is high-level control flow (loops, switches,
    ) realized?
  • How are calls to procedures and functions
    implemented?

7
Selecting Instructions
Code Generator
  • Select instructions for each operation
  • Should be as efficient as possible
  • Difficulty of selection depends on machine
    instructions available.
  • Order of instructions may also affect efficiency
    of target code
  • But there is no optimal order.
  • We initially generate code in order produced by
    intermediate code generation
  • Latest technology requires compiler to generate
    bundles of instructions for concurrent execution

8
The Back End (BE)
SYMBOL TABLE
SYMBOL TABLE
Optimizer
IR
IR
  • Most of work goes into optimization
  • This is complex and there are many trade-offs and
    very hard problems
  • For this reason, back end is usually hand-coded
  • Some major optimization goals
  • Improve selection of instructions,
  • Instruction scheduling (reordering for pipelines
    and other hardware features
  • Assign data to registers

The challenge these are interrelated. Moreover,
there are no optimal solutions.
9
Instruction Selection and Optimization
  • Goal is to produce fast, efficient code
  • Take advantages of features of target machine
    instruction set such as variety of addressing
    modes
  • Usually dealt with as a pattern matching problem
  • Patterns in IR input to back end, ad hoc approach
  • Advent of RISC instruction sets simplified this
    greatly
  • Doing this well occupied compiler writers in the
    1970s

10
Low-Level Optimization
Optimizer
  • Improve selection of instructions
  • eliminate redundant operations
  • choose most efficient instructions
  • reorder (schedule) instructions
  • Allocate registers
  • keep most important (most heavily used) data in
    registers
  • optimize lifetime of data in registers

11
Example
Optimizer
  • a b c d a e
  • LDI b, Ri
  • LDI c, Rj
  • ADDI Ri, Rj
  • STI Rj, a Ri, Rj are registers
  • LDI a, Ri
  • LDI e, Rj
  • ADDI Ri, Rj
  • STI Rj, d
  • But we can avoid storing and reloading a.

12
Register Optimization
Optimizer
  • Registers provide particularly fast access, thus
    code compiled with data in registers will execute
    faster.
  • Instructions with operands in registers take up
    less space thus good use of registers also saves
    memory.
  • So it is important to use registers well when
    generating code.

The problem is to manage a limited set of
resources
13
Register Optimization
Optimizer
  • Goal in register allocation is to hold as many
    operands as possible in register.
  • Save data in memory only when run out of
    registers.
  • Allocation strategies also aim to reuse values in
    registers when possible.
  • During register allocation, we select values that
    will reside in registers at a point in the
    program.
  • Register assignment is an NP-complete problem, so
    there are no optimal solutions.
  • There are some popular strategies.

Compilers approximate solutions to NP-Complete
problems
14
Why are Optimizations hard?
Optimizer
  • The next problem is that instruction selection
    and register allocation are not independent
    problems
  • p w 2
  • q p r
  • s w 2
  • Register optimization suggests we remove p (and
    w) from registers as soon as possible.
  • But we need the same value later If we keep p in
    a register, we dont have to recompute it.
  • So to save an instruction, we need an additional
    register.
  • This kind of trade-off is typical!

15
Instruction Scheduling
  • Modern machines have multiple functional units
  • Need to avoid hardware stalls and interlocks
  • Use all functional units productively
  • Reordering can modify lifetime of variables
    (thus perhaps changing the register allocation)
  • Optimal scheduling is also NP-Complete

16
Setting Up Run Time Storage
Code Generator
  • a b c 2
  • MULI LOCc, 2, R1
  • To generate the correct instructions, we need to
    know how to access a, b and c.
  • Code generator uses information stored in symbol
    table in order to perform this translation
  • However, the symbol table is not around at run
    time

SYMBOL TABLE
17
Setting Up Run Time Storage
SYMBOL TABLE
  • At run time, memory is required to store the
    programs object code and its data objects.
  • An assignment to a variable will modify the
    contents of the corresponding storage location.
  • Intermediate representations use the symbol table
    as a means to refer to variables.
  • Before code is generated, these references will
    be replaced by the memory locations

18
Setting Up Run Time Storage
  • So back end must also set up storage for program
    and its variables
  • deal with different storage classes
  • Adapts code to reflect the locations chosen
  • usually relocatable (i.e. with offset, not
    absolute addresses)
  • Compiler assumes contiguous memory
  • Job of OS is to manage this

19
Preparing for Run Time
  • Program consists of collection of procedures.
  • An invocation of a procedure results in its
    activation at run time.
  • Compiler must generate code to
  • begin and terminate execution of procedures
  • Pass arguments to and results from called routine
  • Ensure proper return to calling procedure and
    restore its environment
  • A call stack is used to save local data, pass
    arguments and results, and save state of caller

20
Setting Up Run Time Storage
  • Each storage class is considered separately when
    reserving memory
  • What is known at end of compile time
  • size of object code,
  • amount of storage required for some data objects,
  • storage class of each data object.

21
Run-Time Memory Allocation
  • Here is one possible organization of memory

22
Run Time System
  • Target code would be too large if all operations
    are coded entirely in machine code.
  • So compilers usually have a run time library
  • Perform functions that are always carried out the
    same way, e.g. Initialization routines and
    termination code
  • Save space by performing repetitive non-trivial
    functions, e.g. Input and Output
  • Interrupts
  • Back end generate calls to run time library
    routines

23
Role of the Run-time System
  • Memory management services
  • Allocate
  • In the heap or in an activation record (stack
    frame)
  • Deallocate
  • Collect garbage
  • Run-time type checking
  • Error processing
  • Interface to the operating system
  • Input and output
  • Support of parallelism
  • Parallel thread initiation
  • Communication and synchronization

24
Where do Optimizations Occur?
  • Back end translates intermediate code to
    machine-like code
  • Called lowering
  • Optimizations are performed
  • In practice lowering may occur several times,
    interleaved with the optimizations
  • Some optimizations require information that may
    later be lost
  • In other words, they may depend on a certain kind
    of IR
  • Some optimizations may be repeated
  • This is the process we are going to start looking
    at in more detail

25
Outlook
  • Next, we will extend our description of a
    compilers structure
  • A more realistic description
  • Shows central role of optimizations
  • Other topics we will look into a bit more
  • Intermediate Representation
  • Symbol Tables
  • Preparing for Execution
  • Memory management and handling procedure calls

26
Run-Time Stack
  • Begin with result and actual parameters, other
    data of known size, then fields whose size may
    not be not fixed at compile time.
  • may be filled
    needed
    when
  • by caller
    procedure
    ends


  • unless in fixed
    storage area
  • size initially
  • unknown

27
Run Time Stack
  • Local variables can be saved on stack
  • So can temporaries
  • Global and static variables outlive a procedure
    activation
  • so they must be stored separately
  • a fixed area is usually reserved for them
  • Dynamic variables require a different storage
    area
  • the heap

28
Summary Code Generation
  • Code generation translates intermediate code into
    form like target machine code
  • Organizes memory usage
  • Optimization is essential for modern
    architectures.
  • Peephole optimizations try to improve target code
    in small region of consecutive instructions.
  • Register allocation, instruction selection and
    instruction scheduling
Write a Comment
User Comments (0)
About PowerShow.com