Processor Modelling and Retargetable Compilation - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Processor Modelling and Retargetable Compilation

Description:

Register Allocation(coloring) Code Selection (tree automata) Scheduling (trace) ... Need compiler support for 'varying' target proc architecture. Conventional ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 47
Provided by: pcverh
Category:

less

Transcript and Presenter's Notes

Title: Processor Modelling and Retargetable Compilation


1
Processor Modelling and Retargetable Compilation
2
Outline
  • Introduction to Retargetable Compilers
  • Processor Modelling (nML)
  • CHESS
  • Intermediate representation
  • Code Selection process
  • Compilation flow

3
Evolution of Compilers
Micro programming
70
CISC
Code Gen (dynamic prog) Code selection (LR
parsing) Code Selection (combiner)
80
High Level Synthesis (ASIC)
Register Allocation(coloring) Code Selection
(tree automata) Scheduling (trace) Scheduling
(s/w pipelining)
RISC, VLIW, Superscalar
90
Models for Retargetablity Phase coupling Register
Allocation (heterog reg)
Embedded Procs
4
Why Retargetable compilation?
  • DSP oriented application increasing
  • Embedded processors
  • Architecture of Embedded proc
  • subjected to changes regularly
  • Program Developers
  • Need compiler support for varying target proc
    architecture
  • Conventional Compilers
  • Back end has to be rewritten
  • Tedious

5
Conventional Compilers
Application In HLL
Syntactic Semantic checks
Front End
Refinement, proc Independent optimizations
IR
Code Selection, Register allocation, scheduling
Code Generator
Machine code
Knowledge of target arch built in Code Generator
6
Retargetable Compiler
Processor Specification
Appln in HLL
Front end
Front end
IR (s)
Refinement, proc Independent optimizations
Code Selection, Register allocation, scheduling
Code Generator
M/c code
Knowledge of target arch specified explicity
7
Retargetabilty
  • Types
  • Depending on amount of work to accommodate the
    new target processor
  • Developer Retargetable
  • Rewriting the backend
  • User Retargetable
  • Writing compiler specific proc models
  • uses aid of compiler-compilers
  • Eg Gcc, Lcc
  • Automatically Retargetable
  • Independent proc specification
  • Spec at level of Programmers manual
  • Eg CHESS, MIMOLA

8
The CHESS Environment
C (nDL) -primitive datatypes -and
operations -appln algos
nML proc description -inst set -high level
structure
Front end
Front end
High level optimization
CDFG
ISG
Code selection
LIB
Register allocation
Scheduling
Machine code
9
Processor Modelling
  • Basic features of the proc
  • registers
  • data path (connectivity)
  • Instruction Set (execution behaviour)
  • For good quality code, all arch pecularities
  • heterogeneous reg structure
  • Addressing modes
  • Specified at a high level of abstraction
  • Language and associated grammer
  • Proc modelling Languages
  • nML, ISPS, ISDL, LISA

10
nML
  • Specifies the syntax and semantics of the
    Instruction Set
  • 2 main parts
  • Declarations (Structural Skeleton)
  • H/W entities of the target proc (Storage)
  • Grammer
  • Instruction Set
  • Execution Behaviour
  • Topology (datapath connectivity)
  • State m/c
  • values in storage state
  • Instrution execution Transition Function

11
nML (Declarations)
  • Defines structural Skeleton by defining the
    conection points
  • All storage elements declared globally
  • 2 types of storage elements
  • Static Storage
  • Transitory Storage

12
Static Storage
  • Defn Elements storing values for one or more
    than one m/c cycle until explicitly over
    written
  • Componets
  • Memories
  • Controllable registers
  • Capacity of storage also specified
  • Eg
  • Memory
  • mem DM1024 ltnumgt
  • Registers
  • reg AX ltnumgt Alu reg
  • reg Ia2 ltaddrgt address reg

13
Transitory Storage
  • Defn Elements that pass the value with certain
    delay, specified in m/c cycles
  • Components
  • Buses
  • Nets
  • Pipeline Regs
  • Capacity is one, and can be read once
  • Eg
  • trn A ltnumgt Alu input
  • trn XD ltnumgt Data Bus
  • trn T ltnumgt d 1 delay of 1 m/c cycle
  • Memory and Reg Ports are also specified as
    transitory to identify h/w conflicts
  • Eg reg Axltnumgt read (AW_rA, AX_rB) write (AX_w)

14
Other declarations
  • Record type storage
  • Eg Accumulator of fixed pt DSP
  • Functional Units can also be modelled
  • fu alu
  • fu mult
  • Hardwired constants
  • cst C_3 ltfactgt xxx
  • cst one_8 ltnumgt 00000001

Record data_type class acc public num w0 num
w1
Storage element reg MRltaccgt MR0 MR1
15
nML (Grammer)
  • Instr set and behaviour description
  • Instr set analysed structure captured in
    Production Rules (grammer)
  • The topology (connectivity) of datapath captured
    by grammer attributes
  • Production Rules
  • OR-Rules
  • Lists all alternatives for an Instr part
  • mutually exclusive
  • opn jvp_core (arith_ls_instr control_inst
    direct_mv)
  • And-Rules
  • Composition of instr parts
  • orthogonal
  • opn arith_ls_instr (ar arith_instr ls
    indirect_mv)

16
  • Each possible derivation from these rules
    represents
  • a legal instr
  • The structure (hierarchy) in the Instr set
    captured by
  • the production rules

Jvp_core
Direct_move
Control_instr
Arith_ls_instr
Dir_store
Reg_mov
Dir_ld
. . . . . .
Indirect_mov
Arith_instr
OR_rule
. . . . . .
. . . . . .
AND_rule
17
Grammer Attributes
  • OR rules just pass attributes
  • AND rules define 4 types
  • Action attribute
  • specifies what is executed by instr/instr part
  • Each AND rule have one action attribute
  • Syntax attribute
  • specifies assembler syntax (mnemonic)
  • Each AND rule may have multiple syntax attrs
  • Image attribute
  • defines binary encoding of instr/instr part
  • Value and Mode attributes
  • specifies how a storage element is addressed

18
Eg Instruction part performing immediate
shift opn shift_instr (al alu_left_op, factor
c_3) action A al.value
// read the operand C pass(A, AS) _at_alu //
pass it thru ALU AR C ltlt factor _at_sh //
perform shift image 11 al.image factor

Operation-types which can be executed on ALU
specified best using switch statements opn
alu_op (op alu) action switch (op)
case add C add(A, B, AS) _at_alu case
sub C sub(A, B, AS) _at_alu . .
image 0 op
19
Control Instructions
  • Modelled using switch statements
  • Action attributes contain primitive operation
    types that model controller
  • opn cond_jump (t c_10, c cond)
  • action
  • switch (c)
  • case EQ
  • tC eq(AS)
  • jump(tC, t)
  • case GT
  • tC ge (AS)
  • jump(tC, t)
  • .
  • .
  • image 100xxx c t

20
The CHESS Environment
C (nDL) -primitive datatypes -and
operations -appln algos
nML proc description -inst set -high level
structure
Front end
Front end
High level optimization
CDFG
ISG
Code selection
LIB
Register allocation
Scheduling
Machine code
21
Instruction Set Graph (ISG)
  • Intermediate processor model
  • Directed Bi-partite Graph
  • GISGltVISG, EISGgt
  • Vertices VISG VS U VI
  • VS storage elements
  • VI operating-types
  • Edges EISG C (VS x VI) U (VI x VS)
  • connectivity
  • data flow

22
Partial ISG of a processor
AX(num)
AR(num)
MR1(num)
MR0(num)
AX read_reg 00xxx0xx 01100xxx AR_r
AX read_reg 00xxx0xx 01100xxx AX_r
..
..
AR_r(num)
AX_r(num)
AX_r copy 00xxx0xx 01100xxx A
AR_r copy 00xxx1xx B
B(num)
A(num)
Static storage
A B and 00010xxx C
AS_w
Transitory
Operation type
AR (num)
..
23
ISG (contd)
  • ISG Operation-types
  • Defn Primitive processor operation activity, has
    fixed no of ordered i/p args and o/p
  • each arg connected to one-edge and one storage
    element
  • Impl of primitive-operation types defined in a
    header file
  • Enabling Conditions
  • Each instr proc executes many oprn-types
  • One oprn-type enabled by many instrs
  • Defn All the instrs enabling an oprn-type
  • enabling(i)

24
Conflicts
  • Encoding conflicts
  • H/W or resource conflicts
  • Encoding Conflicts
  • Defn Subset of ISG oprn-types Vio C VI
  • enabling (Vio) Intersection I ? VIo enabling
    (I)
  • Vio has encoding conflict if enabling (Vio) F
  • For packing 2 oprn-type into an instrn

25
  • Resource (H/W) conflicts
  • Several oprn-types contend for the same resource
  • input (i, n) Vi x N -gt Vs
  • output(i, n) Vi x N -gt Vs
  • read and write ports transitories
  • H/W conflict modelled as access conflict on
    transitories
  • To check H/W conflicts
  • resources(i) set of all transitories oper i
    accesses
  • Vio C Vi, if for all ii , ij ? Vio ii NE ij and
  • resources (ii) inters resources (ij ) ?
  • then Vio is free of H/W conflicts

26
nML to ISG front end
  • nML is parsed into a parse tree
  • parse tree passed thru, 3 passes
  • pass 1 Finding instruction word length and
    locating the position of each image attribute in
    the instrn
  • pass 2 Finding enabling conditions and all
    specified instructions
  • pass 3 Finding exact enabling cond using the
    set of instrs found in pass 2

27
CDFG (IR for application)
  • Similar to ISG
  • Directed Bi-partite Graph
  • GCDFGltVCDFG , ECDFGgt
  • Vertices VCDFG VO U VV
  • VO operations in application
  • VZ values that operations produce/consume
  • Edges ECDFG C (VO x VV) U (VV x VO)
  • Represents data-flow from operations through
    values
  • Control-flow is modelled by imposing hierarchy of
    macronodes on CDFG operations
  • macronodes have type
  • basic block, if-stat, for-stat, do-stat

28
Eg of CDFG
root
Block (init)
a
b
c
d
Do-stat
Block (loop- init)
x
If-stat


t4
Block (then)
Block (else)
t2
t3
-
Block (loop-end)
t1
Data flow of (a(bc))-((bc)d)
Control flow of a do-while loop
29
  • CDFG Operation Types
  • Operation types used in application
  • Could be hierarchial
  • Different from the ISG operation types
  • All operation types both applications and
    processors are declared in an header file
  • The operation types are linked by a library(LIB)
    which defines the operation heirarchy

30
Operation-Type hierarchy (LIB)
  • The LIB contains 3 parts
  • Proc independent part
  • defines operation properties
  • eg commutative, inline fun, primitive
  • Proc dependent part from header file
  • proc dependent part from nML part
  • Basic Idea
  • Operation types in LIB organised in a hierarchial
    way that represents different ways in which, CDFG
    operation can be mapped to an ISG operation

31
Operation type Hierarchy (eg.)
func_opn
comm_opn
sub
add
X
Y
sub_XY
sub_YX
add_XY
add_YX
C
ISG
32
The CHESS Environment
C (nDL) -primitive datatypes -and
operations -appln algos
nML proc description -inst set -high level
structure
Front end
Front end
High level optimization
CDFG
ISG
Code selection
LIB
Register allocation
Scheduling
Machine code
33
Code Generation Process
  • Mapping of GCDFGltVo, Vvgt onto GISGltVI, Vsgt
  • Vv onto Vs
  • Vo onto VI
  • Assumptions
  • Basic block by basic block
  • Transitories have zero delay
  • So, each oprn type executes in 1 cycle
  • Phases
  • Code selection phase
  • Refinement
  • Bundling
  • Covering
  • Register allocation
  • Scheduling

34
Code Seletion (Refinement)
  • Replacing CDFG operations by its childern
  • refinement (o) r
  • Valid Refinement
  • A CDFG operation r ? VOR is valid refinement for
    a CDFG operation o ? VO with type (o) ? L, iff
  • type( r) i i is subtype of type (o)
  • datatype(input(o,n)) datatype(input(i,n))
  • datatype(output(o,n)) datatype(output(i,n))
  • Valid Mapping
  • mapping (o) i, o ? VO i ? VI
  • i type (r) type (refinement(o))

35
Refinement (contd)
  • Binding data dendency
  • 2 types
  • Direct data dependency
  • Allocated data dependency
  • Direct data dependency
  • Data dep b/w 2 refined CDFG oprn r1 r2 is
    direct if it is implemented as a valid direct
    path in ISG
  • direct path A path in ISG, b/w 2 operations that
    does not iny any storage other than transitories
  • direct (ri, rj) true/false
  • Allocated data dependency
  • Data dep b/w 2 refined CDFG oprn r1 r2 is
    allocated if it is implemented as a path in
    ISG, that has one or more static storage elements

36
Bundling
  • Idea To find conflict free CDFG operations
    that can be executed in same cycle
  • Defn Set of CDFG operations that can be refined
    to form a refined bundle
  • Refined bundle
  • Set of all refined operations r1 r2 ? VOR that
    are coupled
  • coupled (r1, r2) true/false
  • true if
  • r1 r2 or
  • direct(r1, r2) or
  • coupled (r1, r3) couple(r3, r2)
  • Defined for a given refinement function
  • Each bundle can be associated with a set of
    refinement functions

37
Properties/constraints on bundles
  • Same cycle theorem
  • 2 refined operations that have a direct data
    dependency belong to the same bundle if they
    have allocated data dependency they cannot be in
    the same bundle
  • Operations in bundles should not have encoding or
    Resource conflicts
  • Bundles need to be convex
  • convex bundle if no opern path b/w 2 of its
    opern contains an oprn path external to the bundle

ltlt1

x
Not a convex bundle
gtgt2
38
  • Refined bundle which satisfies those prop is
    called a valid refined bundle
  • A bundle is valid if its operns can be refined to
    form a valid refined bundle
  • Eg of Valid bundle
  • In effect, each valid bundle coresponds to an
    intruction/instruction part

ltlt1

x
gtgt2
39
Refinement and Bundling in a nutshell
library
sub
Func_oprn
Comm_opn
sub
ltlt
add
ltlt
A
B
Sub_BA
Add_AB
Sub_AB
Sub_AB
C
ltlt_C
ltlt_C
AR_w
mapping
ISG
refinement
type
40
Code Selection (covering)
  • Previous stages, give all possibilites of valid
    bundling
  • Each oprn may be coverd by one or more bundles
  • and, each bundle covers one or more oprns
  • Minimum graph cover
  • Given a collection of bundles B that induce
    patterns in CDFG, problem is to seqrch for q
    minimum number of patterns that cover the whole
    GCDFG
  • cost fn no of bundles
  • Solution Branch and Bound Algorithm

41
Branch and Bound (Basic strategy)
  • Find essential bundles
  • if oi is covered by only one bundle
  • Add these to the cover C
  • For the rest, build a search tree
  • each node is a partial cover of CDFG
  • Branching at each node models selection of
    bundles
  • Depth traversal gives a cover C

B1
x
B4
B5


B3
B2
42
The search tree
start
O2
O2
B4o1, o2
B2o2
O3
O3
O3
O3
B5o1, o3 cost 2
B3o3
B5o1, o3
B4o3
O1
O1
O1
B1o1 cost 3
B1o1 cost 3
B1o1 cost 3
Covers B5, B4, B1, B3, B4
43
Issues in covering
  • Overlapping bundles
  • operation duplication
  • Order of choosing the operations oi
  • size of tree can be reduced
  • eg Increasing order of BOI
  • Pruning and branching heuristics

44
Register Allocation and Scheduling
  • Register allocation
  • Binds the values to registers/ memory
  • Modelled as Data routing problem (ISG)
  • Makes sure capacity of storage is not exceeded
  • spilling values to memory
  • fixing execution order b/w bundles
  • Scheduling (compaction phase)
  • oprns are bound to time
  • oprns packed into instruction
  • oprns in same bundle exectute in same instr
  • diff bundles may be scheduled in parallel in the
    same instrn

45
Other issues related to Code generation
  • Code generation beyond basic blocks
  • Bundling of operations beyond basic blocks
  • scheduling done globally
  • oprns could still be moved across blocks
  • loop unfolding or S/W pipeling
  • Phase Coupling
  • delayed binding
  • common operands
  • coupling by cost funtions
  • cost for each bundle different
  • cost (C) S cost (Bi)
  • scheduling in parallel is emphasised

46
Compilation flow using CHESS environment
Proc modelling
Processor.h
NOODLE
Processor.cdfg
Proc.nml
Processor.lib
ANIMAL
Processor.isg
Processor.h program.c
NOODLE
Application prog
Program.cdfg.cdfg
COSEL
Prgram_bndl.lib
Program_bndl.cdfg
Program_bndl.isg
AMNESIA
Program_dr.cdfg
MIST
Program_sch.cdfg
STATIC
Program.micro
Write a Comment
User Comments (0)
About PowerShow.com