Title: Field Modifiable Architecture and its Design Method
1Field Modifiable Architecture and its Design
Method
- Kenshu Seto, Yoshihisa Kojima, Hiroshi Saito,
- Satoshi Komatsu and Masahiro Fujita
- Dept. of Electronic Eng., University of Tokyo
- 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656 Japan
- Tel 81-3-5841-6764
- E-mail seto, kojima, saito, komatsu,
fujita_at_cad.t.u-tokyo.ac.jp
2Typical Architecture and its Problem
Critical loops in processor software are
implemented as complete hardware (i.e. ASICs).
- It is high performancebut ...
- Verification takes too much time (not easy to
design). - Bug fix is impossible.
- Less reusable than software.
3Field Modifiable Architecture
- It has specialized instructions for a given
program. - Specialized instructions are generated
automatically. - It is controlled by software.
- Much more flexible and debuggable.
- Much more easily designed.
- Much more reusable.
4Design Method Overview
In this presentation, we focus onSpecialized
instruction set generation and Code generation.
5Instruction Set Generation Flow
Function add area 150 delay 0.5ns cycle
1 power 0.1mW Function 3add area 250
delay 0.7ns cycle 1 power 0.15mW ...
SpecialInstruction Library
DFG
Match Generation
Binate Covering Formulation
Solving Binate Covering by MTBDD
plus(mult(reg,reg),reg) -gtregresource ALU mac
reg,reg,reg,reg plus(reg, reg)-gtregresource
ALU add reg,reg,reg
Hardware Netlist
Rule File
ADD
MAC
Optimum Instruction Set
6Instruction Models
Following kinds of specialized instructions are
generated automatically.
Serial instructions(like MAC in DSP)
Parallel instructions(like 2parallel-ADD in
SIMD processors)
In this presentation, only serial instructions
are considered.
7Match Generation
n3
n4
n3
n4
Generate all matches
n1
n2
n1
n2
Generated matches
Given DFG
- A match is a sub-graph of the given DFG.
- Each match consists of 1 - 4 DFG nodes.
- Each match is a candidate of a special
instruction. - The optimal set of instructions is selected.
8Variables Used in Binate Covering
n3
n4
n4
n1
m1
n3
n2
m2
m3
m4
n4
n3
n4
n1
n2
m5
m7
Node variables (n1,n2,n3,n4)
m6
n1
n2
n1
Match variables (m1,m2,m3,m4,m5,m6,m7)
- Meanings of the variables is as follows.
- mi 1 when match mi is selected as a special
instruction. - nj 1 when node j is covered as primary output
of a match.
9Cost Functions
- Total cycle cost Cstep Si Cstep,imi
- Total area cost Carea Sj Carea,jfuncj
- Cstep,i is the cycle cost of a match mi.
- Carea,j is the area cost of a function unit j.
- funcj1 when function unit j is used an
instruction.
Cycle cost 3 (3 instructions) Area cost 1
(ADD function unit)
n4
n1
n2
m1
m2
m4
10Solving Binate Covering
- Relation between node and match variables.
- F1 (!n1m1m5m6)(!n2m3m4)(!n3m3)(!n4m4)
- DFG output condition
- F2 n1n2
- Match input condition
- F3 (!m1n3n4)(!m2n4)(!m31)(!m41)(!m5n4)
(!m6n3)(!m71)
Condition F1F2F3 and cost functions are
represented by MTBDD (Multi-Terminal Binary
Decision Diagram) and the best solution are
computed.
Example solution is m10,m21,m30,m41, m51,m60
,m70. (3 instructions)
m4
m5
m2
11Code Generation by FSM
Efficient code generator is desired for complex
architecture. (Traditional code generator produce
poor quality code.)
CGFSM(Code Generation FSM)-based code generator
has been developed.
12Code Generation Flow by FSM
ld(addr)-gtregresource read_port ld
reg,(addr) st(addr)-gtregresource
write_port st (addr),reg plus(reg,
reg)-gtregresource ALU add reg,reg,reg
Rule File
DFG
CGFSM Generation
CGFSM
Symbolic Analysis of CGFSM
ld r1,(addr) ld r2,(addr) add r3,r1,r2 st
(addr), r3
Assembly code
13Variables used in CGFSM
- For this simple example DFG,
- Input variables are (m1,m2,m3,m4).
- State variables are (o1,reg, o2,reg, o3,reg, c4).
- mi 1 when executing match mi.
- oi,reg 1 when operand of node i is available.
- ci 1 when node i is covered.
14State Transitions of CGFSM
(0,0,0,0)
Input variable (m1,m2,m3,m4)
ld
ld
(1,0,0,0)
(0,0,1,0)
(0,0,0,0)
st
(0,0,0,0)
(1,0,0,0)
(0,0,0,0)
n1
n2
ld
ld
ld
ld
ld
ld
ld
ld
(0,0,1,0)
(0,1,0,0)
(0,0,0,1)
n3
n4
st
st
st
st
(0,1,0,0)
(1,1,1,0)
(0,0,0,0)
(1,1,1,1)
(1,1,0,0)
(0,0,1,0)
Final State
Initial State
ld
ld
(0,0,0,0)
st
State variable (o1,reg, o2,reg, o3,reg, c4)
(1,1,0,0)
15Experimental Results
- Rijndael encryption functions are used.
- (sw) - Implementation by general processor
(MIPS). - (ours) - Result for field modifiable
architecture. - (vliw) - Result for usual VLIW architecture.
- Area cost not considered.
- All instructions assumed to take 1 cycle.
Proposed architecture 2.9 times faster than
MIPS 1.5 times faster than VLIW
16Conclusion and Future Work
- Conclusion
- Field modifiable architecture
- Specialized instruction generation
- FSM-based code generation
- Latest encryption program is mapped
successfully to field modifiable architecture.
- Future Work
- Specialized instruction generation
consideringresource constraints, power, etc. - FSM-based Code generation for CDFG.