Field Modifiable Architecture and its Design Method - PowerPoint PPT Presentation

About This Presentation

Title:

Field Modifiable Architecture and its Design Method

Description:

2.9 times faster than MIPS. 1.5 times faster than VLIW. Rijndael encryption functions are used. (sw) - Implementation by general processor (MIPS) ... – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 17

Provided by: fujit7

Category:

more less

Transcript and Presenter's Notes

Title: Field Modifiable Architecture and its Design Method

1
Field Modifiable Architecture and its Design
Method

Kenshu Seto, Yoshihisa Kojima, Hiroshi Saito,
Satoshi Komatsu and Masahiro Fujita
Dept. of Electronic Eng., University of Tokyo
7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656 Japan
Tel 81-3-5841-6764
E-mail seto, kojima, saito, komatsu,
fujita_at_cad.t.u-tokyo.ac.jp

2
Typical Architecture and its Problem
Critical loops in processor software are
implemented as complete hardware (i.e. ASICs).

It is high performancebut ...
Verification takes too much time (not easy to
design).
Bug fix is impossible.
Less reusable than software.

3
Field Modifiable Architecture

It has specialized instructions for a given
program.
Specialized instructions are generated
automatically.
It is controlled by software.

Much more flexible and debuggable.
Much more easily designed.
Much more reusable.

4
Design Method Overview
In this presentation, we focus onSpecialized
instruction set generation and Code generation.
5
Instruction Set Generation Flow

Function add area 150 delay 0.5ns cycle
1 power 0.1mW Function 3add area 250
delay 0.7ns cycle 1 power 0.15mW ...
SpecialInstruction Library
DFG
Match Generation

Binate Covering Formulation
Solving Binate Covering by MTBDD

plus(mult(reg,reg),reg) -gtregresource ALU mac
reg,reg,reg,reg plus(reg, reg)-gtregresource
ALU add reg,reg,reg
Hardware Netlist
Rule File
ADD
MAC
Optimum Instruction Set
6
Instruction Models
Following kinds of specialized instructions are
generated automatically.
Serial instructions(like MAC in DSP)
Parallel instructions(like 2parallel-ADD in
SIMD processors)
In this presentation, only serial instructions
are considered.
7
Match Generation
n3
n4
n3
n4
Generate all matches
n1
n2
n1
n2
Generated matches
Given DFG

A match is a sub-graph of the given DFG.
Each match consists of 1 - 4 DFG nodes.
Each match is a candidate of a special
instruction.
The optimal set of instructions is selected.

8
Variables Used in Binate Covering
n3
n4
n4
n1
m1
n3
n2
m2
m3
m4
n4
n3
n4
n1
n2
m5
m7
Node variables (n1,n2,n3,n4)
m6
n1
n2
n1
Match variables (m1,m2,m3,m4,m5,m6,m7)

Meanings of the variables is as follows.
mi 1 when match mi is selected as a special
instruction.
nj 1 when node j is covered as primary output
of a match.

9
Cost Functions

Total cycle cost Cstep Si Cstep,imi
Total area cost Carea Sj Carea,jfuncj

Cstep,i is the cycle cost of a match mi.
Carea,j is the area cost of a function unit j.
funcj1 when function unit j is used an
instruction.

Cycle cost 3 (3 instructions) Area cost 1
(ADD function unit)
n4
n1
n2
m1
m2
m4
10
Solving Binate Covering

Relation between node and match variables.
F1 (!n1m1m5m6)(!n2m3m4)(!n3m3)(!n4m4)
DFG output condition
F2 n1n2
Match input condition
F3 (!m1n3n4)(!m2n4)(!m31)(!m41)(!m5n4)
(!m6n3)(!m71)

Condition F1F2F3 and cost functions are
represented by MTBDD (Multi-Terminal Binary
Decision Diagram) and the best solution are
computed.
Example solution is m10,m21,m30,m41, m51,m60
,m70. (3 instructions)
m4
m5
m2
11
Code Generation by FSM
Efficient code generator is desired for complex
architecture. (Traditional code generator produce
poor quality code.)
CGFSM(Code Generation FSM)-based code generator
has been developed.
12
Code Generation Flow by FSM
ld(addr)-gtregresource read_port ld
reg,(addr) st(addr)-gtregresource
write_port st (addr),reg plus(reg,
reg)-gtregresource ALU add reg,reg,reg
Rule File
DFG
CGFSM Generation
CGFSM
Symbolic Analysis of CGFSM
ld r1,(addr) ld r2,(addr) add r3,r1,r2 st
(addr), r3
Assembly code
13
Variables used in CGFSM

For this simple example DFG,
Input variables are (m1,m2,m3,m4).
State variables are (o1,reg, o2,reg, o3,reg, c4).

mi 1 when executing match mi.
oi,reg 1 when operand of node i is available.
ci 1 when node i is covered.

14
State Transitions of CGFSM
(0,0,0,0)
Input variable (m1,m2,m3,m4)
ld
ld

(1,0,0,0)
(0,0,1,0)
(0,0,0,0)
st
(0,0,0,0)
(1,0,0,0)
(0,0,0,0)
n1
n2
ld
ld
ld
ld
ld
ld
ld
ld
(0,0,1,0)
(0,1,0,0)
(0,0,0,1)
n3

n4
st
st
st
st
(0,1,0,0)
(1,1,1,0)
(0,0,0,0)
(1,1,1,1)
(1,1,0,0)
(0,0,1,0)
Final State
Initial State
ld
ld

(0,0,0,0)
st
State variable (o1,reg, o2,reg, o3,reg, c4)
(1,1,0,0)
15
Experimental Results

Rijndael encryption functions are used.
(sw) - Implementation by general processor
(MIPS).
(ours) - Result for field modifiable
architecture.
(vliw) - Result for usual VLIW architecture.

Area cost not considered.
All instructions assumed to take 1 cycle.

Proposed architecture 2.9 times faster than
MIPS 1.5 times faster than VLIW
16
Conclusion and Future Work

Conclusion
Field modifiable architecture
Specialized instruction generation
FSM-based code generation
Latest encryption program is mapped
successfully to field modifiable architecture.