Title: Resource Awareness FPGA Design Practices for Reconfigurable Computing: Principles and Examples
1Resource Awareness FPGA Design Practices for
Reconfigurable Computing Principles and Examples
- Wu, Jinyuan
- Fermilab, PPD/EED
- April 2007
2Introduction
- Short Course (1/2 day)
- How to Design Compact FPGA Functions
- Resource awareness design practices.
- http//www-ppd.fnal.gov/EEDOffice-W/Projects/ckm/c
omadc/CompactFPGAdesign.pdf - Refresher Course (45min)
- Resource Saving in Micro-Computer Software
FPGA Firmware Designs - http//www-ppd.fnal.gov/EEDOffice-W/Projects/ckm/c
omadc/ResourceSaving.ppt - This Document
- Resource Awareness FPGA Design Practices for
Reconfigurable Computing Principles and Examples
What can be done with an FPGA?
3Example ADC Using FPGA
FPGA
AMP Shaper
ADC
AMP Shaper
ADC
- Analog signals from AMP Shapers are directly
fed to FPGA pins. - FPGA outputs and passive RC network are used to
generate ramping reference voltage VREF. - The input voltages and VREF are compared using
FPGA differential input receivers. - The times of transitions representing input
voltage values are digitized by TDC blocks in
FPGA.
AMP Shaper
ADC
AMP Shaper
ADC
FPGA
AMP Shaper
TDC
AMP Shaper
TDC
AMP Shaper
TDC
AMP Shaper
TDC
VREF
R1
R1
C
R2
4TDC Inside FPGA
Clock Domain Changing
Multiple Sampling
Q3
QF
- Sampling rate 360 MHz x4 phases 1.44 GHz.
- LSB 0.69 ns.
- Logic elements with critical timing are assigned
as shown.
c0
c0
Q2
QE
c90
Q1
QD
c180
Q0
c90
c270
DV
T0 T1
Trans. Detection Encode
4Ch
Coarse Time Counter
TS
Logic elements with non-critical timing are
freely placed by the fitter of the compiler.
5ADC Test Waveform Digitization on BD3_19
A lot can be done with an FPGA if one can image.
FPGA
TDC
TDC
VREF
50
50
Input Waveform, Overlap Trigger Reference
Voltage
1000pF
100
Raw Data
Converted
6Micro-computing vs. Reconfigurable Computing
(1003-4)57 ?
100
3
Data 100,3,4,5,7
4
5
7
Control
LD
(-)
()
()
()
FPGA
Data
CPU
Data
Program
Program
Configuration
- In microprocessor, the users specify program on
fixed logic circuits. - In FPGA, the users specify logic circuits (as
well as program). - The FPGA computing needs not to follow
microprocessor architectures. (But useful
experiences can be borrowed.) - The usefulness of FPGA reconfigurable computing
is still to be fully appreciated.
7Example Track Fitting
8Relative Errors of Several Track Fitter Schemes
Least Square Fitter
Multiplier-less FPGA LS Fitter
9Least Square Fitter
c7
d7
e7
c6
d6
e6
c5
d5
e5
c4
d4
e4
c3
d3
e3
c2
d2
e2
c1
d1
e1
y1
y2
y3
y4
y5
y6
y7
X
X
X
S
S
S
- The parameters can be described as
inner-products. - Hit coordinates and coefficients are fed
simultaneously. - The inner-products can be calculated with
multiplier-accumulator structures.
10Multiplier-less (ML) Quasi-Least Square Fitter
x1
x2
x3
x4
x5
x6
x7
1
-1
-16
4
8
128
y1
y2
y3
y4
y5
y6
y7
ltlt
ltlt
ltlt
- The coefficients are described as two-bit
numbers, e.g. - 541 78-1 112128-16
- The multiplication is replaced with two shift
add/sub operations. - There are two clock cycles to fetch a measurement
point (i.e., y1, y2, etc.) allowing two shift
add/sub operations
S
S
S
/-
/-
/-
11Inaccuracy Doesnt Matter, A Lot of Time
Multiplier-less Quasi-Least Square FPGA Fitter
Least Square Fitter
12Fitting is easy. Matching hits is harder.
13Resource Saving Tricks
Loop Reduction Tricks The number of computations
in a given task is reduced by (1) using fewer
iterations in loops or/and (2) using fewer
operations in each iteration.
Non-Loop Reduction Tricks The number of
computations in a given task is unchanged. The
FPGA resource is saved by (1) reusing the
resources multiple times via sequencing or/and
(2) using transistor-saving resources such as RAM.
14Resource Saving TricksLoop-Reduction
Recursive Implementation of FIR Filter
Tiny Triplet Finder O(n)O(Nlog(N))
Multiplier-less (ML) Approaches
FFT O(n)O(log(N))
15Resource Saving TricksNon-Loop-Reduction
Sequencing
Using RAM Hash Sorter/Histogram
Initialization
Initialization 1
Initialization 2
Initialization 3
OP4
OP3
OP2
OP1
OP2
OP3
OP4
OP1
OP4
OP3
OP2
OP1
OP2
OP3
OP4
OP1
OP4
OP3
OP2
OP1
OP2
OP3
OP4
OP1
OP4
OP3
OP2
OP1
OP2
OP3
OP4
OP1
16An Example of Inexplicit Computing Hidden
Resource
RAM
D
16
WA
W/R
Input Ctrl
RA
32
- Data with random time stamp are re-ordered
according to beam crossing (BCO). - Data with same BCO output together and the
bandwidth becomes smaller. - Inexplicit computing (sorting) is performed with
hidden resource (RAM, it should be static RAM not
dynamic RAM.)
17Why Saving Resource?
Why not?
18The Fever of Moores Law vs. Maxwells Equations
Op/sec
WRW
MIT, 2002
1998 2000 2002 2004 2006 2008 2010
- During the hot days of Moores Law, the rules of
thumb are - BRB Buy Rather than Build
- URU Use Rather than Understand
- WRW Wait Rather than Work
- From fundamental principles like Maxwells
Equations, it is known limits of Moores Law
exist. The technology advance should come from - The I3 Law Imagination, Innovation
Implementation.
19Total Useful Works (Clock Frequency) x
(Silicon Size) x (Efficiency)
E
E
Primarily Users Responsibility
F
F
S
S
- There is a big room for improvement on
computation efficiency in both micro-computer
software and FPGA firmware. - Resource awareness not only saves direct cost,
but also indirect cost like power consumption, PC
board layout, cooling etc. - Unnecessary artificial complexities confuse
people, often including the designer. - Resource saving helps today when technology
stales. - Resource saving helps future with technology
progresses.