Title: FPGA Power Reduction Using Configurable DualVdd
1FPGA Power Reduction Using Configurable Dual-Vdd
- Fei Li, Yan Lin and Lei He
- EE Department, UCLA
- Partially supported by NSF CAREER award
CCF-0401682 and NSF grant CCR-0306682. Address
comments to lhe_at_ee.ucla.edu.
2Outline
- Background and Motivation
- Configurable Dual-Vdd FPGA
- Circuits
- Architectures
- Design Flow
- Experimental Results
- Conclusions
3Power Limitation of FPGAs
- Existing FPGAs are HIGHLY power inefficient
- Over 100X power overhead vs. ASIC Kusse,
ISLPED98 - Power is likely the largest limitation for FPGAs
4Research to Reduce FPGA Power
Design of FPGA circuits and architectures
Quantitative Evaluation
System design and synthesis targeting new FPGA
circuits and fabrics
5FPGA Power Optimization
- Power efficient FPGA circuits and architectures
- Pre-defined dual-Vdd fabric Li et al, FPGA04
- Power-gating unused blocks Gayasen et al,
FPGA04 - Power aware FPGA CAD algorithms for existing FPGA
architectures - CAD algorithms to minimize power-delay product
Lamoureux et al, ICCAD03 - Configuration inversion for leakage reduction
Anderson et al, FPGA04
6Review of Pre-defined Dual-Vdd
- FPGA fabric with two types of logic blocks
- VddH, VddL blocks
- Vdd level is pre-determined for each logic block
- New architectural parameters
- Dual-Vdd layout pattern
- Ratio between number of VddH blocks andnumber of
VddL blocks
7Limitation of Pre-defined Dual-Vdd
- Architecture comparison
- Pre-defined dual-Vdd fabric introduces
- Extra layout constraint due to Vdd matching
- Extra interconnect delay and power
circuit s38584
circuit alu4
0.31
0.09
0.29
0.08
0.27
0.25
0.07
0.23
total power (watt)
power (watt)
0.06
0.21
0.05
0.19
0.17
0.04
0.15
0.03
0.13
65
75
85
95
105
115
125
60
70
80
90
100
clock frequency (MHz)
clock frequency (MHz)
Configurable Dual-Vdd Vdd Programmability is
required
8Outline
- Background and Motivation
- Configurable Dual-Vdd FPGA
- FPGA circuits
- FPGA Architectures
- Design Flow
- Experimental Results
- Conclusions
9Main Idea of Configurable Dual-Vdd
- Three types of logic blocks
- H-block (VddH), L-block (VddL)
- P-block (Vdd-programmable)
- Vdd selection
- Power-gating unused blocks (two-bit control)
10P-Block Design
- Power switch is similar to sleep transistor
- Normal Vt (instead of high Vt) is used to for
power switch - High Vt reduces leakage, but increases area to
meet bound of delay increase
11Gate-boosting Power Switches
- Gate-boosting is used in existing FPGAs
- Increase Vg to Vdd Vt
- Compensate logic 1 degradation of NMOS pass
transistor - Gate-boosting for PMOS power switches
- Applied during power-gating mode
- Leakage reduced by two orders of magnitude
Vg
strong 1
weak 1
Vdd
Vg
12Low-leakage Configuration Cells
- Extra configuration bits are needed for
Vdd-programmability - Their power overhead is counted
- FPGA configuration cells during normal operation
- Only leakage power is consumed
- Read/write delay is irrelevant to runtime
performance - High-Vt SRAM cells for leakage reduction
- 100nm technology Vt(NMOS) 0.4693v, Vt(PMOS)
-0.5454v - 15X SRAM leakage reduction
- 13 increase of configuration time but no runtime
performance loss
13Level Converter
- Single supply level converter Puri et al,
DAC03 - Needed only when a VddL block drives a VddH block
- Avoid excessive leakage power in direct
connection - No level converter is needed when a VddH block
drives a VddL block
100nm technology
14Fabrics Using Configurable Dual-Vdd
Level converter
- Interleaved dual-vdd layout pattern
- Interleaved sequence in each row(H-block
L-block P-block) - Ratio H-block/L-block/P-block is pre-determined
- 100 P-blocks
- Full Vdd-programmabilityfor logic blocks
H-block
L-block
P-block
15Outline
- Background and Motivation
- Configurable Dual-Vdd FPGA
- FPGA Circuits
- FPGA Architectures
- Design Flow
- Experimental Results
- Conclusions
16Overall Design Flow
Tech Mapped Netlist (Single-Vdd)
Timing Driven Layout (Single-Vdd)
Dual-Vdd Assignment for Cells
H/L/P
Timing Driven Layout (Dual-Vdd)
Power-gating Unused Blocks
Local Refinement
Delay/Power Estimation
Delay
Power
17Placement Considering Configurable Dual-Vdd
- Simulated annealing similar to that in VPR
- Cost function
- Vdd-matching function
18Local Refinement
- Flip the Vdd configuration of a P-block
- VddH VddL if critical path delay is not
affected - VddL VddH if it is on critical path
Local Refinement increases number of VddL
P-blocks by 6.52 and reduces critical path delay
by 4.23
19Experimental Settings
- Comparison between four cases
- arch-SV (architecture with single Vdd scaling)
- arch-DV with 100 P-blocks(logic fabric is fully
Vdd-programmable) - arch-DV with H/L/P 1/1/3(logic fabric is
partially Vdd-programmable) - ideal-DV (ideal case without delay and power
overhead for Vdd-Programmability)
20Power vs. Performance for alu4
- Achieved 15.83 total power saving compared to
single-Vdd - At highest frequency
0.09
0.08
0.07
0.06
total power (watt)
0.05
0.04
0.03
65
75
85
95
105
115
125
clock frequency (MHz)
21Power vs. Performance for s38584
- arch-DV (100 P-block) beats ideal case due to
power gating
0.29
0.27
0.25
0.23
0.21
total power (watt)
0.19
0.17
0.15
0.13
55
65
75
85
95
105
115
125
clock frequency (MHz)
22Power Reduction by Programmable Vdd
- Dual-Vdd fabric with full Vdd-programmability for
logic blocks - 35.46 logic power saving
- 14.29 total power saving
23Power Reduction by Partial Programmability
- 28.62 logic power saving (vs. 35 for full
programmability) - 9.04 total power saving (vs. 14 for full
programmability)
24Power Saving by Power-gating
- FPGA chip is customized for each benchmark
circuit - Smallest FPGA that just fits the given circuit
- Much lower logic block utilization rate in
reality - Power-gating is more effective
25Ongoing Work Extension to Interconnects
- Programmable Vdd can also be applied to routing
switches - Vdd selection and power gating unused buffers
- Configurable level conversion
- Over 50 total FPGA power reduction
26Conclusions and Discussions
- Vdd-Programmability is needed for dual-Vdd FPGAs
to obtain a desired power-performance tradeoff - Configurable dual-Vdd reduces logic power by
around 35 - Consider power overhead for Vdd programmability
- Ongoing work
- Dual-Vdd interconnect fabric
- Over 50 total power reduction
- Power delivery network to support configurable
Vdd