Title: Load-Sensitive Flip-Flop Characterization
1Load-Sensitive Flip-Flop Characterization
- Seongmoo Heo and Krste Asanovic
- MIT Laboratory for Computer Science
- http//www.cag.lcs.mit.edu/scale
- WVLSI 2001
- April 19, 2001
2Motivation
- Flip-flops are one of the most important
components in synchronous VLSI designs. - Critical effect on cycle time
- Large fraction of total system power
- Previously published work has failed to consider
the effect of circuit loading on the relative
ranking of flip-flop structures. - Kawaguchi et al. 98 Ko and Balsara 00
Kong et al 00 Lang et al 97 Nikolic et al
00 Nogawa and Ohtomo 98 Stojanovic and
Oklobdzija 99 Stollo et al 00 Yuan and
Svensson 91 Zyuban and Kogge 99 H.P. et al
96 J.M. et al 96 - Fixed and usually overly large output load
- Large or non-specified input drive
- No output buffering
3Observation
- Different flip-flop designs have different
inherent parasitics and output drive strength. - Different number and complexity of logic gates
- Different kinds of feedback
4Observation
- Different flip-flop designs have different
inherent parasitics and output drive strength. - Different number and complexity of logic gates
- Different kinds of feedback
D
Q
D
Q
D
Q
5Observation
- 2. Output loads in a circuit vary
significantly.
Flip-flop output load instances in a
microprocessor datapath
120
(A custom-designed 32-bit MIPS CPU in 0.25mm
process)
100
80
7.2fF (4 min inv gate cap)
60
115.2fF (64 min inv gate cap)
of instances
28.8fF (16 min inv gate cap)
40
20
0
1.8fF (min inv gate cap)
6Our Proposal
- Load effects must be considered in flip-flop
characterization to avoid sub-optimal selection. - We will present energy and delay measurements for
various flip-flops across a range of output
loading conditions(EE and absolute load size) and
show that the relative rankings of structures
vary. - We will show that output buffering at high load
can lead to the better performance and energy
consumption for some structures.
7Related Work
- Traditional Buffer Sizing
- Logical Effort Sutherland and Sproull
- Logical Effort drive strength of a circuit
structure - Electrical Effort the ratio of output load to
input load - Delay intrinsic parasitic delay LE x EE
8Overview
- Flip-Flop Designs
- Test Bench Simulation Setup
- Delay and Energy Characterization
- Delay Analysis
- Energy-versus-Delay Analysis
- Summary
9Flip-Flop Designs
Fully static and single-ended
Nikolic et al 00
10Test Bench
- Sized clock buffer
- to give equal rise/fall time
- Used a fixed, realistic input driver
- Varied output load from
- 4 min inv cap(7.2fF) to
- 64 min inv cap(115.2fF).
- 4 Load and Drive Configurations
- EE4-min min input drive, 4 min inv load (7.2fF)
- EE16-min min input drive, 16 min inv load
(28.8fF) - EE64-min min input drive, 64 min inv load
(115.2fF) - EE4-big 16x min input drive, 64 min inv load
(115.2fF)
4 min inv cap 16 min inv cap 64 min inv cap
FF
11Simulation Setup
- 0.25µm TSMC CMOS process, Vdd2.5V, T25C
- Hspice Levenberg-Marquardt method was used for
transistor size optimization. - Transistor widths optimized for each load and
drive conf. - to give min delay or min energy for a given
delay - (transistor lengths were fixed at minimum.)
- Parasitic capacitances included in the circuit
netlists.
12Delay and Energy Characterization
- Minimum D-Q delay Stojanovic et al. 99
(.Measure command) - Total energy input energy internal energy
clock energy - output energy
- A single test waveform with ungated clock and
data toggling every cycle - For a full characterization of energy
dissipation, more realistic activity patterns
should be considered Heo, Krashinsky, Asanovic
ARVLSI01.
FF
4 min inv load 16 min inv load 64 min inv load
13Speed Ranking Without Buffering
3.5
(Transistors sized at each load point, but only
for min delay)
3.0
- Delay const. intrinsic parasitic delay
- output drive delay ( load size
driving capability) - Driving Capability f( of stages, complexity)
PPCFF
SAFF
1.5
MSAFF
HLFF
SSAPL
1.0
0.5
14Influence of Buffering on Performance
(Assuming no penalty for inverting output)
SAFF
MSAFF
PPCFF
1.5 1 0.5 0
Delay (ns)
0 20 40 80
Load (min inv cap)
unbuffered
one inverter two
inverters (Min. input drive was used.)
SSAPL
HLFF
1.5 1 0.5 0
0 20 40 80
15Speed Ranking With Buffering Allowed
3.5
- Less speed variation compared to original
flip-flops
3.0
PPCFF
SAFF
MSAFF
HLFF
1.5
SSAPL
1.0
0.5
16Energy-Delay Curve EE4-min
Each point sized for min energy for a given delay
Energy(fJ)
Delay(ns)
EE4-min min. drive 4 min inv load(7.2fF)
17Energy-Delay Curve EE4-min
Energy(fJ)
PPCFF-unbuf
Delay(ns)
EE4-min min. drive 4 min inv load(7.2fF)
18Energy-Delay Curve EE16-min
SSAPL-unbuf
Energy(fJ)
Delay(ns)
EE16-min min. drive 16 min inv load(28.8fF)
19Energy-Delay Curve EE16-min
SSAPL-unbuf
Energy(fJ)
SSAPL-buf
Delay(ns)
EE16-min min. drive 16 min inv load(28.8fF)
20Energy-Delay Curve EE16-min
HLFF-unbuf
Energy(fJ)
Delay(ns)
EE16-min min. drive 16 min inv load(28.8fF)
21Energy-Delay Curve EE16-min
HLFF-unbuf
Energy(fJ)
HLFF-buf
Delay(ns)
EE16-min min. drive 16 min inv load(28.8fF)
22Energy-Delay Curve EE16-min
Energy(fJ)
PPCFF-unbuf
Delay(ns)
EE16-min min. drive 16 min inv load(28.8fF)
23Energy-Delay Curve EE64-min
Energy(fJ)
MSAFF-unbuf
Delay(ns)
EE64-min min. drive 64 min inv load(115.2fF)
24Energy-Delay Curve EE64-min
HLFF-buf
HLFF-unbuf
Energy(fJ)
Delay(ns)
EE64-min min. drive 64 min inv load(115.2fF)
25Energy-Delay Curve EE4-big
Energy(fJ)
Delay(ns)
EE4-big 16x min. drive 64 min inv load(115.2fF)
26Energy-Delay Curve EE4-min vs EE4-big
EE4-min
EE4-big
PPCFF-unbuf
MSAFF-unbuf
SAFF-unbuf
27Energy-Delay Curve EE4-min vs EE4-big
EE4-min
EE4-big
MSAFF-unbuf
PPCFF-unbuf
MSAFF-unbuf
SAFF-unbuf
PPCFF-unbuf
SAFF-unbuf
28Summary
- Different flip-flops have different gains and
parasitics. - Real VLSI designs exhibit a variety of flip-flop
output loads. - The output load size affects the relative
performance and energy consumption of different
flip-flop designs. - Therefore, output load effects should be
accounted for when comparing flip-flops. - Electrical effort
- Absolute output load size
- Output buffering