Title: Lecture 13 Partially Reversible Logic and QuasiAdiabatic Memories
1Lecture 13Partially Reversible Logic and
Quasi-Adiabatic Memories
- Designs with partially reversible logic
- Supply clock generation
- Low-Power Memory System Design
- Summary
- Michael L. Bushnell
- CAIP Center and WINLAB
- ECE Dept., Rutgers U., Piscataway, NJ
2Partially Reversible Logic
- Common logic gates (NAND, NOR, XOR) are
irreversible theoretical lower bound for
isolated irreversible operation is kT - In CMOS, lower bound related to Vth and charges
- Partially reversible logic recovers most of its
switching energy - Based on self-control scheme
3Diode-Based Logic Family
- Dissipates C Vdd Vth of energy
- Uses differential signaling
- Each signal its complement must be present
- Logical 0 represented by downward pulse on c
- Logical 1 represented by downward pulse on d
- Invert signals by exchanging wires
- Need 4 clock stages
42N-2N2D Inverter/Buffer
5Inverter/Buffer Behavior
- 1st phase idle -- Clk high and outputs (c d)
high - 2nd phase evaluate clock ramps down to 0 V
(inputs a b must be valid) - If a 0 and b 1, then d follows clock down
- 3rd phase hold outputs c d are valid, can
be sampled, input can now be changed - 4th phase recharge Clk ramps up to Vdd
6Adiabatic AND/NAND Gate
- IS1 and IS2 transistors isolate input from output
- Replace with transmission gate reduced channel
resistance leads to lower power dissipation - Drawback requires 6 clock phases
- P and P only turn on during restoration
7Adiabatic Serial Adder
- P1 F2, P2 F4 for serial adder
8Measured Energy Savings
- Assume 100 efficient power supply
- Net energy flowing into circuit from power
supply - Difference in net energy between 2 consecutive
cycles is energy loss in 1 full cycle - Design 1 serial adder
- Design 2 serial adder with isolation transistor
replaced by transmission gate
9Net Energy Function
10Buffer Gate Energy Dissipation
116 Buffer Chain Design with Reversible Logic
- Isi Fi 2 Isi Fi 1
- Ci Fi 2 Ci Fi 1
- 6 Clock phases required for reversible logic
example - Reversible logic primitives may have redundant
outputs - Can recover 93 of energy at 1.1 MHz (100
efficient power supply)
126 Buffer Chain
13Simple Charge Recovery Logic
- Modified static CMOS circuit simpler
- Adiabatic circuits so far have constant energy
dissipation, regardless of input switching
activity - When input activity 0, adiabatic circuit still
dissipates constant energy, whereas static CMOS
only has leakage current - Necessary to switch between static CMOS and
adiabatic circuit - SEL selects which mode to operate in
- 2 supply phases evaluation and hold
14Adiabatic Adder
15CMOS vs. Adiabatic CMOS 2 X 2 Multiplier Energy
16Adiabatic Dynamic Logic Inverter
- For cascading, need constant voltage Vdd between
precharge and evaluate avoids non-adiabatic
transitions
F
17Cascaded Adiabatic Inverters
- When 1st stage latched, 2nd starts evaluating
- When 1st stage evaluating, 2nd must be adiabatic
- Need 4 clock phases, so all loops must have a
multiple of 4 gates in the logic
18Adiabatic Dynamic NAND Gate
19Energy Recovery SRAM Core
- Can recover 75 of energy for both reads writes
- No complexity increase and low area overhead
- Add row driver to drive memory core
- Replace sense amplifiers with voltage level
shifters - Vhi and Vlow ramp up and down like adiabatic
clocks - Generated by row driver circuit
20Adiabatic Core Diagram
21Adiabatic SRAM Cell
22Adiabatic SRAM Core Operations
- Row address decoder generates row selection
signals W0, W1, , WM-1 - Vhi, Vlow, Vword of enabled row controlled
independently by global supply lines Ghi, Glow,
Gword - Vhi, Vlow, Vword of unselected rows connected to
static supply lines Shi, Slow, and GROUND - Does not need bit line precharger replaced by
bit line equalizer transistor
23Row Driver Circuit
24Read Operation
- Example supplies Shi Vdd 5 V, Slow 2 V, Vt
1 V - SRAM in rest state all rows disabled
- Row driver Vhi 5 V, Vlow 2 V, Vword 0 V
- bit line precharged midway to 2 V
- Read operation
- Row selection applied Vword smoothly ramped up
to 3 V by Gword - Vhi and Vlow ramped down to 3 and 0 V,
respectively - If 0 stored, node A is low, node B is high
- Both M1 and M5 on, so bit follows Vlow down to 0
V - bit remains at 2 V, since B is gt 3 V, and Vword
3 V, so M6 cannot turn on - Level shifter amplifies bit line differential
generates output - bit reverts to rest state (2 V) by same cells
being read
25Write Operation
- Cell bit information overwritten by bit lines
- Apply row selection, pull up Vword, pull Vhi down
to 3 V - Cell voltage difference now Vhi Vlow 1 V Vt
- Cell state held for columns not being written
- Low enough to flip cell state for selected
columns - Simultaneously ramp Vlow and bit from 2 V to 0 V
- B pulled down by bit and cell state flips
- Return to rest state Vlow and bit revert back to
2 V, word is disabled
26Adiabatic SRAM Waveforms
27Column-Activated Memory Core
- Only activate columns we actually will read/write
- Saves energy
- Vhi and Vlow now run vertically, generated by
column driver, implemented like row driver - Row driver still required
- Same cell structure
28Column-Activated Core Diagram
29Memory Core Energy Dissipation
- Energy dissipated when cell written erasing
data - Design 1 row-based, Design 2 column-based
30Column-Based CORE Organization
- Core of M X N bits
- n bits read/written in each operation
- Ratio of effective capacitances
- In 256 X 256 block, row-activated uses 4 times
the energy of column-activated design
31Energy Recovery NOR Address Decoder
- Only change from conventional dynamic decoder
- Precharge pMOSFETS replaced by nMOSFETs
- Rest state Vlow Vdd, all row lines at Vdd -
Vth - Evaluation
- Vlow swings from Vdd to 0, all row lines follow
it down except selected line, which stays at Vdd
- Vth
32Address Decoder Waveforms
33NOR and NAND Decoders
34NAND Adiabatic Decoder
- Operates like NOR decoder, but only 1 row
switches - Uses far less energy than NOR decoder
- Can be used with pre-decoding, but requires more
clock phases - Same decoding scheme used for both row and column
decoders
35Energy Recovery Level Shifter I/O Buffer
36Level Shifter Operation
- Initially Vls 0, shifter disabled
- A and A equalized
- Turn on access transistors
- bit and bit arrive, build voltage difference in
amplifier - Vls ramped from 0 to Vdd no short-circuit
current - Turn off access transistors isolate
level-shifter - Buffer drives I/O bus
- 90 energy recovery achieved
37Overall SRAM Supply Schemes
38Optimal Voltage Selection
- Bit line energy dissipation
- Optimal voltage swing lies between 3 Vth and 4 Vth
39Row and Column-Activated Cells
- Energy recovery of 50 for both read and write
40Energy Recovery in Adiabatic Cores
Column-Activated
Row-Activated
41Decoder Transition Times
42NAND/NOR Decoder Dissipation
43Energy Efficiency of Clock Generator
44Summary
- Energy recovery circuitry produces large power
saving - Cost of more area and slower clock speed
- Considerations
- Highly efficient switching power supply (high Q
factor) - Optimize redundant signals in reversible logic
- Quasi-static CMOS circuits are most practical
- Need to change standard CMOS process for
adiabatic logic to add necessary diodes