Title: Lecture 11 LowPower Static RAM Architectures
1Lecture 11Low-Power Static RAM Architectures
- Static SRAM organization and MOS SRAM cell
- Banked SRAM organization
- Reducing bit line voltage swing
- Reducing write driver power sense amplifier
power - Low core supply voltage techniques
- Summary
- Michael L. Bushnell
- CAIP Center and WINLAB
- ECE Dept., Rutgers U., Piscataway, NJ
2Introduction
- Portable computing requires low-power design
- System-on-a-Chip technology integrates memory
onto same chip as processors - Memory power consumption is major problem
- Focus on SRAM power reduction
- Major loss of power
- DRAM design benefits from low-power SRAM decoder
3SRAM Organization
4Problem
- Activating a word line for a row causes
- All columns in that row to be active
- Switches way too much power
- Must focus on activating only those columns in
the row - That are actually being read/written
5MOS SRAM Cell
- Bi-stable inverter loop
- CMOS usually used for low-power design
- 6T SRAM cell
- However, nMOS SRAM has less area
- 4T SRAM cell
- nMOS may not use much more power than CMOS
- If properly designed
6Cell Transfer Characteristic
74T nMOS SRAM Implementation
- High-valued Rs made from undoped polysilicon
- Extremely compact fewer transistors, no nWells
- Extremely small currents required
84T Cell Design
- Higher R
- Reduces current consumption
- Lower standby power consumption
- Must get correct ratio of pass transistors T1/T2
- To pull-down transistors T3/T4
- Ratio called the aspect ratio
- Choose so that internal cell voltages never rise
enough to upset cell state - Energy needed to switch bit-line capacitances gt
energy to switch cell
96T SRAM Cell
- Resistors replaced by minimum size pFETs
106T Cell Design Considerations
- Supply current drawn is limited to stable state
transistor leakage currents - Inverters have large nMOS width compared to pMOS
width - Inverter switching threshold close to nMOS Vtn
- Aspect ratio design similar to that for 4T cell
11SRAM Power Reduction
- Reduce power (in general) by
- Lowering switched Capacitance
- Lowering voltage swing
- Lowering activity factor
- Lowering operation frequency
- Easiest to lower voltage swing in memory on
bit/word lines - Limited by
- Inability to resolve small voltage differentials
at adequate speed - Increasing soft bit error rates and degraded
signal integrity
12Banked SRAM Organization
- Reduces switched capacitance, reduces power,
increases speed - R x C SRAM means that any access to row
- Enables R rows
- Enables all bit lines
- Ccell is individual cell capacitance
- Causes R x C x Ccell capacitance to be switched
- Solution Split memory into B banks
- Only 1 bank enabled for an access, not all banks
- Switched capacitance now R x C x Ccell / B
13SRAM Bank
14Divided Word Lines
- Applies banking only to word lines
- Divided into groups, only 1 of which is enabled
- Need gate and local driver for each group
- Severely reduces driven C on word line
- Limitation is that it adds word line driving delay
15Reduced Bit Line Voltage Swing
- Can end sense amplifier read operation as soon as
differential voltage detection is complete - Saves fraction of power needed to accomplish read
- DV bit line voltage swing
- Vcore core supply voltage
- r operation fraction that is read
- f frequency of core operations
- Read power ½ Ceff Vcore DV r f
- Reducing DV often fails increases noise
sensitivity, sense amp complexity, reduces RAM
performance
16Early Word Line Termination
17Pulsed Word Lines
- Enable word lines only for precise time
- Needed to develop bit cell voltage discharge
- Use pulse generator
- Gates word line and sense amplifier
- Need margin for worst-case pulse width
- Must estimate actual RAM access time
18Self-Timed RAM Core
- Different rows have different access speeds
- Row closest to sense amps is fastest
- Columns closest to word line drivers enabled
first - Tailor pulse width to RAM access time
- Use dummy column to time signal flow
- Forced to known state by shorting one internal
node - Set SR flip-flop to trigger word line
- By time dummy column sense amp generates high
- Rest of columns have been sensed
- Dummy column sense amp resets SR flip-flop, turns
off word line - Dummy column adds insignificant chip area/power
overhead - Called word line kill circuit
19Dummy Column for Self-Timing
20Bit Line Precharge Voltage
- Two methods
- Uses MOS device static load enhancement nMOS,
depletion nMOS, standard pMOS - Use precharger transistor no static power
consumed - Needs more power to drive clocked precharger
- Lower precharge V lowers power consumption (less
bit line voltage swing) - Enhancement nMOS most effective
(precharge to VCC Vtn) - Optimal precharge voltage VCC / 2
21Problems with Lower Precharge V
- Read forces SRAM internal nodes towards bit line
voltage - Bit line precharges to VCC Vtp
- Forces cell internal nodes low counteracted by
weak cell pMOS - Read may destroy old cell data
- Avoid this problem - use different bit line
voltages for read/write
22Write Driver Power Reduction
- Reduce power in word line decoders and drivers
- Write line driver only drives 1 word line at a
time - Small contribution to overall power
- Want fast row decoding
- NAND decoder only changes word line output of 1
row - Slower
- NOR decoder changes word line output of all but 1
row - Faster but very bad for power
23Domino NAND Decoder
24NOR Decoder
25Improve NAND Decoder Speed
- Do not decode A address lines into 1 of 2A word
lines - Split decoding process
- Decode A1 lt A address lines
- Use 2A1 lines to activate one of second stage
decoders - Second stage decodes A A1 lines into 2(A A1)
word lines - Get total of 2A1 x 2(A A1) 2A lines
- Recursively repeat to get a tree of intermediate
decoders extreme is to decode 1 address
line/stage
26Multistage NAND Decoder
27Sense Amp Power Reduction
- Larger currents improve speed
- Becomes significant fraction of total power
- Have a sense amplifier enable signal
- Power reduction
- Limit current by enabling sense amp for minimum
needed period - Use self-timed RAM core
- Use sense amp that automatically cuts off after
sensing - Sets SR flip-flop, once dummy sense amp finished,
resets SR flip-flop, turns off sense amplifier
enable - Alternative shape tail current of amp by
activating pull-down transistors of differential
amps in sequence
28Differential Sense Amp
29Differential Charge Sense Amp
30Self-Timed Sense Amp
31Self-Latching Sense Amp
- Self-latching sense amp automatically limits
currents after sense - Cross-coupled amplifying inverter loop
- Extra transistors transfer bit line voltages to
inverter loop
32Latched Sense Amp
33Low Core Voltage from Single Supply
- Memory core
- Square law relationship for both standby and
dynamic power with respect to core voltage - Commodity RAM Have single external supply
voltage - Step this down to get lower core voltage
- Sakata method Achieve ½ core supply voltage
- Place 2 identical DRAM cores in series
- If average power consumption fairly constant
- Results in potential divider
- Top and bottom core supplies VCC / 2
34Voltage Supply Step-Down Circuits
- Step-down circuit design with low DC voltage,
significant current drain is hard - Cannot use inductors and pulse width control
circuits - Implement by charging N series capacitors from
VCC - Achieve parallel capacitor connection by
opening/closing switches - Steps voltage down to VCC / N
- Run at high rate, to get smooth supply waveform
- Need near-ideal switches and capacitors hard to
get in CMOS
35Summary
- Switched capacitance reduction techniques
- Reduces power
- Also improves performance
- Banked SRAM Organization
- Divided word lines
- Change decoder design (useful for DRAMS, too)
- Voltage swing reduction techniques
- Early word line cutoff
- Reduced bit line swings
- Self-timed RAM core