Power Consumption In High Performance Microprocessors - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Power Consumption In High Performance Microprocessors

Description:

Power Consumption In High Performance Microprocessors. Rajesh Kumar ... Flurry of leakage reduction arch/circuit/process proposals even at the cost of ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 18
Provided by: kbma
Category:

less

Transcript and Presenter's Notes

Title: Power Consumption In High Performance Microprocessors


1
Power Consumption In High Performance
Microprocessors
  • Rajesh Kumar
  • Desktop Products Group, Circuit Technology
  • Intel, Hillsboro, Oregon
  • rajesh.kumar_at_intel.com
  • 03/13/2004

2
Outline
  • Where does the power go ?
  • Clock power and frequency
  • Optimal pipelining for power/ performance wide
    vs. fast
  • Optimal leakage
  • Interconnect power
  • Summary

3
Silicon correlation do we know where the power
goes ?
IREM image of 130 nm Pentium ? 4 Silicon
running high power floating point
Transistor level power model of Pentium ? 4. gt
90 correlation
4
Where does the power go ?
Data for 3.2 Ghz, Intel ? Pentium ? 4 in 130 nm
5
Clock Power Components
  • Local clock power(distribution flops and
    latches) dominates global
  • Global clock techniques optical clock, resonant
    clock, transmission line clock, globally
    asynchronous/ locally synchronous (GALS) etc can
    only have a moderate impact on total power

Clock power breakdown in Pentium ? 4
6
IA32 pipelining/ frequency history
Relative pipelining
Relative frequency iso process
  • Increasing pipelining increases frequency e.g. 2X
    pipelining from P6 to P4 -gt 1.7X frequency but
    increases clock power
  • What is the optimal pipelining/ frequency for
    power/ performance efficiency ?

7
Optimal pipelining past research
  • FO4 inverter Universal, process independent
    metric of logic depth
  • Pentium ? 4 in 180 nm, 130 nm 18 FO4 inverters
  • Will we see even faster (pipelined) machines in
    the future ?

8
Communication dominant structures experience
area, power explosion with width -gt why we dont
build very slow/ wide ST machines
2x Frequency
2x Width
Source Doug Carmean, Intel
AGU
AGU
Front End
Rename
ALU
Schedulers
Register File
L1 D-cache
ALU
uop Queues
ALU
ALU
ALU
ALU
ALU
9
Architectural Power Efficiency
Littles law of queuing theory
Latency X Bandwidth Required Parallelism
  • Latency is hard, bandwidth is easy(ier)
  • Wider is not necessarily better than pipelined if
    maintaining global dependencies
  • Much more power efficient to use easy
    parallelism rather than extract parallelism in
    hardware
  • Thread level (SMT, multiple cores) e.g.
    Hyperthreading ? gives 20-30 performance for
    negligible hardware or power increase in Pentium
    ? 4
  • Data level e.g. media extensions MMX ?, SSE ?,
    SSE2 ?
  • ILP/DLP/ TLP in media/ graphics engines e.g.
    Imagine
  • Software usage/ programming ease/ enabling is the
    main bottleneck, not hardware design

10
Source/Drain Leakage Perception
  • VCC reduced each process generation to reduce
    dynamic power (cv2f) and for reliability (gate
    oxide field, hot electrons etc)
  • Decreasing VCC, decreases gate overdrive (VCC
    Vt) and hence speed. Vt needs to be reduced to
    gain back the speed
  • Decreasing Vt increases S/D leakage exponentially
  • (Perception) Leakage power is growing fast and
    will be the dominant source (40-70) of total
    power in future technologies
  • Flurry of leakage reduction arch/circuit/process
    proposals even at the cost of switching power and
    chip performance
  • Our view S/D leakage is a design knob, not a
    process constant

11
Optimal Leakage
  • Optimize all variables architectural pipelining,
    transistor (Vt, Leff, Tox), VCC etc. to provide
    maximum performance at desired cost or power

At optimal point, the cost / benefit ratio of
all variables is the same!



Freq
Freq
Freq



Leff
Vt
Vcc





Power
Power
Power



Leff
Vt
Vcc
12
Impact Of Leakage On Speed
  • VCC gtgt Vt (practical case), 10-15 leakage change
    for 1 speed gt 2X leakage for 5-7 speed

Any leakage reduction idea must exchange ltlt 5
speed for every 2X reduction in chip leakage
(not easy) to be practical
VCC 0.5V
Change in speed / change in Ioff
VCC 0.75V
1V
Vt (V)
13
Optimal Leakage vs. Circuit Type
Optimal VCC
VCC,Vt (V)
Leakage power
Optimal leakage
Optimal Vt
Register files
Datapath
Clock
Switching Activity Factor
  • Circuits with different activitiy factors want
    different transistors
  • Clock wants low VCC, low Vt (high leakage)
  • Register files and caches want high VCC and low
    leakage

Optimal leakage fraction is almost constant
across 50-100X range in activity factors!
14
Optimal Leakage Scaling
  • Leakage consumes 20-30 of chip power at optimal
    setting
  • Optimal leakage is almost constant with respect
    to
  • Process generation (130nm, 90 nm, 65 nm) as long
    as VCC gtgt Vt
  • Total power budget (1 W or 100 W)
  • Frequency
  • Pipelining
  • Chip area
  • Signal switching probabilities (activity factors)
  • Leakage reduction proposals must lose ltlt 5 speed
    for 2X leakage reduction to be practical

15
Interconnect Power Scaling
power in interconnect increasing due to
increase in metal layers and (possibly) longer
wires Diffusion decreasing due to faster
scaling of area component and material changes
SOI capacitance benefit should decrease
16
Interconnect Power
of total wire power in Pentium 4, 90 nm
Length (microns)
  • Only modest power in long wires e.g. gt 1000 um in
    90 nm -gt modest gains for low power signaling
    techniques
  • Low power signaling may be useful for specialized
    situations on chip networks, multiple core
    interface etc

17
Summary
  • Clock is the biggest component of dynamic power.
    Reducing local clock power much more important
    than global clock
  • High performance on ST applications requires high
    power whether through frequency or width
  • Optimal leakage is 20-30 of chip power and
    remains fairly constant with process generations
    and chip architectures
  • Wire power has increased significantly while
    diffusion has decreased
  • Most wire power is in short wires not much to
    gain from advanced, low power signaling
    techniques
Write a Comment
User Comments (0)
About PowerShow.com