Low Power MicroArchitecture - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Low Power MicroArchitecture

Description:

ETR (Energy-to-Throughput-Ratio) = Max. energy/Max. throughput = Power/(throughput)2 ... by designing performance enhanced circuits before lowering power. ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 17
Provided by: YuHe8
Category:

less

Transcript and Presenter's Notes

Title: Low Power MicroArchitecture


1
Low Power Micro-Architecture
2
Power Dissipation Formula
  • Power Ceff (Vsw VDD) f
  • Ceff ? C effective capacitance,
  • ? activity factor
  • Vsw switched voltage (VDD)
  • VDD supply voltage
  • f clock frequency

3
Circuit Delay Formula
  • Delay ? (CL/IAVE) (?V/2)
  • ? (CL VDD)/kv W (VDD?VT ? Vdsat)
  • Short-channel cause velocity saturation of
    channel electrons.
  • IAVD Average current, linear to VDD rather than
    quadratic as in long channel model.
  • When VDD2VT, Vdsat ? VDD?VT
  • Thus, Delay may become large at very low VDD.

4
Metrics Measuring Low Power
  • Energy/op Powerclock_cycle/op
  • Power/throughput
  • ETR (Energy-to-Throughput-Ratio) Max.
    energy/Max. throughput Power/(throughput)2
  • EIDLE total energy consumed idling/total ops.

5
Low Power Design Principles
  • High performance design is energy efficient
  • High performance processor, if allowed to operate
    at low voltage, can be more energy efficient in
    delivering the same amount of throughput
  • Fast operations can decrease energy efficiency
  • If fast response time is critical, the processor
    can be left at nominal supply voltage and shut
    down when it is not needed.
  • Clock frequency reduction is NOT energy
    efficiency!
  • The relative amount of time in idle versus
    maximum throughput determines the effect of clock
    frequency reduction.
  • Dynamic voltage scaling is energy efficient.

6
Dynamic Voltage Scaling
  • Idle Time When EIDLE dominate total energy
    consumption, simultaneous reduction of f, and VDD
    during idle will yield energy efficient
    implementation
  • Burst Mode Operation The voltage is dynamically
    changed to deliver the minimum throughput
    required at the time of operation, so as to
    minimize energy consumption.

7
Dynamic Voltage Scaling
  • Full speed? fast slow idle EAVE ETR
    Battery life
  • Always 10 0 90 .031
    237 1 hour
  • Sometimes 1 90 9 .006
    45 5.3 hours
  • Rarely .1 99 .9 .003
    25.8 9.2 hours
  • MIPS R4700 processor, peak throughput 130
    SPECint92.
  • Desired application requires either fast
    throughput at 130 SPECint92 or a slow throughput
    at 13 SPECint92.
  • Same of operations performed in each of three
    cases.

8
Low Power Design Strategies
  • Reduce operating voltages (Vsw, VDD)
  • This may cause slow operation
  • Can be compensated by designing performance
    enhanced circuits before lowering power.
  • Reduce on-chip operating frequency(f)
  • Compensating with higher level of parallelism
  • Reduce effective capacitance (Ceff ? C)
  • Reduce physical capacitance C
  • Reduce capacitance switching activities (? ) by
    reducing wasteful operations.

9
Performance Enhancement
  • Critical Path Reduction by Re-timing
  • Shorten critical path to compensate increased
    delay due to operating voltage reduction (to very
    low voltage)
  • ? Goal keep throughput constant

10
Reduce Wasteful Operations
  • Reducing switching activities
  • Preservation of data correlation
  • Distributed computing and locality of reference
  • Reduce long data bus
  • Reduce memory reference
  • Application-specific hardware
  • Reduce less used options
  • Demand-Driven operations
  • Power down stand-by function units

11
Reducing Switching Activities
  • Same Boolean function, different implementation
  • Some requires fewer switching activities than
    others
  • Different input data representation
  • Eg. Gray code vs. binary code
  • Amount of temporary storage needed may be
    optimized by algorithm transformation
  • number of accesses to register files

12
Energy Efficient VLSI Design
  • Instruction set architecture
  • Instruction word length
  • number of registers
  • addressing mode
  • Micro-architecture
  • Moderate pipelining is energy efficients
  • VLIW processor is energy efficient compared to
    superscaler
  • sectored cache is energy efficient

13
Locality and Parallelism
  • Bus and Interconnection consumes 25-50 power
  • Separate local and long distance communications
  • Area-power trade-off
  • Use separate buses for short and long distance
    communications.
  • Use parallel architecture to reduce demand on
    speed

14
InfoPad Power Model
45mW
400mW
120mW
Clock Osc.
ARM60
PLD
BUS
45mW
0.5MB SRAM
I/O Interface
600 mW
40mW
15
Exploiting Locality
  • Task assignment to reduce communication overhead.
  • Ensure majority of data transfer are local to a
    subset of hardware.
  • Shorter local buses are used more frequently
  • Longer global buses are used sparsely.

16
References
  • R. Mehra, L.M. Guerra, and J. M. Rabaey,
    Low-power architectural synthesis and the impact
    of exploiting locality J. VLSI Signal Processing
    Systems, 13, 239-258 (1996)
  • T. D. Burd, and R. W. Brodersen, Processor
    design for portable systems, J. VLSI Signal
    Processing Systems, 13, 203-221, (1996)
Write a Comment
User Comments (0)
About PowerShow.com