Low power Design Strategies - PowerPoint PPT Presentation

About This Presentation
Title:

Low power Design Strategies

Description:

Low power Design Strategies Daniele Folegnani Talk outline Why Low Power is Important Power Consumption in CMOS Circuits New Trends for Future Microprocessors Low ... – PowerPoint PPT presentation

Number of Views:186
Avg rating:3.0/5.0
Slides: 32
Provided by: DAC63
Learn more at: https://hpc.ac.upc.edu
Category:

less

Transcript and Presenter's Notes

Title: Low power Design Strategies


1
Low power Design Strategies
  • Daniele Folegnani

2
Talk outline
  • Why Low Power is Important
  • Power Consumption in CMOS Circuits
  • New Trends for Future Microprocessors
  • Low Power Strategies
  • Power Consumption Evaluation of a Superscalar
    Processor
  • An Architectural Technique to Reduce the Power
    Consumption of the Issue Logic
  • Conclusions

3
Why Low Power is Important
  • High performance microprocessors
  • PowerPC704 consumes 85 Watt
  • Alpha 21364 consumes 100 Watt
  • Problems involved thermal runaway, gate
    dielectric, junction fatigue, electromigration
    diffusion, electrical parameters shift, silicon
    interconnections fatigue, package related
    failure.
  • THE FUNCTIONALITY AND THE CLOCK SPEED CAN BE
    LIMITED

4
  • Thermal and Power dissipation costs

5
Low performance processors
  • High demand of portable devices ( mobile phones,
    laptops, smart cards,
  • videogames, etc ) gtgtgt 95 of production
    !!!
  • Extensive use of multimedia features
  • Problems involved gtgtgt
    Battery life !!!
  • Energy battery will not grow drastically in the
    near future due to technology and safety reasons
    ( todays batteries has the same energy of a
    grenade !!! )
  • One of the market point is hours of use and
    hours of standby
  • Need of techniques to improve energy efficiency
    without penalizing performance

6
Power Consumption in CMOS Circuits
  • Static
  • Theoretically 0, in practice leakage and
    threshold currents exist in transistors
  • Dynamic
  • Transients ( the linear zone )
  • Capacitance switching THE MOST IMPORTANT
    FACTOR

7
  • New Trends for Future Microprocessors

8
  • Moores Law
  • doubling transistors every 18 months
  • Power is proportional to
  • DIE AREA and FREQUENCY
  • In the same technology a new architecture has
    2-3X in Die Area
  • Changing technology implies 2X frequency
  • SCALING TECHNOLOGY ...
  • Decreasing voltage ( 0.7 scaling
    factor )
  • Decreasing of die area ( 0.5 scaling
    factor )
  • Increasing C per unit area 43 !!!

9
  • This implies that the power density increase of
    40 every generation !!!
  • Temperature is a function of power density and
    determinates the type of cooling system needed.
  • VARIABLES
  • PEAK POWER ( worst case )
  • Todays packages can sustain a power dissipation
    over 100W for up to 100msec gtgtgt
    cheaper package if peaks are reduced
  • ENERGY SPENT ( for a workload )
  • More correlated to battery life

10
Low Power Strategies
  • OS level PARTITIONING, POWER DOWN
  • Software level REGULARITY, LOCALITY,
    CONCURRENCY
  • ( Compiler technology for
    low power, instruction scheduling )
  • Architecture level PIPELINING, REDUNDANCY,
    DATA ENCODING
  • ( ISA, architectural design, memory
    hierarchy, HW extensions, etc )
  • Circuit/logic level LOGIC STYLES, TRANSISTOR
    SIZING, ENERGY RECOVERY
  • ( Logic families,
    conditional clocking, adiabatic circuits,
    asynchronous design )
  • Technology level Threshold reduction,
    multi-threshold devices, etc

11
Power Consumption Estimation
12
  • Due to the relative high error rate in the
    architectural estimation ( no vision of the total
    area, circuit types, technology, block activity,
    etc )
  • IMPORTANT DESIGN DECISIONS MUST BE DONE AT
  • ARCHITECTURAL LEVEL
  • Accurate power evaluation is done at late design
    phases
  • Needs of good feedback between all the design
    phases
  • - Correlation between power estimation
    from low level to high level
  • TRY TO IMPROVE ACCURACY AT
    HIGH LEVEL
  • - Critical path based power consumption
    analysis
  • ( CIRCUIT TYPES,
    TECHNOLOGY, ACTIVITY FACTOR )
  • - Thermal images based correlation
    analysis
  • ( HOTTEST SPOTS LOCATION,
    COOLEST SPOTS LOCATION, TEMPERATURE
    DIFFERENCES, TEMPERATURE DISTRIBUTION )

13
  • Architectural Power Evaluation
  • G.Cai, Intel
  • Architectural design partition
  • Power consumption evaluation at block level
  • - Power density of blocks ( SPICE
    simulation, statistical input set,
  • technology and circuit types definition )
  • - Activity of blocks and sub-blocks
    ( running benchmarks )
  • - Area ( feedback from VLSI design,
    circuits and technology defined )
  • TRY DO DEFINE SCALING FACTORS THAT ALLOW TO REMAP
    THE ARCHITECTURAL POWER SIMULATOR WHEN
    TECHNOLOGY, AREA AND CIRCUIT TYPES CHANGE
  • TRY TO REDUCE THE ERROR ESTIMATION AT HIGH LEVEL

14
  • POW OUT ORDER
  • Technology assumed CMOS 0.18 micron
  • 5 types of circuit logic ( static, dynamic, SRAM,
    clock distribution, PLA )
  • 32 architectural blocks and area associated
  • blocks built with custom design
  • two types of power density ( active and inactive
    power density )

15
Power Consumption Evaluation of a Superscalar
Processor
  • Architectural parameters
  • 4 instr. fetch, issue and commit
  • 128 entries instruction queue size
  • I-Cache 128Kbytes, direct mapped, 32 byte line, 1
    cycle hit, 3 cycle miss
  • D-Cache 128Kbytes, 4 way set ass, 32 byte line, 1
    cycle hit, 3 cycle miss
  • UL2-Cache,1024Kbytes, 4 way set ass, 64 byte
    line, 3 cycle hit
  • Combined predictor of 1K entries with Gshare with
    1K 2-bit counters, 8 bit global history and
    bimodal pred. of 2K entries with 2-bit counters
  • 4 intALU, 4fpALU, 1int mul/div, 1 fp mul/div
  • Out of order issue, oldest ready first selection
    policy

16

17
An Architectural Technique to Reduce the Power
Consumption of the Issue Logic
  • IQ ROB responsible of about 53 of power
    consumption
  • Cache hierarchy is not the most important power
    consumption factor in superscalar paradigm
  • Power consumption is almost independent to the
    instruction mix
  • TRENDS IN SUPERSCALAR
  • Increasing issue width
  • Increasing size of instruction window is more
    than linear respect IW
  • Area of IQ grows more than linear respect the
    number of entries
  • IQ power contribution may grow in the future

18
  • Every cycle the wakeup logic broadcast the result
    tags through the result buses to all the entries
    and each entry compares them with their to find a
    match
  • THE ISSUE ENGINE SPEND EVERY CYCLE A
    LARGE AMOUNT OF POWER ONYL FOR CHECKING IF SOME
    INSTRUCTIONS ARE AVAILABLE FOR EXECUTION
  • Considering
  • Periods of execution with high parallelism, just
    a subpart of the IQ may satisfy the IW
  • Periods of execution with poor parallelism, some
    parts of the IQ may not provide any useful
    instruction ready to execute
  • The issue engine is very power inefficient

19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
Dynamically Resizing the Instruction Queue
  • We propose a run-time mechanism that adapt the
    size of IQ based on its contribution on IPC
  • We avoid the wake-up function in the parts that
    are temporally disabled
  • Resize decision are commit based
  • IQ implemented as a circular FIFO with head and
    tail pointers, no collapsing

24
What we do is ...
  • Partition the queue in 16 parts of 8 entries
  • Define a new pointer for the queue, called the
    limit pointer
  • At start time has the same value of the head
    pointer and is update as the head pointer
  • When a resize decision is done an offset ( one
    portion ) is added/subtracted from it
  • The zone between the head and the limit pointer
    is the disabled zone ( no wake-up )
  • If the tail grows more than the limit, we allow
    the correct wake-up and we stop the
    insertion until the limit reach the tail

25
  • Heuristic to reduce size
  • Collect statistics about the instructions
    committed in the youngest portion of the queue
  • every quantum time ( 1000 cycles ).
  • We propose to insert a bit in each ROB
    entry that will be set at dispatch time if the
    physical position of the instruction in IQ is in
    the current youngest part
  • The resize decision is threshold-based gtgtgt
    0.025 of IPC in the current portion
  • No limit to cut
  • Heuristic to increase size
  • Blind gtgtgt grow of one portion every 5 quantum
    time at lets the cut approach decide if
  • the decision was correct or not (
    time of high parallelism or not )

26
Results
27

28
Conclusions
  • Power consumption is a new constraint in the
    design of computer systems like cost and
    performance
  • The problem must be attacked from different
    levels of abstraction
  • Power decision must be done at early steps of the
    design
  • There is a need of power estimation models and
    tools, specially at architectural level

29
QA ?
30

31
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com