Overview - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Overview

Description:

Overview Motivation (Kevin) Thermal issues (Kevin) Power modeling (David) Thermal management (David) Optimal DTM (Lev) Clustering (Antonio) Power distribution (David) – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 51
Provided by: SkadronS
Category:

less

Transcript and Presenter's Notes

Title: Overview


1
Overview
  1. Motivation (Kevin)
  2. Thermal issues (Kevin)
  3. Power modeling (David)
  4. Thermal management (David)
  5. Optimal DTM (Lev)
  6. Clustering (Antonio)
  7. Power distribution (David)
  8. What current chips do (Lev)
  9. HotSpot (Kevin)

2
PowerPC G3 Microprocessor
  • On-chip temperature sensor (junction temperature)
  • Based on differential voltage change across 2
    diodes of different sizes
  • Implemented in PowerPC G3/G4 processors
  • OS required for control
  • Instruction Cache Throttling used to dynamically
    lower junction temperature

3
Pentium III
  • On-die thermal diode
  • Coupled with board-level thermal diode sensor
  • Uses
  • Monitor long-term temperature and environmental
    trends
  • Provide indication of catastrophic failure

4
Pentium 4
  • Thermal ramp rates 50ºC/second(over whole
    package)
  • Much too high for coarse-grained solutions
  • Thermal Monitor
  • Highly-accurate on-die temperature sensing
    circuit
  • Fast acting temperature control circuit (50ns)

Temperature Sensing Diode
PROCHOT
Current Comparator
Reference Current Source
5
Pentium 4 -- Thermal Monitor
  • Trip Point is calibrated at manufacturing time
  • Simple response
  • Turn processor clocks on/off at 50 duty cycle
  • For 1.5GHz processor, 2?s on 2?s off?

6
Pentium 4 -- Results
  • For 200 traces (TPC-C, SPEC, Microsoft)
  • Thermal design point can be reduced to 75 of
    true max power with minimal performance loss

7
Pentium 4
  • Thermal monitors allow
  • Tradeoff between cost and performance
  • Cheaper package
  • More triggers, Less Performance
  • Expensive package
  • No triggers, no performance loss

8
Architecture-level Thermal Management
  • Dynamically adjust execution to control
    temperature
  • Avoid catastrophic failure (heat sink, fan)
  • Permit use of less expensive package
  • Design for less than the worst case
  • Package costs 1/W above 40 W
  • Heat sinks, heat pipes, thinned wafers, fans
  • Fans reduce battery life
  • Peak power as high as 150 W now and gt 200W in 1-2
    generations
  • Temperatures over 100C
  • More fundamentally -- there is a need for
    architecture-level thermal modeling
  • Whats actually going on in there?

9
HotSpot project
  • Collaboration between HPLP and LAVA Labs (ECE and
    CS depts. UVa)
  • Deal with hot spots
  • Localized heating occurs muchfaster than
    chip-wide
  • microsec. to millisec.
  • Chip-wide treatment is too conservative
  • seconds to minutes
  • but there is significant lateral thermal
    coupling through the package
  • How do we model this?
  • Prove temperature will be
  • safely bounded

10
Hot spots in Power4
Temperature landscape space and time How to
estimate early in the design cycle?
11
Thermal modeling
  • Want a fine-grained, dynamic model of temperature
  • At a granularity architects can reason about
  • That accounts for adjacency and package
  • That does not require detailed designs
  • That is fast enough for practical use
  • HotSpot - a compact model based on thermal R, C
  • Parameterized to automatically derive a model
    based on various
  • Architectures
  • Power models
  • Floorplans
  • Thermal Packages

12
Dynamic compact thermal model
  • Electrical-thermal duality
  • V ?temp (T)
  • I ?power (P)
  • R ?thermal resistance (Rth)
  • C ?thermal capacitance (Cth)
  • RC time constant (Rth Cth)
  • Kirchoff Current Law
  • differential eq. I C dV/dt V/R
  • thermal domain P Cth dT/dt T/Rth
  • where T T_hot T_amb
  • At higher granularities of P, Rth, Cth
  • P, T are vectors and Rth, Cth are circuit
    matrices

13
Package we model
Heat sink
IC Package
Heat spreader
PCB
Pin
Die
Interface material
14
Modeling the package
  • Thermal management allows for packaging
    alternatives/shortcuts/interactions
  • HotSpot needs a model of packaging
  • Basic thermal model
  • Heat spreader
  • Heatsink
  • Interface materials (e.g. phase-change films)
  • Fan/Active cooler (TEC)
  • Thermal resistance due to convection
  • Constriction and bulk resistance for fins
  • Spreading constriction and bulk resistance for
    heatsink base and heat spreader
  • Thermal resistance for bonding material
  • Thermal capacitance heat spreader and heatsink

15
Optimal package
  • Default package is found using
  • Power dissipation
  • Target temperature on chip
  • Chip area
  • Clock speed high or low performance
  • Power dissipation and target temperature used to
    determine resistance value needed
  • Needs more work modern packages are incredibly
    complex, yet there is still a need to model at
    higher levels
  • Now what can we do with HotSpot?

16
Equivalent vertical network
  • Diagram is simplified peripheral nodes

Chip
Interface
Peripheral spreader nodes
Spreader
Interface Sink
Convection
17
Vertical network parameters
  • Resistances
  • Determined by the corresponding areas and their
    cross sectional thickness
  • R resistivity x thickness / Area
  • Capacitances
  • C specific heat x thickness x Area
  • Peripheral node areas

Spreader
North
East
West
Chip
South
18
Lateral resistances
  • Determined by the floorplan and the length of
    shared edges between adjacent blocks
  • "Heat Spreading and Conduction in Compressed
    Heatsinks", Jaana Behm and Jari Huttunen, in
    proceedings of the 10th International Flotherm
    User Conference, May 2001.

19
Lateral resistances contd...
  • Lengths used for silicon
  • Lengths used in the spreader

20
Our model (lateral and vertical)
Interface material(not shown)
21
Temperature equations
  • Fundamental RC differential equation
  • P C dT/dt T / R
  • Steady state
  • dT/dt 0
  • P T / R
  • When R and C are network matrices
  • Steady state T R x P
  • Modified transient equation
  • dT/dt (RC)-1 x T C-1 x P
  • HotSpot software mainly solves these two equations

22
HotSpot
  • Time evolution of temperature is driven by unit
    activities and power dissipations averaged over
    10K cycles
  • Power dissipations can come from any power
    simulator, act as current sources in RC circuit
    ('P' vector in the equations)
  • Simulation overhead in Wattch/SimpleScalar lt 1
  • Requires models of
  • Floorplan important for adjacency
  • Package important for spreading and time
    constants
  • R and C matrices are derived from the above

23
Implementation
  • Primarily a circuit solver
  • Steady state solution
  • Mainly matrix inversion done in two steps
  • Decomposition of the matrix into lower and upper
    triangular matrices
  • Successive backward substitution of solved
    variables
  • Implements the pseudocode from CLR
  • Transient solution
  • Inputs current temperature and power
  • Output temperature for the next interval
  • Computed using a fourth order Runge-Kutta (RK4)
    method

24
Transient solution
  • Solves differential equations of the form dT AT
    B where A and B are constants
  • In HotSpot, A is constant but B depends on the
    power dissipation
  • Solution assume constant average power
    dissipation within an interval (10 K cycles) and
    call RK4 at the end of each interval
  • In RK4, current temperature (at t) is advanced in
    very small steps (th, t2h ...) till the next
    interval (10K cycles)
  • RK 4 because error term is 4th order i.e.,
    O(h4)

25
Transient solution contd...
  • 4th order error has to be within the required
    precision
  • The step size (h) has to be small enough even for
    the maximum slope of the temperature evolution
    curve
  • Transient solution for the differential equation
    is of the form Ae-Bt with A and B are dependent
    on the RC network
  • Thus, the maximum value of the slope (AxB) and
    the step size are computed accordingly

26
Validation
  • Validated and calibrated using MICRED test chips
  • 9x9 array of power dissipators and sensors
  • Compared to HotSpot configured with same grid,
    package
  • Within 7 for both steady-state and transient
    step-response
  • Interface material (chip/spreader) matters

27
Current features
  • Specification of arbitrary floorplans
  • Format of floorplan file
  • One line per unit
  • Line format ltunit-namegt \t ltwidthgt \t ltheightgt
    \t ltleft-xgt \t ltbottom-ygt \n
  • Takes a power trace file as an input and outputs
    corresponding temperature trace
  • Ability to modify package specifactions (type of
    interface material, size and type of heat
    spreader and heat sink etc.)

28
Current floorplan
29
Current floorplan CPU core
30
Soon to be features
  • Grid model RC network per grid cell instead of
    a block
  • Temperature models for wires, pads and interface
    material between heat sink and spreader
  • Better (more user friendly) floorplan
    specification
  • Automatic floorplan generation using classical
    floorplanning algorithms

31
Better floorplan specification
  • Floorplan of current microprocessors has a
    structural similarity
  • Floorplans similar to MIPS R10K, Pentium and the
    Alpha 21264
  • Pipeline order corresponds to floorplan adjacency

32
Better floorplan specification
  • Sample specification (with areas) that takes
    advantage of pipeline order

33
Automatic floorplan for architects
  • Why develop an architectural floorplanning tool?
  • Thermal modeling requires adjacency information.
  • Wire delays make performance depend on the
    floorplan.
  • Goal
  • Derive a realistic floorplan using only
    microarchitectural information
  • Trade off thermal efficiency against latency
  • Simulated annealing based floorplan optimization
    for thermal, delay and combined metrics
  • Current work. Results will be available soon

34
Sensors
  • Caveat emptor
  • We are not well-versed on sensor design the
    following is a digest of information we have been
    able to collect from industry sources and the
    research literature.

35
Desirable Sensor Characteristics
  • Small area
  • Low Power
  • High Accuracy Linearity
  • Easy access and low access time
  • Fast response time (slew rate)
  • Easy calibration
  • Low sensitivity to process and supply noise

36
PowerPC G3
  • (Sanchez et al, Symp. on VLSI Circuits 97,
    COMPCON 97)
  • 0.35 µ, 2.5V
  • Area 0.2 mm2
  • Power 10 mW
  • Precision 4.5
  • Offset 12 at process corners
  • Linearity lt 4
  • Based on thermal diodes and current mirrors

37
Types of Sensors
  • (In approx. order of increasing ease to build)
  • Thermocouples voltage output
  • Junction between wires of different materials
    voltage at terminals is a Tref Tjunction
  • Often used for external measurements
  • Thermal diodes voltage output
  • Biased p-n junction voltage drop for a known
    current is temperature-dependent
  • Biased resistors (thermistors) voltage output
  • Voltage drop for a known current is temperature
    dependent
  • You can also think of this as varying R
  • Example 1 KO metal snake
  • BiCMOS, CMOS voltage or current output
  • Rely on reference voltage or current generated
    from a reference band-gap circuit current-based
    designs often depend on temp-dependence of
    threshold

38
Thermal Sensors in PowerPC
  • On-chip temperature sensor (junction temperature)
  • Based on differential voltage change across 2
    diodes of different sizes
  • Implemented in PowerPC G3/G4 processors
  • Instruction Cache Throttling used to dynamically
    lower junction temperature

39
Typical Sensor Configuration
PTAT Proportional to Absolute Temperature
40
Absolute Sensor 1
Syal, Lee, Ivanov, Altet, Online Testing
Workshop, 2001
Schematics of Delta Vgs Current Reference (left)
Generator and Delay Cell (right)
41
Sensors Problem Issues
  • Poor control of CMOS transistor parameters
  • Noisy environment
  • Cross talk
  • Ground noise
  • Power supply noise
  • These can be reduced by making the sensor larger
  • This increases power dissipation
  • But we may want many sensors

42
Reasonable Values
  • Based on conversations with engineers at Sun,
    Intel, and HP (Alpha)
  • Linearity not a problem for range of
    temperatures of interest
  • Slew rate lt 1 µs
  • This is the time it takes for the physical
    sensing process (e.g., current) to reach
    equilibrium
  • Sensor bandwidth ltlt 1 MHz, probably 100-200 kHz
  • This is the sampling rate 100 kHz 10 µs
  • Limited by slew rate but also A/D
  • Consider digitization using a counter

43
Reasonable Values Precision
  • Mid 1980s lt 0.1 was possible
  • Precision
  • 3 is very reasonable
  • 2 is reasonable
  • 1 is feasible but expensive
  • lt 1 is really hard
  • The limited precision of the G3 sensor seems to
    have been a design choice involving the
    digitization

P 10s of mW
44
Calibration
  • Accuracy vs. Precision
  • Analogous to mean vs. stdev
  • Calibration deals with accuracy
  • The main issue is to reduce inter-die variations
    in offset
  • Typically requires per-part testing and
    configuration
  • Basic idea measure offset, store it, then
    subtract this from dynamic measurements

45
Dynamic Offset Cancelation
  • Rich area of research
  • Build circuit to continuously, dynamically detect
    offset and cancel it
  • Typically uses an op-amp
  • Has the advantage that it adapts to changing
    offsets
  • Has the disadvantage of more complex circuitry

46
Role of Precision
  • Suppose
  • Junction temperature is J
  • Max variation in sensor is S
  • Thermal emergency is T
  • T J S
  • Spatial gradients
  • If sensors cannot be located exactly at hotspots,
    measured temperature may be G lower than true
    hotspot
  • T J S G

47
Rate of change of temperature
  • Our FEM simulations suggest maximum 0.1 in about
    25-100 µs
  • This is for power density lt 1 W/mm2 die thickness
    between 0.2 and 0.7mm, and contemporary packaging
  • This means slew rate is not an issue
  • But sampling rate is!

48
Sensors Summary
  • Sensor precision cannot be ignored
  • Reducing operating threshold by 1-2 degrees will
    affect performance
  • Precision of 1 is conceivable but expensive
  • Maybe reasonable for a single sensor or a few
  • Precision of 2-3 is reasonable even for a
    moderate number of sensors
  • Power and area are probably negligible from the
    architecture standpoint
  • Sampling period lt 10-20 µs

49
HotSpot Summary
  • HotSpot is a simple, accurate and fast
    architecture level thermal model for
    microprocessors
  • Over 90 downloads till now
  • Ongoing active development architecture level
    floorplanning will be available soon
  • Download site
  • http//lava.cs.virginia.edu/HotSpot
  • Mailing list
  • www.cs.virginia.edu/mailman/listinfo/hotspot

50
  • Temperature-aware computing
  • Optimize performance subject to a thermal
    constraint
Write a Comment
User Comments (0)
About PowerShow.com