Title: Adaptive Designs for Power and Thermal Optimization
1Adaptive Designs for Power and Thermal
Optimization
Richard McGowen Intel Corporation, Fort Collins,
CO
2Organization
- Introduction
- Motivation
- Implementation Details
- Results
- Future Directions
- CAD Challenges
3Adaptive Designs
Sensor
Control
Decision Logic
- Designs that monitor their operating conditions
and adapt their behavior to handle those
conditions - Trend due to necessity and opportunity
- Process scaling plays a large role
- Increased density
- Increased variability
4Examples
- Memories
- Correctness and reliability
- Using ECC to correct for soft errors
- Thermals
- Protection and reliability
- Using thermal sensor to keep processor from
burning up - Power
- TCO, battery longevity, and performance
- Using power sensor to maintain target power level
to enable simple cooling solution at both
processor and data center level
5Motivation
- Introduction
- Motivation
- Implementation Details
- Results
- Future Directions
- CAD Challenges
6Power as a Function of Application
7Power Motivation
- Large difference between best case and worst case
switching power - Minimum Average -gt Max Average
- 65 to 89
- Minimum Average -gt Max Peak
- 65 to 100
- When power limited, if you frequency bin part for
worst application you are artificially limiting
frequency of all other applications - Transaction/Integer code limited by Technical
Computing
8Power Motivation
- What happens when you add more cores?
- Can you afford power limit based on summation of
worst case power of each core? - What are the chances of each core running worst
case code at the same time? - Particularly as the number of cores increases
- Is industry trending toward more or fewer cores?
- Need to address gap between typical power and max
power to avoid losing performance - This is where adaptive design can help
9Implementation Details
- Introduction
- Motivation
- Implementation Details
- Results
- Future Directions
- CAD Challenges
10Take Advantage of P ? V2F
100
Frequency
80
Frequency Voltage
60
Power (W)
Other
40
Core Leakage
Core Switching
20
0
100
50
55
60
65
70
80
85
90
95
75
Frequency ( of Fmax)
11High Level View of System
10s of µs
Voltage Control
Frequency Control
100s of ps
12High Level View of System
10s of µs
PowerSensor
Supply VRM
Micro-Controller
Frequency Control
100s of ps
13High Level View of System
10s of µs
PowerSensor
Supply VRM
Micro-Controller
ThermalSensor
Frequency Control
100s of ps
14High Level View of System
10s of µs
PowerSensor
Supply VRM
Micro-Controller
ThermalSensor
Voltage to Freq.Converter
VoltageSensor
Clock
100s of ps
15Power Control System
10s of µs
PowerSensor
Supply VRM
Micro-Controller
ThermalSensor
Voltage to Freq.Converter
VoltageSensor
Clock
100s of ps
16Power Control System
VID6
PLimit
-
DAC
IIR
-
RPackage
VConnector
Calc
PCalc
VDie
Power Supply
Micro-Controller
Package/Die
Logic
Control
Sensor
17Power Sensor
10s of µs
PowerSensor
Supply VRM
Micro-Controller
ThermalSensor
Voltage to Freq.Converter
VoltageSensor
Clock
100s of ps
18Measuring Power
VDie
VConnector
RPackage
- Use package resistance to measure power
- Avoids burning extra power in measurement
- Portable, self-contained solution
- No dependence on external power supply
19Voltage Measurement
VAnalog
VDigital
VCO
Counter
- By using an analog mux, we can reuse the same VCO
to measure voltages from multiple sources - On die voltmeter
- High speed counter, gtgt10GHz for 100µV
measurement granularity - 8?s counting interval for filtering and resolution
20Micro-Controller
10s of µs
PowerSensor
Supply VRM
Micro-Controller
ThermalSensor
Voltage to Freq.Converter
VoltageSensor
Clock
100s of ps
21Micro-Controller
Foxton Controller
- Burns less than 0.5W, less than lt.5
22Supply VRM
10s of µs
PowerSensor
Supply VRM
Micro-Controller
ThermalSensor
Voltage to Freq.Converter
VoltageSensor
Clock
100s of ps
23Supply VRM
Connector
Connector
- Three separate supplies
- Cache, Core, and Fixed
- Cache and Core can be set by micro-controller
- 6-bit VIDs (12.5mV increments)
24High Level View of System
10s of µs
PowerSensor
Supply VRM
Micro-Controller
ThermalSensor
Voltage to Freq.Converter
VoltageSensor
Clock
100s of ps
25Measuring Temperature
- Two thermal sensors per core
- Mux thermal diodes into VCOs to measure temp
26High Level View of System
10s of µs
PowerSensor
Supply VRM
Micro-Controller
ThermalSensor
Voltage to Freq.Converter
VoltageSensor
Clock
100s of ps
27Frequency Control System
Several RVDs Per DFD
3 DFDs per Core
28Results
- Introduction
- Motivation
- Implementation Details
- Results
- Future Directions
- CAD Challenges
29Frequency vs. Power Limit
Core 0, Core 1, Avg Frequency vs. PLimit
100
96
Frequency
92
88
84
100W
95W
90W
85W
80W
75W
70W
65W
60W
PLimit
30Frequency Benchmarks
Frequency Histogram
100
90
SpecInt.eon doesnt throttle
80
70
60
50
40
30
System placed in light-halt by Linux during idle
prompt
-
20
10
0
1.00
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
Normalized Frequency
SpecInt.eon
SpecFp.galgel
31Power Benchmarks
Power Histogram
60
50
40
30
20
10
0
116
112
107
102
98
93
88
84
79
74
70
Power Watts
32Future Directions
- Introduction
- Motivation
- Implementation Details
- Results
- Future Directions
- CAD Challenges
33Future Directions for Power Control
- Finer Grained
- More power planes
- Core or Sub-Core
- On demand cores
- Self-reconfiguring for high performance vs.
throughput - Tighter integration with supply
- Think of on-board memory controllers
- Potential to improve reliability and bandwidth
34CAD Challenges
- Introduction
- Motivation
- Implementation Details
- Results
- Future Directions
- CAD Challenges
35CAD Challenges
- Mixed-Mode/Analog Formal Verification
- Very good tools for RTL vs. Schematic for digital
circuits - RTL vs. Analog consists of comparing RTL
simulations against spice simulations - Error-prone
- Vector based risk
- Lack of assertion based checking
- Repeatability/Variation
- Stored-response testers are very unhappy
- Compiler tuning
- Benchmarking
36CAD Challenges
- Mismatched time scales
- Power control system has 8uS loop
- Full chip has lt 1ps cycle with digital simulation
rate on order of Hz - Day vs. minutes
- Had to separate to full chip and system models
- Interface checking becomes painful
- Simulation in general without fixed reference
- Varying frequency
- Requires further subdivision gt kills performance
- Traditional Embedded IP/Multi-instantiation/Mixed
Mode simulation