Leakage Power Optimization With DualVth Library In HighLevel Synthesis - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Leakage Power Optimization With DualVth Library In HighLevel Synthesis

Description:

Santa Clara, CA, USA. Outline. Introduction. Related Work. Problem Formulation ... Definition: If G = (V, E) is a undirected graph and w: V- R is a weight function ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 31
Provided by: xiaoyo
Category:

less

Transcript and Presenter's Notes

Title: Leakage Power Optimization With DualVth Library In HighLevel Synthesis


1
Leakage Power Optimization With Dual-Vth Library
In High-Level Synthesis
Hai Zhou haizhou_at_ece.northwestern.edu Northwestern
University Evanston, IL, USA (Presenter)
Prith Banerjee prith_at_uic.edu
University of Illinois at Chicago Chicago,
IL, USA
Xiaoyong Tang xtang_at_magma-da.com Magma Design
Automation, Inc Santa Clara, CA, USA
2
Outline
  • Introduction
  • Related Work
  • Problem Formulation
  • MWIS-Based Algorithm for Leakage Power
    Optimization
  • Experimental Results
  • Conclusions and Future Work

3
Introduction
  • Low Power Design
  • Portable systems
  • Thermal considerations
  • Reliability issues
  • Environmental concerns
  • Leakage Power Consumption
  • become dominant in digital designsunder 90nm

4
Source of Static Power Consumption
  • Psub-threshold Due to sub-threshold leakage
    currents
  • Pgate Due to gate oxide tunneling, hot carrier
    injection into gates, gate induced drain leakage
    currents
  • Others Ppn (PN junction reverse bias currents),
    Ppunch-thru (punch-through effects)

5
Basic Facts of Leakage Currents
  • Leakage current for a n-type MOSFET transistor
    with Vgs 0
  • Leakage current decreases exponentially with the
    increase of the threshold voltage
  • For gates, leakage current also depends on the
    input state (stack effects)

6
Related Work
  • Leakage Power Estimation
  • K. Roy et. al. (IEEE03), R. X. Gu et. al. (J.
    of Solid-States Circuits96), Khouri et. al.
    (TVLSI02), Bobba et. al. (1999),
  • Device or Gate Level Estimation
  • Leakage Power Optimization
  • Srivastava et. Al. (ASPDAC03), Khouri et. Al.
    (TVLSI02), Ye et. al. (Symp. On VLSI Cir. 98),
    Abdollahi et. al. (TVLSI04)
  • Algorithms based on MWIS
  • Chen et al. (DAC96)

7
Dual Threshold Voltage Library inHigh Level
Synthesis
  • High level synthesis has the biggest impact on
    the whole system design
  • Number and type of resources
  • Operation execution sequence
  • Early estimation of performance, area and power
  • High-Vth device has lower leakage power
    consumption but slower speed than low-Vth device
  • Perform replacements of module instances with
    their high-Vth correspondents on non-critical
    paths to minimize power

8
Problem Definition
  • Given a synthesized data flow graph, timing
    constraints, a dual-Vth library, replace some
    module instances with their corresponding
    high-Vth implementations, such that the data
    dependency and timing constraints are satisfied,
    and the total leakage power consumption is
    minimized.
  • Challenges
  • Slack dependency
  • Dual-Vth library (incomplete library)
  • Resource sharing

9
Graph Representations Example

C




A
B
C
D


-

-
-
-
-

-
B
E
F
I
A
E
D
G
H
I

F
-
-
G
H
fa_1
sa_1
sa_2
fs_1
ss_1
ss_2
ss_3
Module Instance Usage Graph (MIUG)
  • Sample DFG
  • To be illustrative, use the following
    assumptions
  • For low-Vth implementations, the delay of fa_1
    and fs_1 is 1 cycle, others are 2 cycles
  • Their high-Vth designs will increase the delays
    by 1 cycle
  • Timing constraint is 8 cycles

10
Graph Representation
2
1
1
2




A
B
C
D
1/4/3
A
B
C
D
1/4/3
1/4/3
2/5/3
-

-
1
1
2
E
F
E
F
I
I
3/6/3
1/5/4
3/6/3
-
-
2
2
G
H
G
H
4/7/3
4/7/3
Composite Constraint Graph
Slack Graph
  • CCG contains data dependency and resource
    constrains in one graph
  • SG represents the delay for each node, as well as
    the time triplet ASAP/ALAP/Slack

11
Estimation of Leakage Energy Reduction by Using
Dual-Vth Library
  • MIUG(m) MIUG of module instance m
  • LPWTv leakage power consumption for the input
    states of node v
  • tidle idle time for the module instance
  • a empirical coefficient for effective idle
    portion during computation
  • D delay for the computation

12
Greedy Approach for Replacement
sa_2
sa_1
fa_1
fa_1
22
34
34
22
fs_1
ss_3
fa_1
26
34
30
ss_1
ss_2
26
26
DFG with Resources and Leakage Power Reduction
Labeled
  • Greedy approach first replace the one with the
    biggest leakage power reduction fa_1
  • Use limited local info

13
Maximum Weight Independent Set Problem
  • Our goal
  • Simultaneous replacements with maximum leakage
    power reduction
  • Ensure slack independent to each other for the
    instances in the replacement set
  • Maximum Weight Independent Set (MWIS) Problem
  • Definition If G (V, E) is a undirected graph
    and w V-gtR is a weight function defined on the
    node set, then find the independent set S that
    maximizes the weight function w(S) sum ( w(s)
    s in S)
  • NP-complete problem for general graph
  • There are polynomial algorithms for comparability
    graphs (i.e. transitive orientable graphs or
    partially orderable graphs)

14
Heuristic Approach Based On Simultaneous
Replacements
  • Basic Ideas
  • Analyze the slack distributions
  • Estimates the reductions of leakage power for
    individual replacements of the module instances
  • Analyze the correlations between slack changes
    and module replacements
  • Perform multiple replacements with maximum
    leakage reduction while maintaining the validity
    of the synthesis result
  • Difficulties
  • Slack dependence analysis
  • Selection of module instance set to be replaced

15
Slack Sensitive Graph and Its Transitive Closure
Graph
2
1
1
2
2
1
1
2
A
B
C
D
A
B
C
D
1/4/3
1/4/3
1/4/3
1/4/3
1/4/3
2/5/3
1/4/3
2/5/3
1
1
2
1
1
2
E
F
I
E
F
I
3/6/3
1/5/4
3/6/3
3/6/3
1/5/4
3/6/3
2
2
2
2
G
H
G
H
4/7/3
4/7/3
4/7/3
4/7/3
Slack Sensitive Graph
Slack Sensitive Transitive Closure Graph
  • Slack Sensitive Edge (u, v) ASAP critical or
    ALAP critical (Chen. et. al. DAC96)
  • Slack Insensitive Set eg. E, F, I, A, B, D

16
MWIS Based Heuristic Algorithm
  • Step1 Construct module instance usage graphs
    MIUGs
  • Step 2 Construct composite constraint graph CCG
  • Step 3 Construct a general transitive closure
    graph (TG) from CCG
  • Step 4 Construct a module instance sensitive
    graph (MISG) from TG
  • Step 5 Recognition and finding a transitive
    orientation for the module instance sensitive
    graph (MISG)
  • Step 6 For each module instance perform
    tentative replacement, build slack graph SG from
    CCG, and update the safety of the replacement.
    If there is no safe replacement, return.

17
MWIS Based Heuristic Algorithm
  • Step 7 For each safe node U in the MISG,
    calculate and assign a leakage power reduction
    weight.
  • Step 8 Find the maximum weight independent set
    of MISG if MISG is a transitive graph otherwise,
    find a near-maximum independent set of MISG using
    greedy approach.
  • Step 9 Replace the module instances node with
    their high_Vth designs in the set of Step 8.
  • Step 10 Update the delay of each operation node
    in DFG
  • Step 11 Go to Step 6

Time Complexity O(MV3)
18
Graph Example

C




A
B
C
D


-

-
-
-
-

-
B
E
F
I
A
E
D
G
H
I

F
-
-
G
H
fa_1
sa_1
sa_2
fs_1
ss_1
ss_2
ss_3
Module Instance Usage Graph (MIUG)
  • Sample DFG
  • To be illustrative, use the following
    assumptions
  • For low-Vth implementations, the delay of fa_1
    and fs_1 is 1 cycle, others are 2 cycles
  • Their high-Vth designs will increase the delays
    by 1 cycle
  • Timing constraint is 8 cycles

19
Heuristic Algorithm Example
2
1
1
2
sa_1
fa_1
sa_2
A
B
C
D
1
1
2
fs_1
ss_3
E
F
I
2
2
ss_1
ss_2
G
H
General Transitive Closure Graph (TG)
Module Instance Sensitive Graph (MISG)
20
Heuristic Algorithm Example
2
1
1
3
sa_1
fa_1
sa_2
A
B
C
D
34
22
1/3/2
22
1/3/2
1/3/2
2/4/2
fs_1
ss_3
2
1
3
E
F
I
26
30
3/5/2
1/4/3
4/6/2
2
2
ss_1
ss_2
26
26
G
H
5/7/2
5/7/2
Induced Transitive Graph from MISG (with weight
labeled)
New Slack Graph
  • MWIS heuristic approach
  • the input to the MWIS solver is the weighted
    transitive graph
  • fs_1, sa_2, ss_3 are replaced by the solver for
    the first round

21
Heuristic Algorithm Example
2
1
1
3
sa_1
fa_1
sa_2
A
B
C
D
0
0
1/2/1
22
1/2/1
1/2/1
2/3/1
fs_1
ss_3
2
1
3
E
F
I
0
0
3/4/1
1/3/2
4/5/1
3
3
ss_1
ss_2
26
26
G
H
5/6/1
5/6/1
Induced Transitive Graph from MISG (with weight
labeled)
New Slack Graph
  • MWIS heuristic approach
  • Second round ss_1, ss_2

22
Heuristic Algorithm Example
3
1
1
3
sa_1
fa_1
sa_2
A
B
C
D
0
0
1/2/1
22
1/2/1
1/1/0
2/3/1
fs_1
ss_3
2
1
3
E
F
I
0
0
4/4/0
1/3/2
4/5/1
3
3
ss_1
ss_2
0
0
G
H
6/6/0
5/6/1
Induced Transitive Graph from MISG (with weight
labeled)
New Slack Graph
  • MWIS heuristic approach
  • Third round sa_1
  • Total Reduction 152

23
Greedy Algorithm Example
sa_1
fa_1
sa_2
3
2
2
3
A
B
C
D
34
22
1/1/0
22
1/2/1
1/2/1
3/3/0
fs_1
ss_3
2
2
3
E
F
I
26
30
4/5/1
1/4/3
5/5/0
ss_1
ss_2
2
2
26
26
G
H
7/7/0
7/7/0
Induced Transitive Graph from MISG (with weight
labeled)
New Slack Graph
  • Greedy approach by replacing one instance for
    each iteration fa_1, fs_1, ss_3, sa_1, and sa_2
  • Total Reduction 134

24
Experiments
  • Benchmarks
  • Diffeq
  • A differential equation solver
  • Ellipf
  • Elliptical wave filter
  • FIR filter
  • Band pass filter
  • Laplace edge detection
  • Matrix multiplication
  • O(n3)
  • Sobel edge detection
  • Basic Jacobi style algorithm (nearest neighbor)
  • 0.18µm 1.8V Technology Library

25
MWIS Heuristic Algorithm Optimization Results
Initial Results
Greedy
LPILP
MWIS_Heuristic
26
MWIS Heuristic Algorithm Optimization Results
Initial Results
Greedy
LPILP
MWIS_Heuristic
27
MWIS Heuristic Algorithm Optimization Results
Initial Results
Greedy
LPILP
MWIS_Heuristic
28
Conclusions
  • Problem Formulation
  • Leakage power minimization through dual Vth
    re-binding
  • Iteratively apply Max Weight Independent Set
    algorithm to find the resources whose
    simultaneous replacement gives max power savings
  • Heuristic because of power model, resource
    sharing, and incomplete library
  • Experimental Results
  • Average 70.9 leakage power reduction
  • Close to ILP approach but much faster

29
Future Work
  • Current approach starting with the minimal
    latency, iteratively reduce leakage
  • Investigate an approach of starting with the
    minimal leakage and iteratively reducing latency
  • Current approach assumes a fixed sharing and
    usage sequence
  • Investigate how to do a dual-Vth aware sharing or
    a combined sharing and dual-Vth binding

30
Thank you
  • QA?
Write a Comment
User Comments (0)
About PowerShow.com