RTL Power Optimization with Gatelevel Accuracy - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

RTL Power Optimization with Gatelevel Accuracy

Description:

construct nodes. c1. branch nodes. b1. b2. merge node. m. output nodes. y ... TOC(oj-1) 12. ICCAD 2003. Computation of TOC/NOC (cont.) Operation nodes: NOC(nop) ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 25
Provided by: caden74
Category:

less

Transcript and Presenter's Notes

Title: RTL Power Optimization with Gatelevel Accuracy


1
RTL Power Optimization with Gate-level Accuracy
Qi Wang Cadence Design Systems, Inc
Sumit Roy Calypto Design Systems, Inc
2
Outline
  • Introduction
  • Literature Review
  • Proposed Approach
  • Experimental Results
  • Conclusion
  • Future work

3
RTL Power Optimization Techniques
  • Clock Gating
  • Shut-off clock signal when the outputs of the
    driven registers are not used.
  • Reduce the dynamic power dissipated by the clock
    tree network and the registers
  • Sleep Mode (Operand Isolation)
  • Shut-off combination blocks when the outputs of
    the blocks are not used.
  • Useful for designs with lots of datapath blocks
    and small amount of average switching activities.

4
Example of Sleep Mode Transformation
5
Challenges
Timing Closure
a


1
b
out
1
0
Achieving Power Saving
0
en2
en1
Identifying Complex Enable Function
6
Literature Review
  • H. Kapadia and et.al. Reducing Switching
    Activity on Datapath Buses with Control-Signal
    Gating, IEEE Journal of Solid-State Circuits,
    Vol. 34, No. 3, March 1999, pp. 405-414.
  • 6 S. Dey and et. al., Controller-Based Power
    Management for Control-Flow Intensive Designs,
    IEEE Trans. on Computer-Aided Design of
    Integrated Circuits and Systems, Vol. 18, No. 10,
    October 1999, pp. 1496-1508.
  • M. Munch, and et. al., Automating RT-Level
    Operand Isolation to Minimize Power Consumption
    in Datapaths, Proceedings of Design and Test
    Automation Conference in Europe, Mar. 2000, pp.
    624-631.

7
Limitations of Previous Work
  • Insert gating logic before timing optimization
  • Poor accuracy for power estimation at RTL
  • Poor accuracy for timing analysis at RTL
  • Problems for timing closure
  • Timing constraints may be violated
  • Inserted logic may shift critical path
  • Simply undoing transformations is not enough
  • Possible loops between RTL and timing
    optimization to achieve timing closure
  • Result in long run time and bad QoR
  • Power may be increased

8
Proposed Approach
  • Objective
  • A robust solution to meet the challenges of both
    timing closure and power requirement for
    nanometer designs.
  • Two step approach
  • RTL exploration
  • Behavioral level observability analysis to derive
    complex enable function from CDFG.
  • Mark the netlist with candidates for sleep mode
    transformations with enable functions but do not
    commit it.
  • Gate-level committing
  • Perform regular logic and timing optimization
    like not sleep mode logic has been inserted.
  • Commit it after timing optimization.
  • Accurate power and delay trade-off becomes
    possible.

9
Control Data Flow Graph (CDFG)
module m(a,b,en,clk,y) input en, clk input 70
a, b output 80 y register 80 y always _at_
(posedge clk) if (en) y a b endmodule
10
Behavioral Level Observability
  • TOC Token Observable Condition of an edge is the
    condition under which the token on that edge can
    be observed at one or more output nodes of a
    CDFG.
  • NOC Node Observable Condition is the condition
    under which the token on any input edge of the
    node can be observed at one or more output nodes
    of a CDFG.
  • TO Token Observability of an edge is the
    probability of the TOC of this edge being 1.
  • NO Node Observability of a node is the
    probability of the NOC of the node being 1.

11
Computation of TOC/NOC
Output/Register nodes NOC(nout) 1 TOC(i)
1
12
Computation of TOC/NOC (cont.)
Operation nodes NOC(nop) TOC(o0) ?
TOC(o1) ? ?
TOC(oj-1) TOC(ip) NOC (nop) ?
p ?0, 1, , k-1
13
Computation of TOC/NOC (cont.)
Merge nodes NOC(nmerge) TOC(o) TOC(c)
NOC(o) TOC(ip) cp ? TOC(o) where cp ? p ?0,
k-1 is a Boolean encoding of the variables for
the value of the token at the control port to
select output port p
14
Computation of TOC/NOC (cont.)
Branch nodes NOC(nbranch) TOC(o0) ?
TOC(o1) ? ... ? TOC(oj-1) TOC(c)
NOC(nbranch) TOC(i) c0 ? TOC(o0) ? c1
? TOC(o1) ? ? cj-1 ? TOC(oj-1) where cp
? p ?0, j-1 is a Boolean encoding of the
variables for the value of the token at the
control port to select output port p
15
Fast Computation of TOC/NOC
16
RTL Exploration
a
1
out
1

0
0
b
en1
en2
17
Gate Level Committing
a
1
out
1

0
0
b
en1
en2
18
Partial Committing
committed
fa(bc)
? fnewa
19
Synthesis Flows
20
Experiment Setup
  • Implemented into Cadence PKS/LPS? 5.0
  • A commercial low power synthesis tool for both
    logical and physical power optimization.
  • Using the incremental power analysis engine
    inside PKS/LPS ? to evaluate the impact of power
    during the gate-level committing stage.
  • Using the incremental timing analysis engine
    inside PKS to evaluate the impact of timing
    during the gate-level committing stage.
  • The overhead of extra delay and power introduced
    by the gating logic is accurately considered.

21
Experiment Setup (cont.)
  • 6 industrial blocks were chosen for
    experimentation
  • All except 6 having customer provided simulation
    testbench to obtain the switching information for
    power estimation

22
Experimental Results
23
Experimental Results (cont.)
  • Proposed approach can achieve a wide range of
    power delay trade-offs
  • Design 6 was chosen to run the flow several times
    with different timing constraints

24
Conclusion
  • Robust solution for applying sleep-mode
    transformation
  • 2-step approach toward achieving RTL
    transformation with gate level accuracy
  • Accurate and full range of power delay trade off
  • No impact on normal timing optimization
  • Fully automated
  • Ideal solution to meet the challenges of both
    timing closure and power management for modern
    nanometer designs.
Write a Comment
User Comments (0)
About PowerShow.com