Low Power Hardware Synthesis from Concurrent Action Oriented Specifications CAOS - PowerPoint PPT Presentation

About This Presentation
Title:

Low Power Hardware Synthesis from Concurrent Action Oriented Specifications CAOS

Description:

NOTE - For any ? 1, a ?-approximation algorithm for a ... It is known that for any ? 0, there is no O(n1- ?) - approximation ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 42
Provided by: csgCsa
Learn more at: http://csg.csail.mit.edu
Category:

less

Transcript and Presenter's Notes

Title: Low Power Hardware Synthesis from Concurrent Action Oriented Specifications CAOS


1
Low Power Hardware Synthesis from Concurrent
Action Oriented Specifications (CAOS)
  • Sandeep K. Shukla
  • Gaurav Singh
  • FERMAT Lab, Virginia Tech.

2
Outline
  • CAOS Scheduling Problem
  • Complexity Analysis
  • Peak Power Problem
  • Complexity Analysis
  • Technique Rescheduling ( suppressing actions )
  • Dynamic Power Problem
  • Complexity Analysis
  • Techniques Rescheduling, Operand Isolation,
    Clock Gating,
  • Gated Guards.

3
  • CAOS Scheduling Problem
  • ( Complexity Analysis )

4
SCHEDULING PROBLEMS WITHOUT A PEAK
POWERCONSTRAINT
  • Maximum Non-conflicting Subset of actions (MNS)
  • Choosing actions which can execute in a clock
    cycle.
  • Minimum Length Schedule Construction (MLS)
  • Distributing actions over multiple clock cycles.

5
MAXIMUM NON-CONFLICTING SUBSET OF ACTIONS (MNS)
  • Instance - Set A a1, a2, , an of enabled
    actions a collection
  • C of pairs of actions, where ai, aj ? C
    means that actions ai and
  • aj conflict an integer K n.
  • Question - Is there subset A C A such that A
    gt K and no pair of
  • actions in A conflict?
  • MNS problem is NP-Complete.
  • Corresponds to Maximum Independent Set (MIS)
    Problem.

6
MAXIMUM NON-CONFLICTING SUBSET OF ACTIONS (MNS)
  • NOTE - For any ? 1, a ?-approximation
    algorithm for a
  • combinatorial optimization problem is a
    heuristic that produces a
  • solution which is within a factor ? of the
    optimal solution value.
  • It is known that for any ? gt 0, there is no O(n1-
    ?) - approximation
  • algorithm for the MIS problem, unless P NP.
  • Same holds for MNS Problem.

7
MAXIMUM NON-CONFLICTING SUBSET OF ACTIONS (MNS)
  • SOLUTION - Heuristics with good performance
    guarantees can be
  • devised by exploiting the relationship between
    MNS and MIS
  • problems.
  • SPECIAL CASES
  • Each action conflicts with at most ? other
    actions for some constant ?-
  • Approximation algorithm exists that provides a
    performance guarantee of ?1.
  • Planar graphs, near-planar graphs and unit disk
    graphs-
  • Efficient approximation algorithms are known for
    such classes of graphs.

8
MINIMUM LENGTH SCHEDULE CONSTRUCTION (MLS)
  • Instance - Set A a1, a2,,an of actions a
    collection C of
  • pairs of actions, where ai, aj ? C means that
    actions ai and aj
  • conflict, an integer t n.
  • Question - Is there a partition of A into r
    subsets A1, A2,...,Ar for
  • some r t such that for each i, 1 i r, the
    actions in Ai are
  • pair-wise non-conflicting?
  • MLS problem is NP-Complete.
  • Corresponds to Minimum K-coloring (MINCOLOR)
    Problem.

9
MINIMUM LENGTH SCHEDULE CONSTRUCTION (MLS)
  • It is known that for any ? gt 0, there is no O(n1-
    ?) - approximation
  • algorithm for MINCOLOR problem, unless P NP.
  • Same holds for MLS Problem.

10
MINIMUM LENGTH SCHEDULE CONSTRUCTION (MLS)
  • SOLUTION Heuristics for graph coloring can be
    used in
  • constructing schedules of near-minimum length.
  • SPECIAL CASES
  • Upper bound on the length of schedule is two -
  • Corresponds to the problem of determining whether
    a graph is 2-colorable.
  • Efficient algorithms are known.
  • Each action conflicts with at most ? other
    actions
  • For such instances, a schedule of length at most
    ? 1 can be constructed in polynomial time.

11
  • PEAK POWER PROBLEM
  • ( Complexity Analysis )

12
SCHEDULING PROBLEMS INVOLVING A POWERCONSTRAINT
  • Single Clock Cycle
  • Maximum Number of Actions in a Time Slot Subject
    to Peak Power Constraint (MNA-PP).
  • Maximizing Utility Subject to Peak Power
    Constraint (MU-PP).

13
Maximum Number of Actions in a Time Slot Subject
to Peak Power Constraint (MNA-PP).
  • Instance
  • set A a1, a2,, an of non-conflicting
    actions,
  • for each action ai, the power pi needed to
    execute that action,
  • a positive number P representing the peak power
    constraint.
  • Requirement - Find a subset A C A such that -
  • total power needed to execute actions in A is at
    most P and
  • A is a maximum over all subsets of A that
    satisfy peak power constraint.
  • Optimal Solution -
  • Sort actions in A into non-decreasing order by
    the amount of power.
  • Keep adding actions in order as long as the peak
    power constraint is satisfied.

14
Maximizing Utility Subject to Peak Power
Constraint (MU-PP)
  • Instance
  • set A a1, a2,,an of non-conflicting actions,
  • for each action ai, its power pi consumed and its
    utility ui,
  • a positive number P representing the peak power,
  • a positive number G representing the required
    utility.
  • Question - Is there a subset A C A such that
    the total power needed to execute all the actions
    in A is at most P and the utility of A is at
    least G ?
  • MU-PP problem is NP-Complete.
  • Corresponds to KNAPSACK Problem.

15
Maximizing Utility Subject to Peak Power
Constraint (MU-PP)
  • Any approximation algorithm for the KNAPSACK
    problem can be used as
  • an approximation algorithm with the same
    performance guarantee for the
  • optimization version of MU-PP
  • When the weights and profits are integers, there
    is a polynomial time
  • approximation scheme (PTAS) for the KNAPSACK
    problem.

16
SCHEDULING PROBLEMS INVOLVING A POWERCONSTRAINT
  • Multiple Clock Cycles
  • Minimizing Makespan Subject to Peak Power
    Constraint
  • (MM-PP).
  • Minimizing Peak Power Subject to Makespan
    Constraint
  • (MPP-M).
  • Minimizing Makespan and Peak Power Decision
    Version
  • (MPP-DECISION)

17
Minimizing Makespan Subject to Peak Power
Constraint (MM-PP)
  • Instance
  • set A a1, a2,,an of non-conflicting actions,
  • for each action ai, the power pi needed to
    execute that action,
  • a positive number P representing the peak power
  • Requirement
  • Find a schedule of minimum length for the
    actions in A such that the total power needed to
    execute the actions in each time slot is at most
    P.

18
Minimizing Peak Power Subject to a Makespan
Constraint (MPP-M)
  • Instance
  • set A a1, a2,,an of non-conflicting actions,
  • for each action ai, the power pi needed to
    execute that action,
  • a positive number L representing the makespan
    (number of slot used by a schedule).
  • Requirement
  • Find a schedule of length at most L for the
    actions in A such that the maximum total power
    used in any time slot is a minimum over all
    schedules of length at most L.
  • NOTE - MPP-M is dual of MM-PP.

19
Minimizing Makespan and Peak Power
(MPP-DECISION) Decision Version of MM-PP and
MPP-M.
  • Instance
  • set A a1, a2,,an of non-conflicting actions,
  • for each action ai, the power pi needed to
    execute that action,
  • a positive number P representing the peak power,
  • a positive number L representing the makespan.
  • Question
  • Is there a schedule of length at most L for the
    actions in A such that the
  • total power used in any time slot is at most P ?
  • MPP-DECISION problem is Strongly NP-Complete.
  • Corresponds to 3-PARTITION problem.
  • No pseudo-polynomial algorithm for the
    MPP-DECISION problem, unless
  • P NP.

20
Approximation Algorithms for MM-PP
  • Efficient approximation algorithms possible by
    reducing the
  • problem to the well known BIN PACKING problem.
  • Example - Simple algorithm called First Fit
    Decreasing (FFD)
  • provides a performance guarantee of 11/9.
  • Sort items in non-increasing order of their sizes
    and then assign
  • each item to the first bin in which it will fit.
  • Sophisticated implementation reduces the running
    time to O(n log n).

21
Approximation Algorithms for MPP-M
  • Efficient approximation algorithms possible by
    reducing the
  • problem to classical multiprocessor scheduling
    problem.
  • Example
  • 4/3 approximation algorithm -
  • Sort the actions in non-increasing order of their
    power requirements.
  • Assign each action to a time slot for which the
    total power used is the smallest at that time.
  • Can be implemented to run in O(n log n) time.

22
LOW PEAK POWER TECHNIQUE
  • Re-scheduling Suppress some actions in each
    cycle to reduce peak power of the
    design.
  • Possible Ways
  • Conflict - based
  • Add extra conflicts for peak power sake.
  • Memory - based
  • Use memory to select how many actions to execute
    in each cycle.

23
MEMORY-BASED LOW PEAK POWER TECHNIQUE
  • ALGORITHM -
  • Arrange actions based on their TRS ordering.
  • Find possible combinations of non-conflicting
    actions which can violate the peak power
    constraint when executed concurrently.
  • For each violating combination -
  • find a satisfying combination by suppressing some
    actions.
  • give priority to actions which come earlier in
    TRS-ordering.
  • store the satisfying combinations in a memory.
  • In hardware, memory is used to execute
    appropriate actions in each clock cycle in order
    to satisfy the peak power constraint.

24
MEMORY-BASED LOW PEAK POWER TECHNIQUE
  • Implemented in Bluespec Compiler
  • Around 10 peak-power savings achieved for small
    designs like
  • Vending Machine.
  • Larger power savings may be possible for larger
    designs
  • Experiments Ongoing.

25
MEMORY-BASED LOW PEAK POWER TECHNIQUE
  • LIMITATIONS -
  • Some designs written under the assumption that
    maximum number of
  • actions will execute in each clock cycle might
    not be able to use this
  • technique.
  • Increases latency so applicable mostly to
    latency-insensitive designs.
  • Designs with large number of actions may result
    in a big memory.

26
  • DYNAMIC POWER PROBLEM
  • ( Complexity Analysis )

27
DYNAMIC POWER PROBLEM (DPP)
  • Instance
  • - set A a1, a2,,an of actions.
  • - a positive integer P representing dynamic
    power consumed.
  • Requirement -
  • Select the ordering of execution of actions in A
    such that P is minimized.
  • DPP is NP-Complete.
  • Corresponds to Traveling Salesman Problem -
    sub-problem to DPP.

28
LOW DYNAMIC POWER TECHNIQUES
  • Re-scheduling.
  • Operand Isolation.
  • Clock Gating.
  • Gated Guards.

29
RE-SCHEDULING
  • Actions can be re-scheduled such that switching
    at the inputs of the functional units is
    minimized.
  • Resource sharing - Conflicts can be created such
    that same functional units can be shared among
    actions consisting of same operations on same
    operands.

30
OPERAND ISOLATION
  • Operand Isolation
  • Computation corresponding to the body of an
    action is allowed only when its output is used in
    the present clock cycle.
  • Involves -
  • Insertion of gates at the appropriate points
    without affecting guards.
  • Selection of activation signal.
  • Guards of actions used as gating signals.
  • Implemented algorithm in Bluespec Compiler saved
    upto 25 dynamic power.

31
OPERAND ISOLATION SINGLE ACTION
Computations stay quiescent except when action
executes, i.e. guard is True
action foo ( cond (x lt y) ) x lt x z
endrule
x
x
action foo
y
y
next-state values
F2
z
z
next state
Q
D
body logic
current state
EN
cond logic
enablesignals
32
OPERAND ISOLATION MULTIPLE ACTIONS
Rule1
Rule Control
State
DataSelect
RuleN
F2
Action1
FN
ActionN
Cond1
Scheduler
CondN
  • Isolating multiple actions of a design.

33
REGISTER CLOCK GATING
  • Register Clock-gating -
  • Registers having a common ENABLE signal can be
    provided the same gated clock.
  • CAOS - Registers being updated by same set of
    actions can be passed the same gated clock.
  • Implemented algorithm in Bluespec Compiler saved
    upto 45 dynamic power.

34
REGISTER CLOCK GATING
CLK
Register
DIN
EN
QOUT
GATED_CLK
GATED_CLK
EN
CLK
  • In CAOS, guards of the actions provide the
    control for gating the clocks of the registers.

35
GATED GUARDS
  • In hardware, only required guards should be
    computed in each clock cycle for power sake.
  • Static analysis can be done to figure out which
    guards should be
  • computed.

36
Gated Guards
  • Rule 1 (x gt y) (y ! 0) --gt (x y y x)
  • Rule 2 (x lt y) (y ! 0) --gt (y y - x)
  • Rule 3 (y 0) --gt (result x)
  • Let P ( x gt y) Q (y 0)
  • Then g1 P !Q
  • g2 !P !Q
  • g3 Q
  • ------------------------------------------
  • g1 g2 false
  • g1 g3 false
  • g3 g1 false

37
Gated Guards
  • What else can we infer?
  • (x gt y), (y ! 0), (x y), (y x)
  • --------------------------------------------------
    ----
  • (x lt y) (y ! 0) OR (y 0)
  • So after Rule 1 execution, we know for sure, G1
    cannot be true, but G2 or G3 may be true, and
    hence G1 need not be evaluated. Also prioritize
    G3.

38
Gated Guard
  • Gcd (70, 42)
  • x 70, y 42 --gt Rule 1
  • x 42, y 70 --gt Rule 2
  • x 42, y 28 --gt Rule 1
  • x 28, y 42 --gt Rule 2
  • x 28, y 14 --gt Rule 1
  • x 14, y 28 --gt Rule 2
  • x 14, y 14 --gt Rule 2
  • x 14, y 0 --gt Rule 3
  • result 14

39
Gated Guard
  • Use a F/F that gets value 1, when Rule 1 is
    fired, and becomes 0, when other rules are fired.
  • If this F/F holds a value 1, evaluate only G3 and
    then G2.
  • Unless Rule 1 is fired, this F/F stays at 0, and
    hence can be clock gated most of the time.
  • This example may not be very useful, as the
    guards are simple to evaluate, but guard calculus
    on complex guards can lead to savings.

40
GATED GUARDS
  • Theorem proving techniques can be used for
    deductions.
  • Such analysis can be done for more complicated
    designs.
  • A memory in hardware can be used to store the
    information about which guards need not be
    computed in the present clock cycle.

41
Thank You !!
  • ?
Write a Comment
User Comments (0)
About PowerShow.com