Title: Clock Period Minimization with Minimum Delay Insertion
1Clock Period Minimization with Minimum Delay
Insertion
- Shih-Hsu Huang, Chun-Hua Cheng,
- Chia-Ming Chang, and Yow-Tyng Nieh
Department of Electronic Engineering Chung Yuan
Christian University, Taiwan
2Outline
- Introduction
- Preliminaries
- Motivation
- Our Approach
- Linear Program
- Solution Space Reduction
- Experimental Result
- Conclusion
3Outline
- Introduction
- Preliminaries
- Motivation
- Our Approach
- Linear Program
- Solution Space Reduction
- Experimental Result
- Conclusion
4Clock Period Minimization
- Clock period minimization is always one of
themost important objectives in the design of
edge-triggered circuits. - Although the optimal clock skew scheduling can
enhance the circuit performance, it has not
achieved the lower bound of the clock period. - The hold constraints often limit the smallest
feasible clock period that the optimal clock skew
scheduling can achieve. - In fact, hold violations can be resolved by
applying the delay insertion. Therefore, the
combination of clock skew scheduling and delay
insertion can lead to further clock period
reduction.
5Why Minimize the Required Inserted Delay?
- In addition to minimizing the clock period,
minimizing the required inserted delay is also an
important objective of this problem. - We point out two reasons as below.
- Minimizing the required inserted delay makes it
easier to achieve the design closure. - Minimizing the required inserted delay makes it
easier to find a feasible solution from the cell
library.
6Pervious Works (1/2)
- Minimum Padding
- Minimizes the required inserted delay under a
given clock skew schedule. - Does NOT deal with the determination of
clockskew schedule. - DIANA Algorithm
- Finds a good clock skew schedule for
theminimum padding method. - Guarantees achieving the lower bound ofthe clock
period. - But only heuristically reduces the
requiredinserted delay.
7Pervious Works (2/2)
- RCA Algorithm
- Obtains the result as that of the DIANA
algorithm. - Has a lower time complexity than that of
theDIANA algorithm. - Data Path Level Delay Insertion in Clock Skew
Scheduling - Does NOT achieve the lower bound of theclock
period - Does NOT attempt to minimize the
requiredinserted delay.
8Our Contributions
- Obtains the Optimal Solution
- Our paper is the first work that guarantees
solving this problem optimally. - Achieves the lower bound of the clock period.
- Achieves the lower bound of required inserted
delay (for working with the lower bound of the
clock period) - Shows This Problem Is Polynomial
- We use a linear program to formally formulate
this problem. Note that a linear program can be
solved in polynomial time complexity. - Thus, our paper is the first proof of showing
that the time complexity of this problem is
polynomial.
9Outline
- Introduction
- Preliminaries
- Motivation
- Our Approach
- Linear Program
- Solution Space Reduction
- Experimental Result
- Conclusion
10Zero Clock Skew Circuit
Longest timing path r?s?t?v?w?y?z?b?c?d
If the clock skew is zero, the clock period P 15
11Constraint Graph (1/2)
- By properly scheduling the clock arrival time of
registers, the clock period can be smaller than
the delay of longest timing path. - Several graph-based algorithms use the constraint
graph to solve the optimal clock skew scheduling. - A constraint graph works with the clock period P,
provided that it has no negative cycle when the
clock period is P. - There exists at least one critical cycle (i.e.,
the cycle in which the summation of weights is
zero) in the constraint graph, if the clock
period is the smallest feasible clock period.
12Constraint Graph (2/2)
Circuit Graph
Constraint Graph
13Optimal Clock Skew Scheduling
Critical Cycle
P 13
14Lower Bound Pseq (1/2)
- Since sequential timing optimization adjusts the
timing slacks among data paths, the lower bound
of sequential timing optimization Pseq is only
determined by setup constraints. - Due to the limitation of hold constraints, the
optimal clock skew scheduling often does not
achieve this lower bound.
15Lower Bound Pseq (2/2)
- The cycle consisting S-edge es(R1,R3) and S-edge
es(R3,R1) determine the lower bound of sequential
timing optimization. - Since (155)/ 2 10, we have Pseq 10.
16Lower Bound Ppad (1/2)
- In fact, hold violations may be resolved by
applying the delay insertion, which is referred
to as the padding method. - Note that the padding problem is not always
solvable. The largest delay difference among all
the timing paths gives a lower bound of the clock
period Ppad for a feasible padding solution.
17Lower Bound Ppad (2/2)
- Due to timing path r?s?t?v?w?y?z?b?c?d, we have
Ppad (5361)-(1211) 10.
18Lower Bound PLB
- Let the clock period PLB max(Pseq,Ppad).
Obviously, the clock period PLB gives a lower
bound of the clock period for the combination of
clock skew scheduling and delay insertion. - In this example,we have PLB max(Pseq,Ppad)
max (10,10) 10.
19Outline
- Introduction
- Preliminaries
- Motivation
- Our Approach
- Linear Program
- Solution Space Reduction
- Experimental Result
- Conclusion
20Drawback of Data Path Level Delay Insertion (1/2)
- The main drawback of data path level delay
insertion in clock skew scheduling is that it
does NOT achieve the lower bound of the clock
period. - The reason is that in their approach, the
maximum delay and the minimum delay must be
increased at the same time. - Another drawback of data path level delay
insertion in clock skew scheduling is that it
does NOT attempt to minimize the required
inserted delay. - As a result, their approach often suffers from a
huge amount of delay insertion.
21Drawback of Data Path Level Delay Insertion (2/2)
- TC3-TC1 P-(15I3,1)
- TC1-TC3 3i3,1
- i3,1 I3,1
- Since the maximum delay and the minimum delay of
data path R3?R1 must be increased at the same
time, the clock period P obtained by data path
level delay insertion in clock skew scheduling is
12. - However, the lower bound of the clock periodPLB
10.
22Limitation of Minimum Padding (1/4)
- The limitation of the minimum padding method is
that it does not deal with the determination of
clock skew schedule. - Intuitively, we can derive a clock skew schedule
by only considering all the setup constraints.
Then, all the hold violations can be resolved by
applying the minimum padding method. - However, there are many clock skew schedules that
can satisfy all the setup constraints. Since
different clock skew schedules have different
hold violations, they require different amounts
of delay insertion.
23Limitation of Minimum Padding (2/4)
Under the clock skew schedule Thost 0, TC1
-2, TC2 -11, TC3 -7, the required inserted
delay is 16
24Limitation of Minimum Padding (3/4)
Under the clock skew schedule Thost 0, TC1 4,
TC2 5, TC3 -1, the required inserted delay is
9
25Limitation of Minimum Padding (4/4)
Under the clock skew schedule Thost 0, TC1 1,
TC2 -5, TC3 -4, the required inserted delay
is 5
26Drawback of RCA algorithm (1/2)
- The RCA algorithm attempts to find one
irredundant relaxed constraint graph, in which
the hold constraints are not over-relaxed. - However, the irredundant-relaxed constraint graph
may be non-unique. - Different irredundant relaxed-constraint graphs
lead to different clock skew schedules, which
require different amounts of delay insertion. - Therefore, the RCA algorithm only heuristically
minimizes the required inserted delay.
27Drawback of RCA algorithm (2/2)
Another irredundant-relaxed constraint graph
Thost 0, TC1 1, TC2 -5, TC3 -4.
One irredundant-relaxed constraint graph Thost
0, TC1 4, TC2 5, TC3 -1.
28Our Motivation
- Although the combination of the RCA algorithm and
the minimum padding method can minimize the clock
period, it only heuristically reduces the
required inserted delay. - The main reason is that the RCA algorithm
determines the clock skew schedule without the
knowledge of minimum padding method. - To guarantee achieving the lower bound of
required inserted delay, the minimum padding
method should be directly integrated into the
stage of clock skew scheduling. - Based on that observation, we have the motivation
to study the simultaneous application of clock
skew scheduling and minimum padding.
29Outline
- Introduction
- Preliminaries
- Motivation
- Our Approach
- Linear Program
- Solution Space Reduction
- Experimental Result
- Conclusion
30Outline
- Introduction
- Motivation
- Our Approach
- Linear Program
- Solution Space Reduction
- Experimental Result
- Conclusion
31Variables in Our Linear Program
- Variable TCi denotes the clock arrival time
ofregister Ri - For each pin u
- Variable EAu denotes the earliest data arrival
time - Variable LAu denotes the latest data arrival time
- For each wire from pin u to pin v, Xu,v denotes
the delay insertion
32Objective Function
- Our objective function is to minimize the
required inserted delay for working with the
lower bound of the clock period PLB. - Minimize (Xi1,a Xc,d Xe,f Xg,h Xl,j
Xk,o1 Xe,n Xl,m Xp,q Xr,s Xt,o2 Xr,u
Xt,v Xw,y Xi2,x Xz,b)
33Formula 1
- The amount of delay insertion should be greater
than or equal to 0. - Thus, we have the following constraints
- 0Xi1,a
- 0Xi2,x,
- 0Xc,d,
- 0Xe,f,
- 0Xg,h,
- 0Xl,j, and so on.
34Formula 2
- For each output pin of a logic gate, its earliest
data arrival time should not be greater than the
earliest data arrival time of input pin plus the
minimum delay of this timing arc. - Thus, we have the following constraints
- EAc EAa1, EAc EAb1, EAp EAm1, and so on.
35Formula 3
- For each output pin of a logic gate, its latest
data arrival time should not be less than the
latest data arrival time of input pin plus the
maximum delay of this timing arc. - Thus, we have the following constraints
- LAa3 LAc, LAb1 LAc, LAm4 LAp, and so on.
36Formula 4
- For each wire, the earliest data arrival time of
starting pin should not be greater than the
earliest data arrival time of ending pin plus the
amount of delay insertion. - Thus, we have the following constraints
- EAa EAi1Xi1,a, EAx EAi2Xi2,x, EAd
EAcXc,d, and so on.
37Formula 5
- For each wire, the latest data arrival time of
starting pin should not be less than the latest
data arrival time of ending pin plus the amount
of delay insertion. - Thus, we have the following constraints
- LAi1Xi1,a LAa, LAi2Xi2,x LAx, LAcXc,d
LAd, - and so on.
38Formula 6
- For the starting pin of each data path, its
earliest data arrival time is the clock arrival
time. - Thus, we have the following constraints
- EAi1 Thost, EAi2 Thost,
- EAe TC1, EAl TC2, EAr TC3.
39Formula 7
- For the starting pin of each data path, its
latest data arrival time is the clock arrival
time. - Thus, we have the following constraints
- LAi1 Thost, LAi2 Thost,
- LAe TC1, LAl TC2, LAr TC3.
40Formula 8
- Due to the hold constraint, for the ending pin of
each data path, its earliest data arrival time
should be greater than or equal to the clock
arrival time. - Thus, we have the following constraints
- Tc1EAd, Tc2 EAh, Tc3 EAq,
- Thost EAo1, Thost EAo2.
41Formula 9
- Due to the setup constraint, for the ending
pinof each data path, its latest data arrival
time should be less than or equal to the arrival
time of next clock edge. - Thus, we have the following constraints
- LAd TC1PLB, LAhTC2PLB, LAqTC3PLB,
- LAo1ThostPLB, LAo2 ThostPLB.
42Our Results
- After solving the linear program, we find that
- Xt,o23, Xr,u2, Thost 0, TC1 1, TC2 -5,
and TC3 -4. - The delay insertions of other wires are 0.
- The required inserted delay is only 5.
- The padded circuit works with the lower bound of
the clock period.
43Outline
- Introduction
- Preliminaries
- Motivation
- Our Approach
- Linear Program
- Solution Space Reduction
- Experimental Result
- Conclusion
44Find Zero-Delay-Insertion Wires
- The delay insertion of many wires are 0.
- We say these wires are zero-delay-insertion
wires. - If we can find zero-delay-insertion wires in
advance, the solution space of our linear program
can be greatly reduced. - Our Strategies
- Find zero-delay-insertion wires from the
viewpointof hold constraints. - Find zero-delay-insertion wires from the
viewpointof setup constraints.
45From Hold Constraints (1/4)
- Only the hold-critical data paths may
requiredelay insertion. - A data path is hold-critical if its hold
constraint limits the circuit performance. - We can use an iteration process to find
hold-critical data paths. - The wires that are not in any hold-critical data
path are zero-delay-insertion wires.
46From Hold Constraints (2/4)
- R2?R3, R3?host, host?R1 are hold-critical
P 12
47From Hold Constraints (3/4)
P 10
48From Hold Constraints (4/4)
- Xl,m, Xp,q, Xr,s, Xt,o2, Xt,v, Xw,y, Xz,b, Xi1,a,
Xi2,x, Xc,d, Xr,u are in hold-critical data
paths. They may require delay insertion. - Xe,f, Xg,h, Xe.n, Xl,j, Xk,o1 are not in any
hold-critical data paths. Thus, we have Xe,f
Xg,h Xe,n Xl,j Xk,o1 0.
49From Setup Constraints (1/2)
- If the lower bound of the clock period is the
lower bound of sequential timing optimization - The data paths that determine the lower bound of
sequential timing optimization is setup-critical. - For each setup-critical data path, we cannot
insert any delay into its longest timing paths. - Therefore, the wires in the longest timing paths
of setup-critical data paths are
zero-delay-insertion wires.
50From Setup Constraints (2/2)
- R1?R3 and R3?R1 are setup-critical
- For R1?R3, e?n?p?q is longest timing path.
- For R3?R1, r?s?t?v?w?y?z?b?c?d is longest timing
path. - Thus,we have Xe,n Xp,q Xr,s Xt,v Xw,y
Xz,b Xc,d 0.
51Reduced Linear Program
- Zero-delay-insertion wires
- From the viewpoint of hold constraints, we can
let Xe,f Xg,h Xe.n Xl,j Xk,o1 0 in
advance. - From the viewpoint of setup constraints, we can
let Xe,n Xp,q Xr,s Xt,v Xw,y Xz,b
Xc,d 0 in advance. - Although a lot of variable are pruned,
aftersolving the linear program, we still obtain
the same optimal solution. - Xt,o23, Xr,u2, Thost 0, TC1 1, TC2 -5,
and TC3 -4. - The required inserted delay is only 5.
- The padded circuit works with the lower bound of
the clock period.
52Outline
- Introduction
- Preliminaries
- Motivation
- Our Approach
- Linear Program
- Solution Space Reduction
- Experimental Result
- Conclusion
53Benchmark Data
- Our platform is Windows XP operating system
running on AMD K8-3000 processor. - We use Extended-LINGO Release 8.0 as the linear
program solver. - We use C programming language to implement the
process of solution space reduction. - We compare our approach with RCA, DIANA, and data
path level delay insertion in clock skew
scheduling. - The circuits in ISCAS89 benchmark suite are
targeted to UMC 0.18µm cell library to test the
effectiveness of our approach. - The comparisons include the clock period, the
required inserted delay, and the CPU time.
54Clock Period-S3384
55Clock Period-S6669
56Clock Period-S15850
57Clock Period-S35932
58Required Inserted Delay-S3384
59Required Inserted Delay-S6669
60Required Inserted Delay-S15850
61Required Inserted Delay-S35932
62CPU Time-S3384
63CPU Time-S6669
64CPU Time-S15850
65CPU Time-S35932
66Outline
- Introduction
- Preliminaries
- Motivation
- Our Approach
- Linear Program
- Solution Space Reduction
- Experimental Result
- Conclusions
67Concluding Remarks
- This paper investigates the simultaneous
application of clock skew scheduling and minimum
padding. A linear program is proposed to formally
formulate this problem. - Our paper is the first work that guarantees
solving this problem optimally. Our approach not
only achieves the lower bound of the clock
period, but also achieves the lower bound of
required inserted delay. - Benchmark data show that the CPU time of our
approach is comparable to those of
state-of-the-art heuristic methodologies.
68Thank you
69Q A
70Clock Period
71Required Inserted Delay
72Required Inserted Delay
73CPU Time