Title: Buffer and FF Insertion
1Buffer and FF Insertion
- Slides from Charles J. Alpert
- IBM Corp.
2Talk Outline
- Introduction
- Buffer insertion
- Van Ginneken dynamic programming
- Extensions
- Interconnect planning
3Simple Buffer Insertion Problem
Given Source and sink locations, sink
capacitances and RATs, a buffer type, source
delay rules, unit wire resistance and capacitance
RAT4
Buffer
RAT3
s0
RAT2
RAT1
4Simple Buffer Insertion Problem
Find Buffer locations and a routing tree such
that slack at the source is minimized
RAT4
RAT3
5Slack Example
RAT 500 delay 400
slack -200
RAT 400 delay 600
RAT 500 delay 350
slack 100
RAT 400 delay 300
6Elmore Delay
7Common Approaches
- Iteratively insert buffers
- Closed-form solutions (2 pin nets)
- Dynamic programming
- Simultaneous constructions
8Van Ginnekens Classic Algorithm
- Optimal for multi-sink nets
- Quadratic runtime
- Bottom-up from sinks to source
- Generate list of candidates at each node
- At source, pick the best candidate in list
9Key Assumptions
- Given routing tree
- Given potential insertion points
10Generating Candidates
11Pruning Candidates
12Candidate Example Continued
13Candidate Example Continued
After pruning
14Merging Branches
15Pruning Merged Branches
16Van Ginneken Example
(20,400)
Wire C10,d150
Buffer C5, d30
(30,250) (5, 220)
(20,400)
Buffer C5, d50 C5, d30
Wire C15,d200 C15,d120
(30,250) (5, 220)
(45, 50) (5, 0) (20,100) (5, 70)
(20,400)
17Van Ginneken Example Contd
(30,250) (5, 220)
(45, 50) (5, 0) (20,100) (5, 70)
(20,400)
(5,0) is inferior to (5,70). (45,50) is inferior
to (20,100)
Wire C10
(30,250) (5, 220)
(20,100) (5, 70)
(30,10) (15, -10)
(20,400)
Pick solution with largest slack, follow arrows
to get solution
18Van Ginneken Recap
- Generate candidates from sinks to source
- Quadratic runtime
- Adding a buffer adds only one new candidate
- Merging branches additive, not multiplicative
- Optimal for Elmore delay model
19Optimal Extensions
- Multiple buffer types
- Inverters
- Polarity constraints
- Controlling buffer resources
- Capacitance constraints
- Blockage recognition
- Wire sizing
20Multiple Buffer Types
21Inverters
22Polarity Constraints
- Some sinks are positive, some negative
- Put negative sinks into - list
23Controlling Buffering Resources
Before, maintain list of capacitance slack pairs
(C1, q1), (C2, q2), (C3, q3) (C4, q4), (C5, q5)
(C6, q6), (C7, q7), (C8, q8) (C9, q9)
Now, store an array of lists, indexed by of
buffers
3 2 1 0
(C1, q1, 3), (C2, q2, 3), (C3, q3, 3) (C4, q4,
2), (C5, q5, 2) (C6, q6, 1), (C7, q7, 1), (C8,
q8, 1) (C9, q9, 0)
Prune candidates with inferior cap, slack, and
buffers
24Buffering Resource Trade-off
25Capacitance Constraints
- Each gate g drives at most C(g) capacitance
- When inserting buffer g, check downstream
capacitance. - If bigger than C(g), throw out candidate
Total cap 500 ff
26Blockage Recognition
- Delete insertion points that run over blockages
27Other Extensions
- Simultaneous driver sizing
- Modeling effective capacitance
- Higher-order interconnect delay
- Slew constraints
- Noise constraints
28Driver Sizing
29Driver Sizing
- Driver behaves like buffer
- Pick driver with the best slack
- Implications upstream in timing graph
- Delay penalty for large input capacitance
30p-Models
- Van Ginneken candidate (Cap, slack)
C
- Replace Cap with p-model (Cn, R, Cf)
- Total capacitance preserved Cn Cf C
- R represents degree of resistive shielding
31Computing Gate Delay
- When inserting buffer, compute effective
capacitance from p-model
- Use effective instead of lumped capacitance in
gate delay equation - Optimality no longer guaranteed
32Higher-order Interconnect Delay
- Moment matching with first 3 moments
- Previously candidate (p-model, slack)
- Now candidate (p-model, m1, m2, m3)
- Given moments, compute slack on the fly
- Bottom-up, efficient moment computation
- Problem guess slew rate
33Slew Constraints
- When inserting buffer, compute slews to gates
driven by buffer - If slew exceeds target, prune candidate
- Difficulty unknown gate input slew
Slew 300 ps
?
Slew 350 ps
34Noise Constraints
- Each gate has acceptable noise threshold
- Compute cumulative noise for each wire via Devgan
noise metric - Throw out candidates that violate noise
- Not in production code
35Extensions Recap
- Multiple buffer types, including inverters
- Polarity constraints
- Controlling buffer resources
- Slew, capacitance, and noise constraints
- Blockage recognition
- Driver sizing
- Higher-order delay modeling
- Wire sizing
36Talk Outline
- Introduction
- Buffer insertion
- Van Ginneken dynamic programming
- Extensions
- Interconnect planning
37What is the Problem?
- DSM timing closure
- Squeeze buffers into tight spaces
- Alleviate hot spots, local wire congestion
- Getting worse
- Handle wire congestion, buffering resources early
- Acknowledge these constraints when floorplanning
38Which Floorplan Is Better?
- Timing analysis worthless
- Interconnect synthesis, electrical correction,
routing, extraction - Days to find answer
39Buffer Explosion
Past
- Number of buffers triples each generation
- 800K buffers in 0.05 micron technology
40Buffer Block Planning
- Create blocks between macros just for holding
buffers - Adjust floorplan accordingly
- Computing size//location of blocks
- Analyze 2-pin nets
- Find feasible regions
- Assign buffers with smallest region
- Combine buffers into blocks
41Feasible Regions
feasible region
42Buffer Block Planning Trade-offs
- Goods
- Buffer locations flexibile
- Global view, buffers most difficult ones first
- Bads
- Wire congestion around blocks
- Dont have timing information
- Some nets still cannot be buffered/routed