A Sliding Window Scheme for Accurate Clock Mesh Analysis - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

A Sliding Window Scheme for Accurate Clock Mesh Analysis

Description:

aj. 6. Previous Work on Clock Mesh Analysis ... Interconnect reduction using AWE [Bailey et. al. 2001, DEC] Moment matching technique. ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 20
Provided by: hai95
Category:

less

Transcript and Presenter's Notes

Title: A Sliding Window Scheme for Accurate Clock Mesh Analysis


1
A Sliding Window Scheme for Accurate Clock Mesh
Analysis
  • Hongyu Chen2, C.Y. Yeh3, G. Wilke4, S. Reddy1, H.
    Nguyen1, W. Walker1, R. Murgai1
  • 1.Fujitsu Laboratories of America, Inc., CA, USA
  • 2.University of California, San Diego, CA, USA
  • 3.University of California, Santa Barbara, USA
  • 4. UFRGS, Brazil

2
Outline
  • Problem Statement
  • Mesh based architectures
  • Sliding Window Scheme
  • Improving the SWS Accuracy
  • Optimal Window Size Selection
  • Conclusions

3
Mesh-based Clock Architectures
  • Excellent for low skew, jitter
  • Used in modern processors
  • Difficult to analyze
  • v.s. Tree better performance, more routing
    resource usage, no existing tool support

4
Pure Mesh Architecture
  • Three components
  • -n x n (uniform) mesh
  • (Uniform) array of k x k buffers drives the mesh
    at grid nodes.
  • -Global tree drives mesh buffers
  • -Local distribution
  • FFs directly connected to nearest mesh segment

5
Clock Net Analysis Problem
  • Goal Given a mesh based clock architecture,
    compute latency (delay) from the clock root to
    each flip-flop.
  • Needed to determine clock cycle and/or timing
    violations.
  • Skew imposes constraints on min and max logic
    path delays.
  • -long path analysis aj ? ai ?logic_max
    tset_up - Tcycle
  • -short path analysis aj ? ai ?logic_min -
    thold
  • Mesh architectures difficult to analyze for a
    real design Huge number of circuit nodes in the
    model.
  • Needed for accuracy.
  • Large number of metal loops present in the mesh
    structure.
  • Design with 64x64 clock mesh and 200K FFs exceeds
    HSPICE capacity.

6
Previous Work on Clock Mesh Analysis
  • Break clock mesh into tree apply smoothing
    algorithm to redistribute mesh loads IBM patent
    March 2001 Restle et. al. 2001
  • No accuracy results shown.
  • Interconnect reduction using AWE Bailey et. al.
    2001, DEC
  • Moment matching technique.
  • Orthogonal to our scheme.
  • Sizing a clock mesh given latency constraints
  • Desai et. al. 1996, DEC
  • Break mesh into tree
  • use approximate model of delay
  • Vandenberghe et. al. 1997
  • Use dominant time constant as measure of delay.
  • Use semi-definite programming
  • Show results only for smaller meshes.

7
Sliding Window-based Simulation (SWS)
  • Proposed new sliding window-based scheme for mesh
    analysis
  • Two nodes on the mesh that are far from each
    other have little electrical impact on each other
  • Insight RC mesh constitutes a cascaded low pass
    filter, each driver only has a local effect
  • Model mesh with two different resolutions
  • Detailed model for mesh elements close to nodes
    being measured.
  • Simplified model for other nodes.

8
Sliding Window-based Simulation (SWS)
Preserve detailed circuit inside window
Ca Total load inside the rectangle area
a
b
Lump capacitance and remove resistors outside
window (except on the mesh itself)
9
Benefits of SWS
  • Reduces memory usage by simplifying the model to
    be simulated (for each window).
  • 64x64 mesh 100K FFs distributed uniformly over
    the chip.
  • Assuming 1-pi model for interconnect (2 nodes per
    segment).
  • Golden model needs 308K nodes (8K nodes for 4K
    mesh segments 300K nodes for the FFs)
  • SWS with window size of 16x16
  • Needs 29K nodes (8K for mesh 21K for 7K FFs)
  • 10X reduction in model size!

a
Cw/2
Cw/2
10
Benefits of SWS (contd)
  • Run-time
  • Assume SPICE run-time is O(N1.5) N number of
    nodes.
  • Each window simulation 101.5 32X faster than
    golden simulation.
  • 16 simulations cover the mesh.
  • Overall speed-up factor 2.
  • Can complete on fine meshes.
  • Is very accurate.
  • Suited to parallelization or grid-computing.

11
SWS Accuracy
12
Improving Accuracy of SWS
  • Noticed large errors outside window and in the
    window periphery.
  • Solution
  • Add border to the window w new window w.
  • Detailed model within w.
  • Delay measurement only for FFs inside w.
  • Ignore noisy FFs in the border w - w.
  • New windows are overlapping.
  • Improves accuracy at the expense of runtime.

13
Flow of Improved SWS
14
Accuracy of SWS with Border
15
Accuracy of SWS With Border
16
Optimum Window Size
  • Main Concerns
  • Memory Limit Fit one simulation into memory
  • Run time reduction Spice runtime O(Na), a1.5
  • Parallelism Prefer smaller window for parallel
    execution
  • Experimental Studies on a 64 by 64 mesh

17
Optimum Window 64x64 mesh, 100K FFs
18
Simulation with a real industry design
  • About 300k FFs
  • Parallel execution on a 4 processor machine

19
Conclusions
  • Clock mesh analysis is a difficult problem
  • Proposed a new sliding-window based scheme to
    analyze clock meshes with respect to latency to
    FFs.
  • Accurate to within 1 of HSPICE.
  • Can complete on large mesh design when HSPICE
    could not.
  • Is parallelizable.
  • Determined strategy for picking optimum window
    size.
  • Future work
  • Combine the SWS with model order reduction
    techniques
  • Jitter analysis
Write a Comment
User Comments (0)
About PowerShow.com