Title: ArchitectureAware FPGA Placement Using Metric Embedding
1Architecture-Aware FPGA Placement Using Metric
Embedding
- Padmini Gopalakrishnan
- Xin Li
- Larry Pileggi
- Electrical Computer Engineering
- Carnegie Mellon University
2FPGAs and Regular Architectures
- Regular architectures, easy to manufacture,
shorter turn-around times - FPGAs, CPLDs, FPGA-like architectures
- Metal-mask or via configurable arrays of logic
- Increasing number of design starts
- Performance and sizes of these chips rapidly
improving - e.g., Virtex 5 550 MHz, gt 300,000 logic cells
(not including memory and IP cores)
3FPGAs
4FPGAs
- Complex and heterogeneous routing architectures
- Delay dominated by switches on a route
- Not geometric length
- Physical design must exploit routing architecture
- Not just minimize wirelength
5Delay Contours
Find contours of locations that have equal delay
from this location in terms of number of hops
Array of logic islands
6Delay proportional to Euclidean distance
Equi-delay Contours
7Delay proportional to Manhattan distance
Equi-delay Contours
8Delay along simple heterogeneous FPGA
9Delay along simple heterogeneous FPGA
Equi-delay Contours
10Delay along Virtex-like FPGA grid
11Delay along Virtex-like FPGA grid
Equi-delay Contours
12The Challenge and the Opportunity
- Exploit relationship between delay and routing
grid - Complex and FPGA-specific
- Geometric measures ? poor approximation
- Prior work tries to do this via different models
- Lookup tables Rose FPGA 00, Leaver FPGA 01
- Empirical or statistical models Bazargan DAC
03, Karnik ICCAD 95 - Detailed delay calculation Rutenbar TCAD 98
- However .
- Lookup tables and detailed models dont scale
well - Empirical models depend on statistical data
13The Challenge and the Opportunity
- FPGA placement is a discrete assignment problem
- Simulated annealing scales poorly
- e.g., VPR Rose FPL 97
- Partitioning or clustering move-based
- e.g., Selvakkumaran FPGA 04Bazargan DAC 03
Cong FPL 04
14The Challenge and the Opportunity
- Require initial optimization that is global,
architecture-aware, and efficient
15The Challenge and the Opportunity
- Require initial optimization that is global,
architecture-aware, and efficient
16The Challenge and the Opportunity
- Require initial optimization that is global,
architecture-aware, and efficient
FPGA Placement
17The Challenge and the Opportunity
- Require initial optimization that is global,
architecture-aware, and efficient
Need new techniques cognizant of both the complex
delay model and discrete assignment nature of the
placement problem !
FPGA Placement
18Our Approach
- Placement as graph embedding into chosen metric
space - Metric space
- Set of objects, distance function representing
distance between objects - A superset of inner-product space
- First studied by Maurice Fréchet in 1906
19Our Approach
- Placement as graph embedding into chosen metric
space - Metric space
- Set of objects, distance function representing
distance between objects - Graph Embedding
- Assign coordinates to each vertex in a metric
space - Typically try to maintain some original notion of
distance between vertices - Most embeddings incur distortion
20Our Approach
- Placement as graph embedding into chosen metric
space
21Our Approach
- Placement as graph embedding into chosen metric
space
2 D plane
22Our Approach
- Placement as graph embedding into chosen metric
space
2 D plane
23Our Approach
- Placement as graph embedding into chosen metric
space
2 D plane
Most placement algorithms (e.g., QP) try to
minimize some measure of distortion
24Our Approach
- Placement as graph embedding into chosen metric
space - Choose appropriate metric space for FPGA routing
grids
FPGA routing grid
25Our Approach
- Placement as graph embedding into chosen metric
space - Choose appropriate metric space for FPGA routing
grids
FPGA routing grid
26Our Approach
- Placement as graph embedding into chosen metric
space - Choose appropriate metric space for FPGA routing
grids
FPGA routing grid
27Our Approach
- Placement as graph embedding into chosen metric
space - Choose appropriate metric space for FPGA routing
grids
Minimize distortion with respect to the FPGA
routing grid
FPGA routing grid
28Our Approach
- Placement as graph embedding into chosen metric
space - Choose appropriate metric space for FPGA routing
grids - Embedding itself is formulated as an assignment
problem
29Overall Placement Methodology
Phase 1 Architecture-aware initial placement
solution
Phase 2 Local optimization to improve routability
and specific critical paths
30Overall Placement Methodology
Phase 1 Architecture-aware initial placement
solution
Our Focus CAPRI (Convex Assigned Placement for
Regular ICs)
Phase 2 Local optimization to improve routability
and specific critical paths
31Overall Placement Methodology
Phase 1 Architecture-aware initial placement
solution
Our Focus CAPRI (Convex Assigned Placement for
Regular ICs)
Phase 2 Local optimization to improve routability
and specific critical paths
Apply existing FPGA placement technique
32CAPRI Overview
Embed netlist graph in metric space defined by
architecture graph
33CAPRI Formulation
Location IDs (1, 2, n)
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Fast heuristic to find good assignment via matrix
decomposition and graph matching
34CAPRI Formulation
Location IDs (1, 2, n)
FPGA routing arch. abstracted as graph
i
Design netlist abstracted as graph
Define a metric space for routing architecture
j
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
j
Fast heuristic to find good assignment via matrix
decomposition and graph matching
35CAPRI Formulation
Logic-node IDs (1, 2, n)
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Fast heuristic to find good assignment via matrix
decomposition and graph matching
36CAPRI Formulation
Logic-node IDs (1, 2, n)
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Fast heuristic to find good assignment via matrix
decomposition and graph matching
37CAPRI Formulation
Logic-node IDs (1, 2, n)
FPGA routing arch. abstracted as graph
i
Design netlist abstracted as graph
Define a metric space for routing architecture
j
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Fast heuristic to find good assignment via matrix
decomposition and graph matching
n x n
38CAPRI Formulation
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Fast heuristic to find good assignment via matrix
decomposition and graph matching
39CAPRI Formulation
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Assignment represented by a permutation matrix P
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
P(k,i) 1
Location k
Fast heuristic to find good assignment via matrix
decomposition and graph matching
Logic-node i
40CAPRI Formulation
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Assignment represented by a permutation matrix P
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Action of P on Design Graph represented by PTDDP
Fast heuristic to find good assignment via matrix
decomposition and graph matching
41CAPRI Formulation
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Minimize Distortion
Fast heuristic to find good assignment via matrix
decomposition and graph matching
42CAPRI Formulation
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Fast heuristic to find good assignment via matrix
decomposition and graph matching
43CAPRI Formulation
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Fast heuristic to find good assignment via matrix
decomposition and graph matching
44CAPRI Formulation
- Integer optimization is hard!
- Convex relaxation does not scale
- Apply series of approximations to find good
solution
FPGA routing arch. abstracted as graph
Design netlist abstracted as graph
Define a metric space for routing architecture
Binary Quadratic Assignment Problem
Formulation Convex objective to minimize
distortion, non-convex solution space
Fast heuristic to find good assignment via matrix
decomposition and graph matching
45Implementation of CAPRI
46Implementation of CAPRI
Enables efficient distance matrix construction
47Implementation of CAPRI
Enables efficient distance matrix construction
Concurrent placement of the components Not yet
legal on the FPGA surface
48Implementation of CAPRI
Enables efficient distance matrix construction
Concurrent placement of the components Not yet
legal on the FPGA surface
Each node is legalized to a legal location on the
FPGA
49Implementation of CAPRI
50Implementation of CAPRI
Singular values of typical truncated distance
matrix
51Implementation of CAPRI
- Low-rank (k-rank) approximation of matrices
- Orthogonal iterations
- Choose k 2
- Projecting nodes and locations onto a
k-dimensional hyperplane
52Implementation of CAPRI
Projection of Architecture Graph (simple 2D
grid) Equivalent to a 2D visualization of the
metric space of the graph
53Implementation of CAPRI
Projection of Architecture Graph (Virtex-like
grid) Equivalent to a 2D visualization of the
metric space of the graph
54Implementation of CAPRI
Projection of Design Graph (design name des) into
2 dimensions The graph itself inherently has a
higher number of dimensions
In general small k works Empirically found only
small difference between k 2, 3, and n
55Implementation of CAPRI
Nodes
Locations
Points in k-D space
Points in k-D space
Locations
Nodes
56Implementation of CAPRI
Nodes
Locations
Points in k-D space
Points in k-D space
Locations
Nodes
Bipartite Graph
57Implementation of CAPRI
Nodes
Locations
Points in k-D space
Points in k-D space
Locations
- Build bipartite graph dynamically
Nodes
Bipartite Graph
58Implementation of CAPRI
Nodes
Locations
Points in k-D space
Points in k-D space
Locations
- Build bipartite graph dynamically
- Dynamic (online) matching
- Edge costs include congestion component
Nodes
Bipartite Graph
59Experimental Highlights
Phase 1 CAPRI initial architecture-aware
placement solution
Cool anneal in VPR used for our experiments
Phase 2 Local optimization to improve
routability and specific critical paths
- Compare detailed routing results for two
architectures - CAPRI followed by cool anneal in VPR
- VPR alone
60Experimental Highlights
Phase 1 CAPRI initial architecture-aware
placement solution
Cool anneal in VPR used for our experiments
Phase 2 Local optimization to improve
routability and specific critical paths
- Identical placement footprints for CAPRI and VPR
flows - I/Os located along the periphery and placed
simultaneously - Delay based on smallest width of routing channel
possible
61Delay Improvement (worst critical paths)
Minimal routing tracks (identical for both CAPRI
and VPR flows)
Simple Heterogeneous Grid
Virtex-like Routing Grid
62Delay Improvement (worst critical paths)
2 extra tracks added to elliptic
Simple Heterogeneous Grid
Virtex-like Routing Grid
63Runtime Comparison
CAPRI lt 5 of total placement runtime (even with
modest-sized problems)
Facilitates exploration
64Future Work
- Future work
- Model routing resources
- Hierarchical methodology
- Explore different options for legalization and
local optimization - FPGA architecture exploration and evaluation
65Acknowledgements
- Prof. Rob Rutenbar (CMU), Abhishek Ranjan
(Xilinx), and Brian Taylor (CMU) for their
comments - Prof. Radu Marculescu (CMU), Dr. Ruchir Puri
(IBM), and Steven Teig (Tabula) for their
feedback on an early version of this work