Title: Floorplan Design for MultiMillion Gate FPGAs
1Floorplan Design for Multi-Million Gate FPGAs
Lei Cheng
Martin D.F. Wong
Department of Electrical and Computer
Engineering University of Illinois at
Urbana-Champaign
2Outline
- Introduction
- Floorplanning for FPGAs with heterogeneous
resources - Realizations for a module
- Irreducible realization list
- Algorithm
- Calculate irreducible realization lists
- Reduce complexity
- Compaction Postprocessing
- Experimental results
3FPGA Architecture
- Xilinx XC3S5000
- 8320 CLBs
- 104 RAMs
- 104 Multipliers
4FPGA Floorplanning
- Place a set of modules onto an FPGA chip.
- Resource requirement vector of a module
ltn1,n2,n3gt - n1 is the number of CLBs
- n2 is the number of RAMs
- n3 is the number of multipliers
- Place each module in a rectangular region
satisfying its resource requirement. - No overlapping among modules.
5Slicing Floorplan
Slicing Floorplan
Slicing Tree
6Module Realizations
- Employ a coordinate system on the chip.
- A realization r for a module ? is a rectangular
region ltx, y, w, hgt - (x, y) is the coordinate of rs lower left
corner. - w is the width of r.
- h is the height of r.
- r satisfies the resource requirement of ?.
7Example
Resource requirement of module ? lt10,2,1gt (10
CLBs, 2 RAMs, and 1 Multiplier)
8Dominance Relation
- S? set of all realizations for a module ?
- r1, r2 are two realizations from S?
- r1 dominates r2 iff
- x(r1) ? x(r2)
- y(r1) ? y(r2)
- x(r1) w(r1) ? x(r2) w(r2)
- y(r1) h(r1) ? y(r2) h(r2)
r2
r1
9Irreducible Realization List (IRL)
- An irreducible realization list for module ?
- L?(x, y) a list of realizations
- The starting point of each realization is (x,y).
- No other realizations starting from (x,y)
dominate any of these realizations. - Realizations of L?(x, y) are always sorted from
high to low.
10Irreducible Realization List (IRL)
11Algorithm
- Simulated annealing (SA) searches on slicing
floorplans.
D
D
A
A
C
C
E
E
B
B
h
h
v
h
C
v
A
B
C
D
E
12Reduce Space Complexity
- Each FPGA chip is a two dimensional array of a
basic pattern. - We only need to compute IRLs for all the points
on the basic pattern.
13Reduce Space Complexity
- It is practical to impose aspect ratio bounds.
- For performance consideration, we prefer each
module to have aspect ratio close to 1. - Short internal wire length.
Dr. Salil, Multi-million gate FPGA physical
design challenges, ICCAD03
14Compute IRLs at Internal Nodes
- Compute IRLs for all points on the pattern for an
internal node. - When computing the IRL for one point (x,y)
- Let (r1, r2, , rk) be the IRL of the left child
starting from (x,y). - Combine each ri with a set of irreducible
realizations of the right child, which have
heights between h(ri) and h(ri-1). - Combine ri with the irreducible realization of
the right child with height less than but closest
to h(ri).
15Compute IRLs at Internal Nodes
- Why do we only consider realizations with heights
between h(ri) and h(ri-1)?
ri-1
ri
16Compute IRLs at Internal Nodes
- Why do we only consider the realization with the
highest height among all realizations with
heights less than h(ri)?
ri
17Time Complexity
- The complexity of computing one IRL for an
internal node is - The term is the maximum of the width and height
of the chip. - The complexity of evaluating a slicing tree is
- The term is the number of points on the
pattern. - The term is the number of modules.
18Experimental Result
- Xilinx XC3S5000 FPGA
- 8320 CLBs
- 104 RAMs
- 104 multipliers
- l 104
- p 352
- The result
- m 21
- 72 CLBs
- 88 RAMs
- 86 multipliers
- Runtime 59s
19Compaction
- A rectangular realization may contain more
resources. - Some problems can not be solved if we only allow
rectangular realizations. - Compact vertically.
20Example
- We have 3 modules ?1 ?2 ?3, and their resource
requirements are lt7,1,1gt, lt8,1,1gt, lt12,1,1gt.
The RAM and Multiplier can not be used by other
modules
21Experimental Result
- Xilinx XC3S5000 FPGA
- The result
- m 23
- 72 CLBs
- 84 RAMs
- 84 multipliers
- Runtime
- Slicing only 27s
- With compaction 2s
22Postprocessing after Compaction
- Problems with compaction
- Some modules are placed in undesirable shapes.
- Large amount of white space on top of the chip.
- Our observation
- A module can be placed freely between two lines.
- The lower contour line records the placements of
modules placed before this module. - The upper contour line records the placements of
modules placed after this module. - Postprocess all modules in reverse order of the
compaction.
23Experimental Result
Before
After
24Experimental Results
25Conclusions
- We propose the first FPGA floorplanning algorithm
targeted for FPGAs with heterogeneous resources. - Our algorithm uses simulated annealing to search
on slicing floorplans. - For each slicing floorplan, our algorithm
computes IRLs for all the internal nodes
efficiently on a basic pattern. - Our algorithm uses compaction and postprocessing
to solve more problems and to optimize floorplans.
26