Title: CS184a: Computer Architecture (Structure and Organization)
1CS184aComputer Architecture(Structure and
Organization)
- Day 14 February 7, 2005
- Interconnect 2 Wiring Requirements and
Implications
2Previously
- Need for Interconnect
- Why simplest things dont work
- Bus
- Crossbar
- Need to understand/exploit structure in our
interconnect problem
3Today
- Wiring Requirements
- Rents Rule
- A model of structure
- Implications
4Wires and VLSI
- Have a small, finite number of wiring layers
- E.g.
- one for horizontal wiring
- one for vertical wiring
- Assume wires can run over gates
5Visually Wires and VLSI
6Important Consequence
7Thompsons Argument
- The minimum area of a VLSI component is bounded
by the larger of - The area to hold all the gates
- Achip ? N ? Agate
- The area required by the wiring
- Achip ? Nhorizontal Wwire ? Nvertical Wwire
8How many wires?
- We can get a lower bound on the total number of
horizontal (vertical) wires by considering the
bisection of the computational graph - Cut the graph of gates in half
- Minimize connections between halves
- Count number of connections in cut
- Gives a lower bound on number of wires
9Bisection
Bisection Width 3
10Next Question
- In general, if we
- Cut design in half
- Minimizing cut wires
- How many wires will be in the bisection?
11Arbitrary Graph
- Graph with N nodes
- Cut in half
- N/2 gates on each side
- Worst-case
- Every gate output on each side
- Is used somewhere on other side
- Cut contains N wires
12Arbitrary Graph
- For a random graph
- Something proportional to this is likely
- That is
- Given a random graph with N nodes
- The number of wires in the bisection is likely to
be c?N
13Particular Computational Graphs
- Some important computations have exactly this
property - FFT (Fast Fourier Transform)
- Sorting
14FFT
15FFT
- Can implement with N/2 nodes
- Group row together
- Any bisection will cut N/2 wire bundles
- True for any reordering
16Assembling what we know
- Achip ? N ? Agate
- Achip ? Nhorizontal Wwire ? Nvertical Wwire
- Nhorizontal c ? N
- Nvertical c ? N
- bound true recursively in graph
- Achip ? cN Wwire ? cN Wwire
17Assembling
- Achip ? N ? Agate
- Achip ? cN Wwire ? cN Wwire
- Achip ? (cN Wwire)2
- Achip ? N2 ? c?
18Result
- Achip ? N ? Agate
- Achip ? N2 ? c?
- Wire area grows faster than gate area
- Wire area grows with the square of gate area
- For sufficiently large N,
- Wire area dominates gate area
19Intuitive Version
- Consider a region of a chip
- Gate capacity in the region goes as area (s2)
- Wiring capacity into region goes as perimeter
(4s) - Perimeter grows more slowly than area
- Wire capacity saturates before gate
20Result
- Achip ? N2 ? c?
- Wire area grows with the square of gate area
- Troubling
- To double the size of our computation
- Must quadruple the size of our chip!
21So what?
- What do we do with this observation?
22First Observation
- Not all designs have this large of a bisection
- What is typical?
23Array Multiplier
Mpy bit
Mpy bit
Mpy bit
Mpy bit
Mpy bit
Mpy bit
Mpy bit
Mpy Bit
Mpy bit
Mpy bit
Mpy bit
Mpy bit
Mpy bit
Mpy bit
Mpy bit
Mpy bit
24Shift Register
reg
reg
reg
reg
reg
reg
reg
reg
reg
reg
reg
reg
reg
reg
reg
reg
25Architecture ? Structure
- Typical architecture trick
- exploit expected problem structure
- What structure do we have?
- Impact on resources required?
26Bisection Bandwidth
- Bisection bandwidth of design
- ?lower bound on network bisection bandwidth
- important, first order property of a design.
- Measure to characterize
- Rather than assume worst case
- Design with more locality
- ? lower bisection bandwidth
- Enough?
27Characterizing Locality
- Single cut not capture locality within halves
- Cut again
- ? recursive bisection
28Regularizing Growth
- How do bisection bandwidths shrink (grow) at
different levels of bisection hierarchy? - Basic assumption Geometric
- 1
- 1/?
- 1/?2
29Geometric Growth
- (F,a)-bifurcator
- F bandwidth at root
- geometric regression a at each level
30Good Model?
Log-log plot ? straight lines represent geometric
growth
31Rents Rule
- In the world of circuit design, an empirical
relationship to capture - IO c Np
- 0?p?1
- p characterizes interconnect richness
- Typical 0.5?p?0.7
- High-Speed Logic p0.67
32Rents Rule
- In the world of circuit design, an empirical
relationship to capture - IO c Np
- compare (F,a)-bifurcator
- a 2p
33Rent and Locality
- Rent and IO capture/quantifying locality
- local consumption
- local fanout
34What tell us about design?
- Recursive bandwidth requirements in network
35As a function of Bisection
- Achip ? N ? Agate
- Achip ? Nhorizontal Wwire ? Nvertical Wwire
- Nhorizontal Nvertical IO cNp
- Achip ? (cN)2p
- If plt0.5
- Achip ? N
- If pgt0.5
- Achip ? N2p
36In terms of Rents Rule
- If plt0.5, Achip ? N
- If pgt0.5, Achip ? N2p
- Typical designs have pgt0.5
- interconnect dominates
37What tell us about design?
- Recursive bandwidth requirements in network
- lower bound on resource requirements
- N.B. necessary but not sufficient condition on
network design - I.e. design must also be able to use the wires
38What tell us about design?
- Interconnect lengths
- Intuition
- if pgt0.5, everything cannot be nearest neighbor
- as p grows, so wire distances
Can think of p as dimensionallity p1-1/d
39Interconnect Lengths
- Side is sqrt(N)
- IO crossing it is Np
- Whats minimum length for longest wires?
?
40What tell us about design?
- Interconnect lengths (more general)
- IO(n2)P crossing distance n
- end at exactly distance n
41What tell us about design?
- E(length)
- 1(number at length 1)2 (number at length 2)
3 (number at length 3)
Assume iid sources Equally likely to
originate at any point in area.
42Math
43Math continued
44Math continued
45Math Continued
46What tell us about design?
- Interconnect lengths
- IO(n2)P cross distance n
- at exactly distance n
- E(length)W(N(p-0.5))
- pgt0.5
True even with multiple metal layers.
47Delays
Recall from Day 6
- Logical capacities growing
- Wirelengths?
- No locality?k
- Rents Rule
- L ?n(p-0.5)
- pgt0.5
48Capacity
- Rent IOCNp
- pgt0.5
- A CN2p
- N(A/C)(1/2p)
- Logical Area ?k2
- N((k2A)/C)(1/2p)
- N(A/C)(1/2p) (k2)(1/2p)
- NN (k2)(1/2p)
- NN (k)(1/p)
- Sanity Check
- p1
- N2 kN
- p0.5
- N2 k2 N
49Capacity (alternate)
- Rent IOCNp
- pgt0.5
- A CN2p
- Logical Area ?k2
- k2 A CN22p
- k2 CN12p CN22p
- k2 N12p N22p
- k N1p N2p
- N2 k(1/p) N1
- Sanity Check
- p1
- N2 kN
- p0.5
- N2 k2 N
50What tell us about design?
- IO?NP
- Bisection BW?NP
- side length ?NP
- N if plt0.5
- Area ?N2p
- pgt0.5
- Average Wire Length ? N(p-0.5)
- pgt0.5
N.B. 2D VLSI world has natural Rent of
P0.5 (area vs. perimeter)
51Rents Rule Caveats
- Modern systems on a chip -- likely to contain
subcomponents of varying Rent complexity - Less I/O at certain natural boundaries
- System close
- (Rents Rule apply to workstation, PC, PDA?)
52Area/Wire Length
- Bad news
- Area W(N2p)
- faster than N
- Avg. Wire Length W (N(p-0.5))
- grows with N
- Can designers/CAD control p (locality) once
appreciate its effects? - I.e. maybe this cost changes design
style/criteria so we mitigate effects?
53What Rent didnt tell us
- Bisection bandwidth purely geometrical
- No constraint for delay
- I.e. a partition may leave critical path weaving
between halves
54Critical Path and Bisection
Minimum cut may cross critical path multiple
times. Minimizing long wires in critical path ?
increase cut size.
55Rent Weakness
- Not account for path topology
- ? Can we define a Temporal Rent which takes
into consideration? - Promising research topic
56Big IdeasMSB Ideas
- Rents rule characterize locality
- Fixed wire layers
- Area growth W (N2p)
- ? Wire Length W (N(p-0.5))
- pgt0.5? interconnect growing faster than compute
elements - expect interconnect to dominate other resources