The fattree topology and its performance issues - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

The fattree topology and its performance issues

Description:

... with m-port switches: all internal nodes are of degree m ... Generalized fat-tree GFT(h, m, w) ... FT(m, n) is a constant bisection bandwidth (CBB) network: ... – PowerPoint PPT presentation

Number of Views:255

Avg rating:3.0/5.0

Slides: 31

Provided by: rache82

Category:

more less

Transcript and Presenter's Notes

Title: The fattree topology and its performance issues

1
The fat-tree topology and its performance issues

Different names for fat-tree
folded Clos networks
constant bisection bandwidth (CBB) networks
Trees with multiple roots
multi-stage networks
Fat-tree is the de-facto topology in high speed
system area networks.
Almost all medium and large clusters (gt 100
ports) are connected with some kinds of fat-tree
topologies.

2
Why is fat-tree so popular?

Clusters with nodes connected by one centralized
switch have many desirable properties for
building large (scalable) systems.
Constant latency
Bisection bandwidth scales linearly with the
number of nodes
Fat-tree approximate a centralized switch with a
large number of ports. (the Clos network)

3
Fat-tree construction

Fat-tree as it was original defined by C.E.
Leiserson is very flexible regarding bisection
bandwidth.
C.E. Leiserson, Fat-trees Universal Networks
for Hardware-Efficient Supercomputing, IEEE
Transactions on Computers, 34(10)892-901, Oct.
1985.
The ones used in the current system area networks
are mostly constant bisectional bandwidth (CBB)
networks.
We will introduce two sub-class of fat-trees.

4
Some example fat-trees
5
General idea in fat-tree construction

A perfect fat-tree has the same functionality as
a crossbar.
Use smaller switches to approximate large
switches.
Connectivity is reduced, but the topology is
implementable
Constant bisection bandwidth is maintained by
having the same number of links in each level.

6
FT(m, n) m-port n-tree

Reference X. Lin, Y. Chung and T. Huang, A
Multiple LID Routing Scheme for Fat-tree Based
InfiniBand Networks, IEEE IPDPS 2004.
Fat-trees built with m-port switches all
internal nodes are of degree m
FT(m, n) is built over sub fat-trees (SUBFT) fat
trees with open up-links.

7
SUB-fat-trees

SUBFT(m, h) has (m/2)h open up links and
connects (m/2)h machines (leaves).
(m/2)(h-1) top level switches
m/2 SUBFT(m, h-1)

8
FT(m, h)

(m/2)(h-1) top level switches
m SUBFT(m, h-1)

9
FT(m, h)

Number of machines m(m/2)(h-1)
Number of switches (2h-1) (m/2)(h-1)
Typical value for m 24
Typical value for h 2 or 3.
FT(24, 3) 3456 ports, 720 switches

10
FT(4, 3)
11
Generalized fat-treeGFT(h, m, w)

Reference S. R. Ohring, M. Ibel, S. K. Das, M.
J. Kumar, On Generalized Fat-tree, IEEE IPPS
1995.
FT(m, n) is a constant bisection bandwidth (CBB)
network
each node has m/2 children and m/2 parents.
GFT(h, m, w)
Each node has m children and w parents
mw bisection bandwidth ratio
mw 11 is sometimes called full bisectional
bandwidth (FBB).

12
GFT(h, m, w)

GFT(0, m, w) a single node
GFT(h1, m, w)
w(h1) top level switches (eaching having m
child)
m GFT(h, m, w)s
Similar to how FT(m, n) is constructed.

13
GFT(x, 2, 1)
GFT(2, 2, 1)
GFT(1, 2, 1)
GFT(0, 2, 1)
14
GFT(x, 2, 2)
GFT(0, 2, 2)
GFT(1, 2, 2)
GFT(2, 2, 2)
15
GFT(3, 2, 2)
GFT(2, 2, 2)
GFT(1,2,2)
How is this different from FT(4, 3)?
16
GFT(2, 4, 4)
17
GFT(2, 3, 3)
18
GFT(2, 2, 3)
19
GFT(2, 4, 2)
20
Performance issues in fat-trees

Clos network is non-blocking when ngt2m-1.
2-level fat-trees (e.g. GFT(1, 2, 4)) are
equivalent to Clos networks, thus the name folded
Clos.
Can 2-level fat-trees achieve non-blocking
communication?

21
Can 2-level fat-trees achieve non-block
communications?

Clos networks are non-blocking when
The system knows all the current on-going
traffics
Needs a centralized controller.
The source must be able to use any path to the
destination.
Needs to support a large number of paths.
Are these conditions practical in large computer
clusters?

22
Practical fat trees

2-level CBB networks or folded Clos (nm) are the
minimum required to achieve non-blocking
(rearrangeable non-blocking).
Network contention is possible due to the lack of
centralized controller.
Needs techniques to minimize the possibility of
network contention.
What kind of techniques can do this?

23
Practical fat trees

What kind of techniques can reduce contention?
Routing spread traffics among all links
Adaptive routing (Quadrics)
Require multiple paths, avoid links currently
under use.
Limited applicability used in up links, but not
down links.
Source routing similar idea as adaptive routing,
but less flexibility. (Myrinet)
Deterministic routing worst performer, but
simple implementation. (InfiniBand)
Congestion control slow down when the network is
in trouble.
Reactive approach is this good for high speed
networks?

24
Routing issue in fat trees
Can we compute routes that achieve non-blocking
Communication for any permutation?
25
A case study for the current fat-tree
interconnection networks

Reference T. Hoefler, T. Schneider, and A.
Lumsdaine Multistage Interconnection Networks
are not Crossbars Effects of static routing in
high performance networks, IEEE Cluster, 2008.
Many large scale fat-tree based networks have
been built. How are they doing?

26
Performance metrics

User perceived bisection bandwidth
4X DDR InfiniBand ? 20 Gbps between each pair.
What happens when half of the machines send to
the other half simultaneously?
In a crossbar, all pairs should get 20Gbps!!
How about fat-tree?
Due to the routing constraints, the user
perceived bisection bandwidth should depend on
the permutation.

27
User perceived bisection bandwidth on some systems

Results obtained using simulation average of
many random permutations
Ranger (3908 nodes) 57.5
Atlas (1142 nodes) 55.6
Thunderbird (4390 nodes) 40.6
40 to 60 of a crossbars seem not too bad.
But the results are the average case, not the
worst case.

28
Other effects of network contenion

Bandwidth varies with communication pattern.
Performance prediction and modeling is not easy.
Message latency is also affected.

29
Conclusion

Fat-trees can only approximate cross-bar.
Are there better topologies than fat-trees under
practical constraints?
In the current fat-tree topology, what are the
best routing schemes with adaptive, source route,
and single path routing?
It is commonly believed that adaptive routing is
good for fat-trees, but is adaptive routing good
enough?

30
References

Fat-tree origins
C.E. Leiserson, Fat-trees Universal Networks
for Hardware-Efficient Supercomputing, IEEE
Transactions on Computers, 34(10)892-901, Oct.
1985.
Fat-tree construction
S. R. Ohring, M. Ibel, S. K. Das, M. J. Kumar,
On Generalized Fat-tree, IEEE IPPS 1995.
X. Lin, Y. Chung and T. Huang, A Multiple LID
Routing Scheme for Fat-tree Based InfiniBand
Networks, IEEE IPDPS 2004.
Fat-tree routing and performance issues
T. Hoefler, T. Schneider, and A. Lumsdaine
Multistage Interconnection Networks are not
Crossbars Effects of static routing in high
performance networks, IEEE Cluster, 2008.
P. Geoffray and T. Hoefler. Adaptive Routing
Strategies for Modern High Performance Networks.
In 16th Annual IEEE Symposium on High Performance
Interconnects (HOTI 2008), pages 165-172, Aug.
2008.