Pipelined Broadcast on Ethernet Switched Clusters - PowerPoint PPT Presentation

About This Presentation
Title:

Pipelined Broadcast on Ethernet Switched Clusters

Description:

Let T(msize) = time to send a message of size msize. Broadcast(msize) = T(msize) ... Contention free broadcast tree is necessary for pipelined algorithms ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 29
Provided by: csF2
Learn more at: http://www.cs.fsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Pipelined Broadcast on Ethernet Switched Clusters


1
Pipelined Broadcast on Ethernet Switched Clusters
  • Pitch Patarasuk, Ahmad Faraj, Xin Yuan
  • Department of Computer Science
  • Florida State University
  • Tallahassee, FL 32306

2
Broadcast communication(MPI_Bcast)
n0
n1
n2
n3
Before
A
B
C
D
n0
n1
n2
n3
After
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
Let T(msize) time to send a message of size
msizeBroadcast(msize) gt T(msize)
3
Ethernet Switched Cluster
switch
switch
switch
switch
4
  • Problem statement
  • How to efficiently realize the broadcast
    operation with large message sizes on Ethernet
    switched clusters.
  • Using pipelined broadcast can achieve near
    optimal results (T(msize) time for broadcasting a
    message of size msize).
  • Finding contention free broadcast tree
  • Finding a good segment size

5
Traditional Broadcast algorithms
  • Linear tree

0
1
2
3
4
5
6
7
Time (P-1) x T(msize)
  • Flat tree

0
1
2
3
4
5
6
7
Time (P-1) x T(msize)
6
  • Binary tree
  • k-ary tree

0
0
1
2
1
2
3
3
4
5
6
4
5
6
7
7
  • Time 2x(log2(P1)-1)xT(msize)

7
  • Binomial tree

0
4
2
1
6
5
3
7
  • Time log2P x T(msize)

8
  • Scatter/Allgather

n0
n1
n2
n3
Before
A
B
C
D
Scatter
A
B
C
D
Allgather
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
Time 2 x T(msize)
9
Time Complexity for large messages
10
Pipelined Broadcast Algorithm
  • Linear pipeline

0
1
2
3
11
  • Performance of pipelined broadcast
  • Assume no network contention
  • a message of size msize be broken into X messages
    of msize/X.
  • H tree hight, D the number of children
  • Size of pipelined stage D T(msize/X)
  • Total time T (X H 1) (D T(msize /X))
  • linear tree H P, D 1, T T(msize)
  • Binary tree H log(P), D 2, T 2T(msize)
  • K-ary tree H log_k(P), D k, in general not
    as efficient as binary tree.

12
Time Complexity for large messages
13
Pipelined broadcast
  • How to find a contention-free broadcast tree?
  • How to select the best segment size?

14
Example of network contention
switch
switch
  • Binary tree

n4,n5,n6,n7
n0,n1,n2,n3
0
1
2
  • There is a link contention cause by communication
    (1?4), (2?5), (2 ? 6), and (3 ? 7)

3
4
5
6
7
15
  • Linear tree

switch
switch
n2,n3,n6,n7
n0,n1,n4,n5
The linear tree 0?1?2?3??7 will have
a contention caused by (1?2) and (5?6)
16
Algorithm for constructing contention free linear
tree
  • Step 1 Traverse through all switches using
    depth-first-search (DFS) algorithm, name the
    switch by the order of their arrival in DFS tree
  • Step 2 The linear tree consists of all machines
    in switch S0, follows by all machines in S1, then
    S2,and so on

17
Example of contention free linear tree
n0,n1,n4,n5
n2,n3,n6,n7
n12,n13,n14,n15
Switch S0
Switch S1
Switch S3
Switch S2
n8,n9,n10,n11
Linear tree n0?n1?n4?n5?2?3?6?7?8?9??15
18
Algorithm for constructing contention free binary
tree
  • Start with a contention free linear tree
  • Recursively divide the tree into 2 sub-trees
  • Make sure that the cannot be a contention
  • The sub-trees are chosen such that the height of
    the whole tree will be minimal

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
19
Binary tree height
  • Performance of binary pipeline broadcast depends
    on the height of a binary tree
  • Even though contention free binary tree may not
    be a complete binary tree, its height is not that
    much more than a complete binary tree

20
Average tree heights for 20 randomly generated
topologies
21
Evaluation
  • Contention free pipelined algorithms
  • Routine generators from topology information
  • The generated routines are based on MPICH p2p
    primitives.
  • Linear tree
  • Binary tree
  • 3-nary tree
  • Targets for comparison
  • MPICH Binomial tree, Scatter/allgather
  • LAM Flat-tree, Binomial
  • Topology unaware pipelined linear and binary
    algorithms

22
Evaluation
23
Performance of different pipelined trees
(topology 1)
24
Comparing pipelined broadcast with other schemes
25
Topology unaware and contention-free pipelined
broadcast
26
Segment size for pipelined broadcast
27
Conclusions
  • Pipelined broadcast is faster than the current
    broadcast algorithm for medium and large messages
  • Linear pipeline has a completion time roughly
    equal to T(msize)
  • binary pipeline broadcast is best for medium
    messages
  • Contention free broadcast tree is necessary for
    pipelined algorithms
  • A good segment size for pipelined broadcast is
    not difficult to find.

28
Questions?
Write a Comment
User Comments (0)
About PowerShow.com