Dynamic Load Balancing in Scientific Simulation presentation

About This Presentation

Transcript and Presenter's Notes

Title: Dynamic Load Balancing in Scientific Simulation

1
Dynamic Load Balancing in Scientific Simulation

Angen Zheng

2
Static Load Balancing No Data Dependency

Distribute the load evenly across processing
unit.
Is this good enough? It depends!
No data dependency!
Load distribution remain unchanged!

3
Static Load Balancing Data Dependency

Distribute the load evenly across processing
unit.
Minimize inter-processing-unit communication!
By collocating the most communicating data into a
single PU.

4
Load Balancing in Scientific Simulation
PUs need to communicate with each other to carry
out the computation.

Distribute the load evenly across processing
unit.
Minimize inter-processing-unit communication!
By collocating the most communicating data into a
single PU.
Minimize data migration among processing units.

Dynamic Load Balancing
5
Dynamic Load Balancing (Hyper)graph Partitioning

Given a (Hyper)graph G(V, E).
(Hyper)graph Partitioning
Partition V into k partitions P0, P1, Pk, such
that all parts
Disjoint P0 U P1 U Pk V and Pi n Pj Ø
where i ? j.
Balanced Pi (V / k) (1 ?)
Edge-cut is minimized edges crossing different
parts.

6
Dynamic Load Balancing (Hyper)graph
Repartitioning

Given a partitioned (Hyper)graph G(V, E).
(Hyper)graph Repartitioning
Repartition V into k partitions P0, P1, Pk,
such that all parts
Disjoint.
Balanced.
Minimal Edge-cut.
Minimal Migration.

Initial Partitioning
7
Dynamic Load Balancing (Hypergraph)
Repartition-Based

Building the (Hyper)graph
Vertices represent data.
Vertex object size reflects the amount of the
data per vertex.
Vertex weight accounts for computation per
vertex.
Edges reflects data dependencies.
Edge weight represents the communication among
vertices.

Reduce the Dynamic Load Balancing to a
(Hyper)graph Repartitioning Problem.
8
(Hypergraph) Repartition-Based Dynamic Load
Balancing Cost Model

9
(Hypergraph) Repartition-Based Dynamic Load
Balancing Network Topology

10
(Hypergraph) Repartition-Based Dynamic Load
Balancing Cache-Hierarchy

11
Hierarchical Topology-Aware (Hyper)graph
Repartition-Based Dynamic Load Balancing

Inter-Node Repartitioning
Goal Group the most communicating data into
compute nodes closed to each other.
Solution
Regrouping.
Repartitioning.
Refinement.
Intra-Node Repartitioning
Goal Group the most communicating data into
cores sharing more level or caches.
Solution1 Hierarchical repartitioning.
Solution2 Flat repartitioning.

12
Hierarchical Topology-Aware (Hyper)graph
Repartition-Based Dynamic Load Balancing

Inter-Node Repartitioning
Regrouping.
Repartitioning.
Refinement.

13
Hierarchical Topology-Aware (Hyper)graph
Repartition-Based Dynamic Load Balancing

Inter-Node (Hyper)graph Repartitioning
Regrouping.
Repartitioning.
Refinement.

Migration Cost 2 (inter-node) 2
(intra-node) Communication Cost 3 (inter-node)
14
Topology-Aware Inter-Node (Hyper)graph
Repartitioning

Inter-Node (Hyper)graph Repartitioning
Regrouping.
Repartitioning.
Refinement.

Migration Cost 2 (intra-node) Communication
Cost 3 (inter-node)

15
Hierarchical Topology-Aware Intra-Node
(Hyper)graph Repartitioning

Main Idea Repartition the subgraph assigned to
each node hierarchically according to the cache
hierarchy.

16
Flat Topology-Aware Intra-Node (Hyper)graph
Repartition

17
Flat Topology-Aware Intra-Node (Hyper)graph
Repartition
P1 P2 P3
Core0 Core1 Core2
Old Partition Assignment
Old Partition
18
Flat Topology-Aware Intra-Node (Hyper)graph
Repartition
Old Partition
New Partition
19
Flat Topology-Aware Intra-Node (Hyper)graph
Repartition
P1 P2 P3
Core0 Core1 Core2
Old Partition Assignment
Core0 Core1 Core2 Core3
P1 0 4 4 4
P2 2 2 4 4
P3 4 4 0 4
P4 4 4 0 4
Partition Migration Matrix
P1 P2 P3 P4
P1 0 1 0 0
P2 1 0 3 0
P3 0 3 0 0
P4 0 0 0 0
New Partition
Partition Communication Matrix
20
Flat Topology-Aware Intra-Node (Hyper)graph
Repartition
Core1 Core2 Core3 Core4
P1 0 4 4 4
P2 2 2 4 4
P3 4 4 0 4
P4 4 4 0 4
Partition Migration Matrix
P1 P2 P3 P4
P1 0 1 0 0
P2 1 0 3 0
P3 0 3 0 0
P4 0 0 0 0
Partition Communication Matrix
New Partition

P1 P2 P3 P4
Core0 Core1 Core2 Core3
21
Major References

1 K. Schloegel, G. Karypis, and V. Kumar, Graph
partitioning for high performance scientific
simulations. Army High Performance Computing
Research Center, 2000.
2 B. Hendrickson and T. G. Kolda, Graph
partitioning models for parallel computing,"
Parallel computing, vol. 26, no. 12, pp.
15191534, 2000.
3 K. D. Devine, E. G. Boman, R. T. Heaphy, R.
H.Bisseling, and U. V. Catalyurek, Parallel
hypergraph partitioning for scientific
computing," in Parallel and Distributed
Processing Symposium, 2006. IPDPS2006. 20th
International, pp. 10-pp, IEEE, 2006.
4 U. V. Catalyurek, E. G. Boman, K. D.
Devine,D. Bozdag, R. T. Heaphy, and L. A.
Riesen, A repartitioning hypergraph model for
dynamic load balancing," Journal of Parallel and
Distributed Computing, vol. 69, no. 8, pp.
711724, 2009.
5 E. Jeannot, E. Meneses, G. Mercier, F.
Tessier,G. Zheng, et al., Communication and
topology-aware load balancing in charm with
treematch," in IEEE Cluster 2013.
6 L. L. Pilla, C. P. Ribeiro, D. Cordeiro, A.
Bhatele,P. O. Navaux, J.-F. Mehaut, L. V. Kale,
et al., Improving parallel system performance
with a numa-aware load balancer," INRIA-Illinois
Joint Laboratory on Petascale Computing, Urbana,
IL, Tech. Rep. TR-JLPC-11-02, vol. 20011, 2011.

22
Thanks!

Write a Comment

User Comments (0)

About PowerShow.com

Dynamic Load Balancing in Scientific Simulation PowerPoint PPT Presentation