Title: Automatic Clustering of Grid Nodes
1Automatic Clustering of Grid Nodes
- Nov 14, 2005
- Qiang Xu, Jaspal Subhlok
- University of Houston
2Grid Scheduler
Network Link Latency, Bandwidth
Computational Resource CPU, memory
I will decide which group of nodes are best for
an application!!!
Network Topology
3Network Topology
- Fine-grained physical network topology --- Hard!
- heterogeneous, dynamic, and distributed nature
of a grid system - We focus on the logical network topology
- logical network topology the connectivity
between nodes based on the observed behavior. - 1) Easier to compute
- 2) Sufficient to tackle the resource
selection problem
4Discover Clusters/Logical Topology
A set of nodes with IP addresses /
hostnames Connectivity?
5Discover Clusters/Logical Topology
Cluster A
Dist(AB)
Dist(AC)
Dist(BC)
Cluster C
Cluster B
nodes close to each other ? same cluster
6Outline
- Introduction
- Internet ? Geometric Space
- Automatic Clustering
- Experiments and Result
- Conclusion
7Internet Topology Map 1
A macroscopic snapshot of the Internet 4 April
2005 - 17 April 2005.
8Internet Topology Map 2
Internet map as of 1998 by Bill Cheswick, Bell
Labs Hal Burch, CMU
9Why Geometric Space ?
Internet Topology Map --- Complex! Geometric
Space (N-Dimension Euclidean Space)
GNP(Global Network Positioning) --- T. S. Eugene
Ng and Hui Zhang, INFOCOM'02
I cant tell the distance between nodes!!
10Magic Landmarks!
12
3
8
Landmark
Node
Landmarks A set of distributed nodes across the
internet
11Geometric Space
- One axis per landmark
- Coordinate of nodes Latency from each landmark.
X412
Z43
Y48
12Internet ? Geometric Space
Simple Geometric Space
Complex Internet Structure
13Advantage of Geometric Space
- Simple --- distance in Geometric Space is well
defined, e.g. the Euclidean distance. - Scalable --- for M Nodes
- Pairwise distance among M nodes ? MM probes
- Mapping to Geometric space ? MN probes
-
- N is the number of landmarks a number 7
is - known to be sufficient.
- Easy to manage --- only need to control the
landmarks
14Outline
- Introduction
- Internet ? Geometric Space
- Automatic Clustering
- Experiments and Result
- Conclusion
15Again the problem!
16Place Nodes in Geometric Space !
How do I cluster?
Simple Geometric Space
17Distance and Threshold
- Network Distance
- Threshold
- If Distance lt Threshold, nodes belong to the same
logical cluster - N is the of landmarks
- T parameter describes how close nodes have to be
to be in the same cluster - for a typical domain to be one cluster ,T 1ms
-
18Build Unidirected Graph
- All grid nodes are graph nodes
- Add an edge between nodes if Distance lt Threshold
-
19Typical Case
- Edge exist if Distance lt Threshold
Clusters are obvious and easy to distinguish!
20Pathological Case
- Where are the clusters?
- General Case Find maximal cliques in the
graph each clique is a cluster
21Summary of Inter-domain Clustering
- Place Nodes in the geometric space.
- Calculate the Euclidean distance.
- Build a graph based on distance and Threshold.
- Find the maximal cliques.
inter-domain clustering --- ? good!
intra-domain clustering --- ? not good enough!
22 Intra-domain clustering
- Nodes in the same domain but in different
subnets. -
- Short latency --- less than 1ms.
- Landmark-based approach --- resolution is not
sufficient! -
- measurement error real latency
- We need to change the approach for intra-domain
clustering !
23Intra-domain Clustering
-
- Distance between nodes is directly measured
latency instead of projected geometrical
distance. - (M M but M is smaller and measurements are
quick.) - Basis for clustering is relative
- Distance between any two nodes inside a
cluster is within ß of the smallest distance in
the cluster.
24Intra-domain Clustering Procedure
Initially each node is a cluster Each edge is
measured latency
REPEAT Select least cost edge, say connecting
clusters A and B If A and B are not the same
cluster and if this edge cost is within ß of
least cost edges inside A and B, then combine
them into one cluster
25Outline
- Introduction
- Internet ? Geometric Space
- Automatic Clustering
- Experiments and Result
- Conclusion
26Experiments
- Inter-Domain Clustering
- 3 Landmarks UT(Austin), Rice, CMU
- 36 Compute Nodes Rice, UT-Dallas, TAMU-College
Station, TAMU-Galveston - Intra-Domain Clustering
- 4 clusters at University of Houston
- PGH201, Itanium, Opetron, Stokes
- TCP Ping(not ICMP Ping) to measure latency
27Inter-domain Cluster ( 2 landmarks)
- Cannot
- distinguish
- between
- UT Dallas
-
- TAMU Galveston
- UT Dallas
- TAMU Galveston
- TAMU College Station
- ?Rice
28Inter-domain Cluster ( 3 landmarks)
- 4 clusters
- are well
- distinguished
- UT Dallas
- TAMU Galveston
- TAMU College Station
- ?Rice
29Inter-domain Cluster ( 2 landmarks)
- UT Dallas
- TAMU Galveston
- TAMU College Station
- ?Rice
30Intra-domain Cluster latency
Clusters PGH201 Opteron Itanium Stokes
PGH201 0.09 0.32 0.32 0.30
Opteron 0.25 0.09 0.09 0.50
Itanium 0.30 0.10 0.10 0.35
Stokes 0.40 0.50 0.60 0.10
Latency between Nodes (ms)
31Illustration of Intra-domain Clusters
- UT Dallas
- TAMU Galveston
- TAMU College Station
- ?Rice
32Future Work
- Integrate into a grid scheduling system
- Use Bandwidth as a factor for clustering
- Dynamically update logical clusters
- Nodes behind a NAT (Network address translation)
-- nodes with local IP addresses
33Conclusions
- Efficient and scalable procedure to
hierarchically group distributed nodes into
logical clusters - Validation with experiments on nodes distributed
across Texas - An important step for scheduling in a grid
environment.
34Questions?
Thank you!