Title: ASM 2003, slide 1
1Communication Pattern Based Node Selection for
Shared Networks
Srikanth Goteti Interactive Data Corp Jaspal
Subhlok University of Houston AMS Symposium 2003
2Resource Selection for Network/Grid Applications
Model
Data
GUI
Sim 1
Pre
Stream
Application
?
where is the best performance
Network
3Current Approaches to Node Selection
Model
Data
GUI
Sim 1
Pre
Stream
- Measure and model network properties, such as
available bandwidth and CPU loads (with tools
like NWS) - Find best nodes for execution based on network
status - But expected application performance based on
measured network status may not be accurate - depends on application characteristics
- translation, e.g., unused bandwidth vs expected
throughput - data may be stale as frequent measurements are
expensive
4Performance Skeleton
- Performance Skeleton is a synthetic short running
program whose execution characteristics mirror
the application it represents - An application and its skeleton have similar
- communication pattern
- synchronization pattern
- CPU usage
- memory usage
- Goal Performance of a skeleton is directly
related to the performance of the application
under any condition - e.g., a skeleton executes in .1 of the time the
application takes to execute on any part of a
shared network
5Node Selection with Performance Skeletons
Model
Data
GUI
Sim 1
Pre
Stream
Select candidate node sets based on network status
Execute the skeleton on them
Select the node set with best skeleton
performance to schedule actual application
6Node Selection Procedure
- Construct a performance skeleton
- mostly by hand in this paper, subject of ongoing
work - Select candidate node sets
- identify the communication graph of the
application - typically a chain, ring or all-all structure
- obtain available bandwidth between nodes with NWS
(Network Weather Service) and build a graph - select nodes to maximize the minimum available
bandwidth between pairs of communicating nodes - best possible node sets based on application
structure and network status - Execute the skeleton on each candidate node set
- Select the node set with best skeleton
performance, map one process to each node
7Communication Structure of NAS Benchmarks
1
1
0
0
2
3
3
2
BT
IS
1
1
1
0
0
0
2
2
2
3
3
3
LU
MG
SP
1
0
2
3
EP
8(No Transcript)
9Validation Experiments
- Best nodes to execute benchmarks selected by
each of the following methods - skeleton based full framework discussed
- all to all based on maximizing the minimum
available bandwidth between on the network graph - random
- compare performance of the application on nodes
selected by each of these procedures on a busy
network - Experiments repeated a large number of times to
get statistically meaningful results
10Experimental Framework
- Linux cluster of 10 dual CPU 1.7GHz Pentium nodes
connected by 100 MHz links and crossbar switch - experiments with Class B NAS MPI benchmark suite
- Class W NAS benchmarks (avrg runtime 1.5 seconds
on our cluster) used as skeletons for class B
benchmarks - available bandwidth between nodes is varied with
Linux iproute2 for the duration of experiments as
follows - path between a pair of nodes is shared by S
streams - i.e., available bandwidth is set to 1/S of peak
- one stream is randomly added to or removed from
the cluster every 30 seconds
11Performance Results slowdown due to network
traffic
- skeleton based has average slowdown of 20,
versus 40 for random and 27 for all to all - significant variation across benchmarks, most
benefit for CG it is communication heavy and
uses only 3 links
12Conclusions type slide
- Performance skeletons have a role in resource
management for grids - removes limitations of using NWS type systems
(what you measure versus what you get problem) - A lot more experimentation is needed to establish
and validate the concepts - Automatic construction of performance skeletons
is a major open challenge - Skeletons may have other uses a fast way of
estimating the performance of an application - e.g. on a slow simulated future system