Title: Wrekavoc a Tool for Emulating Heterogeneity
1Wrekavoc a Tool for Emulating Heterogeneity
- Louis-Claude Canon Emmanuel Jeannot
- ESEO INRIA-LORIA-ICL
- Angers U. Tennessee
- Knoxville
- HCW 04/25/2006
2Outline
- On the Importance of Experiments in Computer
Science - Wrekavoc a tool for Emulating Heterogeneity
- Design
- Configuration of the nodes
- Experiments
3Modern Infrastructures are Hard to Model
- Modern infrastructure are complex
- Processors have very nice features
- Cache
- Hyperthreading
- Dual core
-
- Operating system impacts the performance (process
scheduling, socket implementation, etc.) - The runtime environment plays a role
(MPICH?OPENMPI) - The same for middleware (Globus?GridSolve)
- Various parallel architectures that can be
- Heterogeneous
- Hierarchical
- Distributed
- Dynamic
-
- Analytical validation of algorithms is hard and
sometimes impossible. - Can we still design and validate algorithms?
4Experimental Validation
- A good alternative to analytical validation
- Provides a comparison between algorithms
- Provides a validation of the model or helps to
define the validity domain of the model
5Methodologies for Doing Experiments
log(cost)
Real systems Real applications In-lab
platforms Synthetic conditions
Real systems Real applications Real
platforms Real conditions
Models Sys, apps, Platforms, conditions
Key system mecas. Algo, app. kernels Virtual
platforms Synthetic conditions
log(realism)
emulation
math
simulation
live systems
6Wrekavoc a Tool for Emulating Heterogeneity
- We target heterogeneous distributed environment
- Goal experiment distributed algorithm on
ahomogeneous and centralized cluster - How transform this cluster into heterogeneous
environment and control the heterogeneity
7Goal
- Making a cluster an heterogeneous environment
- CPU speed.
- Mémory.
- Network bandwidth.
- Network latency.
- A real node
-
- An emulated node.
- Two solutions
- Increase the performance (update the hardware)
- Degrade the performance (by software means)
- Solution 2 Costless and allows for performance
control.
8State of the Art
9A Client-Sever Approach
- The client
- Reads a configuration file,
- Contact each node to be configured.
- One server per node on the cluster
- Run a deamon,
- Configure itself according to client orders,
- Is able to test itself (and send back results to
the client).
10Logical Architecture
- The cluster is decomposed into islets.
- 1 islet union of IP addresses intervals
- 152.81.2.12-152.81.2.25-152.81.2.151-152.81.2.1
76 - Each node of a given islet shares the same
characteristics - Network characteristics are define between and
inside an islet.
Islets of machine sharing Same charactreristics
1
2
3
4
5
6
7
8
Islet sub-network
Logical view
Inter-islet network
Islet sub-network
1
2
3
4
11Characteristics Definitions
- For each islet one defines
- A seed,
- Internal characteristic of each node of the islet
(CPU, memory, BW(in/out), latency) according to - A uniform law inf. bound - up. bound
- A gaussian law avg variance
- Between each islet one defines
- A seed
- BW between islet 1 and 2.
- BW between islet 2 and 1.
- Latency between islets.
12Example of Configuration File
- islet1 152.81.15.207-152.81.15.209-
152.81.15.123-152.81.15.254 - SEED 123
- CPU 1000-1200
- BPOUT 10000
- BPIN 10000
- LAT 15-15
- USER ME
- MEM 800000
-
- islet2 152.81.3.100-152.81.3.100
- SEED -1
- CPU 100-300
- BPOUT 120-185
- BPIN 12-18
- LAT 201
- USER OTHER
- MEM 80000-100000
-
- !INTER islet1islet2 5-5 100-100 10 -1
- Units
- CPU MHz
- BW KB/s
- Latencey ms
- Mémory Ko
13Tools for Configuring Nodes
- We want to degrade
- CPU speed
- Allocatable Memory
- Network bandwidth
- Network lantency
14CPU Speed Degradation
- 3 approaches
- CPU-freq (Linux kernel module that change the CPU
frequency) - Advantage very precise.
- Drawback Requires ACPI enabled CPUfew usable
frequencies (coarse management). - CPU-burning (A process take some CPU cycles)
- Advantage works on any architecture fine
management - Drawback calibrating is hard, degrades net.
perf. to the same proportion - CPU-scheduling (a user level scheduler suspend or
active process execution according to the desired
degradation). - Advantage very precise (default method)
- Drawback uses /proc (not portable)
15Memory Degradation
- Use PAM-limit module of the kernel
- Limit the maximum amount of memory allocatable by
a malloc.
16Network Management
- We use (Traffic Control) of iproute2
- Limit ingoing and outgoing bandwidth
- Limit latency (ver. 2.6.8.1 or better).
- Traffic control depends on IP addresses
17Experiments
- Several experiments on GdX
- Configuration time
- Micro-benchmark
- Impact of CPU degradation against available
bandwidth - Algorithms of the literature
18Configuration Time
19Micro-benchmark
20CPU Degradation vs. Bandwidth Degradation
21Matrix Mutliply Algorithms of the Literature
20 machines 5 at 100 5 at 75 5 at50 5
at 25
22Conclusion
- The complexity of modern infrastructures makes
the modeling difficult and sometimes impossible. - The analytical validation of algorithm is
problematic in this context. - Hence, it is required to use experiments to
validate algorithms and models. - We propose Wrekavoc a tool for emulating
heterogeneity. - Controllable configurations of the nodes of an
homogeneous cluster - Independent degradation of the characteristics
- Allows for a quantitative comparison of
algorithms.