UAB - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

UAB

Description:

Genaro Costa, Anna Morajko, Paola Caymes Scutari, Tom s Margalef and Emilio Luque ... New wide systems are built over the available resources and the user does not ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 23
Provided by: genaro4
Category:
Tags: uab | paola

less

Transcript and Presenter's Notes

Title: UAB


1
UAB
Paradyn Week 2006 March 2006
  • Dynamic Monitoring and Tuning in Multicluster
    Environment

Genaro Costa, Anna Morajko, Paola Caymes Scutari,
Tomàs Margalef and Emilio Luque Universitat
Autònoma de Barcelona
2
Outline
  • Introduction
  • Multicluster Systems
  • Applications on Wide Systems
  • MATE
  • New Requirements
  • Design
  • Conclusions

3
Introduction
  • System performance
  • New problems require more computation power.
    Performance is a key issue.
  • New wide systems are built over the available
    resources and the user does not have total
    control of where the application will run.
  • It became more difficult to reach high
    performance and efficiency for these wide systems.

4
Introduction (II)
  • To reach performance goals, users need to find
    and solve bottlenecks.
  • Dynamic Monitoring and Tuning is a promising
    approach.
  • With dynamic systems properties, efficient
    resource use is hard to reach even for expert
    users.

5
Multicluster Systems
  • New systems are built using existing resources.
    Examples are NOW and HNOW linked with multistage
    network interconnections.
  • Intra cluster communications have different
    latencies than inter cluster communications.
  • Generally multiclusters built of clusters
    (homogenous or heterogeneous) interconnected by
    WAN.

6
Multicluster Systems (II)
  • Each cluster can have its own scheduler and can
    be exposed either through a head node or by all
    nodes

7
Applications on Wide Systems
Cluster A
Master
  • Hierarchical Master/Worker Applications
  • Raise the possibility of performance bottlenecks
  • Load imbalance problems
  • Inefficient resource use
  • Non-deterministic inter cluster bandwidth

Worker
Worker
Worker
Worker
Common data aretransmitted once
Cluster B
Sub Master
Sub Master explores data locality
Worker
Worker
Worker
Worker
8
Applications on Wide Systems (II)
  • Hierarchical Master/Worker Applications
  • Sub master is seen as a high processing node by
    the master.
  • Work distribution from master to sub master
    should be based on
  • Available bandwidth
  • Computing power
  • These characteristics may have dynamic behavior.

9
MATE
  • Monitoring, Analysis and Tuning Environment
  • Dynamic automatic tuning of parallel/distributed
    applications.

Modifications
DynInst
Instrumentation
10
MATE (II)
Machine 2
Machine 1
modif.
AC
AC
Task1
Task2
Task3
DMLib
DMLib
DMLib
instr.
instr.
events
  • Application Controller - AC
  • Dynamic Monitoring Library - DMLib
  • Analyzer

events
Machine 3
Analyzer
11
MATE (III)
Analyzer
DTAPI
  • Each tuning technique is implemented in MATE as a
    tunlet, a C/C library dynamically loaded to
    the Analyzer process.
  • measure points what events are needed
  • performance model how to determine bottlenecks
    and solutions
  • tuning actions/points/synchronization - what to
    change, where, when

12
New Requirements
  • Transparent process tracking
  • AC should follow application process to any
    cluster.
  • Lower inter cluster instrumentation communication
    overhead
  • Inter cluster communications generally have high
    latency and lower bandwidth.

13
Transparent process tracking
DESIGN
  • System Service
  • Machine or Cluster can have MATE enabled as
    daemon that detects startup of new processes.

MATE EnabledMachine
MATE EnabledMachine
Taskn
attach
Taskn
DMLib
AC
AC
startup detection
control
Analyzersubscription
14
Transparent process tracking
DESIGN (II)
  • Application plug-in
  • AC can be binary packaged with application
    binary.

AC
Task
DMLib
Remote Machine
Remote Machine
detects Dyninst
Taskn
new Task
create
new Task
create
DMLib
Task
AC
AC
control
Job submission
AC
DMLib
Analyzersubscription
15
Lower communication overhead
DESIGN (III)
  • Smart event collection
  • Total application trace may generate much
    overhead.
  • Event aggregation
  • Remote trace events should be aggregated to trace
    event abstractions, saving bandwidth.
  • Inter Cluster Trace Event Routing

16
Analyzer Approaches
  • Centralized
  • Requires tunlets modification to distinguish
    instrumentation data of local application
    processes.
  • Hierarchical
  • Requires tunlets dismembering into local tunlets
    and global tunlets.
  • Distributed
  • Requires that tunlets instances located on
    different Analyzer instances cooperate to tune an
    application.

17
Lower communication overhead (II)
DESIGN (IV)
  • Centralized Analyzer Approach

Cluster B
Cluster A
Machine B3
Machine B1
Machine A1
Machine A2
Task1
Task1
AC
AC
AC
AC
Task2
Task4
Task3
Task3
Machine B2
AC
Machine A3
Analyzer
Event Router
Task2
18
Local Performance Model Analysis
DESIGN (V)
  • Hierarchical Analyzer Approach

Cluster B
Cluster A
Machine B3
Machine B1
Machine A1
Machine A2
Task1
Task1
AC
AC
AC
AC
Task2
Task4
Task3
Task3
Machine B2
LocalAnalyzer
Machine A4
Machine A3
GlobalAnalyzer
LocalAnalyzer
Abstract Events
19
Distributed Monitoring, Analysis and Tuning
Environment
DESIGN (VI)
  • Distributed Analyzer Approach

Cluster A
Cluster B
Cluster B
Cluster A
Machine B3
Machine B1
Machine A1
Machine A2
Task1
Task1
AC
AC
AC
AC
Task2
Task4
Task3
Task3
Machine B2
Machine A3
Analyzer
Tunlet instancescooperation
Analyzer
20
Conclusions and future work
  • Conclusions
  • Interference of instrumentation information on
    inter cluster communication should be minimal.
  • Process tracking enables MATE for multicluster
    systems.
  • Centralized Analyzer approach benefits tunlet
    developer but does not scale.
  • Distributed Analyzer approach scales but requires
    different model based analysis.

21
Conclusions and future work (II)
  • Future Work
  • Development of new tunlets for distributed and
    hierarchical Analyzer approach.
  • Tuning based only of local instrumentation data.
  • Semantics of aggregation for Instrumentation
    events.
  • Patterns of distributed tunlets cooperation.
  • Scenarios of distributed Analyzer cooperation in
    multiclusters.

22
Thank you
Write a Comment
User Comments (0)
About PowerShow.com