Automatic Online Performance Analysis for the Grid - PowerPoint PPT Presentation

1 / 17

About This Presentation

Title:

Automatic Online Performance Analysis for the Grid

Description:

Scenarios for Performance Monitoring and Analysis. Post-mortem application analysis ... Post-Mortem Application Analysis. Requires ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 18

Provided by: lpdsS

Category:

more less

Transcript and Presenter's Notes

Title: Automatic Online Performance Analysis for the Grid

1
Automatic Online Performance Analysisfor the Grid

Michael Gerndt
Technische Universität München
gerndt_at_in.tum.de

2
Grid Computing

Grids
enable communities (virtual organizations) to
share geographically distributed resources as
they pursue common goals -- assuming the absence
of
central location,
central control,
omniscience,
existing trust relationships.
Globus Tutorial
Major differences to parallel systems
Dynamic system of resources
Large number of diverse systems
Sharing of resources
Transparent resource allocation

3
Requirements for Grid Performance Analysis

Two ways to attack the question
Scenarios
Application types

4
Scenarios for Performance Monitoring and Analysis

Post-mortem application analysis
Self-tuning applications
Grid scheduling
Grid management
GGF performance working group, DataGrid,
CrossGrid

5
Self-Tuning Applications

Chris submits job
Application adapts to assigned resources
Application starts
Application monitors performance and adapts to
resource changes

Requires
Integration of system and application monitoring
On-the-fly performance analysis
API for accessing monitor data (if PA by
application)
Performance model and interface to steer
adaptation (If PA and tuning decision by external
component.)

6
Post-Mortem Application Analysis

George submits job to the Grid
Job is executed on some resources
George receives performance data
George analyzes performance

Requires
either resources with known performance
characteristics (QoS)
or system-level information to assess performance
data
scalability of performance tools
Focus will be on interacting components

7
Grid-Scheduling

Gloria determines performance critical
application properties
She specifies a performance model
Grid scheduler selects resources
Application is started

Requires
PA of the grid application
Possibly benchmarking the application
Access to current performance capabilities of
resources
Even better to predicted capabilities

8
Grid-Management

George claims to see bad performance since one
week.
The helpdesk runs the Grid performance analysis
software.
Periodical saturation of connections is detected.

Requires
PA of historical system information
Need to be done in a distributed fashion

9
Application Types

Remote site access
Parameter studies
Workflow applications
Metacomputing applications
Data-intensive applications

10
Deployment of PA Tools
11
Requirements for Grid PA Tools

Scalability to large systems
Multiple HPC systems or tons of historic system
data
Integration of application- and system-level info
Tuning for intersite communication, improving
resource allocation, dynamic adaptation,
post-morten clarification
Online analysis
Only way to handle performance data set size
Needed for dynamic tuning
Automatic analysis
Needed for dynamic tuning, inspection of large
historic data sets, online analysis, model
generation of applications

12
New Aspect of Performance Analysis

Transparent resource allocation
Dynamism in resource availability
Even larger and geographically dispersed systems
Approaches in the following projects
Damien
Datagrid
Crossgrid
GrADS

13
Managing Dynamism The GrADS Approach

GrADS (Grid Application Development Software)
Funded by National Science Foundation, started
2000
Goal
Provide application development technologies
that make it easy to construct and execute
applications with reliable and often high
performance in the constantly-changing
environment of the Grid.
Major techniques to handle transparency and
dynamism
Dynamic configuration to available resources
(configurable object programs)
Performance contracts and dynamic reconfiguration

14
GrADS Software Architecture
Performance feedback
Software Components
Realtime perf monitor
Scheduler/ Service Negotiator
Grid runtime System (Globus)
Config. object program
Source appli- cation
whole program compiler
P S E
negotiation
Dynamic optimizer
libraries
Program Preparation System
Execution Environment
15
Configurable Object Programs

Integrated mapping strategy and cost model
Performance enhanced by context-depend. variants
Context includes potential execution platforms
Dynamic Optimizer performs final binding
Implements mapping strategy
Chooses machine-specific variants
Inserts sensors and actuators
Perform final compilation and optimization

16
Performance Contracts

A performance contract specifies the measurable
performance of a grid application.
Given
set of resources,
capabilities of resources,
problem parameters
the application will
achieve a specified, measurable performance

17
Creation of Performance Contracts
Program

Developer
Compiler
Measurements

PerformanceModel
MDS
Resource Broker
NWS
ResourceAssignment
PerformanceContract
18
History-Based Contracts

Resources given by broker
Capabilities of resources given by
Measurements of this code on those resources
Possibly scaled by the Network Weather Service
e.g. Flops/second and Bytes/second
Problem parameters
Given by the input data set
Application intrinsic parameters
Independent of execution platform
Measurements of this code with same problem
parameters
e.g. floating point operation count, message
count, message bytes count
Measurable Performance Prediction
Combining application parameters and resource
capabilities

19
Application and System Space Signature

Application Signature
trajectory of values through N-dimensional
metric space
one trajectory per process
e.g. one point per iteration
e.g. metric iterations/flop

20
Verification of Performance Contracts
Execution
Sensor Data

Violation detection
Fault detection

Rescheduling
ContractMonitor
SteerDynamic Optimizer
21
Peridot

Goal
Develop a scalable automatic performance analysis
system
Main target system Hitachi SR8000
Partners
Leibniz Computer Center
Research Center Jülich
Technical University Dresden
Technical University of Munich

22
Hierarchy of analysis agents

Agents are autonomous but cooperate
Agents are responsible for components
Whole system
Nodes
Processes
Work distribution based on ASL specification
Performance data are processed by leave nodes
Reducing communication in analysis hierarchy
Cooperation is done via higher level information
Talk by Karl Fürlinger, Session 02.2 on Friday

23
(No Transcript)
24
Summary

Scalability to large systems
Multiple HPC systems or tons of historic system
data
Integration of application- and system-level info
Tuning for intersite communication, improving
resource allocation, dynamic adaptation,
post-morten clarification
Online analysis
Only way to handle performance data set size
Needed for dynamic tuning
Automatic analysis
Needed for dynamic tuning, inspection of large
historic data sets, online analysis, model
generation of applications