Automatic Online Performance Analysis for the Grid - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Automatic Online Performance Analysis for the Grid

Description:

Scenarios for Performance Monitoring and Analysis. Post-mortem application analysis ... Post-Mortem Application Analysis. Requires ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 18
Provided by: lpdsS
Category:

less

Transcript and Presenter's Notes

Title: Automatic Online Performance Analysis for the Grid


1
Automatic Online Performance Analysisfor the Grid
  • Michael Gerndt
  • Technische Universität München
  • gerndt_at_in.tum.de

2
Grid Computing
  • Grids
  • enable communities (virtual organizations) to
    share geographically distributed resources as
    they pursue common goals -- assuming the absence
    of
  • central location,
  • central control,
  • omniscience,
  • existing trust relationships.
  • Globus Tutorial
  • Major differences to parallel systems
  • Dynamic system of resources
  • Large number of diverse systems
  • Sharing of resources
  • Transparent resource allocation

3
Requirements for Grid Performance Analysis
  • Two ways to attack the question
  • Scenarios
  • Application types

4
Scenarios for Performance Monitoring and Analysis
  • Post-mortem application analysis
  • Self-tuning applications
  • Grid scheduling
  • Grid management
  • GGF performance working group, DataGrid,
    CrossGrid

5
Self-Tuning Applications
  • Chris submits job
  • Application adapts to assigned resources
  • Application starts
  • Application monitors performance and adapts to
    resource changes
  • Requires
  • Integration of system and application monitoring
  • On-the-fly performance analysis
  • API for accessing monitor data (if PA by
    application)
  • Performance model and interface to steer
    adaptation (If PA and tuning decision by external
    component.)

6
Post-Mortem Application Analysis
  • George submits job to the Grid
  • Job is executed on some resources
  • George receives performance data
  • George analyzes performance
  • Requires
  • either resources with known performance
    characteristics (QoS)
  • or system-level information to assess performance
    data
  • scalability of performance tools
  • Focus will be on interacting components

7
Grid-Scheduling
  • Gloria determines performance critical
    application properties
  • She specifies a performance model
  • Grid scheduler selects resources
  • Application is started
  • Requires
  • PA of the grid application
  • Possibly benchmarking the application
  • Access to current performance capabilities of
    resources
  • Even better to predicted capabilities

8
Grid-Management
  • George claims to see bad performance since one
    week.
  • The helpdesk runs the Grid performance analysis
    software.
  • Periodical saturation of connections is detected.
  • Requires
  • PA of historical system information
  • Need to be done in a distributed fashion

9
Application Types
  • Remote site access
  • Parameter studies
  • Workflow applications
  • Metacomputing applications
  • Data-intensive applications

10
Deployment of PA Tools
11
Requirements for Grid PA Tools
  • Scalability to large systems
  • Multiple HPC systems or tons of historic system
    data
  • Integration of application- and system-level info
  • Tuning for intersite communication, improving
    resource allocation, dynamic adaptation,
    post-morten clarification
  • Online analysis
  • Only way to handle performance data set size
  • Needed for dynamic tuning
  • Automatic analysis
  • Needed for dynamic tuning, inspection of large
    historic data sets, online analysis, model
    generation of applications

12
New Aspect of Performance Analysis
  • Transparent resource allocation
  • Dynamism in resource availability
  • Even larger and geographically dispersed systems
  • Approaches in the following projects
  • Damien
  • Datagrid
  • Crossgrid
  • GrADS

13
Managing Dynamism The GrADS Approach
  • GrADS (Grid Application Development Software)
  • Funded by National Science Foundation, started
    2000
  • Goal
  • Provide application development technologies
    that make it easy to construct and execute
    applications with reliable and often high
    performance in the constantly-changing
    environment of the Grid.
  • Major techniques to handle transparency and
    dynamism
  • Dynamic configuration to available resources
    (configurable object programs)
  • Performance contracts and dynamic reconfiguration

14
GrADS Software Architecture
Performance feedback
Software Components
Realtime perf monitor
Scheduler/ Service Negotiator
Grid runtime System (Globus)
Config. object program
Source appli- cation
whole program compiler
P S E
negotiation
Dynamic optimizer
libraries
Program Preparation System
Execution Environment
15
Configurable Object Programs
  • Integrated mapping strategy and cost model
  • Performance enhanced by context-depend. variants
  • Context includes potential execution platforms
  • Dynamic Optimizer performs final binding
  • Implements mapping strategy
  • Chooses machine-specific variants
  • Inserts sensors and actuators
  • Perform final compilation and optimization

16
Performance Contracts
  • A performance contract specifies the measurable
    performance of a grid application.
  • Given
  • set of resources,
  • capabilities of resources,
  • problem parameters
  • the application will
  • achieve a specified, measurable performance

17
Creation of Performance Contracts
Program
  • Developer
  • Compiler
  • Measurements

PerformanceModel
MDS
Resource Broker
NWS
ResourceAssignment
PerformanceContract
18
History-Based Contracts
  • Resources given by broker
  • Capabilities of resources given by
  • Measurements of this code on those resources
  • Possibly scaled by the Network Weather Service
  • e.g. Flops/second and Bytes/second
  • Problem parameters
  • Given by the input data set
  • Application intrinsic parameters
  • Independent of execution platform
  • Measurements of this code with same problem
    parameters
  • e.g. floating point operation count, message
    count, message bytes count
  • Measurable Performance Prediction
  • Combining application parameters and resource
    capabilities

19
Application and System Space Signature
  • Application Signature
  • trajectory of values through N-dimensional
    metric space
  • one trajectory per process
  • e.g. one point per iteration
  • e.g. metric iterations/flop

20
Verification of Performance Contracts
Execution
Sensor Data
  • Violation detection
  • Fault detection

Rescheduling
ContractMonitor
SteerDynamic Optimizer
21
Peridot
  • Goal
  • Develop a scalable automatic performance analysis
    system
  • Main target system Hitachi SR8000
  • Partners
  • Leibniz Computer Center
  • Research Center Jülich
  • Technical University Dresden
  • Technical University of Munich

22
Hierarchy of analysis agents
  • Agents are autonomous but cooperate
  • Agents are responsible for components
  • Whole system
  • Nodes
  • Processes
  • Work distribution based on ASL specification
  • Performance data are processed by leave nodes
  • Reducing communication in analysis hierarchy
  • Cooperation is done via higher level information
  • Talk by Karl Fürlinger, Session 02.2 on Friday

23
(No Transcript)
24
Summary
  • Scalability to large systems
  • Multiple HPC systems or tons of historic system
    data
  • Integration of application- and system-level info
  • Tuning for intersite communication, improving
    resource allocation, dynamic adaptation,
    post-morten clarification
  • Online analysis
  • Only way to handle performance data set size
  • Needed for dynamic tuning
  • Automatic analysis
  • Needed for dynamic tuning, inspection of large
    historic data sets, online analysis, model
    generation of applications
Write a Comment
User Comments (0)
About PowerShow.com