Title: Interactive and semiautomatic performance evaluation
1Interactive and semiautomatic performance
evaluation
- W. Funika, B. Balis
- M. Bubak, R. Wismueller
2Outline
- Motivation
- Tools Environment Architecture
- Tools Extensions for GRID
- Semiautomatic Analysis
- Prediction model for Grid execution
- Summary
3Motivation
- Large number of tools, but mainly off-line and
non-Grid oriented ones - Highly dynamic character of Grid-bound
performance data - Tool development needs a monitoring system
- accessible via well-defined interface
- with a comprehensive range of possibilities
- not only to observe but also to control
- Recent initiatives (DAMS no perf, PARMON no
MP, OMIS) - Re-usability of existing tools
- Enhancing the functionality to support new
programming models - Interoperability of tools to support each other
- When interactive tools are difficult or
impossible to apply, (semi)automatic ones are of
help
4Component Structure of Environment
5X Task 2.4 - Workflow and Interfaces
requirements from
WP 1,3,4
2.1
6Application analysis
- Basic blocks of all applications dataflow for
input and output - CPU-intensive cores
- Parallel tasks / threads
- Communication
- Basic structures of the (Cross-) Grid
- Flow charts, diagrams, basic blocks from the
applications - Optional information on applications design
patterns e.g. SPMD, master/worker, pipeline,
divide conquer
7Categories of performance evaluation tools
- Interactive, manual performance analysis
- Off-line tools
- track based (combined with visualization)
- profile based (no time reference)
- problem strong influence when fine grained
measurements - On-line tools
- possible definition (restriction) of the
measurements at run-time - suitable with cyclic programs new measurements
based to the previous results. gt Automation of
the bottleneck search is possible - Semi-automatic and automatic tools
- Batch-oriented use of the computational
environment (e.g. Grid) - Basis Search-model enables possible refining of
measurements
8Defining new functionality of performance tool
- Types of measurements
- Types of presentation
- Levels of measurement granularity
- Measurement scopes
- Program
- Procedure
- Loop
- Function call
- Statement
- Code region identification
- Object types to be handled within an application
9Definition and design Work
- architecture of the tools, based on their
functional description - hierarchy and naming policy of objects to be
monitored - the tool/monitor interface, based on the
expressing of measurement requests in terms of
monitoring specification standard services - the filtering and grouping policy for the tools
- functions for handling the measurement requests
and the modes of their operation - granularity of measurement representation and
visualization modes - the modes of delivering performance data for
particular measurements
10Modes of delivering performance data
11Interoperability of tools
- Capability to run multiple tools concurrently
and apply them to the same application''Motivati
on- concurrent use of tools for different
tasks- combined use can lead to additional
benefits- enhanced modularityProblemsStructu
ral conflicts due to incompatible monitoring
modulesLogical conflicts e.g. a tool modifies
the state of an object while another tool still
keeps outdated information about it
12Semiautomatic Analysis
- Why (semi-)automatic on-line performance
evaluation? - ease of use - guide programmers to performance
problems - Grid exact performance characteristics of
computing resources and network often unknown to
user - tool should assess actual performance w.r.t.
achievable performance - interactive applications not well suited for
tracing - applications run 'all the time'
- detailed trace files would be too large
- on-line analysis can focus on specific execution
phases - detailed information via selective refinement
13The APART approach
- object oriented performance data model
- available performance data
- different kinds and sources, e.g. profiles,
traces, ... - make use of existing monitoring tools
- formal specification of performance properties
- possible bottlenecks in an application
- specific to programming paradigm
- APART specification language (ASL)
- specification of automatic analysis process
14APART specification language
- specification of performance property has three
parts - CONDITION when does a property hold?
- CONFIDENCE how sure are we? (depends on data
source) (0-1) - SEVERITY how important is the property?
- basis for determining the most important
performance problems - specification can combine different types of
performance data - data from different hosts gt global properties,
e.g. load imbalance - templates for simplified specification of related
properties
15Supporting different performance analysis goals
- performance analysis tool may be used to
- optimize an application (independent of execution
platform) - find out how well it runs on a particular Grid
configuration - can be supported via different definitions of
SEVERITY - e.g. communication cost
- relative amount of execution time spent for
communication - relative amount of available bandwidth used for
communication - also provides hints why there is a performance
problem (resources not well used vs. resources
exhausted)
16Analytical model for predicting performance on
GRID
- Extract the relationship between the application
and execution features, and the actual execution
time. - Focus on the relevant kernels in the applications
included in WP1. - Assuming message-passing paradigm
- (in particular MPI).
17Taking features into a model
- HW features
- Networks speeds
- CPU speeds
- Memory bandwith
- Application features
- Matrix and vector sizes
- Number of the required coomunications
- Size of these communications
- Memory access patterns
18Building a model
- Through statistical analysis, a model to predict
the influence of several aspects on the execution
of the kernels will be extracted. - Then, a particular model for each aspect will be
obtained. A linear combination of them will be
used to predict the whole execution time. - Every particular model will be a function of the
above features. - Aspects to be included in the model
- computations time as a function of the above
features - memory access time as a function of the features
- communications time as a function of the features
- synchronization time as a function of the
features
19X WP2.4 Tools w.r.t. DataGrid WP3
20Summary
- New requirements for performance tools in Grid
- Adaptation of int. performance ev. tool to GRID
- New measurements
- New dialogue window
- New presentations
- New objects
- Need in semiautomatic performance analysis
- Performance properties
- APART specification language
- Search strategy
- Prediction model construction
21Performance Measurements with PATOP
- Possible Types of Measurement
- CPU time
- Delay in Remote Procedure Calls (system calls
executed on front-end) - Delay in send and receive calls
- Amount of data sent and received
- Time in marked areas (code regions)
- Numer of executions of a specific point in the
source code - Scope of Measurement
- System Related
- Whole computing system,
- Individual nodes,
- Individual threads,
- Pairs of nodes (communication partners, for
send/receive), - Set of nodes specified by a performance condition
- Program Related
- Whole program,
- Individual functions
22PATOP
23Performance evaluation tools on top of the OCM
24On-line Monitoring Interface Specification
- The interface should provide the following
properties - support for interoperable tools
- efficiency (minimal intrusion, scalability)
- support for on-line monitoring (new objects,
control) - platform-independence (HW, OS, programming
library) - usability for any kind of run-time tool
(observing/manipulating, interactive/automatic,
centralized/distributed)
25Object based approach to monitoring
- observed system is a hierarchical set of objects
- classes nodes, processes, threads, messages, and
message queues - node/process model suitable for DMPs, NOWs, SMPs,
and SMP clusters - access via abstract identifiers (tokens)
- services observe and manipulate objects
- OMIS core services platform independent
- others platform (HW, OS, environment) specific
extensions - tools define their own view of the observed system
26Classification of overheads
- Synchronisation (e.g. barriers and locks)
- coordination of accessing data, maintaining
consistency - Control of parallelism (e.g. fork/join operations
and loop scheduling) - control and manage parallelism of a program
(user, compiler) - Additional computation - changes to sequential
code to increase paralellism or data locality - e.g. eliminating data dependences
- Loss of parallelism imperfect parallelisation
- un- or partially parallelised code, replicated
code - Data movement
- any data transfer within a process or between
processes
27Interoperability of PATOP and DETOP
- PATOP provides a high-level performance
measurement and visualisation - DETOP provides a source-code level debugging
- Possible scenarios
- Erroneous behaviour observed via PATOP
- Suspend application with DETOP, examine source
code - Measurement of execution phases
- Start/stop measurement at breakpoint
- Measurement on dynamic objects
- Start measurement at breakpoint when object is
created