Title: Performance Analysis and Monitoring of Grid applications in CG Part I' Performance Analysis
1Performance Analysis and Monitoring of Grid
applications in CGPart I. Performance Analysis
- Marian Bubak1,2, Wlodzimierz Funika1, Roland
Wismueller3,4, Tomasz Arodz1, Marcin Kurdziel1
1 Institute of Computer Science AGH, Cracow,
Poland 2 ACC CYFRONETAGH, Cracow, Poland 3 TUM,
Munich, Germany 4 University of Vienna, Vienna,
Austria
2Task 2.4 Goals
- Provide G-PM tool for evaluation of grid
applications performance - Providing rich set of predefined measurements
- Allowing for user-defined measurements
- Allowing for probe-based measurements
- Providing on-line performance measurement
visualization - Compliant with OMIS 2.0 monitoring standard
interface
3Performance Measurement (1)
- User links application with instrumented
libraries (MPI ...) - User submits job with additional command line
arguments - monitoring (yes/no), block at beginning (yes/no)
- monitoring system address, application identifier
- User starts G-PM tool (as an X11 application) on
his workstation - user provides monitoring system address G-PM
connects to it - G-PM shows a list of started applications user
selects one - or user provides application identifier G-PM
waits until this application started
4Performance Measurement (2)
- User does performance analysis of running
application - User exits G-PM, before or after application
terminates - exiting G-PM will not terminate application
- all measurements will be deleted
- User can re-attach to application and do new
measurements
5Metrics vs. measurements
- Measurements are based on
- Standard metrics
- Higher Level Analysis metrics
- Custom metrics
- Probe-based metrics
Metric (def.) The defined measurement method
with its measurement scale. Metrics can be
internal or external, and direct or indirect.
Synonymous with measure. A quantitative measure
of the degree to which a system, component, or
process possesses a given attribute.
6Standard metrics (1)
- Wall clock/CPU time
- Total
- In communication
- Send, Receive, Collective, Barier
- In I/O
- Read, Write
- Data volume
- communication
- IO
- Number of library calls
- communication
- IO
7Standard metrics (2)
- Host metrics
- CPU load
- Available memory
- Network metrics
- Load
- Bandwidth
- Benchmark metrics
- CPU, Network
8Higher Level Analysis metrics
- Custom metrics
- Defined on the basis of standard metrics
- Providing higher level of abstraction
- Programmed in the ASL specification language
- Probes
- Special function calls inserted into source code
by the programmer - Define events that can be used in definition of
custom metrics - Provide a way of passing arguments to G-PM
9Measurements parametrisation
- Measurements can be restricted to specific
- Objects
- Sites, hosts, processes, files
- Partner objects
- Sites, hosts, processes
- Locations in source code
- Modules, functions
- Time resolution
- Integral, Mean value, Current value
- Virtual time
10The G-PM User Interface
- Maintains a list of measurements
- Provides a set of predefined visualization
widgets - Different measurements can be visualized
simultaneously - Single measurement can be visualized with
different widgets simultaneously.
11G-PM Main window
- List of active measurements
- after definition measurements are displayed in
the list
Selected measurements are objects for actions
taken from menu
Selected visualizations are objects for actions
taken from menu
List of active visualizations after definition
visualizations are displayed in the list
12Measurement definition window
Specifies which part of code should be measured
e.g.a particular function
Specifies where measurement should be done e.g
on which site, host, process etc.
In measurements that involve two processes, such
as traffic between process A and process B it
specifies the second partner.
Specifies what should be measured, e.gSend
Volume
13Visualization definition window
Various scale parameters
Type of display can be chosen here
Time mode
Refresh rate
14Example visualization widget (1)
Multicurve plot Measurement value vs. time
A time interval can be specified for additional
computations(e.g integration)
The computed result is displayed here
Each measurement has its own curve. The curves
are scrolling when new values arrive.
This widget displays a couple of measurements
simultaneously
Other parameters of the visualization (e.g a
scale) can be adjusted here.
A measurement for which we do computation can be
selected here. Its curve will be displayed in red.
15Example visualization widget (2)
- Bar plot Compare actual measurement values
This widget displays a couple of measurements
simultaneously with one scale, thus allowing for
easy comparison
Visualization parameters (e.g. automatic/manual
graphs sorting).
16Using G-PM sequence diagram (1)
Measurement definition window is created
The type of measurement and its properties are
specified
A display window for new measurement is created
17Using G-PM sequence diagram (2)
Visualization widget and its properties are
specified
The measurement is activated
Measured values are displayed
18Use case description
- Medical simulation application with visualization
kernel (VK) - Simulation on different site (server) than the
visualization (client) - Task
- analyse performance of simulation to
visualization communication
19Use case code instrumentation
- Programmer inserts three probes
- In the source code on server
- Probe A
- After server asks client to visualize frame
- Probe B
- After data is sent to client
- In the source code on client
- Probe C
- Before data is passed to graphics engine
- Programmer recompiles the application
20Use case new metrics
- Three new custom metrics
- Generate frames/sec
- 1/(time between invocations of probe A)
- Compression factor
- (data passed to probe C) / (sent volume between
execution of probe A and probe B) - VK processing time/frame
- (time interval between execution of probe A and
probe B) - New metrics can be used in the same way as the
built-in ones
21OCM-G lower monitoring layer
- Underlying measurement of built-in metrics
- Application related
- Grid related
- Aggregation of measurements from multiple hosts
- Handling of probes inserted in the source code
- Handle probe events
- Handle custom probe parameters
- Access to results of micro-benchmarks
- Host related
- Network related
22First Prototype
- supports (MPI) applications on local cluster
- assumes common file system
- Standard metrics
- data sent / received
- CPU usage of process
- delays due to communication
- Higher Level Analysis metrics
- application specific events / data (probes)
- specialized examples (demonstrators)
- Visualization windows
- bar graph / multicurve diagram
23Using G-PM Custom metrics
Edit Metrics window is created, with all
user-defined metrics shown
Metric to be edited is selected
Measurements based on the metric are stopped and
removed
Metric is being edited
The metrics list is updated with new metric
version replacing the old version
24Monitoring system OCM-G
- Basis for G-PM performance analysis tool
- Features
- Based on OMIS 2.0 specification
- Extended to allow for measurement of both
application- and grid-related metrics - Interface based on request/reply paradigm
- Manipulation requests
- pa_counter_global_create()
- Event requests
- patop_lib_call_started(,pvm_send,,)
- pa_counter_global_increment(cnt1,par3)
- Information requests
- pa_counter_global_read(cnt1,0)
- Synchronous data requests, asynchronous data
replies with callbacks
All target processes
All code functions
All threads
Function name
Counter ID
Increment value - send volume
Counter ID
Do not clear counter