Monitoring of Interactive Grid Applications - PowerPoint PPT Presentation

About This Presentation
Title:

Monitoring of Interactive Grid Applications

Description:

Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August ... Debuggers. Performance analyzers. Visualizers ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 45
Provided by: jacekn
Category:

less

Transcript and Presenter's Notes

Title: Monitoring of Interactive Grid Applications


1
Monitoring of Interactive Grid Applications
Marian Bubak with Bartosz Balis, Wlodek Funika,
Tomasz Szepieniec, Roland Wismueller Institute
of Computer Science and ACC CYFRONET AGH, Cracow,
Poland LRR-TUM, Muenchen, Germany Institute for
Software Science, University of Vienna,
Austria EU CrossGrid Project www.eu-crossgrid.org

2
Outline
  • Motivation - CrossGrid in a nutshell
  • Applications and their requirements
  • Architecture
  • Tools for applications development
  • Monitoring system
  • Concept of Grid application monitoring
  • Grid extensions for OMIS
  • Design of OCM-G
  • Security
  • Status

3
EU Funded Grid Project Space (Kyriakos
Baxevanidis)
4
CrossGrid Collaboration
Ireland TCD Dublin
Poland Cyfronet INP Cracow PSNC Poznan ICM
IPJ Warsaw
Germany FZK Karlsruhe TUM Munich USTU Stuttgart
Netherlands UvA Amsterdam
Slovakia II SAS Bratislava
Austria U.Linz
Spain CSIC Santander Valencia RedIris UAB
Barcelona USC Santiago CESGA
Greece Algosystems Demo Athens AuTh Thessaloniki
Portugal LIP Lisbon
Italy DATAMAT
Cyprus UCY Nikosia
5
Biomedical Application
CT / MRI scan
Segmentation
Visualization
LB flow
simulation
Medical
Medical
HDB
VE
DB
DB
WD
PC
PDA
10 simulations/day 60 GB 20 MB/s
Interaction
6
VR-Interaction
7
Cascade of Flood Simulations
Data sources
Meteorological simulations
Hydrological simulations
Users
Hydraulic simulations
Output visualization
8
Example of the Flood Simulation - Flow and Water
Depth
9
Distributed Data Analysis in High Energy Physics
  • Objectives
  • Distributed data access
  • Distributed data mining techniques with neural
    networks
  • Issues
  • Typical interactive requests will run on o(TB)
    distributed data
  • Transfer/replication times for the whole data
    about one hour
  • Data transfers once and in advance of the
    interactive session
  • Allocation, installation and set-up of
    corresponding database servers before the
    interactive session

10
Weather Forecast and Air Pollution Modeling
  • Distributed/parallel codes on the Grid
  • Coupled Ocean/Atmosphere Mesoscale Prediction
    System
  • STEM-II Air Pollution Code
  • Integration of distributed databases
  • Data mining applied to downscaling weather
    forecast

11
Key Features of CrossGrid Applications
  • Data
  • Data sources and data bases geographically
    distributed
  • To be selected on demand
  • Processing
  • Large processing capacity required both HPC
    HTC
  • Interactive
  • Presentation
  • Complex data requires versatile 3D visualisation
  • Support for interaction and feedback to other
    components

12
Overview of the CrossGrid Architecture
1.4 Meteo Pollution
1.3 Data Mining on Grid (NN)
1.3 Interactive Distributed Data Access
1.2 Flooding
1.1 BioMed
Applications
3.1 Portal Migrating Desktop
2.4 Performance Analysis
2.2 MPI Verification
2.3 Metrics and Benchmarks
Supporting Tools
Applications Development Support
MPICH-G
1.1, 1.2 HLA and others
App. Spec Services
1.1 Grid Visualisation Kernel
1.1 User Interaction Services
3.1 Roaming Access
3.2 Scheduling Agents
3.3 Grid Monitoring
3.4 Optimization of Grid Data Access
DataGrid Replica Manager
Globus Replica Manager
Generic Services
GRAM
GSI
Replica Catalog
GIS / MDS
GridFTP
Globus-IO
DataGrid Job Submission Service
Replica Catalog
Fabric
Resource Manager (CE)
Resource Manager
Resource Manager (SE)
Resource Manager
3.4 Optimization of Local Data Access
CPU
Secondary Storage
Instruments ( Satelites, Radars)
Tertiary Storage
13
Tool Environment
manual information transfer
14
Tools Environment and Grid Monitoring
Applications
Portals (3.1)
G-PM Performance Measurement Tools (2.4)
MPI Debugging and Verification (2.2)
Metrics and Benchmarks (2.4)
Grid Monitoring (3.3) (OCM-G, RGMA)
Application programming environment
requires information from the Grid about current
status of applications and it should be able to
manipulate them
15
Monitoring of Grid Applications
  • Monitor obtain information on or manipulate
    target application
  • e.g. read status of applications processes,
    suspend application, read / write memory, etc.
  • Monitoring module needed by tools
  • Debuggers
  • Performance analyzers
  • Visualizers
  • ...

16
CrossGrid Monitoring System
17
Concept of Grid Applications Monitoring
  • OCM-G Grid-enabled OMIS-Compliant Monitor
  • OMIS On-line Monitoring Interface Specification
  • Application-oriented
  • information about running applications
  • On-line
  • information collected at runtime
  • immediately delivered to consumers
  • Information collected via instrumentation
  • activated / deactivated on demand
  • information of interest defined at runtime (lower
    overhead)

18
Monitoring Autonomous System
  • Separate monitoring system
  • Tool / Monitor interface OMIS

19
Why OMIS ?
  • Universal generic interface supporting different
    tools
  • May be extended to add new grid-oriented
    functionality
  • Fits to the GGFs Grid Monitoring Architecture
    (GMA)
  • e.g., event-action paradigm enables
    data-subscription scenario

20
Very Short Overview of OMIS
  • Target system view
  • hierarchical set of objects
  • nodes, processes, threads
  • For the Grid new objects sites
  • objects identified by tokens, e.g. n_1, p_1, etc.
  • Three types of services
  • information services
  • manipulation services
  • event services

21
OMIS Services
  • Information services
  • obtain information on target system
  • e.g. node_get_info obtain information on nodes
    in the target system
  • Manipulation services
  • perform manipulations on the target system
  • e.g. thread_stop stop specified threads
  • Event services
  • detect events in the target system
  • e.g. thread_started_libcall detect invocations
    of specified functions
  • Information manipulation services actions

22
OMIS Requests
  • Services are combined into two types of
    monitoring
  • requests
  • Unconditional requests
  • to be executed immediately
  • executed only once
  • Conditional requests
  • to execute actions whenever event occurs
  • actions can be executed multiple time

23
OMIS Unconditional Requests
  • thread_stop(t_1)

Actions
Operands
stop thread t_1
24
OMIS Conditional Requests
thread_started_libcall(t_1, MPI_Send)
counter_inc(c_1)
Event
Operands
Actions
whenever thread t_1 invokes MPI_Send, increment
counter c_1
25
New OMIS Services for Grid (1/3)
  • Services related to the new object site
  • site_attach attach to a site
  • site_get_info return information on a site
  • site_get_nodelist return a list of nodes on a
    site
  • Services for application-related metrics
  • hardware_read_counter return value of a
    hardware performance counter

26
New OMIS Services for Grid (2/3)
  • Services for infrastructure-related metrics
  • network_get_info return information on a
    network connection
  • Benchmark-related services
  • benchmark_get_result return a result of a
    benchmark
  • benchmark_execute execute benchmark

27
New OMIS Services for Grid (3/3)
  • Services for application handling
  • app_attach attach to an application
  • app_attach2 attach to an application
  • app_get_list get a list of running applications
  • app_get_proclist return process list of an
    application
  • Services related to probes
  • thread_executes_probe a probe has been executed

28
Grid-enabled OMIS-Compliant Monitor
  • Features
  • Permanent Grid service
  • External interface OMIS
  • Architecture two types of components
  • Local Monitors
  • Service Managers

29
Components of OCM-G
  • Service Managers
  • one per site in the system
  • permanent
  • request distribution
  • reply collection
  • Local Monitors
  • one per node, user pair
  • transient (created or destroyed when needed)
  • handle local objects
  • actual execution of requests

30
Monitoring Environment
  • OCM-G Components
  • Service Managers
  • Local Monitors
  • Application processes
  • Tool(s)
  • External name service
  • Component discovery

31
OCM-G Unconditional Requests
  • Immediate response from the OCM-G

32
OCM-G Conditional Request
  • Two stages
  • Request registration (msgs 1-1.2.2)
  • Request executed when event occurs (msgs 2-2.3.1)

33
OCM-G SM and LM Modules
  • Core
  • Initialization of the OCM-G components
  • Initial preprocessing of all messages

34
OCM-G SM and LM Modules
  • Communication
  • Uniform Interface for component-to-component
    communication

35
OCM-G SM and LM Modules
  • Internal localization
  • Internal name service
  • Tokens

36
OCM-G SM and LM Modules
  • External localization
  • Uniform access to external information services

37
OCM-G SM and LM Modules
  • Services
  • Implementation of OMIS services

38
OCM-G SM and LM Modules
  • Request management
  • OMIS requests analysis and distribution
  • Reply handling

39
OCM-G SM and LM Modules
  • Application context
  • Represents information about applications

40
OCM-G SM and LM Modules
  • User
  • User management
  • Authentication and authorization

41
OCM-G - SM and LM Modules
  • Application module
  • Part of OCM-G linked to the application

42
Security Issues
  • OCM-G components handle multiple users, tools and
    applications
  • possibility to issue a fake request (e.g., posing
    as a different user)
  • authentication and authorization needed
  • LMs are allowed for manipulations
  • unauthorized user can do anything

43
Security - Solutions
  • LMs are user-bound
  • Run as user processes
  • Security ensured by OS mechanisms
  • Service Managers are permanent
  • Run as unprivileged processes (nobody)
  • User Grid Id checked internally (partial
    security)
  • Grid certificates for users, tools and SMs
    incorporated (ultimate security)

44
Status
  • OCM implementation for clusters
  • Software requirements specification
  • OMIS extensions for the Grid
  • OCM-G concept OO design
  • 1st prototype in December 2002
  • Available via a public software licence
  • More www.eu-crossgrid.org
Write a Comment
User Comments (0)
About PowerShow.com