CrossGrid After the First Year: A Technical Overview - PowerPoint PPT Presentation

About This Presentation
Title:

CrossGrid After the First Year: A Technical Overview

Description:

Title: Tools and Services for Interactive Applications in CrossGrid Author: MaQ Last modified by: bubak Created Date: 9/24/2002 8:45:43 AM Document presentation format – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 31
Provided by: Maq3
Category:

less

Transcript and Presenter's Notes

Title: CrossGrid After the First Year: A Technical Overview


1
CrossGrid After the First Year A Technical
Overview
Marian Bubak, Maciej Malawski, and Katarzyna
Zajac X TAT Institute of Computer Science ACC
CYFRONET AGH, Kraków, Poland www.eu-crossgrid.or
g

2
Main Objectives
  • A new category of Grid-enabled applications
  • Compute- and data-intensive
  • distributed
  • near real-time response (person in a loop)
  • layered
  • New programming tools
  • Grid more user-friendly, secure and efficient
  • Interoperability with other Grids
  • Implementation of standards

3
CrossGrid in a Nutshell
Interactive, Compute and Data Intensive
Applications
  • Interactive simulation and visualization of
  • a biomedical system
  • Flooding crisis team support
  • Distributed data analysis in HEP
  • Weather forecasting and air pollution modeling

Tool Environment
  • MPI code debugging and
  • verification
  • Metrics and benchmarks
  • Interactive and semiautomatic
  • performance evaluation tools

Application Specific Services
  • User Interactive Services
  • Grid Visualization Kernel

New Generic Grid Services
DataGrid
  • Portals and roaming access
  • Scheduling agents
  • Application and Grid monitoring
  • Optimization of data access

Services
Globus Middleware
Fabric
4
Key Features of CG Applications
  • Data
  • Data generators and databases geographically
    distributed
  • Selected on demand
  • Processing
  • Interactive
  • Requires large processing capacity both HPC
    HTC
  • Presentation
  • Complex data requires versatile 3D visualisation
  • Support interaction and feedback to other
    components

5
Biomedical Application
  • Adding small modifications to the proposed
    structure results in immediate changes in the
    blood flow.
  • Online presentation of simulation results via a
    3D environment.
  • The progress of the simulation and the estimated
    time of convergence should be available for
    inspection.

LB flow
simulation
Visualization
VE
WD
Interaction
PC
PDA
6
Basic Characteristics of Flood Simulation
  • Meteorological
  • Intensive simulation (HPC), large input/output
    data sets, high availability of resources
  • Hydrological
  • Parametric simulations (HTC) may require
    different models (heterogeneous simulations)
  • Hydraulic
  • Many 1-D simulations HTC, 2-D hydraulic
    simulations require HPC

7
Distributed Data Analysis in HEP
  • Objectives
  • Distributed data access
  • Distributed data mining techniques with neural
    networks
  • Issues
  • Typical interactive requests will run on o(TB) of
    distributed data
  • Transfer/replication times for the whole data on
    the order of one hour
  • Data transfers once and in advance of the
    interactive session.
  • Allocation, installation and setup the
    corresponding database servers before the
    interactive session starts

Portal XML in/out
Interactive
DB
Installation
Session
On-line output
Manager
Interactive Session
Database server
DISTRIBUTED PROCESSING
8
Weather Forecasting and Air Pollution Modeling
  • Distributed/parallel code on Grid
  • Coupled Ocean/Atmosphere Mesoscale Prediction
    System
  • STEM-II Air Pollution Code
  • Integration of distributed databases
  • Data mining applied to downscaling weather
    forecasts

9
Initial version of X architecture
1.4 Meteo Pollution
1.3 Data Mining on Grid (NN)
1.3 Interactive Distributed Data Access
1.2 Flooding
1.1 BioMed
Applications
3.1 Portal Migrating Desktop
2.4 Performance Analysis
2.2 MPI Verification
2.3 Metrics and Benchmarks
Supporting Tools
Applications Development Support
MPICH-G
1.1, 1.2 HLA and others
App. Spec Services
1.1 Grid Visualisation Kernel
1.3 Interactive Session Services
1.1 User Interaction Services
3.1 Roaming Access
3.2 Scheduling Agents
3.3 Grid Monitoring
3.4 Optimization of Grid Data Access
DataGrid Replica Manager
Globus Replica Manager
Generic Services
GRAM
GSI
Replica Catalog
GIS / MDS
GridFTP
Globus-IO
DataGrid Job Submission Service
Replica Catalog
Fabric
Resource Manager (CE)
Resource Manager
Resource Manager (SE)
Resource Manager
3.4 Optimization of Local Data Access
CPU
Secondary Storage
Instruments ( Satelites, Radars)
Tertiary Storage
10
Project Phases
M 4 - 12 first development phase design, 1st
prototypes, refinement of requirements
M 25 - 32 third development phase complete
integration, final code versions
M 33 - 36 final phase demonstration and
documentation
M 1 - 3 requirements definition and merging
M 13 - 24 second development phase integration
of components, 2nd prototypes
11
Tools
Benchmarks
G-PM
High Level Analysis Component
Applications executing on Grid testbed
RMD
Grid Monitoring
Performance Measurement Component
PMD
User Interface and Visualization Component
MPI Verification MARMOT
Application source code
Performance Prediction Component
  • MPI code debugging and verification
  • Metrics and benchmarks for the Grid environment
  • Grid-enabled Performance Measurement
  • Performance Prediction Component

12
MPI Verification
  • verifies the correctness of parallel, distributed
    Grid applications (MPI)
  • technical basis MPI profiling interface which
    allows a detailed analysis of the MPI application



Application or


Test Tool
Additional
Process


(Debug

Profiling Interface

Server)


Core Tool



Server Side


13
Benchmark Categories
  • Micro-benchmarks
  • For identifying basic performance properties of
    Grid services, sites, and constellations
  • Micro-kernels
  • Generic HPC/HTC kernels, including general and
    often-used kernels in Grid environments
  • Application kernels
  • Characteristic of representative CG applications

Embedding
Portal
gbView
Invocation
Retrieval
gbControl
gbARC
Storage/ Retrieval
gbRMP
Direct Invocation
Invocation/ Collection through GPM
SE storage
Grid Bench suite
14
Performance Measurement Tool G-PM
  • Components
  • performance measurement component (PMC),
  • component for high-level analysis (HLAC),
  • component for performance prediction (PPC) based
    on analytical performance models of application
    kernels,
  • user interface and visualization component UIVC.

UIVC
Interface
HLAC
Measurement
Interface
PMC
OCM-G
Interface
OCM-G
15
User Interactive Service
Interaction GidService
RTIExec GridService
Simulation GridService
Registry
Visualisation GridService
OGSA WSDL RTI Tuple Space functionality
description Dynamic discovery of OGSA Services
Large On-line Data transfer
Short Messages and Events
GridFTP
SOAP/IIOP
TCP or UDP/IP
  • enables end users to run distributed simulations
    in the Grid environment and to steer those
    simulations in near real time
  • uses OGSA mechanisms to call external resource
    brokers, job submission services (efficient and
    transparent execution of the simulation on the
    Grid).

16
Grid Visualization Kernel
  • addresses the problems of distributed
    visualization on heterogeneous devices
  • allows easily and transparently interconnect Grid
    applications with existing visualisation tools
    (AVS, OpenDX, VTK, ...)
  • handles multiple concurrent input data streams
  • multiplexes compressed data and images
    efficiently across long-distance networks

GVK Portal Server
GVK Visualization Planner
GRAM
GASS
MDS
GVK Visualization pipeline
Simulation Data
17
New Grid Services
  • Portals and roaming access
  • Grid resource management
  • Grid monitoring
  • Optimization of data access

18
Roaming Access Current Design
Web Browser
LDAP DataBase
Application Portal Server
Desktop Portal Server

Web Browser
Roaming
Replica
Access Server
Manager
Scheduling

Agent
Command
Line
Benchmarks
  • Portal - easier access and use of the Grid by
    applications
  • Migrating Desktop - a transparent, independent
    user environment
  • Roaming Access Server - responsible for managing
    user profiles, job submission, file transfers and
    Grid monitoring


19
Scheduling Agents - Current Design
  • scheduling user jobs over the CrossGrid testbed
    infrastructure,
  • submition based on Condor-G,
  • support for sequential and MPI parallel jobs,
    batch jobs and interactive jobs,
  • priorities and preferences determined by the user
    for each job

Web Portal


Resource
Resource list
Broker


Scheduling



Agent

Logging

Job monitoring




JSS commands
Bookkeping


JSS / CondorG





CE
CE
CE





20
Application Monitoring
  • OCM-G Components
  • Service Managers
  • Local Monitors
  • Application processes
  • Tool(s)
  • External name service
  • Component discovery



Tool

OMIS

ServiceManager


ExternalLocalization

OMIS

LocalMonitor


SharedMemory

ApplicationProcess
21
Infrastructure Monitoring
Jiro info
Jiro
Infrastructure
Services
MDS
MDS info
Globus
Static info
Information DB
Performance
Non-invasive
Information
Monitoring
Instruments
Post-processing
System
  • Infrastructure monitoring
  • Invasive monitoring (based on Jiro technology)
  • Non-invasive monitoring (Santa-G)

22
Data Access Design
  • Selection of specialized components best suited
    for data access operations
  • Estimation of data access latency and bandwidth
    inside the storage elements
  • Faster access to large tape-resident through
    fragmentation

23
Current status of CG Architecture
Applications
Supporting Tools
Application Specific Services
Generic Services
24
Application-centric view
25
The Current Testbed
  • The current CrossGrid testbed is based on
  • EDG distribution release 1.2.2 and 1.2.3
    (production)
  • EDG distribution release 1.4.3 (validation)
  • The current infrastructure permits
  • installation of initial prototypes of CrossGrid
    software releases
  • (described in M12 Deliverables)
  • testing applications using
  • Globus and EDG middleware
  • MPI
  • achieving compatibility with DataGrid and
    therefore extending Grid coverage in Europe

26
Grid Service
  • Transient, stateful Web Service (created
    dynamically)
  • Described by WSDL
  • Identified by Grid Service Handle (GSH) in the
    form of URI
  • Can be queried for configuration and state in
    standard way Service Data mechanism

27
Why use OGSA
  • Standards
  • to be part of the Grid to implement OGSA Grid
    protocols
  • Interoperability in heterogeneous environments
  • Possible contribution to future Grid activities

28
Grid Services where?
  • Dynamic service creation and lifetime management
    to control the state of some process, e.g.
  • user session in a portal
  • data transfer
  • running simulation.
  • Service data model can be applied to monitoring
    systems that can be used as information providers
    for other services.
  • Service discovery to solve the bootstrap
    problem
  • to connect the modules of a distributed
    simulation
  • to connect the application to a monitoring system

29
Steps towards OGSA
  • Using Web Service interfaces and XML where
    possible
  • Experimenting with prototyping services using
    OGSA alpha releases
  • Applying Grid Service extensions to services
  • Solving GT2 - GT3 transition and compatibility
    issues

30
Summary
  • Achievements of the first project year
  • Software Requirements Specifications together
    with use cases written
  • CrossGrid Architecture defined
  • Detailed Design documents for tools and new Grid
    services (OO approach, UML) written
  • First prototype of software running and
    documented
  • Detailed description of the test and integration
    procedures created
  • Testbed set up
Write a Comment
User Comments (0)
About PowerShow.com