The EU DataGrid - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

The EU DataGrid

Description:

Provided by the work packages responsible for the resource ... Information will be lost if not consumed in time. Streaming well defined ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 33
Provided by: Wils231
Category:

less

Transcript and Presenter's Notes

Title: The EU DataGrid


1
The EU DataGrid Information and Monitoring
Services
  • The European DataGrid Project Team
  • http//www.eu-datagrid.org

2
Information and Monitoring Services
  • EDG information providers
  • Software that provides information about
    resources and infrastructure
  • Provided by the work packages responsible for the
    resource
  • Globus MDS (Metacomputing Directory Service or
    Monitoring and Discovery Service as it is now
    called)
  • Based on OpenLDAP, a hierarchical database
  • R-GMA (Relational Grid Monitoring Architecture)
  • A relational implementation of the Global Grid
    Forums GMA
  • Overview
  • Uses within the testbed

3
EDG Information Providers
4
LDAP - Directory Information Tree
computing element
storage element
site information
network information between this and other sites
status
file statistics
supported protocols
storage elements that are close (not necessarily
at the same site)
5
Siteinfo
  • insiteinfo,Mds-Vo-nameral-dev,Mds-Vo-nameuk,oG
    rid
  • objectClass SiteInfo
  • objectClass DataGridTop
  • objectClass DynamicObject
  • siteName RALDEV
  • sysAdminContact grid.sysadmin_at_rl.ac.uk
  • userSupportContact grid.support_at_rl.ac.uk
  • siteSecurityContact grid.security_at_rl.ac.uk
  • dataGridVersion 1.2
  • installationDate 20020704142800Z

6
Computing Element
  • ceIddev01.hepgrid.clrc.ac.uk2119/jobmanager-pbs-
    M,hndev01.hepgrid.clrc.ac.uk,Mds-Vo-nameral-dev,
    Mds-Vo-nameuk,oGrid
  • objectClass DataGridTop
  • objectClass ComputingElement
  • CEId dev01.hepgrid.clrc.ac.uk2119/jobmanager-pbs
    -M
  • GlobusResourceContactStringdev01.hepgrid.clrc.ac.
    uk2119/jobmanager-pbs/OGrid/OUKHEP/CNdev01.he
    pgrid.clrc.ac.uk
  • GRAMVersion ?
  • Architecture intel
  • OpSys RH 6.2
  • MinPhysicalMemory 258
  • MinLocalDiskSpace 2048
  • TotalCPUs 1
  • FreeCPUs 1
  • NumSMPs 0
  • MinSPUProcessors 0
  • MaxSPUProcessors 0
  • TotalJobs 0
  • RunningJobs 0
  • IdleJobs 0
  • IdleJobs 0
  • MaxTotalJobs 1
  • MaxRunningJobs 1
  • WorstTraversalTime 108000
  • EstimatedTraversalTime 0
  • Active TRUE
  • Priority 20
  • MaxCPUTime 108000
  • MaxWallClockTime 432000
  • AverageSI00 300
  • MinSI00 300
  • MaxSI00 300
  • AuthorizedUser/OGrid/OUKHEP/OUhepgrid.clrc.ac.
    uk/CNTim Eves
  • AuthorizedUser/OGrid/OUKHEP/OUhepgrid.clrc.ac.
    uk/CNTim Folkes
  • RunTimeEnvironment RALDEV
  • AFSAvailable FALSE
  • OutboundIP TRUE
  • InboundIP FALSE
  • QueueName M

7
Close Storage Element
  • closeSEdev02.hepgrid.clrc.ac.uk,ceIddev01.hepgri
    d.clrc.ac.uk2119/jobmanager-pbs-M,
    hndev01.hepgrid.clrc.ac.uk,Mds-Vo-nameral-dev,Md
    s-Vo-nameuk,oGrid
  • objectClass CloseStorageElement
  • objectClass DataGridTop
  • objectClass DynamicObject
  • CEIddev01.hepgrid.clrc.ac.uk2119/jobmanager-pbs-
    M
  • CloseSE dev02.hepgrid.clrc.ac.uk
  • MountPoint /flatfiles

8
Storage Element
  • seIddev02.hepgrid.clrc.ac.uk,Mds-Vo-nameral-dev,
    Mds-Vo-nameuk,oGrid
  • objectClass StorageElement
  • objectClass DataGridTop
  • objectClass DynamicObject
  • SEId dev02.hepgrid.clrc.ac.uk
  • CloseCE dev01.hepgrid.clrc.ac.uk2119/jobmanager-
    pbs-M
  • SEtypearchitecture disk
  • SEsize 13177
  • SEResourceContactString grid.support_at_rl.ac.uk
  • SEvo wpsix

9
Storage Element Protocols
  • seProtocolgridftp, seIddev02.hepgrid.clrc.ac.uk,
    Mds-Vo-nameral-dev,Mds-Vo-nameuk,oGrid
  • objectClass StorageElementProtocol
  • objectClass DataGridTop
  • objectClass DynamicObject
  • SEId dev02.hepgrid.clrc.ac.uk
  • SEProtocol gridftp
  • Port 2811
  • seProtocolrfio, seIddev02.hepgrid.clrc.ac.uk,Mds
    -Vo-nameral-dev,Mds-Vo-nameuk,oGrid
  • objectClass StorageElementProtocol
  • objectClass DataGridTop
  • objectClass DynamicObject
  • SEId dev02.hepgrid.clrc.ac.uk
  • SEProtocol rfio
  • Port 3147
  • seProtocolfile, seIddev02.hepgrid.clrc.ac.uk,Mds
    -Vo-nameral-dev,Mds-Vo-nameuk,oGrid
  • objectClass StorageElementProtocol
  • objectClass DataGridTop

10
Storage Element Status
  • instatus,seIddev02.hepgrid.clrc.ac.uk,Mds-Vo-nam
    eral-dev,Mds-Vo-nameuk,oGrid
  • objectClass StorageElementStatus
  • objectClass DataGridTop
  • objectClass DynamicObject
  • SEfreespace 12031
  • SEId dev02.hepgrid.clrc.ac.uk

11
MDS
  • Monitoring and Discovery Service

12
LDAP search
  • ldapsearch
  • -x
  • -H ldap//lxshare0225.cern.ch2135
  • -b 'Mds-Vo-namedatagrid,ogrid
  • 'objectclass'

13
GRIS/GIIS Hierarchy
  • Mds-Vo-namedatagrid,ogrid
  • This will look at all the data
  • Mds-Vo-namecountryA,Mds-Vo-namedatagrid,ogrid
  • This will look at all the data from countryA
  • Mds-Vo-namecountryA,ogrid
  • This will look at all the data from countryA
  • Mds-Vo-namesiteB,Mds-Vo-namecountryA,ogrid
  • This will look at all the data from siteB
  • Mds-Vo-namesiteB,ogrid
  • This will look at all the data from siteB

14
Map Centre WP7
  • Alternatively the information can be viewed using
    WP7s Map Centre
  • http//ccwp7.in2p3.fr/mapcenter/

15
R-GMA
  • Relational - Grid Monitoring Architecture
  • An Overview

16
The Consumer Producer Model
Producer
  • Use the Grid Monitoring Architecture from Global
    Grid Forum
  • A relational implementation
  • Applied to both information and monitoring
  • Creates impression that you have one RDBMS per
    Virtual Organization

Registry
Command flow
Information flow
Consumer
17
Relational Approach
  • Not a general distributed RDBMS system, but a way
    to use the relational model in a distributed
    environment where ACID properties are not
    generally important.
  • Producers announce SQL CREATE
    TABLE publish SQL INSERT
  • Consumers collect SQL SELECT

18
R-GMA
Application Code
command flow Information flow
Consumer Servlet
Consumer API
9
Registry API
4
5
Registry Servlet
  • API Servlet communication
  • http(s) in
  • XML back

Schema API
6
8
2
3
Registry API
7
Producer API
Schema Servlet
1
ProducerServlet
Sensor Code
19
Schema Contributions
CPULoad (Global Schema) CPULoad (Global Schema) CPULoad (Global Schema) CPULoad (Global Schema) CPULoad (Global Schema)
Country Site Facility Load Timestamp
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
CH CERN ALICE 0.9 19055611022002
CH CERN CDF 0.6 19055511022002
CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2)
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1)
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
CPULoad (Producer3) CPULoad (Producer3) CPULoad (Producer3) CPULoad (Producer3) CPULoad (Producer3)
CH CERN ATLAS 1.6 19055611022002
CH CERN CDF 0.6 19055511022002
20
Contributions are Views
CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1)
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
SELECT FROM cpuLoad WHERE country UK AND
site RAL
CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2)
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
SELECT FROM cpuLoad WHERE country UK AND
site GLA
21
The Mediator
r a table (in a virtual database) S a relational
schema (for a virtual database) q queries posed
against S p Producers, associated with views on
S. Currently views have the form SELECT
FROM r WHERE lt ??? gt
  • The Mediator
  • How to match q with the ps ?
  • It is the mediator which makes R-GMA work
  • It is hidden inside the ConsumerServlet

22
The Mediator cont.
  • Recall that you can view R-GMA as a huge
    relational database of all the information
    produced
  • Producers register which partition of the dataset
    they have by means of a partitioning predicate
  • Some data
  • may not be accessible
  • may not be kept
  • may be duplicated
  • The mediator is hidden inside the Consumer
  • but is an essential part of R-GMA
  • The final mediator will take any SQL statement
    and from the information in the registry find the
    right producers
  • We can now merge information from several
    producers

23
Not just one Producer
  • CircularBufferProducer
  • Fast
  • Uses an SQL parser no RDBMs involved
  • Information will be lost if not consumed in time
  • Streaming well defined
  • 1 write pointer for buffer and 1 read pointer for
    each Consumer
  • DataBaseProducer
  • Slower
  • Information not lost
  • Clean up strategy needed
  • Streaming needs to be defined
  • Archiver
  • Just Consumers and a DataBaseProducer

24
R-GMA
  • Uses Within the Testbed

25
Deployment For Release 1.3
Information Catalogue
Tomcat
Registry Servlet
Schema Servlet
command flow Information flow
13
Storage Element
Resource Broker
Computing Element
6
7
12
2
14
10
15
8
16
11
9
1
5
LDAP SEinformationprovider
17
3
4
LDAP server
LDAP CEinformationprovider
Site A
26
R-GMA Resource Broker (WP1)
GOUT
Archiver
Consumer (CE)
Consumer (SE)
Consumer
LDAP server
DataBaseProducer
RDBMS
Consumer (..)
Clean up
27
R-GMA Logging Bookkeeping (WP1)
  • BUT
  • CircularBufferProducer may lose information
  • Need something fast, capable of streaming, but
    which keeps data safe
  • Need to deal with active jobs
  • Implies a triggering mechanism
  • Only want latest info from a job
  • Will provide an overwrite mode

Archiver of A-H
CircularBufferProducer
BS
Archiver of I-N
CircularBufferProducer
BS
Archiver of O-Z
  • Each Bookkeeping Server publishes to a
    CircularBufferProducer
  • Archiver has a where clause to collect jobs
    belonging to a subset of users
  • Most queries will be satisfied by one Archiver
  • The consumer registration enables the consumer
    part of the Archiver to be notified when a new BS
    appears

28
R-GMA Network Monitoring (WP7)
CircularBufferProducer
NM
CircularBufferProducer
NM
WP7 cost fn
Consumer
Archiver
NSE
SE-GIN
CircularBufferProducer
NCE
CE-GIN
CircularBufferProducer
  • Each NetworkMonitor publishes to a
    CircularBufferProducer
  • Archiver collects all information needed to
    satisfy the cost function query which WP7
    provides for WP2
  • Offers a number of interesting queries to deduce
    information for paths which are not measured

29
R-GMA End Users (WP8-10)
  • GRM/PROVE for parallel applications (from release
    2.0)
  • Also depends upon WP1 offering parallel job
    submission
  • May work with LB people to handle user fields
  • From release 2 users can publish Anything with
    R-GMA
  • A general table is available (release 1.3)
    userTable
  • Structure
  • userId VARCHAR(255)
  • aString VARCHAR(255)
  • aReal REAL
  • anInteger INT
  • timeStamp CHAR(15)
  • Publish with WHERE userId xxx

30
R-GMA End Users (WP8-10) cont.
  • APIs
  • Java
  • C
  • C
  • based on libwww style of OO C
  • Python and Perl
  • (GIN is currently in Perl)
  • Perl and Python derived via SWIG
  • Probably not a good idea
  • Not like hand generated code so may do it again
  • Easy installation and configuration
  • For developers
  • Installers
  • Users
  • Pulse
  • GUI to browse (and manipulate) R-GMA data

31
R-GMA Future Work
  • OGSAfication
  • Consider how to handle time better in queries
  • GRM/PROVE integration
  • Security
  • Will be based on WP2 (Spitfire)
  • Will need to do our own authorisation work
  • Replication
  • We need to be able to distribute the schema and
    the registry
  • For performance
  • For reliability

32
The End
  • Information and Monitoring Services
  • http//hepunx.rl.ac.uk/edg/wp3/
Write a Comment
User Comments (0)
About PowerShow.com