Title: The EU DataGrid
1The EU DataGrid Information and Monitoring
Services
- The European DataGrid Project Team
- http//www.eu-datagrid.org
2Information and Monitoring Services
- EDG information providers
- Software that provides information about
resources and infrastructure - Provided by the work packages responsible for the
resource - Globus MDS (Metacomputing Directory Service or
Monitoring and Discovery Service as it is now
called) - Based on OpenLDAP, a hierarchical database
- R-GMA (Relational Grid Monitoring Architecture)
- A relational implementation of the Global Grid
Forums GMA - Overview
- Uses within the testbed
3EDG Information Providers
4LDAP - Directory Information Tree
computing element
storage element
site information
network information between this and other sites
status
file statistics
supported protocols
storage elements that are close (not necessarily
at the same site)
5Siteinfo
- insiteinfo,Mds-Vo-nameral-dev,Mds-Vo-nameuk,oG
rid - objectClass SiteInfo
- objectClass DataGridTop
- objectClass DynamicObject
- siteName RALDEV
- sysAdminContact grid.sysadmin_at_rl.ac.uk
- userSupportContact grid.support_at_rl.ac.uk
- siteSecurityContact grid.security_at_rl.ac.uk
- dataGridVersion 1.2
- installationDate 20020704142800Z
6Computing Element
- ceIddev01.hepgrid.clrc.ac.uk2119/jobmanager-pbs-
M,hndev01.hepgrid.clrc.ac.uk,Mds-Vo-nameral-dev,
Mds-Vo-nameuk,oGrid - objectClass DataGridTop
- objectClass ComputingElement
- CEId dev01.hepgrid.clrc.ac.uk2119/jobmanager-pbs
-M - GlobusResourceContactStringdev01.hepgrid.clrc.ac.
uk2119/jobmanager-pbs/OGrid/OUKHEP/CNdev01.he
pgrid.clrc.ac.uk - GRAMVersion ?
- Architecture intel
- OpSys RH 6.2
- MinPhysicalMemory 258
- MinLocalDiskSpace 2048
- TotalCPUs 1
- FreeCPUs 1
- NumSMPs 0
- MinSPUProcessors 0
- MaxSPUProcessors 0
- TotalJobs 0
- RunningJobs 0
- IdleJobs 0
- IdleJobs 0
- MaxTotalJobs 1
- MaxRunningJobs 1
- WorstTraversalTime 108000
- EstimatedTraversalTime 0
- Active TRUE
- Priority 20
- MaxCPUTime 108000
- MaxWallClockTime 432000
- AverageSI00 300
- MinSI00 300
- MaxSI00 300
- AuthorizedUser/OGrid/OUKHEP/OUhepgrid.clrc.ac.
uk/CNTim Eves - AuthorizedUser/OGrid/OUKHEP/OUhepgrid.clrc.ac.
uk/CNTim Folkes - RunTimeEnvironment RALDEV
- AFSAvailable FALSE
- OutboundIP TRUE
- InboundIP FALSE
- QueueName M
7Close Storage Element
- closeSEdev02.hepgrid.clrc.ac.uk,ceIddev01.hepgri
d.clrc.ac.uk2119/jobmanager-pbs-M,
hndev01.hepgrid.clrc.ac.uk,Mds-Vo-nameral-dev,Md
s-Vo-nameuk,oGrid - objectClass CloseStorageElement
- objectClass DataGridTop
- objectClass DynamicObject
- CEIddev01.hepgrid.clrc.ac.uk2119/jobmanager-pbs-
M - CloseSE dev02.hepgrid.clrc.ac.uk
- MountPoint /flatfiles
8Storage Element
- seIddev02.hepgrid.clrc.ac.uk,Mds-Vo-nameral-dev,
Mds-Vo-nameuk,oGrid - objectClass StorageElement
- objectClass DataGridTop
- objectClass DynamicObject
- SEId dev02.hepgrid.clrc.ac.uk
- CloseCE dev01.hepgrid.clrc.ac.uk2119/jobmanager-
pbs-M - SEtypearchitecture disk
- SEsize 13177
- SEResourceContactString grid.support_at_rl.ac.uk
- SEvo wpsix
9Storage Element Protocols
- seProtocolgridftp, seIddev02.hepgrid.clrc.ac.uk,
Mds-Vo-nameral-dev,Mds-Vo-nameuk,oGrid - objectClass StorageElementProtocol
- objectClass DataGridTop
- objectClass DynamicObject
- SEId dev02.hepgrid.clrc.ac.uk
- SEProtocol gridftp
- Port 2811
- seProtocolrfio, seIddev02.hepgrid.clrc.ac.uk,Mds
-Vo-nameral-dev,Mds-Vo-nameuk,oGrid - objectClass StorageElementProtocol
- objectClass DataGridTop
- objectClass DynamicObject
- SEId dev02.hepgrid.clrc.ac.uk
- SEProtocol rfio
- Port 3147
- seProtocolfile, seIddev02.hepgrid.clrc.ac.uk,Mds
-Vo-nameral-dev,Mds-Vo-nameuk,oGrid - objectClass StorageElementProtocol
- objectClass DataGridTop
10Storage Element Status
- instatus,seIddev02.hepgrid.clrc.ac.uk,Mds-Vo-nam
eral-dev,Mds-Vo-nameuk,oGrid - objectClass StorageElementStatus
- objectClass DataGridTop
- objectClass DynamicObject
- SEfreespace 12031
- SEId dev02.hepgrid.clrc.ac.uk
11MDS
- Monitoring and Discovery Service
12LDAP search
- ldapsearch
- -x
- -H ldap//lxshare0225.cern.ch2135
- -b 'Mds-Vo-namedatagrid,ogrid
- 'objectclass'
13GRIS/GIIS Hierarchy
- Mds-Vo-namedatagrid,ogrid
- This will look at all the data
- Mds-Vo-namecountryA,Mds-Vo-namedatagrid,ogrid
- This will look at all the data from countryA
- Mds-Vo-namecountryA,ogrid
- This will look at all the data from countryA
- Mds-Vo-namesiteB,Mds-Vo-namecountryA,ogrid
- This will look at all the data from siteB
- Mds-Vo-namesiteB,ogrid
- This will look at all the data from siteB
14Map Centre WP7
- Alternatively the information can be viewed using
WP7s Map Centre - http//ccwp7.in2p3.fr/mapcenter/
15R-GMA
- Relational - Grid Monitoring Architecture
- An Overview
16The Consumer Producer Model
Producer
- Use the Grid Monitoring Architecture from Global
Grid Forum - A relational implementation
- Applied to both information and monitoring
- Creates impression that you have one RDBMS per
Virtual Organization
Registry
Command flow
Information flow
Consumer
17Relational Approach
- Not a general distributed RDBMS system, but a way
to use the relational model in a distributed
environment where ACID properties are not
generally important. - Producers announce SQL CREATE
TABLE publish SQL INSERT - Consumers collect SQL SELECT
18R-GMA
Application Code
command flow Information flow
Consumer Servlet
Consumer API
9
Registry API
4
5
Registry Servlet
- API Servlet communication
- http(s) in
- XML back
Schema API
6
8
2
3
Registry API
7
Producer API
Schema Servlet
1
ProducerServlet
Sensor Code
19Schema Contributions
CPULoad (Global Schema) CPULoad (Global Schema) CPULoad (Global Schema) CPULoad (Global Schema) CPULoad (Global Schema)
Country Site Facility Load Timestamp
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
CH CERN ALICE 0.9 19055611022002
CH CERN CDF 0.6 19055511022002
CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2)
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1)
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
CPULoad (Producer3) CPULoad (Producer3) CPULoad (Producer3) CPULoad (Producer3) CPULoad (Producer3)
CH CERN ATLAS 1.6 19055611022002
CH CERN CDF 0.6 19055511022002
20Contributions are Views
CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1) CPULoad (Producer 1)
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
SELECT FROM cpuLoad WHERE country UK AND
site RAL
CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2) CPULoad (Producer 2)
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
SELECT FROM cpuLoad WHERE country UK AND
site GLA
21The Mediator
r a table (in a virtual database) S a relational
schema (for a virtual database) q queries posed
against S p Producers, associated with views on
S. Currently views have the form SELECT
FROM r WHERE lt ??? gt
- The Mediator
- How to match q with the ps ?
- It is the mediator which makes R-GMA work
- It is hidden inside the ConsumerServlet
22The Mediator cont.
- Recall that you can view R-GMA as a huge
relational database of all the information
produced - Producers register which partition of the dataset
they have by means of a partitioning predicate - Some data
- may not be accessible
- may not be kept
- may be duplicated
- The mediator is hidden inside the Consumer
- but is an essential part of R-GMA
- The final mediator will take any SQL statement
and from the information in the registry find the
right producers - We can now merge information from several
producers
23Not just one Producer
- CircularBufferProducer
- Fast
- Uses an SQL parser no RDBMs involved
- Information will be lost if not consumed in time
- Streaming well defined
- 1 write pointer for buffer and 1 read pointer for
each Consumer - DataBaseProducer
- Slower
- Information not lost
- Clean up strategy needed
- Streaming needs to be defined
- Archiver
- Just Consumers and a DataBaseProducer
24R-GMA
25Deployment For Release 1.3
Information Catalogue
Tomcat
Registry Servlet
Schema Servlet
command flow Information flow
13
Storage Element
Resource Broker
Computing Element
6
7
12
2
14
10
15
8
16
11
9
1
5
LDAP SEinformationprovider
17
3
4
LDAP server
LDAP CEinformationprovider
Site A
26R-GMA Resource Broker (WP1)
GOUT
Archiver
Consumer (CE)
Consumer (SE)
Consumer
LDAP server
DataBaseProducer
RDBMS
Consumer (..)
Clean up
27R-GMA Logging Bookkeeping (WP1)
- BUT
- CircularBufferProducer may lose information
- Need something fast, capable of streaming, but
which keeps data safe - Need to deal with active jobs
- Implies a triggering mechanism
- Only want latest info from a job
- Will provide an overwrite mode
Archiver of A-H
CircularBufferProducer
BS
Archiver of I-N
CircularBufferProducer
BS
Archiver of O-Z
- Each Bookkeeping Server publishes to a
CircularBufferProducer - Archiver has a where clause to collect jobs
belonging to a subset of users - Most queries will be satisfied by one Archiver
- The consumer registration enables the consumer
part of the Archiver to be notified when a new BS
appears
28R-GMA Network Monitoring (WP7)
CircularBufferProducer
NM
CircularBufferProducer
NM
WP7 cost fn
Consumer
Archiver
NSE
SE-GIN
CircularBufferProducer
NCE
CE-GIN
CircularBufferProducer
- Each NetworkMonitor publishes to a
CircularBufferProducer - Archiver collects all information needed to
satisfy the cost function query which WP7
provides for WP2 - Offers a number of interesting queries to deduce
information for paths which are not measured
29R-GMA End Users (WP8-10)
- GRM/PROVE for parallel applications (from release
2.0) - Also depends upon WP1 offering parallel job
submission - May work with LB people to handle user fields
- From release 2 users can publish Anything with
R-GMA - A general table is available (release 1.3)
userTable - Structure
- userId VARCHAR(255)
- aString VARCHAR(255)
- aReal REAL
- anInteger INT
- timeStamp CHAR(15)
- Publish with WHERE userId xxx
30R-GMA End Users (WP8-10) cont.
- APIs
- Java
- C
- C
- based on libwww style of OO C
- Python and Perl
- (GIN is currently in Perl)
- Perl and Python derived via SWIG
- Probably not a good idea
- Not like hand generated code so may do it again
- Easy installation and configuration
- For developers
- Installers
- Users
- Pulse
- GUI to browse (and manipulate) R-GMA data
31R-GMA Future Work
- OGSAfication
- Consider how to handle time better in queries
- GRM/PROVE integration
- Security
- Will be based on WP2 (Spitfire)
- Will need to do our own authorisation work
- Replication
- We need to be able to distribute the schema and
the registry - For performance
- For reliability
32The End
- Information and Monitoring Services
- http//hepunx.rl.ac.uk/edg/wp3/