CrossGrid%20Testbed%20Status - PowerPoint PPT Presentation

About This Presentation
Title:

CrossGrid%20Testbed%20Status

Description:

Collection. VO. URL: ldap://rc01.lip.pt:9980/rc=CG Replica Catalog,dc=rc01,dc=lip,dc=pt ... WN. SE (se01) UI (ui01) LCFG (lngrid01) CA (OFFLINE) MyProxy ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 32
Provided by: Jorge122
Category:

less

Transcript and Presenter's Notes

Title: CrossGrid%20Testbed%20Status


1
CrossGrid Testbed Status
  • Jorge Gomes jorge_at_lip.pt
  • LIP Computer Centre / X WP4

2
X testbeds
  • According with the plans the initial monolithic
    testbed was recently separated in two
    infrastructures.

Production testbed EDG 1.2.2/3
Initial testbed EDG 1.2.2/3
Validation testbed EDG 1.4.3
Deployed in the context of tasks 4.1 and 4.4
  • Production testbed
  • Used to test application prototypes.
  • Validation testbed
  • Mostly used to test new production middleware.

3
X testbed resources
Production testbed Production testbed Validation testbed Validation testbed
Computing Elements 15 Computing Elements 3
Worker nodes 69 Worker nodes 4
CPUs 115 CPUs 5
Storage elements 14 Storage Elements 3
Storage capacity 2.7TB Storage capacity 1.2TB
  • Of the 16 sites foreseen
  • 10 are fully available.
  • 2 are deployed and being tested.
  • 2 are currently in the validation testbed.
  • 2 are deployed but not available (not tested).
  • The X testbeds already offer considerable
    computing and storage resources.

4
X production sites status
5
The X production Resource Broker
1.
1.
Job requests are submitted from remote UIs
2.
Jobs are sent to the RB located at
LIP
2.
Jobs are sent to the RB located at LIP
3.
3.
The RB uses site information in the matchmaking
3.
The RB submits the job to a CE using GRAM
4.
The RB submits the job to a CE using GRAM
Central site
Any
X
Lisbon
Remote site
4
2
Any
Resource
CrossGrid
JSS
broker
User
Interface
I I
1
3
JDL Job
request
6
Central X production services (1)
The CrossGrid production central services are
located in Lisbon and maintained by LIP.
MyProxy RB RC VO UI Monitoring
Certification proxy Resource broker Replica
catalogue Virtual organisation server User
interface Grid monitoring
MyProxy


RB
Central services

RC
VO


Monitoring
7
Central X production services (2)
  • Resource Broker
  • Matchmaking and load balancing scheduler.
  • Performs load sharing across X sites.
  • Certificate Proxy server
  • Short lived certificates for long lived
    processes.
  • Used by the applications portal and by the RB.
  • Virtual Organizations server
  • Database for user authentication.
  • Is used to build the authorization databases of
    all X sites.
  • Replica Catalogue
  • Database for physical replica file location.
  • Central service to find the location of files in
    SEs.
  • Network and Grid Monitoring
  • Early detection of problems in the X testbed.

8
The X production Replica Catalogue
  • The production RC
  • This is basically an LDAP server.
  • Hosted at lngrid08.lip.pt port 9980.
  • Is used by the RB and RM.

VO Collection Description
crossgrid cgtst0 CrossGrid collection
wpsix wpsixtst0 Being used in tests
atlas atlastst0 Being used in tests
cms cmstst0 Not used
URL ldap//lngrid08.lip.pt9980 /rcCrossGridRepl
icaCatalogue,dclngrid08,dclip,dcpt
9
X validation sites status
  • The validation testbed was created in the context
    of the task 4.4 testbed quality assurance.
  • Currently EDG 1.4.3 is being tested.
  • All three sites have been successfully deployed.
  • The central services for the validation testbed
    have been successfully deployed at LIP.

10
The X validation Resource Broker
Self registration
1.
1.
Job requests are submitted from remote UIs
2.
Jobs are sent to the RB located at
LIP
2.
Jobs are sent to the RB located at LIP
3.
3.
The RB uses site information in the matchmaking
3.
The RB submits the job to a CE using GRAM
4.
The RB submits the job to a CE using GRAM
Athens
Central site
Any
X
Lisbon
Remote site
4
2
Any
Resource
CrossGrid
JSS
Broker
User
Interface
3
1
Karlsruhe
Information Index
JDL Job
request
New server
11
Central X validation services (1)
The CrossGrid validation central services are
located in Lisbon and maintained by LIP.
MyProxy RB RC VO UI Monitoring II
Certification proxy Resource broker Replica
catalogue Virtual organisation server User
interface Grid monitoring Information Index
I I
MyProxy


RB
Central services

RC
VO


Monitoring
12
Central X validation services (2)
  • Resource Broker
  • Matchmaking and load balancing scheduler.
  • Performs load sharing across X sites.
  • Certificate Proxy server
  • Shared with the production testbed.
  • Virtual Organizations server
  • Shared with the production testbed.
  • Replica Catalogue
  • Database for physical replica file location.
  • Central service to find the location of files in
    SEs.
  • Network and Grid Monitoring
  • Shared with the production testbed.
  • Information Index
  • TOP MDS information server contains pointers to
    the site information servers.

13
The X validation Replica Catalogue
  • The production RC
  • This is basically an LDAP server.
  • Hosted at rc01.lip.pt port 9980.
  • Is used by the RB and RM.

VO Collection Description
crossgrid cg CrossGrid collection
URL ldap//rc01.lip.pt9980/rcCG Replica
Catalog,dcrc01,dclip,dcpt
14
Production and validation systems hosted at LIP

Test and
Production

Shared

Validation

LCFG

Gatekeeper

Gatekeeper


(lngrid01)

(lngrid02)

(ce01)

CA

WN

WN

Local
(OFFLINE)

Resources
(...)

(...)

SE

SE


(lngrid03)

(se01)

Some X central systems will soon be moved to the
FCCN NOC in Lisbon.
UI

UI

(lngrid05)

(ui01)


RB

RB

MyProxy

(lngrid06)

(rb01)

(lngrid07)

ral
Services
RC

RC

VO

Cent
(lngrid08)

(rc01)

(lnnet05)


II

Monitoring

(ii01)

(lnnet07)

15
Central Services Hosting
  • LIP and the Portuguese academic network (FCCN)
    have establish a protocol for the hosting of
    LIP/CrossGrid systems into the Lisbon NOC.
  • The contract allows
  • LIP to install servers in the Lisbon NOC.
  • Higher bandwidth.
  • The systems to be in the same room of the Géant
    router (only one hop in the middle).
  • Continuous power supply (diesel generator) .
  • The systems will be under full control of LIP.
  • This is result of a collaboration between LIP and
    FCCN on Grid and network technologies.

16
Virtual Organizations
  • CrossGrid has its own VO server
  • The VO server is used to build the authorization
    databases of the X testbed systems.
  • Currently is an LDAP server (VOMS is being
    tested).
  • Hosted at grid-vo.lip.pt port 9990.
  • CrossGrid users can send their VO membership
    requests to vo.admin_at_lip.pt
  • 43 users are registered in the crossgrid VO.

VO Group Description
crossgrid testbed1 All CrossGrid users
cgTV alpha Test and validation experts
cgTV beta Test and validation users
gdmpservers apptb All production GDMP servers
gdmpservers tvtb All validation GDMP servers
gdmpservers devtb Not used
17
Certification Authorities
  • Five new CAs were created and are now recognized
    by CrossGrid.
  • All CAs are operational issuing certs and CRLs.
  • All CAs are recognized by DataGrid with one
    exception that is finishing the acceptance
    process.

18
Certification Authorities (2)
  • However the work is not complete (it will never
    be).
  • Sometimes CRLs expire causing denial of service.
  • A tool to monitor the CRL issuance is being
    developed.
  • Possibly the same will happen with the issued
    certificates since they have 1 year of lifetime.
  • A tool to monitor the validity of the host
    certificates is being developed.
  • The new Cyprus CA is not installed everywhere.
  • Security policies and procedures to deal with
    certificate compromise are required.
  • A draft was written (to be discussed in the
    security team).
  • Probably the CRL download period must be shorter.
  • A manual explaining the theory behind
    certificates and how they should be used is
    required.

19
Monitoring and verification
  • Grid and network monitoring services have been
    deployed to monitor the X testbed.
  • http//mapcenter.lip.pt
  • An installation and verification tool was
    developed at LIP to verify X testbed sites.
  • Interim version can be consulted at
  • http//www.lip.pt/computing/cg-services/site_check

20
Testbed support
  • The CrossGrid helpdesk application is being
    tested and is almost ready.
  • The current sources of support are still
  • crossgrid-wp4-support_at_lists.cesga.es
  • http//grid.ifca.unican.es/crossgrid/wp4
  • The support for the central services is currently
    provided by LIP.
  • grid.support_at_lip.pt
  • http//www.lip.pt/computing/cg-services
  • http//www.lip.pt/computing/cg-tv-services
  • Installation manual
  • http//gridportal.fzk.de/cgi-bin/viewcvs.cgi/cross
    grid/crossgrid/wp4/sites/demo/documents/install_gu
    ide_v1.0.pdf

21
Testbed monitoring
Mapcenter grid monitoring framework. Mapcenter
was developed by DataGrid and adapted to
CrossGrid by LIP. Enhancements are being
implemented by LIP in cooperation with DataGrid.
http//mapcenter.lip.pt
22
X host check tool
Host Check grid host checker. Host Check was
developed by LIP to support the CrossGrid testbed
deployment. Host Check produces a detailed
report for each testbed CE and SE.
http//www.lip.pt/computing/cg-services/site_check
23
Production RB statistics
Total users 33
Jobs submitted 2094
Jobs accepted 1951
Jobs with good match 1836
Jobs submitted by JSS 1817
Jobs run 1651
Jobs done 1101
  • The peak usage of the RB was between last
    November and December.
  • Since the current RB doesnt support parallel
    jobs, MPI job submissions pass unnoticed to the
    RB.

24
Validation RB statistics
Total users 8
Jobs submitted 4173
Jobs accepted 4173
Jobs with good match 4010
Jobs submitted by JSS 4007
Jobs run 3964
Jobs done 3954
163 matching failures 3 not submitted 43
didnt run 219 jobs lost 94.8 success
  • The test and validation RB has been established
    recently.
  • The validation RB also doesnt support parallel
    applications.

25
Production CEs statistics
Sites Connections Pings Jobs OK Failed Jobs Failed Jobs Failed Jobs Failed Jobs
Sites Connections Pings Jobs OK LCAS CRL exp Jobman GSS
LIP 6556 462 2836 50 17 92 3099
IFIC 5326 655 2649 100 97 45 1780
Cyfronet 4516 306 2522 0 20 111 1557
II SAS 1404 6 1185 0 15 99 99
FZK 1799 11 1112 118 7 123 428
Demo 9481 5 1111 36 0 51 8278
ICM 705 34 604 8 24 2 33
CESGA 7321 1 544 78 28 13 6657
UAB 600 14 519 0 9 14 44
INS 592 2 517 6 20 20 27
PSNC 582 0 496 15 14 11 46
TCD 145 0 131 0 0 2 12
AUTH 141 0 127 0 3 0 11
TOTAL 39168 1496 14353 411 254 583 22071
26
Validation CEs statistics
Sites Connections Pings Jobs OK Failed Jobs Failed Jobs Failed Jobs Failed Jobs
Sites Connections Pings Jobs OK LCAS CRL exp Jobman GSS
LIP 67365 2319 64995 21 0 4 26
FZK 8883 64 8671 38 12 50 48
Demo 10665 0 6170 4 6 2 4483
TOTAL 86913 2383 79836 63 18 56 4557
  • The validation testbed has been heavily
    exercised.
  • More than 80.000 jobs have been submitted since
    the end of November.

27
X in the DataGrid testbed
  • CrossGrid
  • sites in the
  • DataGrid
  • testbed as
  • seen by
  • Mapcenter
  • Europe view

28
Test of X applications (1)
  • The tests of the X HEP example application using
    MPICH-G2 across sites started in November.
  • Test were performed
  • Using dedicated systems (IFCA).
  • Using the CrossGrid production testbed (LIP,
    Demokritos).
  • The tests over the testbed have shown that
  • Its possible to run MPI jobs in the testbed.
  • MPI across sites with MPICH-G2 works.
  • However problems were detected in sites using
    private IP addresses.

29
Test of X applications (2)
  • It was possible to run the application in up to
    seven sites at the same time.
  • The application was compiled statically.
  • Both PBS and FORK job managers were used in the
    tests.
  • Issues
  • There isnt support for parallel jobs in the RB
    (yet), matchmaking must be performed by the user.
  • Check that the user is authorized at the testbed
    sites.
  • Check that there are free CPUs available.
  • PBS jobs may end up waiting in a queue.
  • Sometimes processes stay hanged in the queues.
  • Sometimes the execution hangs at start.
  • Problems with private IP addresses.
  • Possible problems with firewalls.

30
IST Demonstration
  • World grid demonstration involving European and
    US sites from CrossGrid, DataGrid, GriPhyN and
    PPDG.
  • Has taken place on November 2002.
  • It was the largest grid testbed in the world.
  • Applications from the CERN/LHC experiments CMS
    and Atlas were used.
  • CrossGrid participated with 3 sites
  • LIP - Lisbon
  • FZK - Karlsruhe
  • IFIC - Valencia

31
Future
  • Test and deploy the first release of CrossGrid
    middleware.
  • Initiate the security group activities.
  • Policies, guidelines, tracking of problems,
    patches.
  • Support the extension of the testbed to new
    sites.
  • More sites internal to the project.
  • Possible external sites and users (policy
    needed).
  • Support clusters already running other Linux
    flavours.
  • Light installation.
  • Establish a development testbed.
  • Prepare the test and possible migration to EDG
    2.x and Linux 7.x.
  • Study the usage of QoS in CrossGrid.
  • Create a QoS test infrastructure.

32
E N D
LIP
FKZ
IFIC
IFCA
Write a Comment
User Comments (0)
About PowerShow.com