Title: LCG experience in Integrating Grid Toolkits
1LCG experience inIntegrating Grid Toolkits
- Zdenek Sekera
- IT Division , CERN
2Outline
- What is LCG?
- LCG project goals
- With whom we work ?
- Basic LCG milestones
- What is LCG-0 ?
- What are the plans for LCG-1 ?
- What is the process to get to LCG-1 ?
- Conclusions
- Authors Piera Bettini, Ian Bird, Flavia Donno,
Maarten Litmaath, - Di Qing, Louis Poncet, Andrea Sciaba, Zdenek
Sekera, - Marco Serra, David Smith
3What is LCG?
- LCG LHC Computing Grid
- Creating the infrastructure for computing needs
of LHC HEP experiments - (Atlas, CMS, Alice, LHCb)
- All are worldwide collaborations, consequently
the software has to work worldwide -
-
4Project Goals
Goal Prepare and deploy the LHC computing
environmentto help the experiments to analyze
the data coming from the detectors
- applications - tools, frameworks, environment,
persistency - computing system ? global grid service
- cluster ? automated fabric
- collaborating computer centres ? grid
- CERN-centric analysis ? global analysis
environment - central role of data challenges
This is not another grid technology project
it is a grid deployment project
5LCG goals
- LCG goal is to bring physics world together by
creating a user friendly, production quality
environment for data processing and physics
analysis. - How? By integrating different grid toolkits or
grid middleware into a homogenous package to
guarantee the interoperability among different
ways of doing things. - LCG does NOT write grid middleware.
6What is production quality?
- It is all of the following in no particular
order - availability 24 x 7
- performance
- stability, robustness
- user friendliness
- maintainability
- user support
7Who are our partners?
- LCG is currently a customer of
- iVDGL - VDT toolkit (including Globus Condor
toolkits) - EDG - European DataGrid project
- EDT - European DataTag project (monitoring)
- Globus - underlying software
- GLUE schema HICB (DataTag iVDGL) product
- HICB HEP Intergrid Coordination Board
- LCG can be considered a joint effort of all.
- It pulls together needed components from
existing projects.
8Basic LCG milestones
- LCG is focused on batch experiments data
challenges (phase 1) - LCG release milestones
- February - LCG-0 (deployment test, not publicly
available) - July - LCG-1 (production pilot)
- First publicly available LCG service
- November - LCG-1 (production system)
- Performance release needed for data
challenges - in 2004.
- We have a first set of integrated software and
currently use it for deployment tests (LCG-0)
9LCG-0
- WorldGrid demo last year (e.g Supercomputing
2002) demonstrated interoperability between
VDT-EDG toolkits - We went a step further by integrating VDT and EDG
into a single package consisting of - EDG 1.4.3 VDT 1.1.6 Glue schema
- Its purpose is to setup infrastructure and
release process rather than Grid functionality
itself - The important task is to determine the necessary
work to be done by laboratories to get integrated
10LCG-0 deployment
- We are deploying it on tier 1 centers, goal is
about 10 sites in the initial deployment - Currently deployed by
- CERN, RAL, CNAF, Taiwan
- being installed by
- FNAL, U of Tokyo
- we expect a few more sites later
11What did we learn with LCG-0
- Release process is difficult
- Many specific issues for different sites (how
many service machines, what services on service
machines, security issues) - Packaging is an issue (USA x Europe, rpm x tar,
pacman x LCFG) - Installation is complicated, we cannot force
ourselves on sites with our installation tools
when they have already their own - Configuration is major problem, it is far too
complex - Testing is a huge issue, we need to test
different architectures, features, networking,
interoperability
12Research? Deployment?
- If we want to/need to reconfigure service nodes,
all neccessary changes are now propagated to
configuration files manually. That cannot stay
this way, we have to find much more automatic
ways. - Compare to Linux installation 3-4 years ago and
today. - We must probably go the same way of
understanding what the real configuration issues
are, only we would like to be much faster. - The integration of worker nodes must be resolved
by sites themselves, we will help. - These issues (and more) make the difference
between the research project ending with a demo
and the product to be used for a production.
13What will be LCG-1?
- Expected to be
- EDG 2.0 VDT 1.1.7 (Globus 2.2.4)
- We are just beginning LCG-1.
- We have a Certification Testing testbed at CERN
consisting of 40 machines we can configure as
desired. - Certification Testing is a joint project (e.g.
VDT is testing Globus and Condor).
14LCG-1 Certification Testbed
U of Wisconsin
40 machines at CERN
15LCG-1 certification
- basic grid functionality
- connectivity
- grid services
- security
- resource brokering
- data management
- (replication, catalog)
- configurability
- error recovery
- real world applications
- site verification suite
16Test and Validation process
Build system
Production
Development Testbed 15cpu
Certification Testbed 40cpu
Developers machines
Unit Test
Build
Certification
Production
Integration
WPs add unit tested code to CVS repository
Run nightly build auto. tests
Grid certification
Certified public release for use by apps.
Individual WP tests
Build system
Test Group
Users
Integration Team
Tagged package
WPs
Application Certification
Overall release tests
Tagged release selected for certification
Certified release selected for deployment
Fix problems
Appl. Representatives
Releases candidate
Releases candidate
Tagged Releases
Certified Releases
24x7
Office hours
Bugzilla anomalies reports
17Conclusions
- With LCG-0 we have proved it is possible to
integrate different toolkits (VDT EDG) into a
single package that can be repeatable deployed. - We have learned the difficult issues
(installation, configuration, testing) that will
require a special attention for LCG-1. - The process how to get the first (July) LCG-1
release out has been setup and is operational. - Incremental releases will be required to correct
problems according to their priorities up to
November LCG-1 release and possibly beyond. - There is a difficult and long road ahead towards
LCG-1 due to the complex software and aggressive
schedule but we believe it can be managed with a
systematic determined approach. - It is a highly collaborative project so everybody
has to contribute.