Title: Jrgen Knobloch cernit 1
1The LHC Computing Grid ProjectStatus Plans
LCG
- CERN-US Co-operation Committee
- 18 June 2004
- Jürgen Knobloch
- IT Department, CERN
- This file is available at http//cern.ch/lcg/pres
entations/LCG_JK_DOE_NSF.ppt
2The LHC Computing Grid Project - LCG
- Collaboration
- LHC Experiments
- Grid projects Europe, US
- Regional national centres worldwide
- Choices
- Adopt Grid technology.
- Go for a Tier hierarchy.
- Use Intel CPUs in standard PCs
- Use LINUX operating system.
- Goal
- Prepare and deploy the computing environment for
the analysis of data from the LHC detectors.
Lab m
Uni x
grid for a regional group
Uni a
CERNTier 1
Lab a
UK
USAFNAL
Tier-1
Tier3 physics department
France
USABNL
Uni n
Tier-2
CERN Tier 0
Japan
Italy
?
Lab c
Germany
Lab b
grid for a physics study group
?
Uni y
?
Uni b
Desktop
3Operational Management of the Project
Applications Area Development environment Joint
projects Data management Distributed analysis
Middleware Area now EGEE Provision of a base
set of gridmiddleware acquisition,development,
integration,testing, support
ARDA A Realisation of Distributed Analysis for
LHC
CERN Fabric Area Large cluster management Data
recording Cluster technology Networking Computing
service at CERN
Grid Deployment Area Establishing and managing
theGrid Service - Middleware certification,
security, operations,registration,
authorisation,accounting
Joint with a new European project EGEE
Phase 1 2002-05development of common software
prototyping and operation of a pilot computing
service Phase 2 2006-08acquire, build and
operate the LHC computing service
Enabling Grids for e-Science in Europe
4Sites in LCG-2/EGEE-0 June 18 2004
- 22 Countries
- 62 Sites (48 Europe, 2 US, 5 Canada, 6 Asia, 1
HP) - Coming New Zealand, China,
- other HP (Brazil, Singapore)
- 4000 CPUs
5LCG-2 sites status 17 June 2004
6LCG Service Status
- Certification and distribution process
established - Middleware package components from
- European DataGrid (EDG)
- US (Globus, Condor, PPDG, GriPhyN) ? the Virtual
Data Toolkit - Principles for registration and security agreed
- Grid Operations Centre at Rutherford Lab (UK)
- A second centre is coming online at Academia
Sinica in Taipei - Call Centre FZK (Karlsruhe)
- LCG-2 software released February 2004
- 62 centres connected with more than 4000
processors - Four collaborations run data challenges on the
grid
7Data challenges
- The 4 LHC experiments currently run data
challenges using the LHC computing grid - Part 1 World-wide production of simulated data
- Job submission, resource allocation and
monitoring - Catalog of distributed data
- Part 2 Test of Tier-0 operation
- Continuous (24 x 7) recording of data up to 450
MB/s per experiment (target for ALICE in 2005
750 MB/s) - First pass data reconstruction and analysis
- Distribute data in real-time to Tier-1 centres
- Part 3 Distributed analysis on the Grid
- Access of data from anywhere in the world in an
organized as well as in a chaotic access pattern
Now
Summer 04
Fall 04
8Grid operation
- After 3 months of intensive use the basic
middleware of LCG-2 is proving to be reliable
although there are many issues of functionality,
scaling, performance, scheduling - The grid deployment process is working well
- Integration certification debugging
- Installation and validation of new sites
- Constructive and useful feedback from first data
challenges especially on data issues from CMS - Interoperability with Grid3 in the US being
studied (presentation by FNAL at Grid Deployment
Board in May) - Implementation of a common interface to Mass
Storage completed or in plan at all Tier-1
Centres - Proposal for convergence on a single Linux
flavour based on the FNAL Scientific Linux
package
9Service Challenges
- Exercise the operations and support
infrastructure - Gain experience in service management
- Uncover problems with long-term operation
- Focus on
- Data management, batch production and analysis
- Reliable data transfer
- Integration of high bandwidth networking
- Also massive job submission, test reaction to
security incidents - Target by end 2004
- Robust and reliable data management services in
continuous operation between CERN, Tier-1 and
large Tier-2 centres - Sufficient experience with sustained high
performance data transfer to guide wide area
network planning - The Service Challenges are a complement to the
experiment Data Challenges
10Networking
- Key element for LHC computing strategy
- Aiming at sustained 500 MB/s by end 2004
requiring 10 Gb/s networks to some Tier-1 centres
based on existing facilities
5.44 Gb/s 1.1 TB in 30 min.
6.25 Gb/s 20 April 04
11LCG-2 and Next Generation Middleware
- LCG-2focus on production, large-scale data
handling - The service for the 2004 data challenges
- Provides experience on operating and managing a
global grid service - Strong development programme driven by data
challenge experience - Evolves to LCG-3 as components progressively
replaced with new middleware
- Next generation middlewarefocus on analysis
- Developed by EGEE project in collaboration with
VDT (US) - LHC applications and users closely involved in
prototyping development (ARDA project) - Short development cycles
- Completed components integrated in LCG-2
12US participation - some examples
- US LHC funds Miron Livny on ARDA/EGEE middleware
- US CMS working on end to end ARDA prototypes for
the fall. - US CMS will participate in CCS milestone for
analysis across grids. - NSF funds VDT/EGEE joint effort.
- Wisconsin certification testbeds for national NMI
testing, VDT, LCG and EGEE middleware... etc.
encourage coherence across the middleware
versions. - PPDG helps fund ATLAS ARDA representative.
- PPDG extension proposal is funded. Enables some
immediate attention to Open Science Grid needs. - Open Science Grid Technical Groups scope
explicitly includes cooperation with the EGEE/LCG
peers groups. - A single Linux flavour based on the FNAL
Scientific Linux package
13Phase 2 Planning Outline
- June 2003 - Establish editorial board for LCG TDR
- September 2004 Consolidated Tier-1, Tier-2
Regional Centre Plan - Background for the draft MoU ? October C-RRB
- Revised version of basic computing models
- Revised estimates of overall Tier-1, Tier-2
resources - Current state of commitments of resources
Regional Centres ? Experiments - High-level plan for ramping up the Tier-1 and
large Tier-2 centres - October 2004 C-RRB Draft MoU
- End 2004 - Initial computing models agreed by
experiments - April 2005 C-RRB - Final MoU
- End June 2005 - TDR
14Planning groups
- High level planning group chair Les Robertson
- Overall picture implied resources at centres
- Representatives from the experiments and LCG
management - Representatives from three regional centres
Italy, Spain, US - Grid Deployment Area steering group chair Ian
Bird - Interoperability, resources, overview Service
Challenges - Representatives from regional centres
- Service challenge group chair Bernd Panzer
- Carrying out service challenges
- People working on service challenges
- Networking chair David Foster
- Working towards required end-to-end bandwidth
- MoU taskforce chair David Jacobs
- Draft LCG MoU and one for each of the four
experiments - Representatives from experiments, six countries,
LCG management - TDR editorial board chair Jürgen Knobloch
15102 FTE-yearsmissing for Phase 2
16Preparing for 2007
- 2003 has demonstrated event production
- In 2004 we must show that we can also handle the
data even if the computing model is very simple - -- This is a key goal of the 2004 Data
Challenges Service Challenges - Target for end of this year
- Basic model demonstrated using current grid
middleware - All Tier-1s and 25 of Tier-2s operating a
reliable service - Validate security model, understand storage
model - Clear idea of the performance, scaling, and
management issues
Progressive evolution
17Running on several Grids
Aiming for common interfaces and interoperability
18Conclusions
- We are now in a position to start from an
operational environment towards a computing
system for LHC by - Completing functionality
- Improving reliability
- Ramping up capacity and performance
- Data challenges and service challenges are
essential to keep the right track - All partners are committed to arrive at
interoperability of the grids involved - We are grateful for the very significant
contributions from and the effective
collaboration with the US - The full funding for LCG phase 2 still needs to
be secured