Title: Grids for the LHC
1Grids for the LHC
- Paula Eerola
- Lund University, Sweden
- Four Seas Conference
- Istanbul
- 5-10 September 2004
Acknowledgement much of the material is from Ian
Bird, Lepton-Photon Symposium 2003, Fermilab.
2Outline
- Introduction
- What is a Grid?
- Grids and high-energy physics?
- Grid projects
- EGEE
- NorduGrid
- LHC Computing Grid project
- Using grid technology to access and analyze LHC
data - Outlook
3Introduction
4About the Grid
- WEB get information on any computer in the world
- GRID get CPU-resources, disk-resources,
tape-resources on any computer in the world - Grid needs advanced software, middleware, which
connects the computers together - Grid is the future infrastructure of computing
and data management
5Short history
- 1996 Start of the Globus project for connecting
US supercomputers together (funded by US Defence
Advanced Research Projects Agency...) - 1998 early Grid testbeds in the USA -
supercomputing centers connected together - 1998 Ian Foster, Carl Kesselman
- GRID Blueprint for a new Computing
Infrastructure - 2000 PC capacity increases, prices drop ?
supercomputers become obsolete ? Grid focus is
moved from supercomputers to PC-clusters - 1990s WEB, 2000s GRID?
- Huge commercial interests IBM, HP, Intel,
6Grid prerequisites
- Powerful PCs are cheap
- PC-clusters are everywhere
- Networks are improving even faster than CPUs
- Network Storage Computing exponentials
- CPU performance ( transistors) doubles every 18
months - Data storage (bits per area) doubles every 12
months - Network capacity (bits per sec) doubles every 9
months
7Grids and high-energy physics?
- The Large Hadron Collider, LHC, start 2007
- 4 experiments, ATLAS, CMS, ALICE, LHCb, with
physicists from all over the world - LHC computing data processing, data storage,
production of simulated data - LHC computing is of unprecedented scale
Massive data flow The 4 experiments are
accumulating 5-8 PetaBytes of data/year
8- Needed capacity
- Storage 10 PetaBytes of disk and tape
- Processing 100,000 of todays fastest PCs
- World-wide data analysis
- Physicists are located in all the continents
- Computing must be distributed for many reasons
- Not feasible to put all the capacity in one
place - Political, economic, staffing easier to get
funding for resources at home country - Faster access to data for all physicists around
the world - Better sharing of computing resources required
by physicists
9LHC Computing Hierarchy
Tier 0 CERN. Tier 0 receives raw data from the
Experiments and records them on permanent mass
storage. First-pass reconstruction of the data,
producing summary data.
Tier 1 Centres large computer centers (about
10). Tier 1s provide permanent storage and
management of raw, summary and other data needed
during the analysis process.
Tier 2 Centres smaller computer centers
(several 10s). Tier 2 Centres provide disk
storage and concentrate on simulation and
end-user analysis.
10Grid technology as a solution
- Grid technology can provide optimized access to
and use of the computing and storage resources - Several HEP experiments currently running (Babar,
CDF/DO, STAR/PHENIX), with significant data and
computing requirements, have already started to
deploy grid-based solutions - Grid technology is not yet off-the shelf product
? Requires development of middleware, protocols,
services,
Grid development and engineering projects EDG,
EGEE, NorduGrid, Grid3,.
11Grid projects
12US, Asia, Australia
- USA
- NASA Information Power Grid
- DOE Science Grid
- NSF National Virtual Observatory
- NSF GriPhyN
- DOE Particle Physics Data Grid
- NSF TeraGrid
- DOE ASCI Grid
- DOE Earth Systems Grid
- DARPA CoABS Grid
- NEESGrid
- DOH BIRN
- NSF iVDGL
- Asia, Australia
- Australia ECOGRID, GRIDBUS,
- Japan BIOGRID, NAREGI,
- South Korea National Grid Basic Plan, Grid
Forum Korea, -
13Europe
- EGEE
- NorduGrid
- EDG, LCG
- UK GridPP
- INFN Grid, Italy
- Cross-grid projects in order to link together
Grid projects
- Many Grid projects have particle physics as the
initiator - Other fields are joining in healthcare,
bioinformatics, - Address different aspects of grids
- Middleware
- Infrastructure
- Networking, cross-Atlantic interoperation
14A seamless international Grid infrastructure to
provide researchers in academia and industry with
a distributed computing facility
PARTNERS 70 partners organized in nine regional
federations Coordinating and Lead Partner
CERN CENTRAL EUROPE FRANCE - GERMANY
SWITZERLAND ITALY - IRELAND UK - NORTHERN
EUROPE - SOUTH-EAST EUROPE - SOUTH-WEST EUROPE
RUSSIA - USA
- STRATEGY
- Leverage current and planned national and
regional Grid programmes - Build on existing investments in Grid
Technology by EU and US - Exploit the international dimensions of the
HEP-LCG programme - Make the most of planned collaboration with NSF
CyberInfrastructure initiative
- ACTIVITY AREAS
- SERVICES
- Deliver production level grid services
(manageable, robust, resilient to failure) - Ensure security and scalability
- MIDDLEWARE
- Professional Grid middleware re-engineering
activity in support of the production services - NETWORKING
- Proactively market Grid services to new research
communities in academia and industry - Provide necessary education
15EGEE goals and partners
- Create a European-wide Grid Infrastructure for
the support of research in all scientific areas,
on top of the EU Reseach Network infrastructure
(GEANT) - Integrate regional grid efforts
- 9 regional federations covering 70 partners in 26
countries - http//public.eu-egee.org/
16EGEE project
- Project funded by EU FP6, 32 MEuro for 2 years
- Project start 1 April 2004
- Activities
- Grid Infrastructure Provide a Grid service for
science research - Next generation of Grid middleware ? gLite
- Dissemination, Training and Applications
(initially HEP Bio)
17EGEE timeline
18Grid in Scandinavia the NorduGrid Project
- Nordic Testbed for
- Wide Area Computing and Data Handling
- www.nordugrid.org
19NorduGrid original objectives and current status
- Status 2004
- The project has grown world-wide nodes in
Germany, Slovenia, Australia,... - 39 nodes, 3500 CPUs
- Created own NorduGrid Middleware, ARC (Advanced
Resource Connector), which is operating in a
stable way - Applications massive production of ATLAS
simulation and reconstruction - Other applications AMANDA simulation, genomics,
bio-informatics, visualization (for
metheorological data), multimedia applications,...
- Goals 2001 (project start)
- Introduce the Grid to Scandinavia
- Create a Grid infrastructure in Nordic countries
- Apply available Grid technologies/middleware
- Operate a functional Testbed
- Expose the infrastructure to end-users of
different scientific communities
20Current NorduGrid status
21The LHC Computing Grid, LCG
- The distributed computing environment to analyse
the LHC data - lcg.web.cern.ch
22LCG - goals
- Goal prepare and deploy the computing
environment - that will be used to analyse the LHC data
- Phase 1 2003 2005
- Build a service prototype
- Gain experience in running a production grid
service - Phase 2 2006 2008
- Build and commission the initial LHC computing
environment
2003
2006
2005
2004
23LCG composition and tasks
- The LCG Project is a collaboration of
- The LHC experiments
- The Regional Computing Centres
- Physics institutes
- Development and operation of a distributed
computing service - computing and storage resources in computing
centres, physics institutes and universities
around the world - reliable, coherent environment for the
experiments - Support for applications
- provision of common tools, frameworks,
environment, data persistency
24Resource targets 04
 CPU (kSI2K) Disk TB Support FTE Tape TB
CERN 700 160 10.0 1000
Czech Rep. 60 5 2.5 5
France 420 81 10.2 540
Germany 207 40 9.0 62
Holland 124 3 4.0 12
Italy 507 60 16.0 100
Japan 220 45 5.0 100
Poland 86 9 5.0 28
Russia 120 30 10.0 40
Taiwan 220 30 4.0 120
Spain 150 30 4.0 100
Sweden 179 40 2.0 40
Switzerland 26 5 2.0 40
UK 1656 226 17.3 295
USA 801 176 15.5 1741
Total 5600 1169 120.0 4223
25LCG status Sept 04
- Tier 0
- CERN
- Tier 1 Centres
- Brookhaven
- CNAF Bologna
- PIC Barcelona
- Fermilab
- FZK Karlsruhe
- IN2P3 Lyon
- Rutherford (UK)
- Univ. of Tokyo
- CERN
- Tier 2 centers
- South-East Europe HellasGrid, AUTH, Tel-Aviv,
Weizmann - Budapest
- Prague
- Krakow
- Warsaw
- Moscow region
- Italy
26LCG status Sept 04
- First production service for LHC experiments
operational - Over 70 centers, over 6000 CPUs, although many of
these sites are small and cannot run big
simulations - LCG-2 middleware testing, certification,
packaging, configuration, distribution and site
validation - Grid operations centers in RAL and Taipei (US)
performance monitoring, problem solving 24x7
globally - Grid call centers in FZK Karlsruhe and Taipei.
- Progress towards inter-operation between LCG,
NorduGrid, Grid3 (US)
27Outlook
EU vision of e-infrastructure in Europe
28Moving towards an e-infrastructure
29Moving towards an e-infrastructure
30Summary
- Huge investment in e-science and Grids in Europe
- regional, national, cross-national, EU
- Emerging vision of European-wide e-science
infrastructure for research - High Energy Physics is a major application that
needs this infrastructure today and is pushing
the limits of the technology