Title: UK Plans for LHC Grid
1UK Plans for LHC Grid
- John Gordon
- HEP-CCC, Bologna
- June 2001
2LHC Computing Model
Lab m
Uni x
USA Brookhaven
Uni a
UK
Lab a
USA FermiLab
France
Tier 1
CERN
Uni n
Tier2
Physics Department
.
Italy
Desktop
Germany
NL
Lab b
Lab c
?
Uni y
?
Uni b
?
3UK Grid not just LHC
UK
4US experiments - Grid Plans
5BaBar
- 8x80cpu farms
- 10 sites with 12TB disk and Suns
- simulation
- data mirroring from SLAC - Kanga, Objectiity
- data movement and mirroring across UK
- data location discovery across UK - mySQL
- remote job submission - Globus and PBS
- common usernames across UK - GSI gridmapfiles
- Find data - submit job to data - register output
- BaBar planning a distributed computing model
- TierA centres
6CDF
- Similar model to BaBar with disk and cpu
resources at RAL and universities plus farm for
simulation. - Development of Grid access to CDF databases lead
by UK. - Data replication from FNAL to UK and around UK
- Data Location Discovery through metadata
7D0
- Large data centre at Lancaster
- ship data from FNAL to UK
- simulation in UK and ship data back to FNAL
- Gridify SAM access to data
- data at FNAL and Lancaster
8GridPP History
Collaboration formed by all UK PP Experimental
Groups in 1999 to submit 5.9M JIF bid for
Prototype Tier-1 centre at RAL (later withdrawn)
Added some Tier-2 support to become part of PPARC
LTSR - The LHC Computing Challenge ? Input to
SR2000
From Jan 2001 handling PPARCs commitment to EU
DataGrid
Formed GridPP in Dec 2000 included CERN, CLRC and
UK PP Theory Groups
9(No Transcript)
10UK Strengths
- Wish to build on UK strengths -
- Information Services
- Networking - world leaders in monitoring
- Security
- Mass Storage
- UK Major Grid Leadership roles -
- Lead DataGrid Architecture Task Force (Steve
Fisher) - Lead DataGrid WP3 Information Services (Robin
Middleton) - Lead DataGrid WP5 Mass Storage (John Gordon)
- Strong Networking Role in WP7 (Peter Clarke,
Robin Tasker) - ATLAS Software Coordinator (Norman McCubbin)
- LHCb Grid Coordinator (Frank Harris)
- Strong UK Collaboration with Globus
- Globus people gave 2 day tutorial at RAL to PP
community - Carl Kesselman attended UK Grid Technical
meeting - 3 UK people visited Globus at Argonne
Natural UK Collaboration with US PPDG and GriPhyN
11Proposal Summary
- 40M 3-Year Programme
- LHC Computing Challenge
- Grid Technology
- Five Components
- Foundation
- Production
- Middleware
- Exploitation
- Value-added Exploitation
- Emphasis on Grid Services and Core Middleware
- Integrated with EU DataGrid, PPDG and GriPhyN
- Facilities at CERN, RAL and up to four UK Tier-2
sites - Centres Dissemination
- LHC developments integrated into current
programme (BaBar, CDF, D0, ...) - Robust Management Structure
- Deliverables in March 2002, 2003, 2004
Approved in principle, May 2001 Financial details
to be confirmed
12GridPP Workgroups
Technical work broken down into several
workgroups - broad overlap with EU DataGrid
A - Workload Management Provision of software
that schedule application processing requests
amongst resources
F - Networking Network fabric provision through
to integration of network services into middleware
G - Prototype Grid Implementation of a UK Grid
prototype tying together new and existing
facilities
B - Information Services and Data
Management Provision of software tools to provide
flexible transparent and reliable access to the
data
H - Software Support Provide services to enable
the development, testing and deployment of
middleware and applications at institutes
C - Monitoring Services All aspects of monitoring
Grid services
I - Experimental Objectives Responsible for
ensuring development of GridPP is driven by needs
of UK PP experiments
D - Fabric Management and Mass Storage Integration
of heterogeneous resources into common Grid
framework
E - Security Security mechanisms from
Certification Authorities to low level components
J - Dissemination Ensure good dissemination of
developments arising from GridPP into other
communities and vice versa
13Major Deliverables
- Prototype I - March 2002
- Performance and scalability testing of components
- Testing of the job scheduling and data
replication software from the first DataGrid
release.
- Prototype II - March 2003
- Prototyping of the integrated local computing
fabric, with emphasis on scaling, reliability and
resilience to errors. - Performance testing of LHC applications.
Distributed HEP and other science application
models using the second DataGrid release.
- Prototype III - March 2004
- Full scale testing of the LHC computing model
with fabric management and Grid management
software for Tier-0 and Tier-1 centres, with some
Tier-2 components.
14GridPP and CERN
This investment will
- Allow operation of a production quality prototype
of the distributed model prior to acquisition of
final LHC configuration - Train staff for management and operation of
distributed computing centres - Provide excellent training ground for young
people - Enable the technology to be re-used by other
sciences and industry
15GridPP and CERN
For UK to exploit LHC to the full Requires
substantial investment at CERN to support LHC
computing.
- UK involvement through GridPP will boost CERN
investment in key areas - Fabric management software
- Grid security
- Grid data management
- Networking
- Adaptation of physics applications
- Computer Centre fabric (Tier-0)
16GridPP Collaboration Meeting
1st GridPP Collaboration Meeting - Coseners House
- May 24/25 2001
17Current UK testbed
18 UK-Sites
Glasgow
- Clusters
- Scotland
- North West
- Midlands
- London
- Testbed Site
- (integrated)
Edinburgh
Durham
Lancaster
Liverpool
Manchester
Dublin
Sheffield
Birmingham
Oxford
Cambridge
RAL
QMUL,UCL,IC,Brunel,RHUL
Bristol
19Globus MDS Explorer
20External Resources
External Funds (additional to PPARC Grants and
central facilities) have provided computing
equipment for several experiments and institutes
BaBar (Birmingham, Bristol, Brunel, Edinburgh,)
12TB disk, 10 Suns, (Imperial, Liverpool,
Manchester, QMUL, RAL, RHUL) 8 Linux farms MAP
(Liverpool ) 300 node farm ScotGrid
(Edinburgh, Glasgow) farm, disk, tape D0
(Lancaster) 200 node farm, 30-200TB
tape Dark Matter (Sheffield) Tape CDF/Minos
(Glasgow, Liverpool, Oxford, UCL) Disk, servers
and farm CMS (Imperial) Farm ALICE
(Birmingham) Farm Total 5.4M
All these Resources will contribute directly to
GridPP
Many Particle Physics Groups are involved in
large SRIF bids in collaboration with other
disciplines mostly to form e-Science centres. The
amount of resource available to GridPP from this
SRIF round could be several M
21GridPP Organisation
Hardware development organised around a number
Regional Centres
- Likely Tier-2 Regional Centres
- Focus for Dissemination and Collaboration with
other disciplines and Industry - Clear mapping onto Core Regional e-Science Centres
Software development organised around a number of
Workgroups
22Tier12 Plans
- RAL already has 300 cpus, 10TB disk, and STK tape
silo which can hold 330TB - Install significant capacity at RAL this year to
meet BaBar TierA Centre requirements - Integrate with worldwide BaBar work
- Integrate with DataGrid testbed
- Integrate Tier1 and 2 within GridPP
- Upgrade Tier2 centres through SRIF (UK university
funding programme)
23Tier1 Resources
24Tier1 Integrated Resources
25Liverpool
- MAP - 300 cpus several TB of disk
- delivered simulation for LHCb and others for
several years - Upgrades of cpus and storage planned for 2001 and
2002 - currently adding Globus
- develop to allow analysis work also
26Imperial College
- Currently
- 180 cpus
- 4TB disk
- 2002
- adding new cluster in 2002
- shared with Computational Engineering
- 850 nodes
- 20TB disk
- 24TB tape
- CMS, BaBar, D0
27Lancaster
Worker
Worker
500 GB Bulkserver
Switch
Worker
196 Worker CPUs
500 GB Bulkserver
Controller Node
Not Fully Installed
Switch
Tape Library Capacity 30 TB k11/30 TB
Controller Node
Finalizing Installation of Mass Storage System
2 Months
28Lancaster
- Currently D0
- analysis data from FNAL for UK
- simulation
- Future
- upgrades planned
- Tier2 RC
- Atlas-specific
29ScotGrid
- Tendering now
- 128 CPU at Glasgow
- 5 TB Datastore server at Edinburgh
- ATLAS/LHCb
- Plans for future upgrades to 2006
- Linked with UK Grid National Centre
30Wider UK Grid
- Prof Tony Hey leading Core Grid Programme
- UK National Grid
- National Centre
- 9 Regional Centres
- Computer Science lead
- includes many sites with PP links
- Grid Support Centre (CLRC)
- Grid Starter Kit
- vesion 1 based on Globus, Condor, ......
- Common software
- e-science Institute
- Grid Network Team
- Strong Industrial Links
- All Research Areas have their own e-science plans
31Network
- UK Academic Network, SuperJANET entered phase 4
in 2001 - 2.5GB backbone, December 2000-April 2001
- 622Mbit to RAL, April 2001
- Most MANs have plans for 2.5GB on their backbones
- Peering with GEANT planned at 2.5GB
32(No Transcript)
33(No Transcript)
34(No Transcript)
35TEN-155 ? GEANT
- 2.5 Gbps to 10 Gbs double every year for 4
years - Consolidated Global Connectivity
- Geographic Expansion
- Managed Bandwidth, QoS, VPN
36Summary
- UK has plans for a national grid for particle
physics - to deliver the computing for several virtual
organisations (LHC and non-LHC) - Collaboration established, proposal approved,
plan in place - Will deliver
- UK commitment to DataGrid,
- prototype Tier1 and 2
- UK commitment to US experiments
- Work closely with other disciplines