Title: title'open revolution execute
1title.open ( ) revolution execute
- LHC Computing Challenge
- Methodology?
- Hierarchical Information in a Global Grid
Supernet - Aspiration?
- HIGGS
- DataGRID-UK
- Aspiration?
- ALL Data Intensive Computation
2Outline
- Starting Points
- Physics Motivation
- The LHC Computing Challenge
- ATLAS
- Data Hierarchy
- GridPP
- First Grid Developments
- Monte Carlo Production
- Grid Tools
- LHC Computing Grid Startup
- Summary
-
3Starting Point
Something Missing...
Mass Generation via Higgs Boson
SolutionBuild a Collider The Large Hadron
Collider at CERN
Problem Large Datasets Petabytes per year (one
mile high stack of CDs)
4The Higgs Mechanism
Theory vacuum potential
Expt direct searches and electroweak fits
Energy
LHC aims to 1. Discover a Higgs particle 2.
Measure properties e.g. mass, spin, lifetime,
branching ratios.
5LHC pp, Ecms 14 TeV
6ATLAS detector
ATLAS large international collaboration to find
the Higgs (and much more) in the range 0.1TeV lt
mH lt 1TeV
The ATLAS experiment is 26m long, stands 20m
high, weighs 7000 tons and has 200 million
read-out channels
7ATLAS Parameters
- Running conditions at startup
- Raw event size 2 MB (recently revised
upwards...) - 2.7x109 event sample ? 5.4 PB/year, before data
processing - Reconstructed events, Monte Carlo data ? 9
PB/year (2PB disk) - CPU 2M SpecInt95
- CERN alone can handle only a fraction of these
resources
8LHC Computing Challenge
1 TIPS 25,000 SpecInt95 PC (1999) 15
SpecInt95
PBytes/sec
Online System
100 MBytes/sec
Offline Farm20 TIPS
- One bunch crossing per 25 ns
- 100 triggers per second
- Each event is 1 Mbyte
100 MBytes/sec
Tier 0
CERN Computer Centre gt20 TIPS
Gbits/sec
or Air Freight
HPSS
Tier 1
RAL Regional Centre
US Regional Centre
French Regional Centre
Italian Regional Centre
HPSS
HPSS
HPSS
HPSS
Tier 2
Tier2 Centre 1 TIPS
Tier2 Centre 1 TIPS
Tier2 Centre 1 TIPS
Gbits/sec
Tier 3
Physicists work on analysis channels Each
institute has 10 physicists working on one or
more channels Data for these channels should be
cached by the institute server
Institute 0.25TIPS
Institute
Institute
Institute
Physics data cache
100 - 1000 Mbits/sec
Tier 4
Workstations
9A Physics Event
- Gated electronics response from a proton-proton
collision - Raw data hit addresses, digitally converted
charges and times - Marked by a unique code
- Proton bunch crossing number, RF bucket
- Event number
- Collected, Processed, Analyzed, Archived.
- Variety of data objects become associated
- Event migrates through analysis chain
- may be reprocessed
- selected for various analyses
- replicated to various locations.
10Data Structure
Trigger System
Data Acquisition
Run Conditions
Level 3 trigger
Calibration Data
Raw Data
Trigger Tags
Reconstruction
Event Summary Data ESD
Event Tags
REAL and SIMULATED data required
11Data Hierarchy
RAW, ESD, AOD, TAG
RAW
Recorded by DAQ Triggered events
Detector digitisation
2 MB/event
12Physics Analysis
ESD Data or Monte Carlo
Event Tags
Tier 0,1 Collaboration wide
Event Selection
Calibration Data
Analysis, Skims
INCREASING DATA FLOW
Raw Data
Tier 2 Analysis Groups
Physics Objects
Physics Objects
Physics Objects
Tier 3, 4 Physicists
Physics Analysis
13 Making the Grid Work for the Experiments
14(No Transcript)
15(No Transcript)
16GridPP Context
Provide architecture and middleware
Future LHC Experiments
Running US Experiments
Build Tier-A/prototype Tier-1 and Tier-2 centres
in the UK and join worldwide effort to develop
middleware for the experiments
Use the Grid with simulated data
Use the Grid with real data
17SR2000 e-Science Allocation
Tony Hey
DG Research Councils
Grid TAG
E-Science Steering Committee
Director
Directors Management Role
Directors Awareness and Co-ordination Role
Generic Challenges EPSRC (15m), DTI (15m)
Academic Application Support Programme Research
Councils (74m), DTI (5m) PPARC (26m) BBSRC
(8m) MRC (8m) NERC (7m) ESRC (3m) EPSRC
(17m) CLRC (5m)
Neil Geddes
80m Collaborative projects
Industrial Collaboration (40m)
18Grid Architecture
For more info www.globus.org/research/papers/anat
omy.pdf
19GridPP
17m 3-year project funded by PPARC
- LCG
- (start-up phase)
- funding for staff and hardware...
- EDG - UK Contributions
- Architecture
- Testbed-1
- Network Monitoring
- Certificates Security
- Storage Element
- R-GMA
- LCFG
- FTree
- MDS deployment
- GridSite
- SlashGrid
- Spitfire
http//www.gridpp.ac.uk
- Applications (start-up phase)
- BaBar
- CDF/D0 (SAM)
- ATLAS/LHCb
- CMS
- (ALICE)
- UKQCD
20UK Tier1/A Status
Current setup 14 Dual 1GHz PIII, 500MB RAM 40GB
disks Compute Element (CE) Storage Element
(SE) User Interface (UI) Information Node (IN)
Central Facilities (Non Grid) 250 CPUs 10TB
Disk 35TB Tape (Capacity 330 TB)
Hardware Purchase for delivery in March 2002 156
Dual 1.4GHz 1GB RAM, 30GB disks 26 Disk servers
(Dual 1.266GHz) each with 1.9TB disk Expand the
capacity of the tape robot by about 35TB
21UK Tier-2 Example Site - ScotGRID
- ScotGrid Processing nodes at Glasgow
- 59 IBM X Series 330 dual 1 GHz Pentium III with
2GB memory - 2 IBM X Series 340 dual 1 GHz Pentium III with
2GB memory and dual ethernet - 3 IBM X Series 340 dual 1 GHz Pentium III with
2GB memory and 100 1000 Mbit/s ethernet - 1TB disk
- LTO/Ultrium Tape Library
- Cisco ethernet switches
- ScotGrid Storage at Edinburgh
- IBM X Series 370 PIII Xeon with 512 MB memory 32
x 512 MB RAM - 70 x 73.4 GB IBM FC Hot-Swap HDD
- Griddev testrig at Glasgow
- 4 x 233 MHz Pentium II
- BaBar UltraGrid System at Edinburgh
- 4 UltraSparc 80 machines in a rack 450 MHz CPUs
in each 4Mb cache, 1 GB memory - Fast Ethernet and MirrorNet switching
- CDF equipment at Glasgow
- 8 x 700 MHz Xeon IBM xSeries 370 4 GB memory 1
TB disk
One of (currently) 10 GridPP sites running in the
UK
22(No Transcript)
23Network
- Tier1 internal networking will be a hybrid of
- 100Mb(ps) to nodes of cpu farms with 1Gb up from
switches - 1Gb to disk servers
- 1Gb to tape servers
- UK academic network SuperJANET4
- 2.5Gb backbone upgrading to 20Gb in 2003
- RAL has 622Mb into SJ4
- SJ4 has 2.5Gb interconnect to Geant
- New 2.5Gb link to ESnet and Abilene just for
research users - UK involved in networking development
- internal with Cisco on QoS
- external with DataTAG
24Distributed MC Production, Today
Submit jobs remotely via Web
Transfer data to CASTOR mass-store at CERN
Update bookkeeping database (Oracle at CERN)
Execute on farm
Data Quality Check on data stored at CERN
Monitor performance of farm via Web
25Validation of Middleware via Distributed MC
Production, Tomorrow
WP 1 job submission tools WP 4 environment
WP 2 data replication WP 5 API for mass storage
Submit jobs remotely via Web
Transfer data to CASTOR (and HPSS, RAL Datastore)
Execute on farm
Update bookkeeping database
WP 1 job submission tools
WP 2 meta data tools WP1 tools
Online histogram production using GRID pipes
WP 3 monitoring tools
Data Quality Check Online
Monitor performance of farm via Web
26GANGA Gaudi ANd Grid Alliance
GANGA
GUI
Collective Resource Grid Services
Histograms Monitoring Results
JobOptions Algorithms
GAUDI Program
Making the Grid Work for the Experiments
27 CMS Data in 2001
- OBJECTIVITY DATATOTAL 29 TB
- TYPICAL EVENT SIZES
- Simulated
- 1 CMSIM event 1 OOHit event 1.4 MB
- Reconstructed
- 1 1033 event 1.2 MB
- 1 2x1033 event 1.6 MB
- 1 1034 event 5.6 MB
28A CMS Data Grid Job
2003 CMS data grid system vision
29ALICE Data Challenge
- COTS for mass storage?
- order of magnitude increase in disk access speed
reqd. - 5 years 100MB/s to gt1GB/s)
30D0
31CDF
32Overview of SAM
SAM
Database Server(s) (Central Database)
Name Server
Global Resource Manager(s)
Log server
Shared Globally
Station 1 Servers
Station 3 Servers
Local
Station n Servers
Station 2 Servers
?
Mass Storage System(s)
Arrows indicate Control and data flow
Shared Locally
33Overview of SAM
SAM
SAM and DataGrid should use common (lower)
middleware
34(No Transcript)
35Experiment Deployment
36LHC computing at a glance
- The investment in LHC computing will be massive
- LHC Review estimated 240MCHF (before LHC delay)
- 80MCHF/y afterwards
- These facilities will be distributed
- Political as well as sociological and practical
reasons
Europe 267 institutes, 4603 users Elsewhere
208 institutes, 1632 users
37Access Grid
- Collection of resources that support
group-to-group interaction across the grid - Supports large-scale distributed meetings and
collaborative work sessions - VRVS (VIC/RAT tools) and H323 commonly used in
GridPP
38(No Transcript)
39How Large is Large?
- Is the LHC Grid
- Just the O(10) Tier 0/1 sites and O(20,000) CPUs?
- the O(50) Tier 2 sites O(40,000) CPUs?
- the collective computing power of O(300) LHC
institutions perhaps O(60,000) CPUs in total? - Are the LHC Grid users
- The experiments and their relatively few,
well-structured production computing
activities? - The curiosity-driven work of 1000s of physicists?
- Depending on our answer, the LHC Grid is
- A relatively simple deployment of todays
technology - A significant information technology challenge
40Service Graph
Allowed? ? Hierarchical Model
All Nodes Grid Aware?
Optimisation? Directory Hierarchicacal?
Relational Heterogeneous?
41Resource Discovery/Monitoring
R
?
R
R
R
?
R
R
R
R
R
network
dispersed users
R
?
?
R
R
R
R
R
R
R
R
R
R
VO-A
VO-B
- Large numbers of distributed sensors with
different properties, varying status - Need different views of this information,
depending on community membership, security
constraints, intended purpose, sensor types, etc
42R-GMA
43R-GMA Schema
CPULoad (Global View)
Timestamp
Load
Facility
Site
Country
19055711022002
0.3
CDF
RAL
UK
19055611022002
1.6
ATLAS
RAL
UK
19055811022002
0.4
CDF
GLA
UK
19055611022002
0.5
LHCb
GLA
UK
19055611022002
0.9
ALICE
CERN
CH
19055511022002
0.6
CMS
CERN
CH
CPULoad (Producer3)
19055611022002
1.6
ATLAS
CERN
CH
19055511022002
0.6
CMS
CERN
CH
44OvervIew
Grid Application Layer
Application Management
Database Management
Algorithm Registry
Job Management
Data Registering
Job Decomposition
Job Prediction
Data Reclustering
Collective Services
Information Monitoring
Replica Manager
Grid Scheduler
Service Index
Network Monitoring
Time Estimation
Replica Catalog
Grid Information
Load Balancing
Replica Optimisation
Underlying Grid Services
Remote Job Execution Services (GRAM)
Security Services (Auth Access Control)
Messaging Services (MPI)
File Transfer Services (GridFTP)
SQLDatabase Service (Metadata storage)
45OvervIew
46Documentsubmittedby Wolfgang Hoschekand Gavin
McCance
47Meta Data Service
- Provide Generic RDB Access
- User Access via HTTPS
- Decode XML with WSDL/Soap
- Security Servlet Maps Roles
- Command Translator
- From generic to specific
- Backend Types
- Oracle, PostgreSQL, MySQL
48Query Optimisation
- Local minimisation of execution time by replica
selection - Two phase minimisation
- RB selects CE based on speculative cost
- Job contacts RM (RO), and pins file
- P2P starting point
- inefficient
- Optimisation gt Economic Model
- Backwards (Vickery) Auction
- Status simulation currently under development
49Resource Broker
Computing Element
Optor
50DataGrid Demonstrator
- Sites involved CERN, CNAF, LYON, NIKHEF, RAL
- User interface in X, dg-job-submit demo.jdl gt
- job sent to the Workload Management System at
CERN - The WMS selects a site according to resource
- attributes given in the Job Definition Language
(jdl) file and to the resources - published via the Information System (currently
MDS) - The job is sent to one of the sites, a data file
is written - the file is copied to the nearest Mass Storage
and replicated on - all other sites
- dg-job-get-output is used to retrieve the files
51First steps...
Computing element gatekeeper Jobmanger-PBS/LSF/BQ
S Publish CPU resources
Worker Node
Submit job
CPU
Worker Node
Worker Node
Workload manager Information system
SITE-X
User ITF Node
Storage element gatekeeper Publish storage
resources
Storage
Client
File Catalog server
Resources provider
52demo.jdl Executable demo.csh Arguments
none StdInput none StdOutput
demo.out StdError
demo.err InputSandbox demo.csh OutputSandb
oxdemo.out.demo.err,demo.log Requirements
other.OpSysRH 6.2
COMPUTING
CERN
STORAGE
COMPUTING
CNAF
STORAGE
COMPUTING
LYON
STORAGE
dg-job-submit demo.jdl
Workload manager Information system
User ITF Node
Input sanbox
Output sandbox
COMPUTING
dg-job-get-output job-id
data
NIKHEF
STORAGE
File catalog server
STORAGE
RAL
COMPUTING
53GRID issues Coordination
- Technical part is not the only problem
- Sociological problems? resource sharing
- Short-term productivity loss but long-term gain
- Key? communication/coordination between
people/centres/countries - This kind of world-wide close coordination across
multi-national collaborations has never been done
in the past - We need mechanisms here to make sure that all
centres are part of a global planning - In spite of different conditions of funding,
internal planning, timescales etc - The Grid organisation mechanisms should be
complementary and not parallel or conflicting to
existing experiment organisation - LCG-DataGRID-eSC-GridPP
- BaBar-CDF-D0-ALICE-ATLAS-CMS-LHCb-UKQCD
- Local Perspective build upon existing strong PP
links in the UK to build a single Grid for all
experiments.
54Latest Info
55Summary
Motivation Experimental Particle Physics
- Unique funding environment
- Particle Physics needs the Grid
- Mutual Interest (leads to
teamwork) - Emphasis on
- Software Development
- CERN lead Unique Local Identity
- Extension of Open Source Ideas Grid
Culture Academia Industry - Multidisciplinary Approach
University Regional Basis - Use of Existing Structures
- Large distributed databases a common
problemchallenge - Now LHC
- Early days opportunity to be involved in first
Grid prototypes -
GRIDPP