les'robertsoncern'ch foil 1 - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

les'robertsoncern'ch foil 1

Description:

... transform from detector view to physics view (tracks, energies, particles, ..) Analysis ... Earth Observation. ESA-ESRIN. KNMI (Dutch meteo) climatology ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 43
Provided by: lesr150
Category:

less

Transcript and Presenter's Notes

Title: les'robertsoncern'ch foil 1


1
LHC Computinga Challenge for Grids
  • Swiss Computer Science Conference SCSC'02
  • 22 February 2002, ETH Zürich
  • Les Robertson
  • CERN - IT Division
  • les.robertson_at_cern.ch

2
  • LHC Computing
  • Grid technology
  • Grids for LHC

3
  • On-line System
  • Multi-level trigger
  • Filter out background
  • Reduce data volume
  • 24 x 7 operation

40 MHz (1000 TB/sec)
Level 1 - Special Hardware
75 KHz (75 GB/sec)
Level 2 - Embedded Processors
5 KHz (5 GB/sec)
Level 3 Farm of commodity CPUs
100 Hz (100 MB/sec)
Data Recording Offline Analysis
4
The Large Hadron Collider Project 4 detectors
CMS
ATLAS
Storage Raw recording rate 0.1 1
GByte/sec Accumulating at 5-8 PetaBytes/year
(plus copies) 10 PetaBytes of
disk Processing 200,000 of todays fastest
PCs
LHCb
5
2.4 PetaBytes Today
6
Problem 1 Cost Problem 2 Number of
components
7
Data Handling and Computation for Physics Analysis
reconstruction
event filter (selection reconstruction)
detector
processed data
event summary data
analysis
raw data
batch physics analysis
event reprocessing
simulation
analysis objects (extracted by physics topic)
event simulation
interactive physics analysis
les.robertson_at_cern.ch
8
LHC Physics Data - Summary
  • 40 MHz collision rate
  • Reduced by real-time triggers/filters to few 100
    Hz
  • Digitised collision data few Mbytes
  • Collisions recorded 1011 per year
  • Embarrassingly parallel each collision
    independent
  • Reconstruction transform from detector view to
    physics view (tracks, energies, particles, ..)
  • Analysis
  • Find collisions with similar features
  • Physics extracted by collective iterative
    discovery
  • Keys are dynamically defined by professors and
    students using complex algorithms
  • Simulation start from the theory and detector
    characteristics and compute what the detector
    should have seen

9
High Energy Physics ComputingHigh Throughput
Computing
  • Large numbers of independent events - trivial
    parallelism
  • Small records - mostly read-only
  • Modest I/O rates - few MB/sec per fast processor
  • Modest floating point requirement - SPECint
    performance

Good fit for clusters of PCs
  • Chaotic workload
  • research environment ? unpredictable, no limit to
    the requirements
  • Very large aggregate requirements computation,
    data, i/o
  • exceeds the capabilities of a single geographical
    installation

10
CERN's Users in the World
Europe 267 institutes, 4603 usersElsewhere
208 institutes, 1632 users
11
  • Problem 1 Cost
  • Problem 2 Number of components
  • Problem 3 Global community
  • Solution 1 Global community ? Access to
    national and regional computer centres

12
analysis
reconstruction
  • Worldwide distributed computing system
  • Small fraction of the analysis at CERN
  • High-level analysis using 12-20 large regional
    centres
  • how to use the resources efficiently
  • establishing and maintaining a uniform physics
    environment
  • Data exchange with tens of smaller regional
    centres, universities, labs
  • Importance of cost containment
  • components architecture
  • utilisation efficiency
  • maintenance, capacity evolution
  • personnel management costs
  • ease of use (usability efficiency)

13
  • Moores law
  • capacity growth with -
  • a fixed cpu count
  • or a fixed annual budget

14
(No Transcript)
15
The MONARC Multi-Tier Model (1999)
les.robertson_at_cern.ch
16
Are Grids a solution?
  • Analogous with the electrical power grid
  • Unlimited ubiquitous distributed computing
  • Transparent access to multi peta byte distributed
    data bases
  • Easy to plug in
  • Hidden complexity of the infrastructure

Ian Foster and Carl Kesselman, editors, The
Grid Blueprint for a New Computing
Infrastructure, Morgan Kaufmann, 1999,
http//www.mkp.com/grids
17
What is the Grid for?
  • creates a virtual computer centre spanning
    different geographical locations that
    are managed independently
  • frees the end-user from the details of the
    different centres resources, policies, data
    location,
  • enables flexible sharing of large scale,
    distributed resources by users from different
    communities

18
LHC Computing Model2002 - evolving
The opportunity of Grid technology
The opportunity of Grid technology
The LHC Computing Centre
les.robertson_at_cern.ch
19
The Promise of Grid Technology
What should the Grid do for you?
  • you submit your work
  • and the Grid
  • Finds convenient places for it to be run
  • Optimises use of the widely dispersed resources
  • Organises efficient access to your data
  • Caching, migration, replication
  • Deals with authentication to the different sites
    that you will be using
  • Interfaces to local site authorisation and
    resource allocation mechanisms, policies
  • Runs your jobs
  • Monitors progress
  • Recovers from problems
  • .. and .. Tells you when your work is complete

20
The DataGRID Project
www.eu-datagrid.org
21
DataGRID Partners
  • Managing partners
  • UK PPARC Italy INFN
  • France CNRS Holland NIKHEF
  • Italy ESA/ESRIN CERN proj.mgt. - Fabrizio
    Gagliardi
  • Industry
  • IBM (UK), Communications Systems (F),
    Datamat (I)
  • Associate partners

University of Heidelberg, CEA/DAPNIA (F), IFAE
Barcelona, CNR (I), CESNET (CZ), KNMI (NL), SARA
(NL), SZTAKI (HU)
Finland- Helsinki Institute of Physics
CSC, Swedish Natural Science Research Council
(ParallelldatorcentrumKTH, Karolinska
Institute), Istituto Trentino di Cultura, Zuse
Institut Berlin,
22
DataGRID Challenges
  • Data
  • Scaling
  • Reliability
  • Manageability
  • Usability

23
Part i - Grid Middleware
  • Building on an existing framework (Globus)
  • Develop enhanced Grid middleware for data
    intensive computing
  • workload management
  • data management
  • application and grid monitoring

24
Datagrid Middleware Architecture
25
Workload Management
  • Optimal co-allocation of data, CPU and network
    for specific grid/network-aware jobs
  • Distributed scheduling (data/code migration) of
    unscheduled/scheduled jobs
  • Uniform interface to various local resource
    managers
  • Priorities, policies on resource usage (CPU,
    Data, Network)

26
Data Management
  • Data Transfer
  • Efficient, secure and reliable transfer of data
    between sites
  • Data Replication
  • Replicate data consistently across sites
  • Data Access Optimization
  • Optimize data access using replication and remote
    open
  • Data Access Control
  • Authentication, ownership, access rights on data
  • Metadata Storage
  • Grid-wide persistent metadata store for all
    kinds of Grid information

Data Granularity Files, not Objects.
27
A Job Submission Example
Replica Catalogue
Information Service
Input Sandbox
Resource Broker
Storage Element
Logging Book-keeping
Job Submission Service
Compute Element
28
Part ii Local Computing Fabrics
  • Fabric management
  • Mass storage

29
Management of Grid Nodes
  • Before tackling the Grid, better know how to
    manage efficiently giant local clusters ? fabrics
  • commodity components processors, disks, network
    switches
  • massive mass storage
  • new level of automation required
  • self-diagnosing
  • self-healing
  • Key Issues
  • scale
  • efficiency performance
  • resilience fault tolerance
  • cost acquisition, maintenance, operation
  • usability
  • security

30
Part iii - Applications
  • HEP
  • The four LHC experiments
  • Live proof-of-concept prototype of the Regional
    Centre model
  • Earth Observation
  • ESA-ESRIN
  • KNMI (Dutch meteo) climatology
  • Processing of atmospheric ozone data derived from
    ERS GOME and ENVISAT SCIAMACHY sensors
  • Biology
  • CNRS (France), Karolinska (Sweden)

31
Part iv Deployment
32
Datagrid Monitor
http//ccwp7.in2p3.fr/mapcenter/
33
The Data Grid Project - Summary
  • European dimension
  • EC funding 3 years, 10M Euro
  • Closely coupled to several national initiatives
  • Multi-science
  • Technology leverage
  • Globus, Condor, HEP farming MSS, Monarc,
    INFN-Grid, Géant
  • Emphasis
  • Data Scaling - Reliability
  • Rapid deployment of working prototypes -
    production quality
  • Collaboration with other European and US projects
  • Status
  • Started 1 January 2001
  • Testbed 1 in operation now
  • Open
  • Open-source and communication
  • Global GRID Forum
  • Industry and Research Forum

34
Trans-Atlantic Testbeds with High Energy Physics
Leadership
US projects
European projects
35
  • EU-Funded Project
  • Partners CERN, PPARC (UK), Amsterdam (NL),
    INFN (IT)
  • Géant research network backbone
  • Extending Datagrid to collaboration with US
    projects
  • iVDGL, GriPhyN
  • EU, DoE NSF funded HEP Transatlantic network
  • Main Aims
  • Compatibility and interoperability between US
    and EU projects
  • Transatlantic testbed for advanced network
    research
  • 2.5 Gbps wavelength-based US-CERN Link -
    6/2002 ? 10 Gbps 2003 or 2004

36
DataTAG Project
NewYork
ABILENE
STARLIGHT
ESNET
GENEVA
? Triangle
MREN
STAR-TAP
37
  • LHC Computing Environment
  • Many Grid Projects
  • Very many Regional Computing Centres
  • How can this be moulded into a coherent, reliable
    computing facility for LHC?

38
The LHC Computing Grid Project
Goal Prepare and deploy the LHC computing
environment
  • applications - tools, frameworks, environment,
    persistency
  • computing system ? evolution of traditional
    services
  • cluster ? automated fabric
  • collaborating computer centres ? grid
  • CERN-centric analysis ? global analysis
    environment

Borrow, buy or build
  • foster collaboration, coherence of LHC
    regional computing centres
  • central role of production testbeds

39
This is not yet another grid technology project
it is a grid deployment project
This is not yet another grid technology project
it is a grid deployment project
Two phases
  • Phase 1 2002-05
  • Development and prototyping
  • Build a 50 model of the grid needed for one
    experiment
  • Phase 2 2006-08
  • Installation and operation of the full world-wide
    initial production Grid

40
The LHC Computing Grid Project
  • Multi-disciplinary, collaborative, open
    environment
  • Grid projects, other sciences, Globus, Global
    Grid Forum
  • Mission oriented
  • LHC computing facility focus
  • Succession of prototypes
  • Acquire technology, rather than develop
  • Industrial participation
  • www.cern.ch/openlab

41
Final Remarks
  • Rapid pace of grid technology development
  • public investments
  • enthusiasm of potential user communities
  • industrial interest
  • LHC has a real need and a good application
  • International networks available to research will
    be critical to success
  • Importance of reliability and usability in early
    grids

42
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com