Jos SALT

About This Presentation

Title:

Jos SALT

Description:

SWE (South West Europe): LIP, IFIC,IFCA, PIC ... to manage the data flow of ATLAS according to the computing model providing only ... – PowerPoint PPT presentation

Number of Views:46

Avg rating:3.0/5.0

Slides: 32

Provided by: qual9

Category:

more less

Transcript and Presenter's Notes

Title: Jos SALT

1

EGEE-II and the implications in the LHC Computing
GRID

José SALT
22 de Junio de 2006

Curso Postgrado GRID y e-Ciencia Kick-off
meeting de Int.eu.grid
Instituto de Física de Cantabria
1
2
Overview

1. EGEE II fast description of the project
2. GRID Computing in LHC Experiments
3.- The EGEE vision and its relationship with the
ATLAS TIER-2
4.- Conclusions and Perspectives

2
3
1.- EGEE-II Fast description of the project

EGEE brings together scientists and engineers of
90 institutions
In over 30 countries worlwide
To provide seamless GRID infrastructure for
e-Science
Available 24 h/day x 7days/week
Funded by EU (European Commission)
Two original scientifical fields HEP and Life
Sciences but it integrates nos many other
fields from Geology up to Computing Chemistry
Infra 30.000 CPUS , 5 PBbytes storage
Maintains 10.000 concurrent jobs on average

3
4
EGEE-II Activities Packages

SSA (Specific Service Activities)
SA1 GRID Operations, Support and Management
IFIC-IFCA
SA2 Networking support
SA3 Integration, testing and certification
IFIC
Networking Activities
NA1 Management of the Project
NA2 Dissemination, Outreach and Communication
IFIC-IFCA
NA3 Training and Induction IFIC-IFCA
NA4 Application identification an Support
CNB
NA5 Policy and International Cooperation
Joint Research Activities
JRA1 Middleware Re-Engineering
JRA2 Quality Assurance

4
5
EGEE-II activities

Operation of the GRID Infrastructure (ROC
Manager)
SA1 Infrastructure Operations European GRID
Support GRID Support, Operation and Management
and includes tasks as GRID Monitoring and control
and resource and user support
Resource Operation Center (ROC)
activities are coordinated in Federations. SWE
(South West Europe) LIP, IFIC,IFCA, PIC
SA3 Integration, Testing and Certification to
manage the process of building deployable and
documented MW distributions, starting by
integrating Mw packages and components from a
variety of sources.
NA2 NA3 Dissemination and Training
NA4 Applications (HEP, Biomed)

5
6
BIFI (Zaragoza)
CIEMAT (Madrid)
6
7
, why so many GRID e-Science projects?
GRID has different approaches and perspectives
from a) the Infrastructure point of view b)
the GRID development and Deployment an point of
view c) From the applications point of view

Antecedents During the last 6 years IFIC (and
IFCA) /CSIC has participated in GRID projects in
the EU Framework
Program DATAGRID (2001-2004), CROSSGRID
(2002-2005) and, now, in EGEE (I and II phases)
DATAGRID tried to cover all these aspects,
CROSSGRID invest more effort in incorporating
new applications.
To join the efforts to establish the e-Science in
Spain
EGEE and LCG (LHC Computing GRID) are strongly
coupled but they offer complementary visions of
a given problem (e-Science)
EGEE continue the effort in the 3 fronts but the
complexity of the different aspects generates
related projects
The HEP Community have had a leader role in
several GRID initiatives of GRID and e-Science

7
8

EGEE as incubator of GRID and e-science projects

Int.eu.grid
8
9
2.- GRID computing in the LHC experiments

High Energy Physics there is a list of
experiments ( accelerator and non-accelerator)
with different scientifical objectives within the
field of Elementary Particle
Accelerator Fermilab, SLAC, CERN, etc
Non-accelerator Astropartícles (AMS, ANTARES,
K2K, MAGIC,...)
Problems in Computing Computing power, Data
Storage and access to data
Main Challenge in LHC experiments

9
10
Application in High Energy Physics

The LHC Computing Grid
A Global Computing Facility for Physics

Where? CERN Name of Accelerator LHC ( Large
Hadron Collider)
(less than 2 year) left for first colisisions in
LHC
10
José Salt
11
The 4 LHC experiments ALICE, ATLAS, CMS y LHCb
11
12

Detector study of collisions of p-p at high
energies
Start of Data Taking Spring 2007
Level 3 Trigger 200 events/s, being the event
size of 1.6 MB/event
Data Volume 2 PB/year during 10 years
Estimated CPU to process data in LHC 100.000
PCs
This generates 3 problems
Data Storage
Processing
Users scatterd Worlwide

Solution
GRID TECHNOLOGIES
12
13
The ATLAS Computing Model

To cover a width range of activities from the
Storage of Raw Data up to provide the possibility
of performing Data Analysis in an Universitary
Department (member of ATLA Coll.)
The data undergo several transformations in order
to get a reduction in size and the extraction of
relevant information

Data reduction chain

the analysis physicist will navigate the data
along their different formats in order to extract
the needed information . This activity will have
a big influence in the establshmente of the fine
adjust of the Computing Model

13
14
Tier-1
Tier-2
Tier-3 Centers
PCs , laptops
RAL
IN2P3

LHC Computing Model (in a nutshell!!)
Tier-0 CERN centre
To filter the Raw Data
Reconstruction ? Event Summary Data (ESD)
Registration of Raw Data ESD
Distribution of Raw Data and ESD to Tier-1
Tier-1
Permanent storage and organization of Raw Data,
ESD, calibration data, metadata, Analysis Data
and DataBases ? data services by using GRID
Massive Data Analysis
Reprocessed Rae Data ? ESD
Call Center at the National/regional level
gt high availability for the on line of Data
Acquisition, Massive Storage and managed of data
for long term commitments

Tier-0
FNAL
CNAF
FZK
PIC
ICEPP
BNL

Tier-2
Services of Data storage on disk issued by GRID
To provide Simulated Data on experiment demand
To provide of Analysis capacity to the Physics
Groups. Operation of a installation of System of
Data analysis (20 working lines in parallel)
To provide the network services for the
interchange with TIER-1

14
15
High Energy Physics E-Infraestructures for LHC

The TIERS
TIER-1 PIC CIEMAT,IFAE TIER-2 CMS
Tier-2 CIEMAT, IFCA ATLAS Tier-2
UAM,IFAE e IFIC LHCb Tier-2 USC, UB
the underlined centers are the
coordinators TIER-3 University Depts., Research
Centers, etc
Recent Funding by the HEP Spanish Program
(2005-2007)
15
16
230 PCs (172 IFIC 58 ICMOL)
96 Athlon 1.2 GHz 1 Gbyte SDRAM
96 athlon 1.4 GHz 1Gbyte DDR
30 nodos PE850 (DELL) Dual Core _at_ 3.2 GHz
Disk local 40 GB/160GB
Fast Ethernet agregating with Gigabit ethernet
IFIC
Manpower 6 FTE
Robot STK L700e700 Up to 134 TB Capacity
4 disk servers (5 TB) 2 tape servers
16
17
UAM
IFAE

The computer nodes and the disks will be hosted
in racks at the PIC Computer room.
Location reserved for the Atlas Tier-2

Disk servers 4.5 TB
17
18
3.- The EGEE vision and its relationship with the
ATLAS TIER-2
An Important Issue System of Distributed
Análysis based on GRID
Open Issues from LHC experiments
18
19
19
20
An Important Issue System of Distributed
Análysis based on GRID

Has to be performed in parallel to the ATLAS
production (up to 50 of ATLAS resources)
Differences
Prod. Jobs are tipically long simulation jobs,
CPU dominated and have large memory requirements
DA jobs much more IO oriented jobs with
considerably smaller memory requiremnts
Plans according to the 3 flavours
LCG plan to use the gLite Resource Broker and
CondorG to submit jobs to sites support DA by
providing a special CE for analysis or short
jobs
Prototype of a DA system at the TIER-2

20
21
Need of a Distributed Data Management
in ATLAS

GRID provides services and tools for the
Distributed Data Management
- File Catalog of low level, storage and
transfer services
ATLAS uses different GRID flavours (LCG, OSG,
NorduGrid), where each one has its own version of
these services
. Its needed to implement a specific layer over
the GRID middleware
Whats the objective? to manage the data flow
of ATLAS according to the computing model
providing only one entry point for all the
distributed data of ATLAS
The DDM (Distributes Data Management) wants to
achieve the previous objective by means of a
software called Don Quijota (DQ)

21
22
Don Quijote

The first version of DQ simply provided an
interface to the different catalogs of the 3 GRID
flavours in order to locate the data and a simple
system of file transfer
DQ was tested in the Data Challenge 2 (DC2)
program of de ATLAS, which has to validate the
software and data model of the experiment
due to the (a) scalablity problem and (b) the
progress in GRID Mw, DQ has had to be
re-engineered DQ2

DQ
queries
LCG EDG RLS
OSG Globus RLS
NG Globus RLS
22
23
Open Issues from LHC experiments

Security, authorization, authentication
VOMS available and stable (Priohigh)
VOMS groups and roles used by all middleware
(Prio high)
Information System
Stable access to static information (Priomedium)
Access to the static information
Storage Management
SRM interface provided by all Storage Element
Services (done)
Support for disk quota management (prio low)
support for disk quota management both at group
and user level should be offered by all Storage
Services
Checking of the integrity/validity after the new
replica creation (prio critical)

25
24

Data Management
FTS improvements and feature requests as
specified in the FTS workshop (Prio Critical)
Central entry point for all transfers FTS should
provide a single central entry point for all the
required transfer channels including T0-T1, T1-T1
and T1-T2/T2-T1 transfers and for the T2 sites
running analysis tasks (Prio Critical)
Support priorities, with possibility to do late
reshuffling (Prio Low)
POSIX file access based on the LFN
File access API (GFAL library) using multiple
instances of LFC (Prio High)

26
25

Workload Management
Capability of handling 1 million of short jobs
(30 min) in 1 day with RB service Feature
needed for SC4. The final short job number is
evaluated to be 1 million (Prio High)
Efficient use of the information system in the
matchmaking capability of sending the jobs to
the sites where the input files are present and
having enough free CPU slots (Prio High)
Support for different priorities based on VOMS
groups/roles (PRIO High)
The RB should reschedule the jobs in its internal
task queue, using a priorization system gt this
feature is already available in gLite RB (Prio
High)

27
26

Workload Management (cont)
CE service directly accessible by service/clients
other than RB
Allow for changing identity of a job running on
the WN

Monitoring Tools and Accounting
A scalable tool to collect VO specific
information
Publish/subscribe to logging and bookeepìng and
local batch system events for all jobs in the VO
Support for accounting, with site, user and group
granulariry (DGAS or equivalent)
Possibility ot aggregate by VO (user) specified
tag

28
27
LCG Deployment Schedule
29
28
4 - Conclusions and Perspectives

EGEE-2
Provide the MW, the General GRID framework, etc
User Support, dissemination and training
High level of synergy
TIER-2
The progress achieved until now has been very
important a Production System is working in a
acceptable way
Next problem to be solved to have access to a
powerful System of GRID Distributed Data Analysis
The succes of GRID in HEP (LHC) will be very
important in the e-Science programs
FINAL OBJECTIVE every physicist of any ATLAS
center should be able to do her/his analysis from
his/her home institute in a efective and fast
wway
EGEE-II TIER-2
Very good relationship with TIER-2 operation,
collaborative framework, User Support, GRID
Middleware progress. GRID Distributed Analysis
To go beyond the ATLAS TIER-2 vision GRID for
High Energy and Nuclear Physics ( Theoretical and
Experimental)
To extend to the industrial partners and National
GRID initiatives

30
29
Slide Backup
30
Don Quijote 2

due to the (a) scalablity problem and (b) the
progress in GRID Mw, DQ has had to be
re-engineered DQ2
DQ2 architecture consists in datasets, central
catalogs and site services
DQ2 is based in the concept of dataset
versions
Defined as a collection of files or another
datasets
DQ2 relies on the ATLAS central catalogs (global
catalog) which define the datasets and their
locations
dataset is the unity of data movement as well
To permit the movement os data it has been
distributed the site services which use the
subscription mechanism to move data from a
place to another.
More information
https//uimon.cern.ch/twiki/bin/view/Atlas/DDM

23
31
Ejemplo trabajo GRID ATLAS (Jul 2004 Julio-Marzo
05)
APLICACIONES CIENTIFICAS EN GRID

660K jobs total in (LCG,Nordugrid,US Grid3)
400 kSI2k years of CPU
In latest period average 7K jobs/day with 5K in
LCG

Mix of jobs Prep for Rome
DC2 (short jobs period)
DC2 (long jobs period)
José Salt
20

Write a Comment

User Comments (0)

About PowerShow.com

Jos SALT - PowerPoint PPT Presentation

Jos SALT

SWE (South West Europe): LIP, IFIC,IFCA, PIC ... to manage the data flow of ATLAS according to the computing model providing only ... – PowerPoint PPT presentation