Jos SALT - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Jos SALT

Description:

SWE (South West Europe): LIP, IFIC,IFCA, PIC ... to manage the data flow of ATLAS according to the computing model providing only ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 32
Provided by: qual9
Category:
Tags: salt | atlas | big | europe | jos | of | racks

less

Transcript and Presenter's Notes

Title: Jos SALT


1

EGEE-II and the implications in the LHC Computing
GRID
  • José SALT


  • 22 de Junio de 2006



Curso Postgrado GRID y e-Ciencia Kick-off
meeting de Int.eu.grid
Instituto de Física de Cantabria
1
2
Overview
  • 1. EGEE II fast description of the project
  • 2. GRID Computing in LHC Experiments
  • 3.- The EGEE vision and its relationship with the
    ATLAS TIER-2
  • 4.- Conclusions and Perspectives

2
3
1.- EGEE-II Fast description of the project
  • EGEE brings together scientists and engineers of
    90 institutions
  • In over 30 countries worlwide
  • To provide seamless GRID infrastructure for
    e-Science
  • Available 24 h/day x 7days/week
  • Funded by EU (European Commission)
  • Two original scientifical fields HEP and Life
    Sciences but it integrates nos many other
    fields from Geology up to Computing Chemistry
  • Infra 30.000 CPUS , 5 PBbytes storage
  • Maintains 10.000 concurrent jobs on average

3
4
EGEE-II Activities Packages
  • SSA (Specific Service Activities)
  • SA1 GRID Operations, Support and Management
    IFIC-IFCA
  • SA2 Networking support
  • SA3 Integration, testing and certification
    IFIC
  • Networking Activities
  • NA1 Management of the Project
  • NA2 Dissemination, Outreach and Communication
    IFIC-IFCA
  • NA3 Training and Induction IFIC-IFCA
  • NA4 Application identification an Support
    CNB
  • NA5 Policy and International Cooperation
  • Joint Research Activities
  • JRA1 Middleware Re-Engineering
  • JRA2 Quality Assurance

4
5
EGEE-II activities
  • Operation of the GRID Infrastructure (ROC
    Manager)
  • SA1 Infrastructure Operations European GRID
    Support GRID Support, Operation and Management
    and includes tasks as GRID Monitoring and control
    and resource and user support
  • Resource Operation Center (ROC)
  • activities are coordinated in Federations. SWE
    (South West Europe) LIP, IFIC,IFCA, PIC
  • SA3 Integration, Testing and Certification to
    manage the process of building deployable and
    documented MW distributions, starting by
    integrating Mw packages and components from a
    variety of sources.
  • NA2 NA3 Dissemination and Training
  • NA4 Applications (HEP, Biomed)

5
6
BIFI (Zaragoza)
CIEMAT (Madrid)
6
7
, why so many GRID e-Science projects?
GRID has different approaches and perspectives
from a) the Infrastructure point of view b)
the GRID development and Deployment an point of
view c) From the applications point of view
  • Antecedents During the last 6 years IFIC (and
    IFCA) /CSIC has participated in GRID projects in
    the EU Framework
  • Program DATAGRID (2001-2004), CROSSGRID
    (2002-2005) and, now, in EGEE (I and II phases)
  • DATAGRID tried to cover all these aspects,
    CROSSGRID invest more effort in incorporating
    new applications.
  • To join the efforts to establish the e-Science in
    Spain
  • EGEE and LCG (LHC Computing GRID) are strongly
    coupled but they offer complementary visions of
    a given problem (e-Science)
  • EGEE continue the effort in the 3 fronts but the
    complexity of the different aspects generates
    related projects
  • The HEP Community have had a leader role in
    several GRID initiatives of GRID and e-Science

7
8
  • EGEE as incubator of GRID and e-science projects

Int.eu.grid
8
9
2.- GRID computing in the LHC experiments
  • High Energy Physics there is a list of
    experiments ( accelerator and non-accelerator)
    with different scientifical objectives within the
    field of Elementary Particle
  • Accelerator Fermilab, SLAC, CERN, etc
  • Non-accelerator Astropartícles (AMS, ANTARES,
    K2K, MAGIC,...)
  • Problems in Computing Computing power, Data
    Storage and access to data
  • Main Challenge in LHC experiments

9
10
Application in High Energy Physics
  • The LHC Computing Grid
  • A Global Computing Facility for Physics

Where? CERN Name of Accelerator LHC ( Large
Hadron Collider)
(less than 2 year) left for first colisisions in
LHC
10
José Salt
11
The 4 LHC experiments ALICE, ATLAS, CMS y LHCb
11
12
  • Detector study of collisions of p-p at high
    energies
  • Start of Data Taking Spring 2007
  • Level 3 Trigger 200 events/s, being the event
    size of 1.6 MB/event
  • Data Volume 2 PB/year during 10 years
  • Estimated CPU to process data in LHC 100.000
    PCs
  • This generates 3 problems
  • Data Storage
  • Processing
  • Users scatterd Worlwide

Solution
GRID TECHNOLOGIES
12
13
The ATLAS Computing Model
  • To cover a width range of activities from the
    Storage of Raw Data up to provide the possibility
    of performing Data Analysis in an Universitary
    Department (member of ATLA Coll.)
  • The data undergo several transformations in order
    to get a reduction in size and the extraction of
    relevant information

Data reduction chain
  • the analysis physicist will navigate the data
    along their different formats in order to extract
    the needed information . This activity will have
    a big influence in the establshmente of the fine
    adjust of the Computing Model

13
14
Tier-1
Tier-2
Tier-3 Centers
PCs , laptops
RAL
IN2P3
  • LHC Computing Model (in a nutshell!!)
  • Tier-0 CERN centre
  • To filter the Raw Data
  • Reconstruction ? Event Summary Data (ESD)
  • Registration of Raw Data ESD
  • Distribution of Raw Data and ESD to Tier-1
  • Tier-1
  • Permanent storage and organization of Raw Data,
    ESD, calibration data, metadata, Analysis Data
    and DataBases ? data services by using GRID
  • Massive Data Analysis
  • Reprocessed Rae Data ? ESD
  • Call Center at the National/regional level
  • gt high availability for the on line of Data
    Acquisition, Massive Storage and managed of data
    for long term commitments

Tier-0
FNAL
CNAF
FZK
PIC
ICEPP
BNL
  • Tier-2
  • Services of Data storage on disk issued by GRID
  • To provide Simulated Data on experiment demand
  • To provide of Analysis capacity to the Physics
    Groups. Operation of a installation of System of
    Data analysis (20 working lines in parallel)
  • To provide the network services for the
    interchange with TIER-1

14
15
High Energy Physics E-Infraestructures for LHC

The TIERS
TIER-1 PIC CIEMAT,IFAE TIER-2 CMS
Tier-2 CIEMAT, IFCA ATLAS Tier-2
UAM,IFAE e IFIC LHCb Tier-2 USC, UB
the underlined centers are the
coordinators TIER-3 University Depts., Research
Centers, etc
Recent Funding by the HEP Spanish Program
(2005-2007)
15
16
230 PCs (172 IFIC 58 ICMOL)
96 Athlon 1.2 GHz 1 Gbyte SDRAM
96 athlon 1.4 GHz 1Gbyte DDR
30 nodos PE850 (DELL) Dual Core _at_ 3.2 GHz
Disk local 40 GB/160GB
Fast Ethernet agregating with Gigabit ethernet
IFIC
Manpower 6 FTE
Robot STK L700e700 Up to 134 TB Capacity
4 disk servers (5 TB) 2 tape servers
16
17
UAM
IFAE
  • The computer nodes and the disks will be hosted
    in racks at the PIC Computer room.
  • Location reserved for the Atlas Tier-2

Disk servers 4.5 TB
17
18
3.- The EGEE vision and its relationship with the
ATLAS TIER-2
An Important Issue System of Distributed
Análysis based on GRID
Open Issues from LHC experiments
18
19
19
20
An Important Issue System of Distributed
Análysis based on GRID
  • Has to be performed in parallel to the ATLAS
    production (up to 50 of ATLAS resources)
  • Differences
  • Prod. Jobs are tipically long simulation jobs,
    CPU dominated and have large memory requirements
  • DA jobs much more IO oriented jobs with
    considerably smaller memory requiremnts
  • Plans according to the 3 flavours
  • LCG plan to use the gLite Resource Broker and
    CondorG to submit jobs to sites support DA by
    providing a special CE for analysis or short
    jobs
  • Prototype of a DA system at the TIER-2

20
21
Need of a Distributed Data Management
in ATLAS
  • GRID provides services and tools for the
    Distributed Data Management
  • - File Catalog of low level, storage and
    transfer services
  • ATLAS uses different GRID flavours (LCG, OSG,
    NorduGrid), where each one has its own version of
    these services
  • . Its needed to implement a specific layer over
    the GRID middleware
  • Whats the objective? to manage the data flow
    of ATLAS according to the computing model
    providing only one entry point for all the
    distributed data of ATLAS
  • The DDM (Distributes Data Management) wants to
    achieve the previous objective by means of a
    software called Don Quijota (DQ)

21
22
Don Quijote
  • The first version of DQ simply provided an
    interface to the different catalogs of the 3 GRID
    flavours in order to locate the data and a simple
    system of file transfer
  • DQ was tested in the Data Challenge 2 (DC2)
    program of de ATLAS, which has to validate the
    software and data model of the experiment
  • due to the (a) scalablity problem and (b) the
    progress in GRID Mw, DQ has had to be
    re-engineered DQ2

DQ
queries
LCG EDG RLS
OSG Globus RLS
NG Globus RLS
22
23
Open Issues from LHC experiments
  • Security, authorization, authentication
  • VOMS available and stable (Priohigh)
  • VOMS groups and roles used by all middleware
    (Prio high)
  • Information System
  • Stable access to static information (Priomedium)
  • Access to the static information
  • Storage Management
  • SRM interface provided by all Storage Element
    Services (done)
  • Support for disk quota management (prio low)
    support for disk quota management both at group
    and user level should be offered by all Storage
    Services
  • Checking of the integrity/validity after the new
    replica creation (prio critical)

25
24
  • Data Management
  • FTS improvements and feature requests as
    specified in the FTS workshop (Prio Critical)
  • Central entry point for all transfers FTS should
    provide a single central entry point for all the
    required transfer channels including T0-T1, T1-T1
    and T1-T2/T2-T1 transfers and for the T2 sites
    running analysis tasks (Prio Critical)
  • Support priorities, with possibility to do late
    reshuffling (Prio Low)
  • POSIX file access based on the LFN
  • File access API (GFAL library) using multiple
    instances of LFC (Prio High)

26
25
  • Workload Management
  • Capability of handling 1 million of short jobs
    (30 min) in 1 day with RB service Feature
    needed for SC4. The final short job number is
    evaluated to be 1 million (Prio High)
  • Efficient use of the information system in the
    matchmaking capability of sending the jobs to
    the sites where the input files are present and
    having enough free CPU slots (Prio High)
  • Support for different priorities based on VOMS
    groups/roles (PRIO High)
  • The RB should reschedule the jobs in its internal
    task queue, using a priorization system gt this
    feature is already available in gLite RB (Prio
    High)

27
26
  • Workload Management (cont)
  • CE service directly accessible by service/clients
    other than RB
  • Allow for changing identity of a job running on
    the WN
  • Monitoring Tools and Accounting
  • A scalable tool to collect VO specific
    information
  • Publish/subscribe to logging and bookeepìng and
    local batch system events for all jobs in the VO
  • Support for accounting, with site, user and group
    granulariry (DGAS or equivalent)
  • Possibility ot aggregate by VO (user) specified
    tag

28
27
LCG Deployment Schedule
29
28
4 - Conclusions and Perspectives
  • EGEE-2
  • Provide the MW, the General GRID framework, etc
  • User Support, dissemination and training
  • High level of synergy
  • TIER-2
  • The progress achieved until now has been very
    important a Production System is working in a
    acceptable way
  • Next problem to be solved to have access to a
    powerful System of GRID Distributed Data Analysis
  • The succes of GRID in HEP (LHC) will be very
    important in the e-Science programs
  • FINAL OBJECTIVE every physicist of any ATLAS
    center should be able to do her/his analysis from
    his/her home institute in a efective and fast
    wway
  • EGEE-II TIER-2
  • Very good relationship with TIER-2 operation,
    collaborative framework, User Support, GRID
    Middleware progress. GRID Distributed Analysis
  • To go beyond the ATLAS TIER-2 vision GRID for
    High Energy and Nuclear Physics ( Theoretical and
    Experimental)
  • To extend to the industrial partners and National
    GRID initiatives

30
29
Slide Backup
30
Don Quijote 2
  • due to the (a) scalablity problem and (b) the
    progress in GRID Mw, DQ has had to be
    re-engineered DQ2
  • DQ2 architecture consists in datasets, central
    catalogs and site services
  • DQ2 is based in the concept of dataset
    versions
  • Defined as a collection of files or another
    datasets
  • DQ2 relies on the ATLAS central catalogs (global
    catalog) which define the datasets and their
    locations
  • dataset is the unity of data movement as well
  • To permit the movement os data it has been
    distributed the site services which use the
    subscription mechanism to move data from a
    place to another.
  • More information
  • https//uimon.cern.ch/twiki/bin/view/Atlas/DDM

23
31
Ejemplo trabajo GRID ATLAS (Jul 2004 Julio-Marzo
05)
APLICACIONES CIENTIFICAS EN GRID
  • 660K jobs total in (LCG,Nordugrid,US Grid3)
  • 400 kSI2k years of CPU
  • In latest period average 7K jobs/day with 5K in
    LCG

Mix of jobs Prep for Rome
DC2 (short jobs period)
DC2 (long jobs period)
José Salt
20
Write a Comment
User Comments (0)
About PowerShow.com