Title: Jos SALT
1EGEE-II and the ATLAS TIER-2
- José SALT
-
-
26 de Abril de 2006 -
-
-
IV Reunión de la Red Temática en GRID Middleware
Universidad Complutense de Madrid
1
2Overview
- 1. GRID computing in the LHC experiments
- 2.- The Spanish Distributed TIER-2 (ATLAS)
- 3.- The EGEE (I and II) vision and its
relationship with the ATLAS tier-2. - 4.- An Important Issue System of Distributed
Data Analysis based on GRID - 5.- Conclusions and Perspectives
2
31.- GRID computing in the LHC experiments
- High Energy Physics there is a list of
experiments ( accelerator and non-accelerator)
with different ( variados) scientifcal objectives
within the field of Elementary Particle - Accelerator Fermilab, SLAC, CERN, etc
- Non-accelerator Astropartícles (AMS, ANTARES,
K2K, MAGIC,...) - Problems in Computing Computing power, Data
Storage and access to data - Main Challenge in LHC experiments
I am biased to ATLAS, sorry !
3
4Where? CERN Name of Accelerator LHC ( Large
Hadron Collider)
350 days left for first colisisions in LHC
4
José Salt
5The 4 LHC experiments ALICE, ATLAS, CMS y LHCb
5
6- Detector study of collisions of p-p at high
energies - Start of Data Taking Spring 2007
- Level 3 Trigger 200 events/s, being the event
size of 1.6 MB/event - Data Volume 2 PB/año during 10 years
- Estimated CPU to process data in LHC procesar
datos 100.000 PCs actuales - This generates 3 problemes
- Data Storage
- Processing
- Users scatterd Worlwide
Posible solución
GRID TECHNOLOGIES
6
7 The ATLAS Computing Model
- To cover a width range of activities from the
Storage of Raw Data up to provide the possibility
of performing Data Analysis in an Universitary
Department (member of ATLA Coll.) - The data undergo several transformations in order
to get a reduction in size and the extraction of
relevant information
Cadena de reducción de Los datos
- the analysis physicist will navigate the data
along their different formats in order to extract
the needed information . This activity will have
a big influence in the establshmente of the fine
adjust of the Computing Model
10
8Tier-1
Tier-2
Centros Pequeños
PCs , portátiles
RAL
IN2P3
- LHC Computing Model (in a nutshell!!)
- Tier-0 CERN centre centre
- To filter the Raw Data
- Reconstruction ? Event Summary Data (ESD)
- Registration of Raw Data ESD
- Distribution of Raw Data and ESD to Tier-1
- Tier-1
- Permanent storage and organization of Raw Data,
ESD, calibration data, metadata, Analysis Data
and DataBases ? data services by using GRID - Masive Data Analysis
- Reprocessed Rae Data ? ESD
- Call Center at the National/regional level
- gt high availability for the on line of Data
Acquisition, Masive Storage and managed of data
for long term commitments
FNAL
CNAF
FZK
PIC
ICEPP
BNL
- Tier-2
- Services of Data storage on disk issued by GRID
- To provide Simulated Data on experiment demand
- To provide of Analysis capacity to the Physics
Groups. Operation of a instaaltion of System of
Data analysis (20 working lines in paralleL) - To provide the network services for the
interchange with TIER-1
12
92.- The Spanish Distributed TIER-2 of ATLAS
gt After a 3-year project Proyecto 3 año (
LCG-ES)- 2002-2005 gt 4 TIERs infrastructures
launched at the last request to the PNFAE ( Plan
Nacional de Física de Altas Energías) 1 TIER-1 3
TIER-2
IFAE
UAM
IFIC
13
10192 PCs (134 IFIC 58 ICMOL)
96 Athlon 1.2 GHz 1 Gbyte SDRAM
96 athlon 1.4 GHz 1Gbyte DDR
Hard disk local 40 GB
Fast Ethernet agregating with Gigabit ethernet
IFIC
Pre-existent Manpower 6 FTE
Robot STK L700e700 Up to 134 TB Capacity
4 disk servers (5 TB) 2 tape servers
10
11- Producción de Monte Carlo para el Workshop de Roma
Número total de jobs de simulación 100209 (5
Millones de sucesos)
Distribución entre sitios
Producción privada del Grupo de la UAM
30.000 sucesos de background
Zbb en producción de Higgs
Servicios GRID usados 2 RB (IFAE,IFIC) 2 BDII
(IFAE,IFIC) 2 Proxy server (IFIC,IFAE)
15
123.- The EGEE vision and its relationship with the
ATLAS TIER-2
- Antecedents During the last 5 years IFIC/CSIC
has participated in GRID projects in the EU
Frameworl Program DATAGRID (2001-2004),
CROSSGRID (2002-2005) and, now, in EGEE (I and II
phases) - To join the efforts to establish the e-Science in
Spain - EGEE and LCG (LHC Computing GRID) are strongly
coupled but they offer complementary visions of
a given problem (e-Science) - The HEP Community have had a leader role in
several GRID initiatives of GRID and e-Science
13
13(No Transcript)
14EGEE-II Activities Packages
- SSA (Specific Service Activities)
- SA1 GRID Operations, Support and Management
IFIC-IFCA - SA2 Networking support
- SA3 Integration, testing and certification
IFIC - Networking Activities
- NA1 Management of the Project
- NA2 Dissemination, Outreach and Communication
IFIC-IFCA - NA3 Training and Induction IFIC-IFCA
- NA4 Application identification an Support
CNB - NA5 Policy and International Cooperation
- Joint Research Activities
- JRA1 Middleware Re-Engineering
- JRA2 Quality Assurance
15(No Transcript)
16Open Issues from Experiments
- Stability in the Data Services
- Production and Distributed Analysis System
- To use the Resource broker to permit (a)
multiusers, (b) permanent service - To define thousands of jobs (analysis phase)
- InteroperabilityThere are several GRID
infrastructures which should be interoperated
(OSG/NG/LCG)
- SUSTAINANIBILITY
- To provide computing resources to be available in
valley-crests cycles of different projects - Minimum Human Resources to maintain the services
- To establish links with companies/industrial
partnersThere are several GRID infrastructures
which should be interoperated (OSG/NG/LCG)
17TIER-2 Activity Workpackages
Fabric Mangement Support
Distributed Analysis
Calibration of Had. Cal.
FMS
DA
EF
CHC
SEP
Simulated Event Production
Event Filter
User Support
ESS
TIER2 Resource Optimization
US
GS
Hardware Support
T2RE
HSA
Experiment Software Support
GRID Services
T2OP
PM
TIER2 Operation
TIER-2
Project Management
16
184.- An Important Issue System of Distributed
Análysis based on GRID
- Main features of a DA system
- The Distributed Data Management
13
19GRID Distributed Analysis
- Has to be performed in parallel to the ATLAS
production (up to 50 of ATLAS resources) - Differences
- Prod. Jobs are tipically long simulation jobs,
CPU dominated and have large memory requirements - DA jobs much more IO oriented jobs with
considerably smaller memory requiremnts - Plans according to the 3 flavours
- LCG plan to use the gLite Resource Broker and
CondorG to submit jobs to sites support DA by
providing a special CE for analysis or short
jobs - Prototype of a DA system at the TIER-2
20Need of a Distributed Data Management
in ATLAS
- GRID provides services and tools for the
Distributed Data Management - Catalogación de archivos de bajo nivel,
almacenamiento y servicios de transferencia - ATLAS uses different GRID flavours (LCG, OSG,
NorduGrid), where each one has its own version of
these services - . Its needed to implement a specific layer over
the GRID middleware - Whats the objective? to manage the data flow
of ATLAS according to the computing providing
only one entry point for all the distributes data
of ATLAS - The DDM (Distributes Data Management) wants to
achieve the previous objective by means of a
software alled Don Quijota (DQ)
21Don Quijote
- The first version of DQ simply provided an
interface to the diffetent catalogs of the 3 GRID
flavours in order to locate the data and a simple
system of file transfer - DQ was tested in the Data Challenge 2 (DC2)
program of de ATLAS, which has to validate the
software and data model of the experiment
DQ
queries
LCG EDG RLS
OSG Globus RLS
NG Globus RLS
22Don Quijote 2
- due to the (a) scalablity problem and (b) the
progress in GRID Mw, DQ has had to be
re-engineered DQ2 - DQ2 architecture consists in datasets, central
catalogs and site services - DQ2 is based in the concept of dataset
versions - Defined as a collection of files or another
datasets - DQ2 relies on the ATLAS central catalogs (global
catalog) which define the datasets and their
locations - dataset is the unity of data movement as well
- To permit the movement os data it has been
dsitributed the site services which use the
subscription mechanism to move data from a
place to another. - More information
- https//uimon.cern.ch/twiki/bin/view/Atlas/DDM
23Total de datos transferidos usando DQ2 durante
el ejercicio del Tier-0
245.- Conclusions and Perspectives
- TIER2
- The progress achieved until now has been very
important a Production Systen ys working in a
acceptable - Next problem to be solved to have access to a
powerul System of GRID Distributed Data Analysis - The succes of GRID in HEP (LHC) will be very
important in the e-Science programs - FINAL OBJECTIVE every physicist of any ATLAS
center should be able to do her/his analysis from
his/her home institute in a efective and fast
wway - EGEE-II
- Very good relationship with TIER-2 operation,
collaborative framework, User Support, GRID
Middleware progress - To go beyond the ATLAS TIER-2 vision GRID for
High Energy and Nuclear Physics ( Theoretical and
Experimental) - To extend to the industrial partners and National
GRID initiatives
19
25BACKUP SLIDES
26- Las infraestructuras tienen que incrementar los
recursos a lo largo del tiempo según las
previsiones del crecimiento y también han de
suministrar los servicios a los que se han
comprometido con el fin de alcanzar los objetivos
de máxima producción científica - Establecimiento de acuerdos de servicios y
coordinación de los centros de recursos - Relación EGEE con los nuevos proyectos de TIERs
- Política a seguir con los grupos de experimentos
no-LHC de nuestros centros ( CDF, BaBar, AMS,
Magic, etc) o de otros centros - Nivel bajo de utilización del GRID
DISCUSION y resumen de propuestas
20
27LCG
28CERN Collaborators
Europe 267 institutes 4603 users Elsewhere
208 institutes 1632 users
CERN has over 6,000 users from more than 450
institutes from around the world
LHC Computing ? uniting the computing resources
of particle physicists in the world!
29Sistema de producción del experimento ATLAS
- Programa Data Challenges (DC)
- La colaboración ATLAS en el año 2002 diseño el
programa Data Challenges (DC) con el objetivo
de validar su - Modelo de Computación
- Software
- Modelo de datos
- Empezar a usar y probar las tecnologías GRID
- Con los DC1 se consiguió (no se utilizó el GRID)
- Desarrollar y desplegar el software necesario
para la producción de sucesos a gran escala. - Participaron en la producción institutos de todo
el mundo. - Con los DC2 se ha conseguido producir sucesos a
gran escala utilizando el middleware GRID
desarrollado en tres proyectos (sabores de Grid)
- LHC Computing Grid project (LCG), en el cual
participa el CERN y el IFIC - GRID3
- NorduGRID
Santiago González de la Hoz
8/34
30 ATLAS Activities at the centers
ATLAS Physics NOW, it will be enlarged by The
time the LHC starts
- UAM
- Construction of the ATLAS Electromagnetic
Calorimeter - ATLAS Physics MC studies on Higgs production
using 2 decays modes 2 gammas and 4 leptons - IFIC
- Construction of the ATLAS Hadronic Calorimeter
(TileCal) - Construction of the ATLAS SCT-Fwd
- ATLAS Physics b tagging algorithns for event
selection MC studies of different process beyond
SM ( Little Higgs and Extra Dimensions models) - IFAE
- Construction and Commissioning of ATLAS Hadronic
Calorimeter (TileCal) - Development and Deployment of the ATLAS third
level trigger (Event Filter Farm) - ATLAS Physics TileCal Calibration,
Reconstruction and Calibration of Jet/Tau/Missing
Transverse Energy, neutral and charged Higgs
Search)
14
31El Proyecto LCG (LHC Computing GRID)
- El LCG tiene como objetvo el despliegue de una
Instalación Global de Computación para Física - Gran comunidad internacional muy experta
- Involucrados en muchos proyectos mundiales y
usuarios de varias GRID ( por ejemplo, todos los
experimentos del LHC usan grid multiples al mismo
tiempo para sus DC) - Infrastructura de Producción (LCG/EGEE)
- Uso intensivo
- Por ejemplo, LHCb, gt 3500 jobs concurrentes
durante largos períodos (semanas), ATLAS mas de
6000 jobs usando multiples GRIDs LCG, GRID3,
NorduGrid - Campañas de simulaciones complejas usando
servicios de alto nivel desarrollados en la
comunidad de HEP para la coordinación general ,
distribución de datos, monitor, uso de grids
heterogeneos
José Salt
7
32El Proyecto LCG (LHC Computing GRID)
- Se ha comenzado el Uso de infraestructura de LCG2
para análisis (sistema expuesto a usuarios
finales). Ejemplo en CMS - Actividad en otros experimentos HEP usando LCG2
(BaBar, CDF, D0, )
- la construcción y operación de la
infraestructura LCG - requiere
- Físicos y especialistas de computación de los
experimentos de LHC - Proyectos en Europa y en Estados Unidos que han
desarrollado el Middleware GRID - Centros de Computación regionales y nacionales
que suministren recursos a LHC - Redes de investigación
Investigadores
Ingenieros de Software
Suministradores de Servicios
José Salt
8
33- condiciones para el despliegue de aplicaciones
científicas ( o de otra índole) en entornos GRID - Posibilidad y necesidad de compartir recursos
- Tener un acceso seguro
- El uso eficiente de los recursos
- Redes de comunicación fiables
- Open source
- Estas características no son privativas de unas
determinadas áreas temáticas cientifico-tecnológic
as y de Humanidades ( por ejemplo, en estudios
filológicos, arqueología, etc) - Dos posibles perspectivas de las aplicaciones
interactivas o no interactivas
9
34Ejemplo de ATLAS Recursos Pre-existentes 2005
UAM
Pre-existent Manpower 2.5 FTE
46 PCs P IV 2.8 GHz 1 GB RAM
Disk servers 4.5 TB
8
35IFAE
Pre-existent Manpower 2FTE
- Protected power with UPS 220 kVA
- Diesel Power Generator 500 kVA
- Access control with microchip cards.
- Castor Robotic storage
- Internal and external Gigabit Ethernet
infrastructure
- The computer nodes and the disks will be hosted
in racks at the PIC Computer room. - Location reserved for the Atlas Tier-2
9
36APLICACIONES CIENTIFICAS EN GRID
ARDA A Realisation of Distributed Analysis for
LHC
- Papel de ARDA en el desarrollo de aplicaciones y
en test de middleware - Ayudando a la evolución de mw específico de
experimentos hacia el uso de análisis - Gran esfuerzo en los prototipos de los 4
experimentos - Prototipo de CMS migrada a la versión 1 de gLite
y expuesto a varios usuarios - Reacción inmediata sobre la utilización del
prototipo de gLite justo desde el principio de
EGEE. - Contribution to the common testing effort
together with JRA1, SA1 and NA4-testing
Contribucióon al esfuerzo de test común junto con
la labor de test de JRA1, SA1 y NA4 - Medidas detalladas de performance/funcionalidad.
- Ayuda a nuevos colegas a conseguir experiencia
(mini tutorial)
José Salt
19
37 Avances y resultados del Proyecto LCG-ES
- LCG-ES proyecto de 3 años del PNFAE
(Oct-2002--Sept-2005) - Avances en dos aspectos en el desarrollo y
despliege del GRID y en Operaciones - Participación en
- los Data Challenges
- de ATLAS
- (DC1, DC2)
22
14
38 Resultados del Proyecto LCG-ES
- LCG-ES proyecto de 3 años para comenzar a
desplegar el GRID para los grupos españoles que
participan en experimentos del LHC
(Oct-2002--Sept-2005) - Los avances han tenido lugar en dos aspectos en
pa parted del Desarrollo e Integración y en el de
Operación de la Infraestructura - Participation in
- ATLAS Data
- Challenges
- (DC1, DC2)
22
17
39Resultados obtenidos a partir del sistema de
accounting del SWE-EGEE
Evolución temporal del número de jobs para la VO
de ATLAS
ATLAS
Tiempo Total CPU para todas las VO de la SWE VO
(Junio 2005)
15
40E-Infraestructuras de Altas Energías para LHC
los
TIERS
TIER-1 PIC CIEMAT,IFAE TIER-2 CMS
Tier-2 CIEMAT, IFCA ATLAS Tier-2
UAM,IFAE e IFIC LHCb Tier-2 USC, UB
los centros subrayados indica centro
coordinador TIER-3 Departamentos
universitarios, centros de investigación, etc
Reciente solicitud al Programa Nacional de Altas
Energías 2005. Plan para 2 años (2005-2007)
16
41Flujo de datos del experimento ATLAS
CERN Computer Centre Tier 0
RAW data
Detector
Reconstructed RAW data
GRID
Small data products
Reprocessing
Simulated data
Tier 1 centres
Tier 2 centres