CMS Computing risultati e prospettive - PowerPoint PPT Presentation

1 / 56

About This Presentation

Title:

CMS Computing risultati e prospettive

Description:

ORCA. Analysis. Job. MSS. ORCA. Grid Job. Unico Tier2. nel DC04: LNL ... ORCA, OSCAR (Geant4), ricostruzione e simulazione di CMS (CMS wide) ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 57

Provided by: PaoloCa7

Category:

more less

Transcript and Presenter's Notes

Title: CMS Computing risultati e prospettive

1
CMS Computingrisultati e prospettive

Outline
Schedule
Pre Data Challenge 04 Production
Data Challenge 04
Disegno e scopo
Componenti sw e mw
Risultati
Lezione
Prospettive ed attivita prossime
Conclusioni

Nota poco pre-Challenge (PCP), ma update di
quanto presentato a Settembre a Lecce
2
CMS Computing schedule

2004
Mar/Apr. DC04 to study T0 Reconstruction, Data
Distribution, Real- time analysis 25 of
startup scale
May/Jul. Data available and useable by PRS
groups
Sep. PRS analysis feed-backs
Sep. Draft CMS Computing Model in CHEP papers
Nov. ARDA prototypes
Nov. Milestone on Interoperability
Dec. Computing TDR in initial draft form. NEW
milestone date
2005
July. LCG TDR and CMS Computing TDR NEW
milestone date
Post July?... DC05 , 50 of startup scale. NEW
milestone date
Dec. Physics TDR Based on Post-DC04
activities
2006
DC06 Final readiness tests
Fall. Computing Systems in place for LHC
startup
Continuous testing and preparations for data

3
CMS permanent production
T. Wildish
The system is evolving into a permanent
production effort
Strong contribution of INFN and CNAF Tier-1 to
CMS pastfuture productions 252 assids in
PCP-DC04, for all production step, both local and
(when possible) Grid
4
PCP _at_ INFN statistics (4/4)
CMS production steps Generation Simulation ooHitf
ormatting Digitisation continued through DC!
2x1033 digitisation step (all CMS)
DC04
Note strong contribution to all steps by CNAF T1
but only outside DC04 (on DC too hard for CNAF T1
to be a RC also!!)
24 Mevents, 6 weeks
May 04
Feb 04
2x1033 digitisation step (INFN only)
43 Mevts in CMS 7.8 Mevts ( 18) done by INFN
D. Bonacorsi
5
PCP grid-based prototypes
Constant work of integration in CMS between
? CMS software and production tools

? evolving EDG-X?LCG-Y middleware in
several phases ? CMS Stress Test stressing
EDGlt1.4, then ? PCP on the CMS/LCG-0 testbed
? PCP on LCG-1 towards DC04 with LCG-2
EU-CMS submit to LCG scheduler ? CMS-LCG
virtual Regional Center 0.5 Mevts Generation
heavy pythia (2000 jobs 8 hours each, 10
KSI2000 months) 2.1 Mevts Simulation
CMSIMOSCAR (8500 jobs 10hours each, 130
KSI2000 months) 2 TB data
OSCAR 0.6 Mevts on LCG-1
PIII 1GHz
CMSIM 1.5 Mevts on CMS/LCG-0
D. Bonacorsi
6
Scopo del Data Challenge 04

Aim of DC04
? reach a sustained 25Hz reconstruction rate in
the Tier-0 farm (25 of the
target conditions for LHC startup)
? register data and metadata to a catalogue
? transfer the reconstructed data to all Tier-1
centers
? analyze the reconstructed data at the Tier-1s
as they arrive
? publicize to the community the data produced at
Tier-1s
? monitor and archive of performance criteria of
the ensemble of activities for
debugging and post-mortem analysis
Not a CPU challenge, but a full chain
demonstration!
Pre-challenge production in 2003/04
? 70M Monte Carlo events (30M with Geant-4)
produced
? Classic and grid (CMS/LCG-0, LCG-1, Grid3)
productions

Era un challenge, e ogni volta che si e
trovato un limite di scalabilita di una
componente, e stato un Successo!
7
Data Challenge 04 layout
By C. Grandi
INFN
INFN
INFN
INFN
Unico Tier2 nel DC04 LNL
INFN
INFN
Full chain (but the Tier-0 reconstruction) done
in LCG-2, but only for INFN and PIC Not without
pain
8
Data Challenge 04 numbers

Pre Challenge Production (PCP04) Jul03-Feb04
Eventi simulati 75 M events 750k jobs, 800k
files, 5000 KSI2000 months, 100 TB of data
(30 M Geant4)
Eventi digitizzati (raw) 35 M events 35k jobs,
105k files
Dove INFN, USA, CERN,
In Italia 10-15 M events (20)
Per cosa (Physics and Reconstruction Software
Groups) Muons, B-tau, e-gamma, Higgs
Data Challenge 04 Mar04-Apr04
Eventi ricostruiti (DST) al Tier0 del CERN
25 M events 25k jobs, 400k files,
150 KSI2000 months, 6 TB of data
Eventi distribuiti al Tier1-CNAF e Tier2-LNL
gli stessi 25 M events e files
Eventi analizzati al Tier1-CNAF e Tier2-LNL
gt 10 M events 15 k jobs, ognuno di 30min
CPU
Post Data Challenge 04 May04-
Eventi da riprocessare (DST) 25 M events
Eventi da analizzare in Italia 50 di 75 M
events
Eventi da produrre e distribuire 50 M

9
Data Challenge 04 componenti MW e SW

CMS specific
Transfer Agents per trasferire i files di DST (al
CERN, ai Tier1)
Mass Storage Systems su nastro (Castor, Enstore,
etc.) (al CERN ai Tier1)
RefDb, Database delle richieste e assignment di
datasets (al CERN)
Cobra, framework del software di CMS (CMS wide)
ORCA, OSCAR (Geant4), ricostruzione e simulazione
di CMS (CMS wide)
McRunJob, sistema per preparazione dei job (CMS
wide)
BOSS, sistema per il job tracking (CMS wide)
SRB, sistema di replicazione e catalogo di files
(al CERN, a RAL, Lyon e FZK)
MySQL-POOL, backend di POOL sul database MySQL (a
FNAL)
ORACLE database (al CERN e al Tier1-INFN)

LCG common
User Interfaces including Replica Manager (al
CNAF, Padova, LNL, Bari, PIC)
Storage Elements (al CNAF, LNL, PIC)
Computing Elements (al CNAF, a LNL e a PIC)
Replica Location Service (al CERN e al
Tier1-CNAF)
Resource Broker (al CERN e al CNAF-Tier1-Grid-it)
Storage Replica Manager (al CERN e a FNAL)
Berkley Database Information Index (al CERN)
Virtual Organization Management System (al CERN)
GridICE, sistema di monitoring (sui CE, SE, WN,
)
POOL, catalogo per la persistenza (in CERN RLS)
US specific
Monte carlo distributed prod system (MOP) (a
FNAL, Wisconsin, Florida, )
MonaLisa, sistema di monitoring (CMS wide)
Custom McRunJob, sistema di preparazione dei job
(a FNAL eforse Florida)

10
Data Challenge 04 Processing Rate

Processed about 30M events
But DST errors make this pass not useful for
analysis
Generally kept up at T1s in CNAF, FNAL, PIC

Got above 25Hz on many short occasions
But only one full day above 25Hz with full system
Working now to document the many different
problems

11
Data Challenge 04 data transfer from CERN to
INFN

A total of gt500k files and 6 TB of data
transferred CERN T0 ? CNAF T1
max nb.files per day is 45000 on March 31st ,
max size per day is 400 GB on March 13th (gt700
GB considering the Zips)

GARR Network use
340 Mbps (gt42 MB/s) sustained for 5 hours (max
was 383.8 Mbps)
D. Bonacorsi
12
DC04 Real-Time (fake) Analysis

CMS software installation
CMS Software Manager (M. Corvo) installs software
via a grid job provided by LCG
RPM distribution based on CMSI or DAR
distribution
Used at CNAF, PIC, Legnaro, Ciemat and Taiwan
with RPMs
Site manager installs RPMs via LCFGng
Used at Imperial College
Still inadequate for general CMS users

Real-time analysis at Tier-1
Main difficulty is to identify complete file sets
(i.e. runs)
Information today in TMDB or via findColls
Job processes single runs at the site close to
the data files
File access via rfio
Output data registered in RLS

A. Fanfani C. Grandi
13
DC04 Fake Analysis Architecture

Fake Analysis
Data Transfer
LCG Worker Node
LCG Resource Broker
Drop agent
Fake Analysis agent
Drop Files

Drop agent triggers job preparation/submission
when all files are available
Fake Analysis agent prepares xml catalog, orcarc,
jdl script and submits job
Jobs record start/end timestamps in mysql DB

J. Hernandez
14
Real-time DC04 analysis Turn-around time from T0

The minimum time from T0 to T1 analysis was 10
minutes
Different problems contributed to the time
spread

the dataset-oriented analysis made the results
dependent on which dataset were sent in real time
from CERN
Tuning of the Tier-1 Replica Agent
Replica Agent operation affected by CASTOR
problem
Analysis Agents were not always up due to
debugging
for 1 dataset Zipped Metadata were late with
respect to data
few problems with submission

Preliminary
N. De Filippis, A. Fanfani, F. Fanzago
15
DC04 Real-time Analysis

Maximum rate of analysis jobs 194 jobs/hour
Maximum rate of analysed events 26 Hz
Total of 15000 analysis jobs via Grid tools
in 2 weeks (95-99 efficiency)

Datasets examples
B0S ? J/y j
Bkg mu03_tt2mu, mu03_DY2mu
tTH, H ? bbbar t? Wb W ? ln T ? Wb W
? had.
Bkg bt03_ttbb_tth
Bkg bt03_qcd170_tth
Bkg mu03_W1mu
H ? WW ? 2m 2n
Bkg mu03_tt2mu, mu03_DY2mu

N. De Filippis, A. Fanfani, F. Fanzago
16
Software di ricostruzione e DST
Senza lattivita dei PRS (b-tau, muon, e-gamma)
per il software di ricostruzione non ci sarebbe
analisi ne Data Challenge (04) LINFN e il
major contributor Ba, Bo, Fi, Pi, Pd, Pg, Rm1,
To.

Last CMS wk Today Prototype DST in place
Huge effort by large number of people, especially
S. Wynhoff, N. Neumeister, T. Todorov, V.
Innocente for base. Also from
Emilio Meschi, David Futyan, George Daskalakis,
Pascal Vanlaer, Stefano Lacaprara, Christian
Weiser, Arno Heister, Wolfgang Adam, Marcin
Konecki, Andre Holzner, Olivier van der Aa,
Christophe Delaere, Paolo Meridiani, Nicola
Amapane, Susanna Cucciarelli, Haifeng Pi
DST constitutes first CMS summary
Examples of doing physics with it in place.
But not complete

P. Sphicas
17
PRS analysis contributions

ttH H?bb and related backgrounds
S. Cucciarelli, F. Ambroglini, C. Weiser, S.
Kappler. A. Bocci, R. Ranieri, A. Heister ...

Bs?J/y f and related backgrounds
V. Ciulli, N. Magini, Dubna group...

A/Hsusy? tt established channel for SUSY H HLT
People/channels
A/H?2t?t-jet t-jet S. Gennai, S. Lehti, L.
Wendland

Reconstruction full track reco starting from to
raw-data several algos already implemented
Studies of RecHits, sensor positions, B field,
material dist
W. Adam, M. Konecki, S. Cucciarelli, A. Frey, M.
Konecki, T. Todorov

H???
People G. Anagnostou, G. Daskalakis, A.
Kyriakis, K. Lassila, N. Marinelli, J. Nysten, K.
Armour, S. Bhattacharya, J. Branson, J. Letts, T.
Lee, V. Litvin, H. Newman, S. Shevchenko

H?ZZ()?4e
People David Futyan, Paolo Meridiani, Kate
Mackay, Emilio Meschi, Ivica Puljak, Claude
Charlot, Nikola Godinovic, Federico Ferri,
Stephane Bimbot

H ? WW ? 2m2n
Zanetti, Lacaprara

E molti altri !!!!

Calibrazioni ed allineamenti
Higgs studies

18
Data Challenge 04 lezione (1/2)

Molte componenti usate non scalano (sia CMS che
NON)
RLS
Castor
D-cache
Metadata
SRB
Cataloghi di vario tipo e specie
Job submission system at the Tier0
Etc.
Molte funzioni/componenti mancavano
Data Transfer Management
Global Data location per tutti (almeno) i Tier1
Niente di male, era un challenge fatto per
questo!
Ma la vera lezione e stata (surprise?) che
NON cera (ce) lorganizzazione, ne per LCG
ne per CMS ne per Grid3
NON cera (ce) un consistente disegno ne di
Data ne di Computing Model
Salvo che parzialmente in Italia e in USA!

19
Data Challenge 04 lezione (2/2)
Infatti, per es.
D. Bonacorsi
20
Prospettive INFN

Breve termine
Ricostruire i DST con una versione di ORCA (sw
CMS)
Validata dalle analisi mentre avviene la
produzione
Dovunque (Tier0, Tier1s e Tier2s) sia possibile
Distribure i DST, gli altri formati di dati
(Digi, Simhits) e i metadati
Ai Tier1 e di conseguenza ai Tier2
Consentire lanalisi localmente distribuita
In modo consistente per laccesso ai dati (pochi
tools lo permettono)
Medio termine
Costruire un Data Model
Costruire un Computing Model
Costruire una architettura consistente e
distribita
Costruire un accesso controllato (e
semi-trasparente) ai dati
Con le componenti che ci sono e che hanno una
prospettiva di scalabilita (da misurare di
nuovo, in modo organico)

21
Attivita post Data Challenge 04

June 04 July 04
Ricreazione dei DST
Distribuzione dei file necessari (data e
metadata) per lanalisi
Primi risultati per i PRS e per il Physics TDR
July 04 July 05
Produzione di nuovi (o vecchi) datasets (inclusi
i DST)
Target 10 M events/month, steady, per il Physics
TDR
Analisi continua dei dati prodotti
Sep 04 Oct 04
Risultati del Data Challenge 04 per CHEP04
Prima definizione del Data Computing Model
Definizione dei MoUs
Jul 05 -
CMS Computing TDR (e LCG TDR)
Data Challenge 05, per verificare il Computing
Model
Serviranno risorse (2005) di
Storage per lanalisi e la produzione ai Tier1,
Tier2 e Tier3
CPUs per la produzione e lanalisi ai Tier1 e
Tier2

22
Possible evolution of CCS tasks(Core Computing
and Software)

CCS will Reorganize to match the new requirements
and the move from RD to Implementation for
Physics
Meet the PRS Production Requirements (Physics TDR
Analysis)
Build the Data Management and Distributed
Analysis infrastructures
Production Operations group NEW
Outside of CERN. Must find ways to reduce
manpower requirements.
Using predominantly (only?) GRID resources.
Data Management Task NEW
Project to respond to DM RTAG
Physicists/ Computing to define CMS Blueprint,
relationships with suppliers (LCG/EGEE), CMS DM
task in Computing group
Expect to make major use of manpower and
experience from CDF/D0 Run II
Workload Management Task NEW
Make the Grid useable to CMS users
Make major use of manpower with EDG/LCG/EGEE
experience
Distributed Analysis Cross Project (DAPROM) NEW
Coordinate and harmonize analysis activities
between CCS and PRS
Work closely with Data and Workload Management
tasks
Establish high-level Physics/Computing panel
between T1 countries to ensure Collaboration
Ownership of Computing Model for MoU and RRB
discussions

23
Conclusioni

Il Data Challenge 04 di CMS ha avuto successo
Misurate molte funzionalita in modo
scientifico
Scoperti molte failures e bottlenecks (ma
raggiunti i 25 Hz!)
Capite (??) molte cose
Contributo italiano (INFN) determinate
Il Data Challenge 04 di CMS non ha avuto
successo
Non e stato programmato a sufficienza
Ha richiesto una continua (due mesi) presenza ed
intervento di persone volonterose (20 ore per
giorno, inclusi i week-end) per soluzioni al
volo ? 30 persone, world-wide
NON ce ancora una valutazione oggettiva dei
risultati
Tutto quello che ha funzionato (nel bene e nel
male) viene a-priori criticato senza proposte
realistiche alternative
Tuttavia, CMS, superato lo stress del DC04, si
sta riprendendo

The CMS system is evolving into a permanent
Production and Analysis effort
24
Milestones 2004 specifiche (1/2)

Partecipazione di almeno tre sedi al DC04 Marzo
Importare in Italia (Tier1-CNAF) tutti gli eventi
ricostruiti al T0
Distribuire gli streams selezionati su almeno tre
sedi ( 6 streams, 20 M eventi, 5TB di AOD)
La selezione riguarda lanalisi di almeno 4
canali di segnale e relativi fondi, ai quali
vanno aggiunti gli studi di calibrazione
Deliverable contributo italiano al report DC04,
in funzione del C-TDR e della preparazione del
P-TDR. Risultati dell'analisi dei canali
assegnati all'Italia (almeno 3 stream e 4 canali
di segnale)
Integrazione del sistema di calcolo CMS Italia in
LCG Giugno
Il Tier1, meta dei Tier2 (LNL, Ba, Bo, Pd, Pi,
Rm1) e un terzo dei Tier3 (Ct, Fi, Mi, Na, Pg,
To) hanno il software di LCG installato e hanno
la capacita di lavorare nellenvironment di LCG
Comporta la installazione dei pacchetti software
provenienti da LCG AA e da LCG GDA (da Pool a RLS
etc.)
Completamento analisi utilizzando infrastruttura
LCG e ulteriori produzioni per circa 2 M di
eventi
Deliverable CMS Italia e integrata in LCG per
piu della meta delle risorse

Fine del DC04 slittata ad Aprile Sedi Ba, Bo,
Fi, LNL, Pd, Pi, CNAF-Tier1 2 Streams, ma 4
canali di analisi DONE, 90
Sedi integrate in LCG CNAF-Tier1, LNL, Ba, Pd,
Bo, Pi Il prolungarsi dellanalisi dei risultati
del DC04 fa slittare di almeno 3 mesi In
progress, 30
25
Milestones 2004 specifiche (2/2)

Partecipazione al C-TDR Ottobre
Include la definizione della partecipazione
italiana al C-TDR in termini di
Risorse e sedi (possibilmente tutte)
Man-power
Finanziamenti e piano di interventi
Deliverable drafts del C-TDR col contributo
italiano
Partecipazione al PCP DC05 di almeno il Tier1 e i
Tier2 Dicembre
Il Tier1 e il CNAF e i Tier2 sono LNL, Ba, Bo,
Pd, Pi, Rm1
Produzione di 20 M di eventi per lo studio del
P-TDR, o equivalenti (lo studio potrebbe
richiedere fast-MC o speciali programmi)
Contributo alla definizione del LCG-TDR
Deliverable produzione degli eventi necessari
alla validazione dei tools di fast-simulation e
allo studio dei P-TDR (20 M eventi sul Tier1 i
Tier2/3)

Il Computing TDR e ora dovuto per Luglio 2005 La
milestone slitta di conseguenza Stand-by/progress,
10
Il Data Challenge 05 slitta al Luglio 2005 La
milestone slitta di conseguenza Stand-by, 0
26
Back-up Slides
27
Computing Model di CMS

Computing Model design
Data location and access Model
Analysis (user) Model
CMS Software and Tools
Infrastructure Organization (Tiers and LCG)

28
(No Transcript)
29
CPU Power Ramp Up
LHC1E34
LHC2E33
Average slope x2.5/year
DC05 P TDR LCG TDR
DC06 Readiness
DC04C TDR
Actual PCP level
Actual DC04 level
DAQTDR
Time shared Resources
Dedicated CMS Resources
30
NO HEAVY IONS INCLUDED YET!
Estimates prepared as input to the MoU Task
Force Computing models under active development
31
Tier-1 Centers are Crucial to CMS

CMS expects to have (External) T1 centers at
CNAF, FNAL, Lyon, Karlsrhue, PIC, RAL
And a Tier-1 center at CERN (Still discussing
role of CERN T1)
Current Computing model gives total External T1
requirements
Assumed over 6 centers, but not necessarily 6
equal centers
Tier-1 centers will be crucial for
Calibration, Reprocessing, Data-Serving
To service the requirements of the Tier-2 centers
Both from the region and via explicit
relationships with external T2 centers.
Servicing the analysis requirements of their
regions
Next step is to iterate with the T1 centers/CMS
Country managements to understand what they can
realistically hope to propose and to possibly
succeed in obtaining

32
Possible Sizing of Regional T1s

Assume 1 T1 at CERN and Sum of 6 External T1s
Take truncated sum of collaboration at T1
Countries and calculate Fractions in those
countries
Share the 61 T1s according to this algorithm to
get opening scenario for discussions
CERN 1 T1 for CMS (By Definition)
France 0.5T1 for CMS
Germany 0.4T1
Italy 1.7T1
Spain 0.2T1
UK 0.4T1
USA 2.6T1

33
Tier-2

Ask Now for intentions from all CMS Agencies
I have an old list, I request that you contact
me with your intentions so I can bring this up to
date.
T1 countries are making a very heavy commitment
They may need to demonstrate sharing of costs
with the dependent T2s
T2s need to start defining with which T1 they
will enter into service agreements, and
negotiating with them to how costs will be
distributed.

34
RLS performance
0.16 files/s ? 10 Hz
0.4 files/s ? 25 Hz
April 2nd, 1800

? Time to register the output of a single job
(16 files) left axis
? Load on client machine at the time of
registration right axis

35
RLS issues

Total Number of files registered in the RLS
during DC04
? 570K LFNs each with ? 5-10 PFNs and 9 metadata
attributes
Inserting information into RLS
Insert PFN (file catalogue) was fast enough if
using the appropriate tools, produced in-course
LRC C API programs (?0.1-0.2sec/file), POOL CLI
with GUID (secs/file)
Insert files with their attributes (file and
metadata catalogue) was slow
We more or less survived, higher data rates would
be troublesome

Sometimes the load on RLS increases and requires
intervention on the server (i.g. log partition
full, switch of server node, un-optimized
queries) ? able to keep up in optimal condition,
so and so otherwise
Time to register the output of a Tier-0 job (16
files)
36
PCP set-up a hybrid model
by C.Grandi
Phys.Group asks for a new dataset
Production Manager defines assignments
RefDB
shell scripts
Data-level query
Local Batch Manager
BOSS DB
Job level query
McRunjob plug-in CMSProd
Site Manager starts an assignment
37
PCP _at_ INFN statistics (1/4)
CMS production steps Generation Simulation ooHitf
ormatting Digitisation
Generation step (all CMS)
Generation step (INFN only)
contribute to this slope
Jun mid-Aug 03
79 Mevts in CMS 9.9 Mevts (13) done by INFN
(strong contribution by LNL)
38
PCP _at_ INFN statistics (2/4)
CMS production steps Generation Simulation ooHitf
ormatting Digitisation
Simulation step CMSIMOSCAR (all CMS)
Simulation step CMSIMOSCAR (INFN only)
Jul Sep 03
75 Mevts in CMS 10.4 Mevts (14) done by INFN
(strong contribution by CNAF T1LNL)
39
PCP _at_ INFN statistics (3/4)
CMS production steps Generation Simulation ooHitf
ormatting Digitisation
ooHitformatting step (all CMS)
ooHitformatting step (INFN only)
Dec 03
end-Feb 04
37 Mevts in CMS 7.8 Mevts (21) done by INFN
D. Bonacorsi
40
OSCAR
41
Evolution of Transfer Requirements
42
From GDB to analysis at T1
Transfer
Replication
Job preparation
Job Submission
43
Real-Time (Fake) Analysis

Goals
Demonstrate data can be analyzed in real time at
the T1
Fast Feedback to reconstruction (e.g.
calibration, alignment, check of reconstruction
code, etc.)
Establish automatic data replication to T2s
Make data available for offline analysis
Measure time elapsed between reconstruction at T0
and analysis at T1
Architecture
Set of software agents communicating via local
mysql DB
Replication, data set completeness, job
preparation submission
Use LCG to run jobs
Private Grid Information System for CMS DC04
Private Resource Broker

J. Hernandez
44
From GDB to analysis at T1
Analysis
T2
GDB
T1
EB
Reconstruction
Transfer and replication agents
Drop and Fake Analysis agents
Publisher and configuration agents
EB agent
J. Hernandez
45
Real-time DC04 analysisSummary

Real-time analysis two weeks of
quasi-continuous running!
The total number of analysis jobs submitted
15000
Overall Grid efficiency 95-99
Problems
RLS query to prepare a POOL xml catalog done
using file GUID otherwise much slower
Resource Broker disk being full causing the RB
unavailability for several hours. This problem
was related to large input/output sandbox.
Possible solutions
Set quotas on RB space for sandbox
Configure to use RB in cascade
Network problem at CERN, not allowing
connections to the RLS and CERN RB
Legnaro CE/SE disappeared in the Information
System during one night
Failures in updating Boss database due to
overload of MySQL server (30 ). The Boss
recovery procedure was used

N. De Filippis, A. Fanfani, F. Fanzago
46
Description of RLS usage in DC04
Local POOL catalogue
TMDB
Tier-1 Transfer agent
SRB GMCAT
Replica Manager
RM/SRM/SRB EB agents
4. Copy files to Tier-1s
Resource Broker
3. Copy/delete files to/from export buffers
5. Submit analysis job
LCG ORCA Analysis Job
Configuration agent
2. Find Tier-1 Location (based on metadata)
6. Process DST and register private data
CNAF RLS replica
ORACLE mirroring
XML Publication Agent
1. Register Files
Specific client tools POOL CLI, Replica Manager
CLI, C LRC API based programs, LRC java API
tools (SRB/GMCAT), Resource Broker
47
Context for the agent system
Global system management/ steering
Replica managers
Configuration agent
Resource brokers?
Agents (and TMDB)
File catalogue
Metadata
Analysis A separate world?
Grid transfer tools
48
DST files
b/t datasets
1. Replicate data to disk SEs at T1/2
Replica Agent
muon datasets
2. Notify that new files are available for
analysis
ORCA 8.0.1 on UI to compile analysis code
to
Real-time Analysis Agent

Check if a file-set (run) is ready to be analyzed
(greenlight)
Prepare the job to analyze the run
Submit the job via BOSS to the RB

CMS software (ORCA8.0.1) installed by the CMS
software manager using a GRID job based on xcmsi
tool
49
Muon and Neutrino Informations

? transverse energy
Muon Pt
Isolated Muon Pt
Isolation Efficiency
Single muon 88 (98 wrt selection)

50
Jet Informations

Total number of Jet
Number of B Jet
Et of non B Jet
Et of B Jet

51
Hadronic Top
Reconstructed Masses
Hadronic W
Leptonic Top
52
data transfert and job preparation
b/tau dataset
DST files
DST files
Muon dataset
Notify that new files are available for analysis
ORCA_8_0_1 available on UI to compile analysis
code
To
Submission via BOSS
CMS software is installed by the CMS Software
Manager using a GRID job based on xcmsi tool
Only If the collection file has greenlight the
agent prepares and submits a job to analyse one
run
2
53
(No Transcript)
54
An example Replicas to disk-SEs
CNAF T1 Castor SE
CNAF T1 Castor SE
eth I/O input from SE-EB
TCP connections
Just one day Apr, 19th
RAM memory
CNAF T1 disk-SE
eth I/O input from Castor SE
green
Legnaro T2 disk-SE
eth I/O input from Castor SE
D. Bonacorsi
55
Data Transfer
Castor
CERN EB (3 disk SE)
Tier-1
Castor
Tier-1
CNAF disk SE
PIC disk SE
CNAF SE
PIC SE
Tier-2
Legnaro disk SE
Tier-2
CIEMAT disk SE

Transfer tools
Replica Manager CLI used for EB ? CNAF and CNAF ?
Legnaro
Java-based CLI introduces non negligible overhead
at start-up
globus-url-copy LRC C API used for EB ?PIC
and PIC ? Ciemat
Faster
Performance has been good with both tools
Total network throughput limited by small file
size
Some transfer problem caused by performance of
underlying MSS
Always use a disk SE in front of an MSS in the
future?

A. Fanfani
56
Real-time DC04 analysis job time statistic
Dataset bt03_ttbb_ttH analysed with executable
ttHWmu
Total execution time 28 minutes
ORCA execution time 25 minutes
Time for staging input and output files 170 s
Job waiting time before starting 120 s
Overhead of GRID waiting time in queue
N. De Filippis, A. Fanfani, F. Fanzago

Write a Comment

User Comments (0)