From D

About This Presentation

Transcript and Presenter's Notes

Title: From D

1
From DØ To ATLAS

Jae Yu
ATLAS Grid Test-Bed Workshop
Apr. 4-6, 2002, UTA

Introduction
DØ-Grid DØRACE
DØ Progress
UTA DØGrid Activities
Conclusions

2
Introduction

DØ has been taking data since Mar. 1, 2001
Accumulated over 20pb-1 of collider data
Current maximum output rate is about 20Hz,
averaging at 10Hz
This will improve to average rate of 25-30Hz with
50 duty factor, eventually reaching to about
75Hz in Run IIb
Resulting Run IIa (2 years) data sample is about
400TB including reco. (1x109 events)
Run IIb will follow (4 year run)? Data to
increase to 3-5PB (1010)
DØ Institutions are scattered in 19 countries
About 40 European collaborators ? Natural demand
for remote analysis
The data size poses serious issues
Time taken for re-processing of data (10sec/event
with 40 specint95)
Easy access through the given network bandwidth
DØ Management recognized the issue and, Nov.
2001, created
A position for remote analysis coordination
Restructured DØ Computing team to include a
D0Grid team (3 FTEs)

3
DØRACE

DØ Remote Analysis Coordination Efforts
In existence to accomplish
Setting up and maintaining remote analysis
environment
Promote institutional contribution remotely
Allow remote institutions to participate in data
analysis
To prepare for the future of data analysis
More efficient and faster delivery of multi-PB
data
More efficient sharing of processing resources
Prepare for possible massive re-processing and MC
production to expedite the process
Expeditious physics analyses
Maintain self-sustained support amongst the
remote institutions to construct a broader bases
of knowledge
Sociological issues of HEP people at the home
institutions and within the field
Integrate various remote analysis effort into one
working piece
Primary goal is allow individual desktop users to
make significant contribution without being at
the lab

4
From a Survey

Difficulties
Having hard time setting up initially
Lack of updated documentation
Rather complicated set up procedure
Lack of experience? No forum to share experiences
OS version differences (RH6.2 vs 7.1), let alone
OS
Most the established sites have easier time
updating releases
Network problems affecting successful completion
of large size releases (4GB) takes a couple of
hours (SA)
No specific responsible persons to ask questions
Availability of all necessary software via
UPS/UPD
Time difference between continents affecting
efficiencies

5
Progress in DØGrid and DØRACE?

Document on DØRACE home page (http//www-hep.uta.e
du/d0race)
To ease the barrier over the difficulties in
initial set up
Updated and simplified instructions for set up
available on the web ? Many institutions have
participated in refining the instruction
Tools to make DØ software download and
installation made available
Release Ready notification system activated
Success is defined by institutions depending on
what they do
Build Error log and dependency tree utility
available
Change of release procedure to alleviate unstable
network dependencies
SAM (Sequential Access model Metadata catalogue)
station set up for data access ? 15 institutions
transfer files back and forth
Script for automatic download, installation,
component verification, and reinstallation in
prep.

6
DØRACE Strategy

Categorized remote analysis system set up by the
functionality
Desk top only
A modest analysis server
Linux installation
UPS/UPD Installation and deployment
External package installation via UPS/UPD
CERNLIB
Kai-lib
Root
Download and Install a DØ release
Tar-ball for ease of initial set up?
Use of existing utilities for latest release
download
Installation of cvs
Code development
KAI C compiler
SAM station setup

Phase 0 Preparation
Phase I Rootuple Analysis
Phase II Executables
Phase III Code Dev.
Phase IV Data Delivery
7
(No Transcript)
8
Where are we?

DØRACE is entering into the next stage
The compilation and running
Active code development
Propagation of setup to all institutions
Instructions seem to take their shape well
Need to maintain and to keep them up to date
Support to help problems people encounter
35 institutions (about 50) ready for code
development
15 institutions for data transfer
DØGRID TestBed formed at the Feb. workshop
Some 10 institutions participating
UTAs own Mark Sosebee is taking charge
Globus job submission testing in progress ? UTA
participates in LSF testing.
DØ has SAM in place for data delivery and
cataloguing services ? Deep into DØ analysis
fabric, the D0Grid must be established on SAM
We will also need to establish regional analysis
centers

9
Proposed DØRAM Architecture
Central Analysis Center (CAC)
Regional Analysis Centers
Store Process 1020 of All Data
Institutional Analysis Centers
Desktop Analysis Stations
10
Regional Analysis Centers

A few geographically selected sites that satisfy
requirements
Provide almost the same level of service as FNAL
to a few institutional analysis centers
Analyses carried out within the regional center
Store 1020 of statistically random data
permanently
Most the analyses performed on these samples with
the regional network
Refine the analyses using the smaller but
unbiased data set
When the entire data set is needed ? Underlying
Grid architecture provide access to remaining
data set

11
Regional Analysis Center Requirements

Become a Mini-CAC
Sufficient computing infrastructure
Large bandwidth (gagibit or better)
Sufficient Storage Space to hold 1020 of data
permanently and expandable to accommodate data
increase
gt30TB just for Run IIa RAW data
Sufficient CPU resources to provide regional or
Institutional analysis requests and reprocessing
Geographically located to avoid unnecessary
network traffic overlap
Software Distribution and Support
Mirror copy of CVS database for synchronized
update between RACs and CAC
Keep the relevant copies of data bases
Act as SAM service station

12
What has UTA Team been doing?

UTA DØGrid team consists of
2 faculty (Kaushik and Jae)
2 senior scientists (Tomasz and Mark)
1 External software designer
3 CSE students
Computing Resources for MC Farm activities
25 Dual P-III 850MHz machines for HEP MC Farm
5 Dual P-III 850MHz machines on CSE farm
ACS 16 Dual P-III 900MHz machines under LSF
Installed Globus 2.0 b on a few machines
Completed Hello world! testing a few weeks ago
A working version of MC Farm bookkeeper (See the
demo) from an ATLAS machine to our HEP Farm
machine through Globus ? Proof of Principle

13
Short and Long Term Goals

Plan to further the job-submission
Submit MC jobs from remote nodes to our HEP farm
for more complicated chain of job executions
Build a prototype high level interface for job
submission and distribution for DØ analysis jobs
(Next 6 months)
Store the reconstructed files ion local cache
Submit an analysis job from one of the local
machines to Condor
Put the output into either the requestors area
or the cache
Reduced output analysis job processing from a
remote node through Globus
Plan to become a DØRAC
Submitted a 1M MRI for RAC hardware purchase in
Jan.

14
DØ Monte Carlo Production Chain
Generator job (Pythia, Isajet, )
DØsim (Detector response)
DØreco (reconstruction)
RecoA (root tuple)
15
UTA MC farm software daemons and their control
WWW
Root daemon
Lock manager
Bookkeeper
Monitor daemon
Distribute daemon
Execute daemon
Gather daemon
Job archive
Cache disk
SAM
Remote machine
16
Job Life Cycle
Distribute queue
Execute queue
Gatherer queue
Error queue
Cache, SAM, archive
17
Production bookeeping

During a running period the farms produce few
thousand jobs
Some jobs crash, need to be restarted
Users must be kept up to date about their MC
requests status (waiting? Running? Done?)
A dedicated bookkeeping software is needed

18
The new, Globus-enabled, bookkeeper
A machine from Atlas farm is running bookkeeper
www server
Globus-job-run Grid-ftp
Globus domain
HEP farm
CSE farm
19
The new bookeeper

One dedicated bookeeper machine can serve any
number of MC production farms running mcfarm
software
The communication with remote centers is done
using Globus-tools only
No need to install bookeeper on every farm
makes life simpler if many farms participate!

20
DØ?ATLAS RAC Computing Room

A 2000ft2 computing room in the new building
Specifications given to the designers
Multi gigabit fiber network
Power and cooling sufficient for
250 Processing PCs
100 IDE-RAID arrays ? Providing over 50TB cache

21
Conclusions

The DØRACE is making significant progress
Preparation for software release and distribution
About 50 of the institutions are ready for code
development and 20 with data delivery
DØGrid TestBed being organized
DØGrid software being developed based on SAM
UTA Has been the only US institution with massive
MC generation capabilities ? Sufficient expertise
in MC job distribution and resource management
UTA is playing a leading role in DØGrid
Activities
Major player in DØGrid software design and
development
Successful in MC bookkeeper job submission and
report
DØGrid should be the testing platform for ATLAS
Grid
What we do for DØ should be applicable to ATLAS
with minimal efforts
What you do for ATLAS should be easily
transferable to DØ and tested with actual data
taking
The Hardware for DØRAC should be the foundation
for ATLAS use

Write a Comment

User Comments (0)

About PowerShow.com

From D PowerPoint PPT Presentation