From D - PowerPoint PPT Presentation

About This Presentation
Title:

From D

Description:

A working version of MC Farm bookkeeper (See the demo) from an ATLAS machine to ... Successful in MC bookkeeper job submission and report ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 22
Provided by: jae51
Category:
Tags: bookkeeper

less

Transcript and Presenter's Notes

Title: From D


1
From DØ To ATLAS
  • Jae Yu
  • ATLAS Grid Test-Bed Workshop
  • Apr. 4-6, 2002, UTA
  • Introduction
  • DØ-Grid DØRACE
  • DØ Progress
  • UTA DØGrid Activities
  • Conclusions

2
Introduction
  • DØ has been taking data since Mar. 1, 2001
  • Accumulated over 20pb-1 of collider data
  • Current maximum output rate is about 20Hz,
    averaging at 10Hz
  • This will improve to average rate of 25-30Hz with
    50 duty factor, eventually reaching to about
    75Hz in Run IIb
  • Resulting Run IIa (2 years) data sample is about
    400TB including reco. (1x109 events)
  • Run IIb will follow (4 year run)? Data to
    increase to 3-5PB (1010)
  • DØ Institutions are scattered in 19 countries
  • About 40 European collaborators ? Natural demand
    for remote analysis
  • The data size poses serious issues
  • Time taken for re-processing of data (10sec/event
    with 40 specint95)
  • Easy access through the given network bandwidth
  • DØ Management recognized the issue and, Nov.
    2001, created
  • A position for remote analysis coordination
  • Restructured DØ Computing team to include a
    D0Grid team (3 FTEs)

3
DØRACE
  • DØ Remote Analysis Coordination Efforts
  • In existence to accomplish
  • Setting up and maintaining remote analysis
    environment
  • Promote institutional contribution remotely
  • Allow remote institutions to participate in data
    analysis
  • To prepare for the future of data analysis
  • More efficient and faster delivery of multi-PB
    data
  • More efficient sharing of processing resources
  • Prepare for possible massive re-processing and MC
    production to expedite the process
  • Expeditious physics analyses
  • Maintain self-sustained support amongst the
    remote institutions to construct a broader bases
    of knowledge
  • Sociological issues of HEP people at the home
    institutions and within the field
  • Integrate various remote analysis effort into one
    working piece
  • Primary goal is allow individual desktop users to
    make significant contribution without being at
    the lab

4
From a Survey
  • Difficulties
  • Having hard time setting up initially
  • Lack of updated documentation
  • Rather complicated set up procedure
  • Lack of experience? No forum to share experiences
  • OS version differences (RH6.2 vs 7.1), let alone
    OS
  • Most the established sites have easier time
    updating releases
  • Network problems affecting successful completion
    of large size releases (4GB) takes a couple of
    hours (SA)
  • No specific responsible persons to ask questions
  • Availability of all necessary software via
    UPS/UPD
  • Time difference between continents affecting
    efficiencies

5
Progress in DØGrid and DØRACE?
  • Document on DØRACE home page (http//www-hep.uta.e
    du/d0race)
  • To ease the barrier over the difficulties in
    initial set up
  • Updated and simplified instructions for set up
    available on the web ? Many institutions have
    participated in refining the instruction
  • Tools to make DØ software download and
    installation made available
  • Release Ready notification system activated
  • Success is defined by institutions depending on
    what they do
  • Build Error log and dependency tree utility
    available
  • Change of release procedure to alleviate unstable
    network dependencies
  • SAM (Sequential Access model Metadata catalogue)
    station set up for data access ? 15 institutions
    transfer files back and forth
  • Script for automatic download, installation,
    component verification, and reinstallation in
    prep.

6
DØRACE Strategy
  • Categorized remote analysis system set up by the
    functionality
  • Desk top only
  • A modest analysis server
  • Linux installation
  • UPS/UPD Installation and deployment
  • External package installation via UPS/UPD
  • CERNLIB
  • Kai-lib
  • Root
  • Download and Install a DØ release
  • Tar-ball for ease of initial set up?
  • Use of existing utilities for latest release
    download
  • Installation of cvs
  • Code development
  • KAI C compiler
  • SAM station setup

Phase 0 Preparation
Phase I Rootuple Analysis
Phase II Executables
Phase III Code Dev.
Phase IV Data Delivery
7
(No Transcript)
8
Where are we?
  • DØRACE is entering into the next stage
  • The compilation and running
  • Active code development
  • Propagation of setup to all institutions
  • Instructions seem to take their shape well
  • Need to maintain and to keep them up to date
  • Support to help problems people encounter
  • 35 institutions (about 50) ready for code
    development
  • 15 institutions for data transfer
  • DØGRID TestBed formed at the Feb. workshop
  • Some 10 institutions participating
  • UTAs own Mark Sosebee is taking charge
  • Globus job submission testing in progress ? UTA
    participates in LSF testing.
  • DØ has SAM in place for data delivery and
    cataloguing services ? Deep into DØ analysis
    fabric, the D0Grid must be established on SAM
  • We will also need to establish regional analysis
    centers

9
Proposed DØRAM Architecture
Central Analysis Center (CAC)
Regional Analysis Centers
Store Process 1020 of All Data
Institutional Analysis Centers
Desktop Analysis Stations
10
Regional Analysis Centers
  • A few geographically selected sites that satisfy
    requirements
  • Provide almost the same level of service as FNAL
    to a few institutional analysis centers
  • Analyses carried out within the regional center
  • Store 1020 of statistically random data
    permanently
  • Most the analyses performed on these samples with
    the regional network
  • Refine the analyses using the smaller but
    unbiased data set
  • When the entire data set is needed ? Underlying
    Grid architecture provide access to remaining
    data set

11
Regional Analysis Center Requirements
  • Become a Mini-CAC
  • Sufficient computing infrastructure
  • Large bandwidth (gagibit or better)
  • Sufficient Storage Space to hold 1020 of data
    permanently and expandable to accommodate data
    increase
  • gt30TB just for Run IIa RAW data
  • Sufficient CPU resources to provide regional or
    Institutional analysis requests and reprocessing
  • Geographically located to avoid unnecessary
    network traffic overlap
  • Software Distribution and Support
  • Mirror copy of CVS database for synchronized
    update between RACs and CAC
  • Keep the relevant copies of data bases
  • Act as SAM service station

12
What has UTA Team been doing?
  • UTA DØGrid team consists of
  • 2 faculty (Kaushik and Jae)
  • 2 senior scientists (Tomasz and Mark)
  • 1 External software designer
  • 3 CSE students
  • Computing Resources for MC Farm activities
  • 25 Dual P-III 850MHz machines for HEP MC Farm
  • 5 Dual P-III 850MHz machines on CSE farm
  • ACS 16 Dual P-III 900MHz machines under LSF
  • Installed Globus 2.0 b on a few machines
  • Completed Hello world! testing a few weeks ago
  • A working version of MC Farm bookkeeper (See the
    demo) from an ATLAS machine to our HEP Farm
    machine through Globus ? Proof of Principle

13
Short and Long Term Goals
  • Plan to further the job-submission
  • Submit MC jobs from remote nodes to our HEP farm
    for more complicated chain of job executions
  • Build a prototype high level interface for job
    submission and distribution for DØ analysis jobs
    (Next 6 months)
  • Store the reconstructed files ion local cache
  • Submit an analysis job from one of the local
    machines to Condor
  • Put the output into either the requestors area
    or the cache
  • Reduced output analysis job processing from a
    remote node through Globus
  • Plan to become a DØRAC
  • Submitted a 1M MRI for RAC hardware purchase in
    Jan.

14
DØ Monte Carlo Production Chain
Generator job (Pythia, Isajet, )
DØsim (Detector response)
DØreco (reconstruction)
RecoA (root tuple)
15
UTA MC farm software daemons and their control
WWW
Root daemon
Lock manager
Bookkeeper
Monitor daemon
Distribute daemon
Execute daemon
Gather daemon
Job archive
Cache disk
SAM
Remote machine
16
Job Life Cycle
Distribute queue
Execute queue
Gatherer queue
Error queue
Cache, SAM, archive
17
Production bookeeping
  • During a running period the farms produce few
    thousand jobs
  • Some jobs crash, need to be restarted
  • Users must be kept up to date about their MC
    requests status (waiting? Running? Done?)
  • A dedicated bookkeeping software is needed

18
The new, Globus-enabled, bookkeeper
A machine from Atlas farm is running bookkeeper
www server
Globus-job-run Grid-ftp
Globus domain
HEP farm
CSE farm
19
The new bookeeper
  • One dedicated bookeeper machine can serve any
    number of MC production farms running mcfarm
    software
  • The communication with remote centers is done
    using Globus-tools only
  • No need to install bookeeper on every farm
    makes life simpler if many farms participate!

20
DØ?ATLAS RAC Computing Room
  • A 2000ft2 computing room in the new building
  • Specifications given to the designers
  • Multi gigabit fiber network
  • Power and cooling sufficient for
  • 250 Processing PCs
  • 100 IDE-RAID arrays ? Providing over 50TB cache

21
Conclusions
  • The DØRACE is making significant progress
  • Preparation for software release and distribution
  • About 50 of the institutions are ready for code
    development and 20 with data delivery
  • DØGrid TestBed being organized
  • DØGrid software being developed based on SAM
  • UTA Has been the only US institution with massive
    MC generation capabilities ? Sufficient expertise
    in MC job distribution and resource management
  • UTA is playing a leading role in DØGrid
    Activities
  • Major player in DØGrid software design and
    development
  • Successful in MC bookkeeper job submission and
    report
  • DØGrid should be the testing platform for ATLAS
    Grid
  • What we do for DØ should be applicable to ATLAS
    with minimal efforts
  • What you do for ATLAS should be easily
    transferable to DØ and tested with actual data
    taking
  • The Hardware for DØRAC should be the foundation
    for ATLAS use
Write a Comment
User Comments (0)
About PowerShow.com