WP8 Status and Plans - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

WP8 Status and Plans

Description:

Have already ran many events on the testbeds of NIKHEF and SARA ... Gilbert Poulard. Frederic Brochu. Ingo Augustin. Laura Perini. Jean-Jacques Blaising. EDG ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 18
Provided by: harr81
Category:
Tags: gilbert | plans | sara | status | wp8

less

Transcript and Presenter's Notes

Title: WP8 Status and Plans


1
WP8 Status and Plans
  • F Harris (Oxford/CERN)

2
Outline of presentation
  • Overview of experiment plans for use of Grid
    facilities/services for tests and data challenges
  • ATLAS
  • ALICE
  • CMS
  • LHCb
  • BaBar
  • D0
  • Status of ATLAS/EDG Task Force work
  • Essential requirements for making 1.2.n usable by
    broader physics user community
  • Future activities of WP8 and some questions
  • Summary

3
ATLAS
  • Currently in middle of Phase1 of DC1 (Geant3
    simulation,Athena reconstruction,analysis). Many
    sites in EuropeUSAustralia,Canada,Japan,Taiwan,I
    srael and Russia are involved
  • Phase2 of DC1 will begin Oct-Nov 2002 using new
    event model
  • Plans for use of Grid tools in DCs
  • Phase1 Atlas-EDG Task Force to repeat with EDG
    1.2. 1 of simulations already done.
  • Using CERN,CNAF,Nikhef,RAL,Lyon
  • 9 GB input 100 GB output 2000 CPU hrs
  • Phase2 will make larger use of Grid tools. Maybe
    different sites will use different tools. There
    will be (many?) more sites. This to be defined
    Sep 16-20.
  • 106 CPU hrs 20 TB input to
    reconstruction 5TB output
    (? How much on testbed?)

4
ALICE
  • Alice assume that as soon as a stable version of
    1.2.n is tested and validated it will be
    progressively installed on all EDG testbed
    sites
  • As new sites come will use an automatic tool for
    submission of test jobs of increasing output size
    and duration
  • at the moment do not plan a "data challenge" with
    EDG. However plan a data transfer test, as close
    as possible to the expected data transfer rate
    for a real production and analysis
  • Will concentrate the AliEn/EDG interface and on
    the AliRoot/EDG interface in particular for items
    concerning the Data Management.
  • Will use CERN, CNAF,Nikhef, Lyon,Turin,Catania
    for first tests
  • CPU and store requirements can be tailored to
    availability of facilities in testbed but will
    need some scheduling and priorities

5
CMS
  • Currently running production for DAQ Technical
    Design Report(TDR)
  • Requires full chain of CMS software and
    production tools. This includes use of
    Objectivity.(licensing problem in hand..)
  • 5 Data Challenge(DC04) will start Summer 2003
    and will last 7 months. This will produce
    5107 events. In last month all data will be
    reconstructed and distributed to Tier1/2 centres
    for analysis.
  • 1000 CPUs for 5 months
    100 TB output (LCG prototype)
  • Use of GRID tools and facilities
  • Will not be used for current production
  • Plan to use in DC04 production
  • EDG 1.2 will be used to make scale and
    performance tests (proof of concept). Tests on
    RB, RC and GDMP. Will need Objectivity for tests.
  • IC,RAL,CNAF/BO,Padova,CERN,Nikhef,IN2P3,Ecol-
    Poly,ITEP
  • Some sites will do EDT GLUE tests
  • CPU 50 CPUs distributed Store 200 Gb
    per site
  • V2 will be necessary for DC04 starting summer
    2003(has functionality required by CMS)

6
LHCB
  • First intensive Data Challenge starts Oct 2002
    currently doing intensive pre-tests at all sites.
  • Participating sites for 2002
  • CERN,Lyon,Bologna,Nikhef,RAL
  • Bristol,Cambridge,Edinburgh,Imperial,Oxford,ITEP
    Moscow,Rio de Janeiro
  • Use of EDG Testbed
  • Install latest OO environment on testbed sites.
    Flexible job submission Grid/non-Grid
  • First tests(now) for MC reconstruction
    analysis with data stored to Mass Store
  • Large scale production tests(by October)
  • Production (if tests OK)
  • Aim to do percentage of production on Testbed
  • Total reqt is 500 CPUs for 2 months 10 TB
  • (10 should be OK on testbed?)

7
BaBar Grid and EDG
  • Target have some production environment ready
    for all users by the end of this year
  • with attractive interface tools
  • Customised to SLAC site
  • Have implemented local hacks to overcome
    problems with
  • use of LSF Batch Scheduler(uses AFS)
  • AFS File System used for User Home Directories
  • Batch Workers located inside of the IFZ (security
    issue)
  • Three parts of the Globus/EDG software were
    installed at SLAC CE, WN and UI
  • The exercise clearly showed that they are running
    fine altogether, and also with the RB at IC
  • Had problems with old version of RB. Problems
    should largely go away with latest version
  • BaBar now have D.Boutigny on WP8/TWG

8
D0 (Nikhef)
  • Have already ran many events on the testbeds of
    NIKHEF and SARA
  • Wish to extend tests to the whole testbed
  • D0 rpm's are already in the EDG releases and will
    be installed on
  • all sites. Will set up a special VO and RC
    for D0 at NIKHEF on a rather short time scale.
  • Jeff Templon, NIKHEF rep. in WP8, will report
    on work

9
ATLAS-EDG task force members and sympathizers
ATLAS ATLAS EDG
Jean-Jacques Blaising Laura Perini Ingo Augustin
Frederic Brochu Gilbert Poulard Stephen Burke
Alessandro De Salvo Alois Putzer Frank Harris
Michael Gardner Di Qing Bob Jones
Luc Goossens David Rebatto Emanuele Leonardi
Marcus Hardt Zhongliang Ren Mario Reale
Roger Jones Silvia Resconi Markus Schulz
Christos Kanellopoulos Oxana Smirnova Jeffrey Templon
Guido Negri Stan Thompson
Fairouz Ohlsson-Malek Luca Vaccarossa
Steve O'Neale
10
  • Achievements so far
  • see http//s.home.cern.ch/s/smirnova/www/a
    tlas-edg/
  • A team of hard-working people across Europe in
    Atlas and EDG (middleware WP6 WP8)
  • has been set up (led by O Smirnova with help
    from R Jones and F Harris)
  • ATLAS software (release 3.2.1) is packed into
    relocatable RPMs, distributed and validated
    elsewhere
  • Following removal of the GASS Cache fix in EDG,
    50 of the planned challenge is performed (5
    researchers 10 jobs) only CERN testbed was
    fully available to start, but this is changing
    fast

11
In progress
  • New set of challenges, including smaller input
    files
  • Presentation and first results Luc Goossens
  • All the core Testbed sites (1.2.2) are becoming
    available FZK gt the rest of the challenge has
    a chance to be really distributed
  • Big file replication can be done, avoiding GDMP
    Replica Manager
  • With distributed input files, several jobs
    already have been steered by the RB to NIKHEF,
    following the requested input data. The rest of
    the batch went to CERN
  • Report in preparation

12
Bottom line for Task Force
  • Major obstacles
  • GASS Cache limitations (long jobs vs frequent
    submission) being worked on
  • File transfer time limit in data management tools
    hopefully can be addressed soon
  • Still, the ways around are known and quick fixes
    are deployed, allowing to run production-like
    jobs
  • The whole EDG middleware is pretty much in the
    development state, and things are changing
    (improving!) on a daily basis

13
Essential requirements for making 1.2.n usable by
broader physics user community
  • Top level requirements
  • Production testbed to be stable for weeks, not
    hours, and allow spectrum of job submissions
  • Have reasonably easy to use basic functions for
    job submission, replica handling and mass
    storage utilisation
  • Good concise user documentation for all functions
  • Easy for user to get certificates and to get into
    correct VO working environment
  • So what happens now in todays reality?
  • having had very positive discussions at Budapest
    in joint meetings with Workpackages 1256
  • gass-cache and 20 min file limit problems
    are absolute top priority being pursued with
    patches right now. Lets hope we dont need new
    version of Globus!
  • wrap data management complexity while waiting
    for version 2. (GDMP is too complex for
    average user) trying out interim RM for
    single files.
  • We need to clarify use of mass store(Castor,HPSS,R
    AL store) by multi-VOs
  • e.g how is store partitioned between VOs, and
    how does non-Grid user access data
  • Discussions ongoing and interim solutions being
    worked on

14
More essential requirements on use of 1.2
  • We must put people and procedures in place for
    mapping VO organisation onto test bed sites (e.g.
    quotas, priorities)
  • We must clarify user support at sites (middleware
    applications)
  • Installation of applications software
  • should not be combined with the system
    installation
  • Authentication authorisation
  • Can we streamline this procedure? (40-odd
    countries to accommodate for Atlas!)
  • Documentation ( Training - EDG tutorials for
    experiments)
  • Has to be user-oriented and concise
  • Much good work going on here (user
    guideexamples). About to be released

15
Some longer term requirements
  • Job Submission to take into account availability
    of space on SEs and quota assigned to user
    (e.g. for macro-jobs, say 500 each generating 1
    GB)
  • Mass Store should be on Grid in a transparent
    way (space management, archiving,staging)
  • Need easy to use replica management system
  • Comments
  • Are some of these 1.2.n rather than 2, i.e.
    increments in functionality in successive
    releases?
  • Task Force people should maintain continuing
    dialogue with developers
  • (should include data challenge managers from all
    VOs in dialogue)

16
Future activities of WP8 and some questions
  • The mandate of WP8 is to facilitate the
    interfacing of applications to EDG middleware,
    and participate in the evaluation and produce
    the evaluation reports (start writing very
    soon!).
  • Loose Cannons have been heavily involved in
    testing middleware components, and have produced
    test software and documentation. This should be
    packaged for use by the Test Group(now
    strengthened and formalised).
  • LCs will be involved in liasing with the
    experiments testing their applications. The
    details of how this relates to the new EDG/LCG
    Testing/Validation procedure have to be worked
    out.
  • WP8 have been involved in the development of
    application use cases and participate to
    current ATF activities. This is continuing. LCG
    via GDB to carry this on in broader sense.
  • We are interested in the feasibility of a common
    application layer running over middleware
    functions. This issue goes into the domain of
    current LCG deliberations.

17
Summary
  • Current WP8 top priority activity is Atlas/EDG
    Task Force work
  • This has been very positive. Focuses attention on
    the real user problems, and as a result we review
    our requirements, design etc. Remember the
    eternal cycle! We must maintain flexibility with
    continuing dialogue between users and developers.
  • Will continue Task Force flavoured activities
    with the other experiments
  • Current use of Testbed is focused on main sites
    (CERN,Lyon,Nikhef,CNAF,RAL) this is mainly for
    reasons of support given unstable situation
  • Once stability is achieved (see Atlas/EDG work)
    we will expand to other sites. But we should be
    careful in selection of these sites in the first
    instance. Local support would seem essential.
  • WP8 will maintain a role in architecture
    discussions, and maybe be involved in some common
    application layer developments
  • THANKS To members of IT and the middleware WPs
    for heroic efforts in past months, and to
    Federico for laying WP8 foundations
Write a Comment
User Comments (0)
About PowerShow.com