WP8 Status and Plans

About This Presentation

Title:

WP8 Status and Plans

Description:

Have already ran many events on the testbeds of NIKHEF and SARA ... Gilbert Poulard. Frederic Brochu. Ingo Augustin. Laura Perini. Jean-Jacques Blaising. EDG ... – PowerPoint PPT presentation

Number of Views:26

Avg rating:3.0/5.0

Slides: 18

Provided by: harr81

Category:

more less

Transcript and Presenter's Notes

Title: WP8 Status and Plans

1
WP8 Status and Plans

F Harris (Oxford/CERN)

2
Outline of presentation

Overview of experiment plans for use of Grid
facilities/services for tests and data challenges
ATLAS
ALICE
CMS
LHCb
BaBar
D0
Status of ATLAS/EDG Task Force work
Essential requirements for making 1.2.n usable by
broader physics user community
Future activities of WP8 and some questions
Summary

3
ATLAS

Currently in middle of Phase1 of DC1 (Geant3
simulation,Athena reconstruction,analysis). Many
sites in EuropeUSAustralia,Canada,Japan,Taiwan,I
srael and Russia are involved
Phase2 of DC1 will begin Oct-Nov 2002 using new
event model
Plans for use of Grid tools in DCs
Phase1 Atlas-EDG Task Force to repeat with EDG
1.2. 1 of simulations already done.
Using CERN,CNAF,Nikhef,RAL,Lyon
9 GB input 100 GB output 2000 CPU hrs
Phase2 will make larger use of Grid tools. Maybe
different sites will use different tools. There
will be (many?) more sites. This to be defined
Sep 16-20.
106 CPU hrs 20 TB input to
reconstruction 5TB output
(? How much on testbed?)

4
ALICE

Alice assume that as soon as a stable version of
1.2.n is tested and validated it will be
progressively installed on all EDG testbed
sites
As new sites come will use an automatic tool for
submission of test jobs of increasing output size
and duration
at the moment do not plan a "data challenge" with
EDG. However plan a data transfer test, as close
as possible to the expected data transfer rate
for a real production and analysis
Will concentrate the AliEn/EDG interface and on
the AliRoot/EDG interface in particular for items
concerning the Data Management.
Will use CERN, CNAF,Nikhef, Lyon,Turin,Catania
for first tests
CPU and store requirements can be tailored to
availability of facilities in testbed but will
need some scheduling and priorities

5
CMS

Currently running production for DAQ Technical
Design Report(TDR)
Requires full chain of CMS software and
production tools. This includes use of
Objectivity.(licensing problem in hand..)
5 Data Challenge(DC04) will start Summer 2003
and will last 7 months. This will produce
5107 events. In last month all data will be
reconstructed and distributed to Tier1/2 centres
for analysis.
1000 CPUs for 5 months
100 TB output (LCG prototype)
Use of GRID tools and facilities
Will not be used for current production
Plan to use in DC04 production
EDG 1.2 will be used to make scale and
performance tests (proof of concept). Tests on
RB, RC and GDMP. Will need Objectivity for tests.
IC,RAL,CNAF/BO,Padova,CERN,Nikhef,IN2P3,Ecol-
Poly,ITEP
Some sites will do EDT GLUE tests
CPU 50 CPUs distributed Store 200 Gb
per site
V2 will be necessary for DC04 starting summer
2003(has functionality required by CMS)

6
LHCB

First intensive Data Challenge starts Oct 2002
currently doing intensive pre-tests at all sites.
Participating sites for 2002
CERN,Lyon,Bologna,Nikhef,RAL
Bristol,Cambridge,Edinburgh,Imperial,Oxford,ITEP
Moscow,Rio de Janeiro
Use of EDG Testbed
Install latest OO environment on testbed sites.
Flexible job submission Grid/non-Grid
First tests(now) for MC reconstruction
analysis with data stored to Mass Store
Large scale production tests(by October)
Production (if tests OK)
Aim to do percentage of production on Testbed
Total reqt is 500 CPUs for 2 months 10 TB
(10 should be OK on testbed?)

7
BaBar Grid and EDG

Target have some production environment ready
for all users by the end of this year
with attractive interface tools
Customised to SLAC site
Have implemented local hacks to overcome
problems with
use of LSF Batch Scheduler(uses AFS)
AFS File System used for User Home Directories
Batch Workers located inside of the IFZ (security
issue)
Three parts of the Globus/EDG software were
installed at SLAC CE, WN and UI
The exercise clearly showed that they are running
fine altogether, and also with the RB at IC
Had problems with old version of RB. Problems
should largely go away with latest version
BaBar now have D.Boutigny on WP8/TWG

8
D0 (Nikhef)

Have already ran many events on the testbeds of
NIKHEF and SARA
Wish to extend tests to the whole testbed
D0 rpm's are already in the EDG releases and will
be installed on
all sites. Will set up a special VO and RC
for D0 at NIKHEF on a rather short time scale.
Jeff Templon, NIKHEF rep. in WP8, will report
on work

9
ATLAS-EDG task force members and sympathizers
ATLAS ATLAS EDG
Jean-Jacques Blaising Laura Perini Ingo Augustin
Frederic Brochu Gilbert Poulard Stephen Burke
Alessandro De Salvo Alois Putzer Frank Harris
Michael Gardner Di Qing Bob Jones
Luc Goossens David Rebatto Emanuele Leonardi
Marcus Hardt Zhongliang Ren Mario Reale
Roger Jones Silvia Resconi Markus Schulz
Christos Kanellopoulos Oxana Smirnova Jeffrey Templon
Guido Negri Stan Thompson
Fairouz Ohlsson-Malek Luca Vaccarossa
Steve O'Neale
10

Achievements so far
see http//s.home.cern.ch/s/smirnova/www/a
tlas-edg/
A team of hard-working people across Europe in
Atlas and EDG (middleware WP6 WP8)
has been set up (led by O Smirnova with help
from R Jones and F Harris)
ATLAS software (release 3.2.1) is packed into
relocatable RPMs, distributed and validated
elsewhere
Following removal of the GASS Cache fix in EDG,
50 of the planned challenge is performed (5
researchers 10 jobs) only CERN testbed was
fully available to start, but this is changing
fast

11
In progress

New set of challenges, including smaller input
files
Presentation and first results Luc Goossens
All the core Testbed sites (1.2.2) are becoming
available FZK gt the rest of the challenge has
a chance to be really distributed
Big file replication can be done, avoiding GDMP
Replica Manager
With distributed input files, several jobs
already have been steered by the RB to NIKHEF,
following the requested input data. The rest of
the batch went to CERN
Report in preparation

12
Bottom line for Task Force

Major obstacles
GASS Cache limitations (long jobs vs frequent
submission) being worked on
File transfer time limit in data management tools
hopefully can be addressed soon
Still, the ways around are known and quick fixes
are deployed, allowing to run production-like
jobs
The whole EDG middleware is pretty much in the
development state, and things are changing
(improving!) on a daily basis

13
Essential requirements for making 1.2.n usable by
broader physics user community

Top level requirements
Production testbed to be stable for weeks, not
hours, and allow spectrum of job submissions
Have reasonably easy to use basic functions for
job submission, replica handling and mass
storage utilisation
Good concise user documentation for all functions
Easy for user to get certificates and to get into
correct VO working environment
So what happens now in todays reality?
having had very positive discussions at Budapest
in joint meetings with Workpackages 1256
gass-cache and 20 min file limit problems
are absolute top priority being pursued with
patches right now. Lets hope we dont need new
version of Globus!
wrap data management complexity while waiting
for version 2. (GDMP is too complex for
average user) trying out interim RM for
single files.
We need to clarify use of mass store(Castor,HPSS,R
AL store) by multi-VOs
e.g how is store partitioned between VOs, and
how does non-Grid user access data
Discussions ongoing and interim solutions being
worked on

14
More essential requirements on use of 1.2

We must put people and procedures in place for
mapping VO organisation onto test bed sites (e.g.
quotas, priorities)
We must clarify user support at sites (middleware
applications)
Installation of applications software
should not be combined with the system
installation
Authentication authorisation
Can we streamline this procedure? (40-odd
countries to accommodate for Atlas!)
Documentation ( Training - EDG tutorials for
experiments)
Has to be user-oriented and concise
Much good work going on here (user
guideexamples). About to be released

15
Some longer term requirements

Job Submission to take into account availability
of space on SEs and quota assigned to user
(e.g. for macro-jobs, say 500 each generating 1
GB)
Mass Store should be on Grid in a transparent
way (space management, archiving,staging)
Need easy to use replica management system
Comments
Are some of these 1.2.n rather than 2, i.e.
increments in functionality in successive
releases?
Task Force people should maintain continuing
dialogue with developers
(should include data challenge managers from all
VOs in dialogue)

16
Future activities of WP8 and some questions

The mandate of WP8 is to facilitate the
interfacing of applications to EDG middleware,
and participate in the evaluation and produce
the evaluation reports (start writing very
soon!).
Loose Cannons have been heavily involved in
testing middleware components, and have produced
test software and documentation. This should be
packaged for use by the Test Group(now
strengthened and formalised).
LCs will be involved in liasing with the
experiments testing their applications. The
details of how this relates to the new EDG/LCG
Testing/Validation procedure have to be worked
out.
WP8 have been involved in the development of
application use cases and participate to
current ATF activities. This is continuing. LCG
via GDB to carry this on in broader sense.
We are interested in the feasibility of a common
application layer running over middleware
functions. This issue goes into the domain of
current LCG deliberations.

17
Summary

Current WP8 top priority activity is Atlas/EDG
Task Force work
This has been very positive. Focuses attention on
the real user problems, and as a result we review
our requirements, design etc. Remember the
eternal cycle! We must maintain flexibility with
continuing dialogue between users and developers.
Will continue Task Force flavoured activities
with the other experiments
Current use of Testbed is focused on main sites
(CERN,Lyon,Nikhef,CNAF,RAL) this is mainly for
reasons of support given unstable situation
Once stability is achieved (see Atlas/EDG work)
we will expand to other sites. But we should be
careful in selection of these sites in the first
instance. Local support would seem essential.
WP8 will maintain a role in architecture
discussions, and maybe be involved in some common
application layer developments
THANKS To members of IT and the middleware WPs
for heroic efforts in past months, and to
Federico for laying WP8 foundations

Write a Comment

User Comments (0)

About PowerShow.com

WP8 Status and Plans - PowerPoint PPT Presentation

WP8 Status and Plans

Have already ran many events on the testbeds of NIKHEF and SARA ... Gilbert Poulard. Frederic Brochu. Ingo Augustin. Laura Perini. Jean-Jacques Blaising. EDG ... – PowerPoint PPT presentation