Title: Grid Production Experience in the ATLAS Experiment
1Grid Production Experience in the ATLAS Experiment
- Kaushik De
- University of Texas at Arlington
- BNL Technology Meeting
- March 29, 2004
2ATLAS Data Challenges
- Original Goals (Nov 15, 2001)
- Test computing model, its software, its data
model, and to ensure the correctness of the
technical choices to be made - Data Challenges should be executed at the
prototype Tier centres - Data challenges will be used as input for a
Computing Technical Design Report due by the end
of 2003 (?) and for preparing a MoU - Current Status
- Goals are evolving as we gain experience
- Computing TDR end of 2004
- DCs are yearly sequence of increasing scale
complexity - DC0 and DC1 (completed)
- DC2 (2004), DC3, and DC4 planned
- Grid deployment and testing is major part of DCs
3ATLAS DC1 July 2002-April 2003Goals Produce
the data needed for the HLT TDR Get
as many ATLAS institutes involved as
possibleWorldwide collaborative
activityParticipation 56 Institutes
- Australia
- Austria
- Canada
- CERN
- China
- Czech Republic
- Denmark
- France
- Germany
- Greece
- Israel
- Italy
- Japan
- Norway
- Poland
- Russia
- Spain
- Sweden
- Taiwan
- UK
- USA
- using Grid
4DC1 Statistics (G. Poulard, July 2003)
5U.S. ATLAS DC1 Data Production
- Year long process, Summer 2002-2003
- Played 2nd largest role in ATLAS DC1
- Exercised both farm and grid based production
- 10 U.S. sites participating
- Tier 1 BNL, Tier 2 prototypes BU, IU/UC, Grid
Testbed sites ANL, LBNL, UM, OU, SMU, UTA (UNM
UTPA will join for DC2) - Generated 2 million fully simulated, piled-up
and reconstructed events - U.S. was largest grid-based DC1 data producer in
ATLAS - Data used for HLT TDR, Athens physics workshop,
reconstruction software tests...
6U.S. ATLAS Grid Testbed
- BNL - U.S. Tier 1, 2000 nodes, 5 for ATLAS, 10
TB, HPSS through Magda - LBNL - pdsf cluster, 400 nodes, 5 for ATLAS
(more if idle 10-15 used), 1TB - Boston U. - prototype Tier 2, 64 nodes
- Indiana U. - prototype Tier 2, 64 nodes
- UT Arlington - new 200 cpus, 50 TB
- Oklahoma U. - OSCER facility
- U. Michigan - test nodes
- ANL - test nodes, JAZZ cluster
- SMU - 6 production nodes
- UNM - Los Lobos cluster
- U. Chicago - test nodes
7U.S. Production Summary
- Exercised both farm and grid based production
- Valuable large scale grid based production
experience
Total 30 CPU YEARS delivered to DC1 from
U.S. Total produced file size 20TB on HPSS
tape system, 10TB on disk. Black - majority
grid produced, Blue - majority farm produced
8Grid Production Statistics
These are examples of some datasets produced on
the Grid. Many other large samples were
produced, especially at BNL using batch.
9DC1 Production Systems
- Local batch systems - bulk of production
- GRAT - grid scripts, generated 50k files
produced in U.S. - NorduGrid - grid system, 10k files in Nordic
countries - AtCom - GUI, 10k files at CERN (mostly batch)
- GCE - Chimera based, 1k files produced
- GRAPPA - interactive GUI for individual user
- EDG/LCG - test files only
- systems I forgot
- More systems coming for DC2
- Windmill
- GANGA
- DIAL
10GRAT Software
- GRid Applications Toolkit
- developed by KD, Horst Severini, Mark Sosebee,
and students - Based on Globus, Magda MySQL
- Shell Python scripts, modular design
- Rapid development platform
- Quickly develop packages as needed by DC
- Physics simulation (GEANT/ATLSIM)
- Pileup production data management
- Reconstruction
- Test grid middleware, test grid performance
- Modules can be easily enhanced or replaced, e.g.
EDG resource broker, Chimera, replica catalogue
(in progress)
11GRAT Execution Model
1. Resource Discovery 2. Partition
Selection 3. Job Creation 4. Pre-staging 5.
Batch Submission 6. Job Parameterization
7. Simulation 8. Post-staging 9.
Cataloging 10. Monitoring
12Databases used in GRAT
- Production database
- define logical job parameters filenames
- track job status, updated periodically by scripts
- Data management (Magda)
- file registration/catalogue
- grid based file transfers
- Virtual Data Catalogue
- simulation job definition
- job parameters, random numbers
- Metadata catalogue (AMI)
- post-production summary information
- data provenance
13U.S. Middleware Evolution
Globus
Used for 95 of DC1 production
Condor-G
Used successfully for simulation
Used successfully for simulation (complex pile-up
workflow not yet)
DAGMan
Tested for simulation, used for all grid-based
reconstruction
Chimera
LCG
14DC1 Production Experience
- Grid paradigm works, using Globus
- Opportunistic use of existing resources, run
anywhere, from anywhere, by anyone... - Successfully exercised grid middleware with
increasingly complex tasks - Simulation create physics data from pre-defined
parameters and input files, CPU intensive - Pile-up mix 2500 min-bias data files into
physics simulation files, data intensive - Reconstruction data intensive, multiple passes
- Data tracking multiple steps, one -gt many -gt
many more mappings
15Grid Quality of Service
- Anything that can go wrong, WILL go wrong
- During a 18 day run, every system died at least
once - Local experts were not always be accessible
- Examples scheduling machines died 5 times
(thrice power failure, twice system hung),
Network outages multiple times, Gatekeeper died
at every site at least 2-3 times - All three databases died at least once!
- Scheduled maintenance - HPSS, Magda server, LBNL
hardware... - Poor cleanup, lack of fault tolerance in Globus
- These outages should be expected on the grid -
software design must be robust - We managed gt 100 files/day (80 efficiency) in
spite of these problems!
16Software Issues
- ATLAS software distribution worked well for DC1
farm production, but not well suited for grid
production - No integration of databases - caused many
problems - Magda AMI very useful - but we are missing data
management tool for truly distributed production - Required a lot of people to run production in the
U.S., especially with so many sites on both grid
and farm - Startup of grid production slow - but learned
useful lessons - Software releases were often late - leading to
chaotic last minute rush to finish production
17New Production System for DC2
- Goals
- Automated data production system for all ATLAS
facilities - Common database for all production - Oracle
currently - Common supervisor run by all facilities/managers
- Windmill - Common data management system - Don Quichote
- Executors developed by middleware experts
(Capone, LCG, NorduGrid, batch systems,
CanadaGrid...) - Final verification of data done by supervisor
18Windmill - Supervisor
- Supervisor development/U.S. DC production team
- UTA Kaushik De, Mark Sosebee, Nurcan Ozturk
students - BNL Wensheng Deng, Rich Baker
- OU Horst Severini
- ANL Ed May
- Windmill web page
- http//www-hep.uta.edu/windmill
- Windmill status
- version 0.5 released February 23
- includes complete library of xml messages between
agents - includes sample executors for local, pbs and web
services - can run on any Linux machine with Python 2.2
- development continuing - Oracle production DB,
DMS, new schema
19Windmill Messaging
- All messaging is XML based
- Agents communicate using Jabber (open chat)
protocol - Agents have same command line interface - GUI in
future - Agents web server can run at same or different
locations - Executor accesses grid directly and/or thru web
services
20Intelligent Agents
- Supervisor/executor are intelligent communication
agents - uses Jabber open source instant messaging
framework - Jabber server routes XMPP messages - acts as XML
data switch - reliable p2p asynchronous message delivery
through firewalls - built in support for dynamic directory,
discovery, presence - extensible - we can add monitoring, debugging
agents easily - provides chat capability for free -
collaboration among operators - Jabber grid proxy under development (LBNL -
Agarwal)
21Core XML Messages
- numJobsWanted
- supervisor-executor negotiation of number of jobs
to process - executeJobs
- supervisor sends XML based job definitions
- getExecutorData
- job acceptance, handle exchange (supports
stateless executors) - getStatus
- polling of job status
- fixJobs
- post-reprocessing and cleanup
- killJob
- forced job abort
22Core Windmill Libraries
- interact.py - command line interface library
- agents.py - common intelligent agent library
- xmlkit.py - xml creation (generic) and parsing
library - messages.py - xml message creation (specific)
- proddb.py - production database methods for
oracle, mysql, local, dummy, and possibly other
options - supervise.py - supervisor methods to drive
production - execute.py - executor methods to run facilities
23Capone Executor
- Various executors are being developed
- Capone - U.S. VDT executor by U. of Chicago and
Argonne - Lexor - LCG executor mostly by Italian groups
- NorduGrid, batch (Munich), Canadian,
Australian(?) - Capone is based on GCE (Grid Computing
Environment) - (VDT Client/Server, Chimera, Pegasus, Condor,
Globus) - Status
- Python module
- Process thread for each job
- Archive of managed jobs
- Job management
- Grid monitoring
- Aware of key parameters (e.g. available CPUs,
jobs running)
24Capone Architecture
- Message interface
- Web Service
- Jabber
- Translation level
- Windmill
- CPE (Capone Process Engine)
- Processes
- Grid
- Stub
- DonQuixote
- from Marco Mambelli
25Capone Processing
executeJob
received
- Supervisor requests
- Jobs
- Different job statuses
- 'received','translate','DAXgen','RLSreg','scheduli
ng','cDAGgen','submission','running',
'checking','stageOut','cleaning','end','killing - completed/failed (each step)
- Completion/failure (job)
recovery
stageOut
stageOut
recovery
end
fixJob
26Windmill Screenshots
27(No Transcript)
28Web Services Example
29Conclusion
- Data Challenges are important for ATLAS software
and computing infrastructure readiness - Grids will be the default testbed for DC2
- U.S. playing a major role in DC2 planning
production - 12 U.S. sites ready to participate in DC2
- Major U.S. role in production software
development - Test of new grid production system imminent
- Physics analysis will be emphasis of DC2 - new
experience - Stay tuned