Title: ATLAS Software
1ATLAS Software ComputingStatus
Developments17 October 2003
- Dario Barberis
- CERN Genoa University/INFN
- David Quarrie
- CERN LBNL
2ATLAS Computing Organization
- The ATLAS Computing Organization was revised at
the beginning of 2003 in order to adapt it to
current needs - Basic principles
- Management Team consisting of
- Computing Coordinator (Dario Barberis)
- Software Project Leader (David Quarrie)
- Small(er) executive bodies
- Shorter, but more frequent, meetings
- Good information flow, both horizontal and
vertical - Interactions at all levels with the LCG project
- The new structure is now in place and working
well - A couple of areas still need some thought (this
month)
3(No Transcript)
4ATLAS Computing Timeline
5Simulation
- Geant4 geometry of full ATLAS detector completed
- a few details to sort out (services etc.)
- Combined test beam simulation being assembled
- Global performance tests and tuning on-going
- memory OK (300 MB for full geometry)
- CPU ?2G3 for calorimetry (dominates total CPU)
- Hit definition completed for almost all detectors
- hit persistency in progress
- Still missing (active work in progress) MC truth
- particles produced by G4 processes
- connection between hits and particles
- Could do with more people working on global
simulation!
6Digitization Pile-up
- Digitization implemented at some level in the
Athena framework for all detectors - tuning wrt test beam data still needed in most
cases - Pile-up procedure tested (since a long time) for
Pixels and LAr - work on-going for other detectors, needs careful
checking and performance evaluation - Simulation of ROD algorithms still incomplete for
some detectors - Persistency of RDOs (Raw Data Objects) in POOL
next month - ByteStream (raw data) format to be revisited in
view of 2004 combined test beam
7Reconstruction
- RTF (Reconstruction Task Force) report delivered
- sets guidelines for Reconstruction Event Data
Model and interfaces between (sub)algorithms - implementation already started for detector
reconstruction - combined reconstruction will follow soon
- co-ordination between tracking detectors (InDet
Muons) also started in view of common (ATLAS)
Track model - Validation team feedback showed one significant
problem with current major software release
(7.0.1) - Release 7.0.2 to be built and kept stable for
end-users - Next few developers releases (till end 2003) may
be unstable as some technical code reshuffling is
needed to conform to new common RTF-defined
interfaces - end users should use release 7.0.2 (plus bug
fixes if needed)
8Event Persistency
- POOL integrated in release 7.0.0 in September
- not all functionality needed for our Data Model
available - Release 7.1.0 (today) contains newer POOL/ROOT
versions - enough for most data classes
- being tested with generator output (HepMC)
- still missing support for DataLinks (used for
object associations) - Release 7.2.0 (end October) will have full
functionality - Hits, RDOs, RIOs, Tracks, CaloClusters etc. etc.
- Data dictionary is generated automatically once
the machinery works, it works for all data types - Full testing plan being discussed, several
options - test directly with HepMC, Geant4 hits, Athena
RDOs, Reco EDM (full chain for functionality) - convert DC1 hits to POOL and have large-scale
test - in any case a few more people could help a lot!
9Data Management Issues
- ATLAS Database Coordination Board recently set up
- coordination of
- Production Installation DBs (TCn)
- Configuration DB (online)
- Conditions DB (online and offline)
- with respect to
- data transfer
- synchronization
- data transformation algorithms
- i.e. from survey measurements of reference marks
on muon chambers to wire positions in space
(usable online and offline) - members
- Richard Hawkings (Alignment Calibration
Coordinator), chair - David Malon (Offline Data Management Coordinator)
- Igor Soloviev (Online DB contact person)
- Kathy Pommes (TCn Production Installation DB)
10Conditions Data Working Group
11Computing Model Working Group (1)
- Work on the Computing Model was done in several
different contexts - online to offline data flow
- world-wide distributed reconstruction and
analysis - computing resource estimations
- Time has come to bring all these inputs together
coherently - A small group of people has been put together to
start collecting all existing information and
defining further work in view of the Computing
TDR, with the following backgrounds - Resources
- Networks
- Data Management
- Grid applications
- Computing farms
- Distributed physics analysis
- Distributed productions
- Alignment and Calibration procedures
- Data Challenges and tests of computing model
12Computing Model Working Group (2)
- This group is
- first assembling existing information and
digesting it - acting as contact point for input into the
Computing Model from all ATLAS members - preparing a running Computing Model document
with up-to-date information to be used for
resource bids etc. - preparing the Computing Model Report for the
LHCC/LCG by end 2004 - contributing the Computing Model section of the
Computing TDR (2005) - The goal is to come up with a coherent model
for - physical hardware configuration
- e.g. how much disk should be located at
experiment hall between the Event Filter Prompt
Reconstruction Farm - data flows
- processing stages
- latencies
- resources needed at CERN and in Tier-1 and
Tier-2 facilities
13Combined Test Beam
- Most of software (infrastructure and algorithms)
are common for full ATLAS simulation and combined
test beam - geometry is different (for Geant4 simulation and
for reconstruction) - but built out of the same basic units
- alignment procedure is completely different
- tracking for InDet is somewhat different
(geometry and B field) - Offline Test Beam group assembled
- includes contact people from detector test beam
communities and major software activities - coordinated by Ada Farilla
- Offline test beam support in parallel to DC2
- same timescale
- different constraints and priorities
- again, would need more people to support
concurrent activities
14Data Challenges
- DC1 completed earlier this year
- some productions still going on using the DC1
software and organization - they are now called continuous productions
- we foresee a semi-automatic system for these
medium-sized productions - DC2 operation in 2004
- distributed production of (gt107) simulated events
in April-June - events sent to CERN in ByteStream (raw data)
format to Tier-0 - (possibly) prompt alignment/calibration and
(certainly) reconstruction processes run on
prototype Tier-0 in a short period of time (10
days, 10 test) - reconstruction results distributed to Tier-1s and
analysed on Grid - needs delivery of well-tested software by March
for simulation and May for reconstruction
15ATLAS DC1 (July 2002-April 2003)
- Driven by the production of data for High Level
Trigger (HLT) and to Physics communities - HLT-TDR in June 2003
- Athens Physics Workshop in May 2003
- Main achieved goals
- Put in place the full software chain from event
generation to reconstruction - Put in place the distributed production
- Experimented production with Grid tools
- NorduGrid for all phases of DC1
- ATLAS-US
- Simulation
- Pile-up and reconstruction in a very significant
way - gt 10K jobs 10K CPU-days gt 10 TB of data
- EDG (pioneer role) UK-Grid in test mode
- Experimented data replication transfer (Magda)
16 Grid in ATLAS DC1
US-ATLAS EDG -gt LCG
NorduGrid DC1
DC1 DC1 Part of
simulation several tests
full production Pile-up reconstruction
DC2 Same basic components for job definition
job submission recipe bookkeeping
data catalog transfer replication
17DC2 April July 2004
- At this stage the goal includes
- Full use of Geant4 POOL LCG applications
- Pile-up and digitization in Athena
- Deployment of the complete Event Data Model and
the Detector Description - Simulation of full ATLAS and 2004 combined
Testbeam - Test the calibration and alignment procedures
- Use widely the GRID middleware and tools
- Large scale physics analysis
- Computing model studies (document end 2004)
- Run as much as possible of the production on
LCG-1
18Task Flow for DC2 data
Athena-POOL
Athena-POOL
Athena-POOL
ESD AOD
Hits MCTruth
Digits
Athena Pile-up Digits
Athena Geant4
Athena
HepMC
Digits
ESD AOD
Hits MCTruth
Athena Pile-up Digits
Pythia 6
Athena Geant4
HepMC
Athena
Digits
Hits MCTruth
Athena Pile-up Digits
ESD AOD
Athena Geant4
HepMC
Athena
Byte-stream
Digitization (Pile-up)
Event generation
Detector Simulation
Reconstruction
19DC2 Scenario Time scale
- September 03 Release7
- Mid-November 03 pre-production release
- February 27th 04 Release 8 (production)
- April 1st 04
- June 1st 04 DC2
- July 15th
- Put in place, understand validate
- Geant4 POOL LCG applications
- Event Data Model
- Digitization pile-up byte-stream
- Conversion of DC1 data to POOL large scale
persistency tests and reconstruction - Testing and validation
- Run test-production
- Start final validation
- Start simulation Pile-up digitization
- Event mixing
- Transfer data to CERN
- Intensive Reconstruction on Tier0
- Distribution of ESD AOD
- Calibration alignment
- Start Physics analysis
- Reprocessing
20DC2 and Grid tools
- Much work already done
- Magda (BNL) AMI (Grenoble) used already in DC1
- Other tools in different stages of development
and test - GANGA (ATLAS-LHCb UK main effort)
- Chimera (US) exploit Virtual Data ideas (DC1)
DIAL (BNL) - AtCom used to generate the jobs (batch and grid
flavors) - GRAT (US) Grappa (US)
- A coherent view of tools use and integration is
emerging but need more work and thinking - LCG-1
- We intend to use and contribute to validate LCG-1
components when they become available - ATLAS-EDG task force has become ATLAS-LCG testing
task force
21DC2 resources (based on Geant3 numbers)
To be kept if no 0 suppression in calorimeters
22Grid activities
- LCG-1 system being deployed by LCG-GD group
- so far in several sites around the world,
expanding fast - somewhat later than expected and with some
reduced functionality - new release in November should improve
functionality and robustness - ATLAS-LCG testing team led by Oxana Smirnova
actively testing the system in close contact with
LCG-GD group - could foresee to open the service for general
users in December - timescale anyway matches our Data Challenge plans
- Grid-3 ATLAS-CMS joint tests in US in
October-November to test internal infrastructure - including interoperability with LCG-1 system
- ARDA (Architectural Roadmap for Distributed
Analysis) RTAG will submit final report by end
October - LCG common project will probably follow, we will
participate
23Commissioning
- Commissioning of software
- Data Challenges of increasing complexity and
scope - tests of ATLAS offline software
- tests of common LCG software
- tests of environment and infrastructure (LCG
Grid) - Test Beam operation
- real-time tests
- framework services for Level-2 trigger and Event
Filter - HLT and offline algorithms in EF and prompt
reconstruction environment - Software for commissioning
- A few simulation/reconstruction activities of
cosmic ray, beam-gas and beam halo events already
started - Now need coordination of efforts (similarly to
test beam coordination)
24Planning Milestones
- We recently revised completely the planning of
the Software Computing project, including - WBS (resource loaded)
- Schedule with internal and external (LCG)
dependencies - Milestones (all levels)
- Integration with ATLAS Technical Coordination
scheduling tools (schedule and PPT system) - Last planning exercise was 3 years ago (no
integration with TC tools at that time) - several things changed since
- LHC t0 shifted to 2007
- LCG project started (resources moved from ATLAS
to LCG common projects) - Tom LeCompte (Planning Coordinator since June03)
actively gathered and harmonized all information
25Planning, Schedule, PPT
- Planning now kept up-to-date very actively by Tom
LeCompte (Planning Coordinator) - Schedule being improved with more detailed task
specifications - Level-1 milestones submitted to EB and LHCC last
month - Lower level milestones being defined (or revised)
next week - Schedule integration with TC master schedule in
progress - Full integration with PPT by end November
- Level-1 and level-2 tasks and milestones to be
followed by PPT
26High-Level Milestones
- 10 Sept. 2003 Software Release 7 (POOL
integration) - 31 Dec. 2003 Geant4 validation for DC2 complete
- 27 Feb. 2004 Software Release 8 (ready for DC2/1)
- 1 April 2004 DC2 Phase 1 starts
- 1 May 2004 Ready for combined test beam
- 1 June 2004 DC2 Phase 2 starts
- 31 July 2004 DC2 ends
- 30 Nov. 2004 Computing Model paper
- 30 June 2005 Computing TDR
- 30 Nov. 2005 Computing MOU
- 30 June 2006 Physics Readiness Report
- 2 October 2006 Ready for Cosmic Ray Run
27Computing Manpower Review
- Final report received last week
- Main points for ATLAS according to the panel
- Must strengthen the core s/w team at CERN soon
- or we will have to recruit more people and
sustain a larger level of effort compared to
other experiments - The software plan needs the backing of the
Collaboration, its members and the funding
agencies. - The ATLAS Collaboration needs to take action
- or they will need twice the resources as
compared to the other experiments and possibly
fail to have their software ready in time for the
start-up of the LHC. - If they establish a core group at CERN, the
level of effort missing from the core team is
estimated to be about 8 FTEs of software
engineers and computing professionals.
28Conclusions and Perspectives
- We have an organizational structure adapted to
the needs of the Software Computing Project
till LHC start-up - clearly-identified managerial roles,
responsibilities, work packages - The current planning update helped us to
identify - critical path developments
- missing manpower
- Data Challenges, Test Beams and Commissioning
runs help us to deploy ATLAS software, assess it
and use it in realistic environments of
increasing complexity