Title: The D0 Monte Carlo Challenge
1The D0 Monte Carlo Challenge
- Gregory E. Graham
- University of Maryland
- (for the D0 Collaboration)
- February 8, 2000 CHEP 2000
2D0 Monte Carlo Challenge Goals
- Testing the D0 GEANT Monte Carlo Program
(D0GSTAR) - Performance CPU time, event size, memory
- Accuracy of detector simulation
- Testing the D0 Reconstruction Program (D0RECO)
- Performance CPU time, event size, memory
- Efficiency and rejection power of reconstruction
algorithms
3D0 Monte Carlo Challenge Goals
- Testing the infrastructure for running D0GSTAR
and D0RECO - remote site Monte Carlo generation
- storage testing (Sequential Access Method - SAM)
(C241 - these proceedings) - farm processing (E60 - these proceedings)
- Testing the integration of these systems with
large numbers of events - Stress testing
4D0 Monte Carlo Challenge Goals
- Evaluating the Physics Potential of the D0
Detector in Fermilab Tevatron Run II - detailed studies of different physics processes
- trigger studies
- algorithm performance
- background rejection / signal efficiency
- discovery potential
5The D0GSTAR Program
- Based on GEANT 3.21 (Fortran) with C wrapper
- Linux and IRIX Platforms
- Input is ISAJET, PYTHIA (with QQ libraries
and TAUOLA ) - Output are Hits and Digi information
- Typical event size 1.0 - 1.5 MB
- Typical detector simulation time 5 - 7 minutes
- Shown for SGI R12000 processor at 300 MHz
6D0 Monte Carlo Challenge Objectives
- Phase I Initial Stage (Summer 1999)
- Generate 100,000 events
- Test and develop D0GSTAR, D0RECO
- Phase II Intermediate Stage (Winter 2000)
- Generate 300,000 - 500,000 events
- Use remote sites for MC generation
- Further testing and development of D0GSTAR,
D0RECO - Integration of Systems
- Initial Physics and Trigger studies
7D0 Monte Carlo Challenge Objectives
- Phase III Final Stage (Fall 2000)
- Further develop remote MC generation sites
- Further testing of software systems integration
- Trigger studies
- Physics Studies
- Double Blind tests
8D0 Monte Carlo Challenge Phase II
- How do we efficiently generate 500,000 D0GSTAR MC
events ? - This corresponds to about 55,000 CPU hours
(on a 300 MHz/CPU) - How do we store 500,000 MC events ?
- This corresponds to about 0.6 TB of data
- How do we further improve our capacity for MC
generation while ensuring homogeneity of
generated samples at remote sites ?
9MCC Event Generation and Storage
- Use many remote computing facilities to generate
Monte Carlo - MC generation must be homogenized
- Tools to collect statistics on MC generation
- Use SAM / ENSTORE system at FNAL to store event
files - Currently configured to use a tape robot using
Mammoth-I tapes (18 GB/tape)
10MCC Remote Processing Sites
11MCC Remote Processing Sites
- FNAL (Batavia, IL)
- 17 processors on 48 processor SGI R12000
- 300 MHz, memory 250 MB/processor
- Lyon (IN2P3,France)
- 15 dual Pentium II/III PC Farm
- 450/500 MHz, memory 250 MB/processor
- Amsterdam ( NIKHEF, Netherlands )
- 6/128 processors on 128 processor SGI R10000
(now 32/128 - Thanks, Kors!) - 250 MHz, memory 450 MB/processor
12MCC Remote Processing Sites
- Prague (Czech Rep. Acad. Sciences)
- 3 Pentium III based PCs
- 450/500 MHz, memory 128/256 MB/processor
- University of Texas at Arlington
- 7 dual Pentium III PCs
- 500 MHz, memory 250 MB/processor
- Further Sites in the Planning Stage
- Lancaster (UK), Nijmegen, and others
- Hardware upgrades are also being planned
13MC Homogeneity
- Event Generation is by isajet or pythia
- these are D0 standard tools (cvs package)
- supplemented by QQ and tauola
- D0GSTAR is also standard
- D0GSTAR based on 3 cvs packages
- Linux and Irix platforms
- Python based tools for MC generation
- Standardized input files
- Random number seeds updated automatically
- Built in scripting tool for multi-job generation
- No D0 environment necessary
- Distributed as executables libraries input
files tools
14Collecting the MC Data
- After generation, MC data is imported into SAM
- Directly into SAM via sam store commands
(possible at FNAL site) - Via ftp connection to SAM import site
- Via Mammoth-I tapes (plus suitcase and airplane
ticket) - Statistics for generation are now collected
locally at the remote sites and compiled by hand
at the central site
15Results on Generated Events
Snapshot January, 2000
QCD DIJET events in each of 7 different Et
thresholds . . . . 50K tt events . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .50K b -gt J/y (-gt ee or mm ). . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . 50K Z -gt ee, mm, tt each . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
10K Z -gt bb . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
.5K W -gt en, mn, tn (each) . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 1K g
jets . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . 1K U
-gt ee . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 1K
Other Signal Events . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .10K Total
Generated . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . 500K
And we still continue to generate events !
16Results on Generated Events
Snapshot January, 2000
MCC production by site FNAL . . . . .
. . . . . . . . . . . . . . . . 240K Lyon . . . .
. . . . . . . . . . . . . . . . . . 210K NIKHEF .
. . . . . . . . . . . . . . . . . . 30K Prague
. . . . . . . . . . . . . . . . . . . . .
20K UTA . . . . . . . . . . . . . . . . . . .
.(just starting) Total . . . . . . . . . . . . .
. . . . . . . . . 500K
17MCC Event Generation Capacity
- Generation Capacity
- Phase II has achieved a capacity of about 5,000
events per day at all existing remote sites - Can expand to at least 25,000 events per day with
further hardware upgrades and new centers - Storage Capacity
- Network bandwidth less than desirable (ftp), but
not a bottleneck - 0.6 TB stored so far in SAM, also not a
bottleneck
18MCC - Debugging D0
- Integration issues are being addressed
- SAM was exercised extensively. Problems were
uncovered and debugged in the software. - Debugging D0RECO
- Six pass minor releases in current production
release - Development of tools
- micro-DST, Analyze tools
- Problems were uncovered and fixed in the MC
generation tools at remote sites - Beginning to look at D0TRIGSIM, uDSTs, etc
19What Did We Learn
- Every remote site is different
- porting and verifying released code was easy
- porting the MC generation tools was hard
- Python/Tk
- UNIX shells
- Customization
- Compiling statistics by hand is hard
- we need a tool to do this
20Future Plans for MCC Phase III
- Develop tools to better control remote production
- Presently, remote production is controlled
locally - Software is being developed to control MC
generation over the network from a central server - Develop tools to collect MC generation statistics
- Software is being developed to automate
collection of MC generation statistics
21Conclusions
- Integration of D0 MC production, storage, and
access was extensively tested (and debugged!) - 500 K Events generated for D0 Experiment
- 55,000 distributed CPU hours logged since 10/1/99
at 5 remote sites - 0.6 TB of data successfully stored on SAM
- Certification of the D0RECO program is currently
underway using the MCC samples - Other projects (eg- trigger simulation) are
ramping up