Spring 2002 CMS Monte Carlo Production: What ? How ? What Next ? PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Spring 2002 CMS Monte Carlo Production: What ? How ? What Next ?


1
Spring 2002 CMS Monte Carlo Production What ?
How ? What Next ?
  • Véronique Lefébure (CERN-HIP) CERN-IT
    SeminarThe 25th of September 2002

2
  • Content

What ? Physics Applications Production steps Data products
Resource Constraints CPU, RAM Persistency
How Much Data Number of events, TB of data Delivery deadline
How ? World-Wide Distributed Production Where, who Coordination
Production Tools RefDB,IMPALA,DAR,BOSS Data Transfer Data Storage Data Validation
Success and Difficulties
What next ? Possible Improvements Coming Major Production 2004 Data Challenge
3
Introduction CMS
  • On-line System
  • Multi-level trigger
  • Filter out background
  • Reduce data volume

40 MHz (1000 TB/sec)
Level 1 Trigger
75 KHz (50 GB/sec)
Level 2 Trigger
5 KHz (5 GB/sec)
Level 3 Trigger
100 Hz (100 MB/sec)
Data Recording Offline Analysis
4
Data Simulation Needs
  • Spring 2002 Production for the CMS Physics
    Community
  • need a large amount of simulated data in order to
    prepare the CMS DAQ TDR document Data
    Acquisition Technical Design Report due for end
    of 2002
  • need the most up-to-date physics software to be
    used
  • need the data before June 2002 CMS week

5
Monte Carlo Production Steps
  • The full Production Chain consists of 4 steps
  • 3 Logical Monte Carlo Simulation Steps
  • Generation
  • Simulation
  • Digitisation
  • 1 Reconstruction and Analysis Step
  • Production was performed step by step for many
    different p-p physics channel

RAW data as produced by the real detector Stored
in Objectivity/DB
6
Monte Carlo Production Steps1) Generation
Primary interactionsin vacuum of beam-pipe
  • Generation of one p-p interaction at a time
  • for a Selected physics channel
  • In reality 4 or 20 interactions per
    beam-crossing depending on the beam luminosity
  • (2.1033 or 1034 cm-2 s-1) i.e. interactions are
    superimposed pile-up

p
7
Monte Carlo Production Steps2) Simulation
Secondary interactionsin detector material and
magnetic field
  • Individual Hits
  • Crossing points
  • Energy deposition
  • Time of flight
  • In reality one beam-crossing every 25 nsltlt
    time of flight and electric signal development
  • i.e. superimposition of signals from particles
    from different beam-crossings pile-up

8
Monte Carlo Production Steps3) Digitisation
Response of Sensitive detector elements, taking
into account the two sourcesof Pile-Up
  • 4 or 20 interactionsper beam-crossings
  • Beam-crossings -5,3
  • For 1 Signal p-p event of 1 MB
  • We have 70 MB of Pile-up events _at_1034 cm-2 s-1

9
Monte Carlo Production Steps4) Reconstruction
and Analysis
Higher level physicsReconstruction
andHistograming
  • Level-1 trigger Filtering
  • Track, clusters, vertices Reconstruction
  • First-pass physics Analysis
  • Histograming

10
Physics Applications
Application Input Output (for jobs of 500 events)
CMKIN/PYTHIA (ISAJET,COMPHEP) Fortran77 Very fast 5 sec/event ascii file PAW ntuple (size 30 MB)
CMSIM/GEANT3 Fortran77 Very slow 1 to gt10 min/event ascii file Geometry and magnetic field ZEBRA file (size14 MB) PAW ntuple ZEBRA file (size 0.5 GB)
ORCA-COBRA C Object-Oriented ooHit Formatting very fast (I/O) 1034 PU (200 PU events)1 min/event Executable size lt200 MB Multi-threaded ascii file Geometry and magnetic field ZEBRA file ZEBRA file Objectivity/DB data metadata ooHit files (size 0.5 GB)
ORCA-COBRA C Object-Oriented ooHit Formatting very fast (I/O) 1034 PU (200 PU events)1 min/event Executable size lt200 MB Multi-threaded ascii file Objectivity/DB data metadata ooHit files for Signal and Pile-up events Objectivity/DB data metadata Digis files (size 2 GB)
ORCA-COBRA C Object-Oriented ooHit Formatting very fast (I/O) 1034 PU (200 PU events)1 min/event Executable size lt200 MB Multi-threaded ascii file Objectivity/DB data metadata ooHit Digi files for Signal and Pile-up events Objectivity/DB data metadata filesor PAW ntuple or ROOT files
Generation
p
Simulation
ooHit formatting
Digitisation
Reconstructionand Analysis
11
More Production Steps
  • Filtering (Level-1 trigger, )
  • Add digits (eg. First calorimeter digits, then
    Tracker after filtering)
  • Cloning of ooHits and/or Digis (smaller
    collection of data to handle, less staging at
    analysis time)
  • Re-digitisation with different algorithms or
    parameters

12
Resource Constraints
  • Long CMSIM jobs can take 2 days and more
  • RAM gt 512 MB for dual processors (ORCA)
  • Redhat 6.1(.1) for Objectivity/DB license
  • Data server
  • 80 GB of Pile-Up events (re-used, otherwise
    300TB!)
  • Typically 1 server per 12 CPUs
  • Disk space size of one typical dataset _at_ 1034
    50K events (1MB fz 1MB oohits 4 MB
    digis)/event 300 GB
  • Lockserver, AMS server number of file handles
    may reach 3000

13
Job Complexity
  • Generation and Simulation jobs easy part
  • ORCA-COBRA jobs more tricky
  • Closely-coupled jobs
  • Shared federation/lockserver, output server, AMS
  • 5 jobs write in parallel to 1 DB
  • 1 job may populate many DBs (10)
  • One stale lock can bring everything to a halt
  • Massive I/O system _at_ 1034
  • 100 jobs in parallel
  • Input 70 MB pile-up events per 1 MB signal
    event, 1 event/minute
    1MB/sec/job
  • Output 4 MB/minute/event/job
  • Not yet fully robust physics software need to
    recover from crashes and to spot infinite loops

14
How Much Data ?
  • Generation/Simulation
  • 4 months
  • 6 M events 150 Physics channels
  • ORCA production
  • 2 months
  • 19000 files 500 Collections 20 TB
  • NoPU 2.5M, 2x1033PU4.4M, 1034PU 3.8M,
    filter 2.9M
  • 300 TB of pile-up movement on the LAN
  • 100 000 jobs, 45 years CPU (wall-clock)
  • More than 10 TB traveled on the WAN
  • Production completed just on time

Successful Production at a regular global rate !
15
CMSIM
6 million events
1.2 seconds per event for 4 months
Feb. 8th
June 6th
16
2x1033PU
4 million events
1.2 seconds per event, 2 months
April 12th
June 6th
17
1034PU
3.5 million events
1.4 seconds per event, 2 months
June 6th
April 10th
18
Physics Results
Data is used for physics studies, not only for
computing performance studies
19
How ?
  • Production
  • Distribution
  • Coordination
  • Production Tool Suite
  • Success and Difficulties

20
World-wide Distributed Production
CMS Production Regional Centre
21
World-wide Distributed Production
  • 11 Regional Centres (RC)gt 20 sites in USA,
    Europe, Russia 1000 CPUsBristol/RAL (UK),
    Caltech, CERN, Fermilab, Imperial College (UK),
    IN2P3-Lyon, INFN (Bari, Catania, Bologna,
    Firenze, Legnaro, Padova, Perugia, Pisa, Roma,
    Torino), Moscow (ITEP, JINR, SINP MSU, IHEP),
    UCSD(San Diego), UFL (Florida), Wisconsin Note
    Still more sites joining (RICE, Korea, Karlsruhe,
    Pakistan, Spain,Greece, )
  • gt 30 Production OperatorsMaria Damato,
    Alessandra Fanfani, Daniele Bonacorsi, Catherine
    MacKay, Dave Newbold, Suresh Singh, Vladimir
    Litvine, Salavatore Costa, Julia Andreeva, Tony
    Wildish, Veronique Lefebure, Greg Graham, Shafqat
    Aziz, Nicolo Magini, Olga Kodolova, David
    Colling, Philip Lewis, Claude Charlot, Philippe
    Mine, Giovanni Organtini, Nicola Amapane, Victor
    Kolosov, Elena Tikhonenko, Massimo Biasotto,
    Stefano Lacaprara, Alexander Kryukov, Nikolai
    Kruglov, Leonello Servoli, Livio Fano, Simone
    Gennai, Ian Fisk, Dimitri Bourilkov, Jorge
    Rodriguez, Pamela Chimney, Shridara Dasu, Iyer
    Radhakrishna, Wesley Smith,plus probably many
    more persons in the shadow !
  • gt 20 Physicists as Production Requestors

22
Coordination Issues
  • Physicists side
  • Handle four Physics groups
  • Check uniqueness of requests
  • Check number of requested events is reasonable
  • Take care of requests priorities
  • Producers side
  • Deploy and support production tools
  • Distribute physics executables
  • Distribute adequately requests to RCs
  • Insure uniqueness of produced data
  • Track progress of data production and transfer

23
Coordination Means
  • Physicists side
  • 1 Coordinator per Physics group
  • 1 Coordinator for the 4 Physics groups
  • Meetings
  • Use of MySQL CMS DB for recording and managing
    the production requests (RefDB)
  • Producers side
  • 1 Production Manager
  • 1 Production Coordinator in contact with the
    Physics Coordinators
  • 1 or 2 Contact Persons per Regional Centre
  • Meetings and mailing list
  • Use of MySQL CMS DB for assigning production
    requests to Regional Centres and progress
    tracking (RefDB)
  • Pre-allocation of run numbers, random seeds,
    DBIDs
  • Automatic file naming provided by RefDB

24
RefDB Central Reference Database
  • Production Requests
  • Submission forms for each production step
  • List of recorded Requests
  • Modification/Correction of submitted Requests
  • Production Assignments
  • Selection of a set of Requests for Assignment to
    an RC
  • Re-assignment of a Request to another RC or
    production site
  • List and Status of Assignments

25
RefDB Central Reference Database
  • Meta Data catalogue
  • Browse Datasets according to
  • Physics Channel
  • Software Version
  • Get Production Status
  • Get Data Location
  • Get Input Parameters

26
How ?
  • Production
  • Distribution
  • Coordination
  • Production Tool Suite
  • Success and Difficulties

27
Production Tools Spring02 Components
IMPALA
Job Scripts Generator
Central Input Parameters DB
Monitoring Schema Scripts
RefDB
BOSS
Local Job Monitoring DB
Central Output Metadata DB
Job Scheduler
28
DARDistribution After Release
  • CMS software distribution tool
  • allows to create and install the binaries
  • Distribution tar files published at FNAL and at
    CERN
  • Local installation dar -i Distribution_Tar_File
    Installation_Directory
  • Used for distribution of ALL physics executables
    and Geometry file

29
BOSSBatch Object Submission System
  • tool for job monitoring and book-keeping
    developed by CMS
  • not a job scheduler, but can be interfaced with
    any scheduler
  • LSF (CERN, INFN)
  • PBS (Bristol, Caltech, UFL, Imperial College,
    INFN)
  • FBSNG (Fermilab)
  • Condor (INFN, Wisconsin)
  • Uses a database (MySQL)

30
BOSS
  • User registers a scheduler
  • Scripts for job submission, deletion and query
    (DB blobs)
  • User registers a job type
  • Schema for the information to be monitored (new
    DB table)
  • Algorithms to retrieve the information from the
    job (DB blobs)
  • User submits jobs of a defined type
  • A new entry is created for the job in the BOSS
    database tables
  • The running job fetches the user monitoring
    programs and updates the BOSS database

31
BOSS
32
BOSS for Spring02 Production
BOSS Job Type Registration components
Job Type Table
cmkin.schema , preprocess, runtimeprocess ,
postprocess
KIN
Generation
cmsim.schema , preprocess, runtimeprocess ,
postprocess
SIM
Simulation
oohit.schema , preprocess, runtimeprocess ,
postprocess
OOHit
OOHit
oodigi.schema , preprocess, runtimeprocess ,
postprocess
Digitisation
OODigi
33
From BOSS to RefDB Summary scripts
  • Updating RefDB with current status of assignment
    progress
  • Book-keeping of the monitored values
  • Checking of uniqueness of generation and
    simulation run numbers and random seeds
  • Warning for duplicate runs
  • Warning for missing or incomplete runs

34
Data Validation Scripts
  • After storage of the data Final Validation at
    the Meta Data level
  • Basically, checks that warnings given by the
    summary scripts have been corrected
  • Correct number of events
  • No duplicates
  • Closure of DB files (COBRA sense of it no more
    data will be written to that DB file)
  • All DB files of a Collection are attached to the
    Federation

35
IMPALA
Job Scripts Generator
Central Input Parameters DB
Monitoring Schema Scripts
RefDB
BOSS
Local Job Monitoring DB
Central Output Metadata DB
Job Scheduler
36
IMPALAIntelligent Monte Carlo Production Local
Actuator
  • Automated script generation tool developed by CMS
    for MC Production
  • Job splitting 50 000 events 100 jobs of 500
    events
  • Interfaces defined for
  • Parameter Handling
  • Input source discovery and enumeration
  • Tracking (declared, created, submitted,
    running, done, problems, logs)
  • Job Submission

37
IMPALA
IMPALA Tracking/Production files
IMPALA Tracking/Batch files
38
IMPALA Configuration
  • Executable location (DAR file)
  • Output data location (Boot file for the
    Objectivity/DB federation, output disk, )
  • BOSS (or Scheduler) installation location
  • Local functions (CopyLogFiles, StageIn,
    StageOut, )

39
Data Transfer(Tonys scripts)
  • Transfer tool developed by CMS Tonys scripts
  • For CERN/Europe
  • Many US sites use GDMP (Grid) and globus-url-copy
  • Simple HTTP server publishes list of files
  • Files on disk (find) or on tape (flat list)
  • Client searches list for new files
  • Compares to list of files already retrieved,
    selects by pattern-matching (to select datasets)
  • Client asks server to push n files
  • DBServer pushes files in m parallel streams
  • using designated copy agent scp, bbcp, rfcp

40
Spring02 Transferred Data
  • To CERN 3 or 4 exporters in parallel, 7 TB in
    total
  • To FNAL 5TB
  • Sustained rate network-to-disk higher than
    sustained rate disk-to-tape

From To Rate
Bristol, RAL, IC,IN2P3,INFN, Caltech,FNAL,UFL,Wisconsin Moscow CERN 200GB/day (disk 150GB limit) Slow
Caltech, UFL, Wisconsin UCSD Moscow FNAL 1 TB/day
Bristol RAL 1 TB/day
INFN INFN 300 GB/day
41
Data Storage
  • CASTOR (CERN)
  • ENSTORE (Fermilab)
  • Basic tape system (RAL)

42
Success and Difficulties
  • Coordination
  • Farm Setup
  • Running Jobs
  • Data Transfer
  • Data Storage and Publication

43
Success and Difficulties Coordination
  • Use of a Central Reference DB RefDB
  • Uniform format of input parameter files
    NEW
  • Storage and index of parameter files
    NEW
  • Automatic retrieval of the parameters by IMPALA
    NEW
  • Tracking of the global CMS production rate
    NEW
  • Test-assignments for validation of software
    installation NEW
  • Where GRID tools can help us
  • Assignment of Requests to RCs is still done by
    hand
  • Need of a CMS-wide Resource Monitoring System
  • Update of RefDB has to be done by hand
  • Should be automated and incorporated in the Job
    Monitoring System

44
Success and Difficulties Farm Setup
  • We have a Production Tool Suite
    NEW
  • But a lot to learn the first time
  • At system level (MySQL, Disk servers
    configuration for Pile-Up, AMS Lockserver , )
  • At the software level (test-assignments to play)
  • Heavy support task rapidly evolving production
    software new releases, bug fixes (but excellent
    team spirit)
  • Different Farm configurations not possible to
    test the tools for all
  • (Different job schedulers, MSS or not,
    distributed or central disks, shared or dedicated
    CPUs, firewalls or no, data servers on CPU nodes
    or not,)
  • Where GRID tools can help us
  • installation in one command toolkit

45
Success and Difficulties Running Jobs
  • ORCA Digitisation Job Resume System
    NEW
  • Highly helpful (10 of failure, jobs can now be
    easily resumed)
  • Still need more robustness in the user analysis
    part of ORCA
  • Invalidation of bad runs to be automated
  • Objectivity/DB readonly option
    NEW
  • Much less locking problems than before
  • System problem recovery
  • Cleaning of stale Objectivity/DB locks
  • 2GB file size limit to be controlled on Solaris
    disk (CERN)
  • Network failure (no more disk failure)
  • Disk space
  • Scaling problems in the way we use BOSS
  • Where GRID tools can help us
  • Farm Monitoring System , with discovery of crash
    reason and action for recovering

46
Success and Difficulties Data Transfer
  • We have transfer tools Tonys scripts and GDMP
  • Much more data movement than before over
    half the data has traveled on the WAN
  • still problems to be handled by hand
  • Transfer interrupted (time limit)
  • Data corruption
  • Disk space limitation
  • Missing files Datasets spread over up to 500
    files for one collection (typically 100 files)
    but we must have every file before analysis can
    start safely
  • Where GRID tools can help us
  • Replica Manager

47
Success and Difficulties Storage Publication
  • Validation scripts for Dataset integrity check
    NEW
  • Should be part of the data transfer tool
  • Tape failures (RAL)
  • Archive failure in Castor rare but difficult to
    spot
  • Stage in time to Castor can be very long for few
    files (gt1hour)
  • Interaction between Castor and (multiple)
    analyses not well understood ? needs studying

48
Success and Difficulties Summary
  • Major improvements in the physics code and in the
    production machinery with respect to previous
    years
  • ORCA Resume System
  • Use of RefDB and BOSS made better automation
    and book-keeping possible
  • Our CMS production tools can be improved more
    automation
  • GRID tools may help to have it even better
  • Tool for Installation/Configuration of Production
    Tools
  • Resource Monitoring System
  • Replica Manager
  • Anything that can help reducing the manpower
    needs
  • Data access for user analysis has to be improved
  • Problems have been addressed by the Production
    team and the Production Tools Review team

49
More and Faster
  • 1999 1TB 1 month 1
    person
  • 2000-2001 27 TB 12 months 30 persons
  • 2002 20 TB 2 months 30 persons
  • 2003 175 TB 6 months lt30 persons

50
Coming Data Challenge
  • 2004 Data ChallengeDC04
  • Analysis of data produced by25 LHC startup
    luminosity (2.1033 cm-2 s-1) _at_ a data-taking
    rate of 25Hz during 1 month 5. 107 events
  • 5 LHC final luminosity (1034 cm-2 s-1)
  • To validate the software baseline
  • new LCG persistency framework (POOL,ROOT)
  • new simulation software (OSCAR/Geant4)
  • new GRID tools and resources
  • 2003 pre-challenge production of the 5. 107
    events _at_ 2.1033 cm-2 s-1

51
Two Phases
  • Pre-Challenge (2003, Q3,Q4) (Must
    be successful)
  • Large scale simulation and digitization
  • Will prepare the samples for the challenge
  • Will prepare the samples for the Physics TDR
  • Progressive shakedown of tools and centers
  • All centers taking part in challenge should
    participate to pre-challenge
  • The Physics TDR and the Challenge depend on
    successful completion
  • Challenge (2004,Q1,Q2)
    (May fail,



    i.e. not be completed on
    schedule)
  • Reconstruction at T0 (CERN)
  • Distribution to T1s
  • Subsequent distribution to T2s

52
Pre-challenge Resource Needs
  • Simulation 100 TB 5 months 1000 CPUs
  • Digitisation 75 TB 2 months 150 CPUs
  • 800MHz P3 is 33 SI95
  • Working assumption that most farms will be at
    50SI95/CPU in late 2003

Challenge Resource Needs
  • Reconstruction 25 TB 1 month 460 CPUs
  • at CERN
    _at_ 50SI95/CPU
  • World-wide distributed analysis

53
Summary and Conclusions
  • Very successful MC Production
  • 20 TB of data delivered on time to the Physicists
  • Smooth production over 4 months
  • 20 production sites, 30 persons
  • More automation for next Data Challenge
  • Improvements of our CMS tools
  • Expecting help from GRID tools

54
More Information
  • GRID/production Workshop (June 2002)
    http//documents.cern.ch/age?a02826
  • The Spring02 DAQ TDR Production CMS Note
    CMS-IN 2002/034
  • CMS MC Production web page
  • RefDB, BOSS, IMPALA, DAR
  • http//cmsdoc.cern.ch/cms/production/www/html/gene
    ral/index.html

55
Acknowledgements
  • Thanks to the CERN-IT division for the invitation
    to give this talk
  • Thanks to David Stickland and Tony Wildish to let
    me present it
  • Thanks to the whole CMS Production Team for
    achieving these nice results, and to everyone on
    the CERN CASTOR, Tape, Objectivity, LSF, AFS and
    CMS support lists !
Write a Comment
User Comments (0)
About PowerShow.com