US-CMS Core Application Software Demonstration - PowerPoint PPT Presentation

About This Presentation
Title:

US-CMS Core Application Software Demonstration

Description:

IMPALA: to specify job parameters for all elements of CMS Production Chain ... Impala has made production smoother and more reproducible. Used by almost all ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 18
Provided by: cms5
Learn more at: https://uscms.org
Category:

less

Transcript and Presenter's Notes

Title: US-CMS Core Application Software Demonstration


1
US-CMS Core Application Software Demonstration
  • DOE NSF Review
  • November 27, 2001

2
Introduction
  • Progress through all levels of the CMS production
    chain
  • Simulated Event Production
  • Job Specification
  • Generation and Simulation
  • Reconstruction
  • Digitization and Combination with minimum bias
  • Distributed Computing
  • Grid Scheduling
  • Use of Distributed Computing Resources
  • Automatic Return of results using grid
    applications
  • Data Analysis
  • User interactions with the database
  • Plotting/Fitting
  • Event Visualization
  • Some of the pieces will be shown in abbreviated
    time

3
Specification and Job Creation
  • US-CMS has taken the lead on job specification,
    creation, and submission.
  • Two Products were created
  • IMPALA to specify job parameters for all
    elements of CMS Production Chain
  • MC_RUNJOB Specifies scripts, provides interface
    job tracking, parameter storage, allows chaining
    of steps.
  • Impala has made production smoother and more
    reproducible.
  • Used by almost all production sites
  • MC_Runjob is being release for production this
    week
  • A joint development effort between CMS and D0
  • Builds on IMPALA specification
  • Will make production more automated

4
CMS Production with IMPALA
The IMPALA scripts (1) Discover input sources
and parameters (2) Fix parameters, create
production scripts, and track production
jobs (3) Define input parameters, executable
production scripts, and tracking DB interface.
5
(No Transcript)
6
Production Chain
  • CMKIN is a Pythia Generator
  • CMSIM is the last piece of CMS fortran code for
    GEANT3 simulation
  • FZ Zebra Files are Created
  • ORCA is the reconstruction code, reconstructed
    events are stored in Objectivity
  • Signal and minimum bias are combined

7
Farm Setup
  • Almost any computer can run the CMKIN and CMSIM
    steps.

8
Farm Setup
  • The first step of the reconstruction is Hit
    Formatting, where simulated data is taken from
    the fz files, formatted and entered into the
    Objectivity data base.
  • Process is sufficiently fast and involves enough
    data that more than 10-20 jobs will bog down the
    data base server.

9
Farm Setup
  • The most advanced production step is digitization
    with pile-up
  • The response of the detector is digitized the
    physics objects are reconstructed and stored
    persistently and at full luminosity 200 minimum
    bias events are combined with the signal events

Due to the large amount of minimum bias events
multiple Objectivity AMS data servers are needed.
Several configurations have been tried.
10
CMS Fall 2000 Production (FNAL)
  • FNAL Hardware
  • 40 dual node 750 MHz Intel based worker nodes
  • 3 quad node 650 MHz Intel based server nodes
  • 1 250 GB RAID5 partitiions (Dell Powervault) per
    server
  • (1 soon to be canned dual CPU server with 1.5 TB
    RAID RAID will be salvaged )
  • 100 Mb/s Ethernet (soon to be upgraded to Gb
    ethernet)
  • 1 8 CPU 400 MHz Sun Server with 1 TB RAID for
    User federation
  • FNAL Experience (or why we use multiple
    federations)
  • Limited by AMS server to 15 concurrent formatting
    jobs, but overcame this by going to multiple
    federations.
  • gt3 Hit formatting jobs will starve digitization
    per federation
  • File descriptor limit for AMS server was raised
    to 4096.
  • Pileup, pileup, and more pileup
  • FNAL farm
  • 60 CPU processing digitization jobs - requires
    about 833 Mb/s of pileup data on average.
  • Use 9 pileup servers on 100 Mb/s network for full
    pileup.
  • But we didnt reach the network limit
  • FBSNG batch manager used to configure the farm
  • pileup intensive jobs required to NOT run on
    pileup serving worker nodes.

11
Example Objy Server Deployment at FNAL
4 Production Federations at FNAL. (Uses
catalog only to locate database files.) 3 FNAL
servers plus several worker nodes used in this
configuration. 3 federation hosts with attached
RAID partitions 2 lock servers 4 journal
servers 9 pileup servers
12
Distributed Computing
  • The Production required to complete the TDRs and
    Data Challenges rapidly overwhelms any single
    production facility.
  • To complete the required production CMS must
    enlist the help of many centers.

Simulation Digitization Digitization GDMP Production tools
Simulation No PU PU GDMP Production tools
CERN Fully operational Fully operational Fully operational ? ?
FNAL Fully operational Fully operational Fully operational ? ?
Moscow Fully operational Fully operational Fully operational ?
INFN Fully operational Fully operational Fully operational In progress ?
UCSD Fully operational Fully operational Fully operational ? ?
Caltech Fully operational Fully operational Fully operational ? ?
Wisconsin Operational Operat. starting ?
IN2P3 Operational Operat. Not Op. In progress ?
Bristol/RAL Operational Operat. Starting ? ?
Helsinki Operational Not Op. Not Op.
UFL starting Not Op. Not Op.
13
CMS-PPDG SuperComputing 2001 Demo
14
User Analysis
  • Once data is in simulated and in the database,
    the analysis can begin.
  • Separate summary format can be created
  • Root File example created at Fermilab
  • Ntuple Files JetMet PRS group analysis ntuples
    generated and stored at FNAL.
  • Both of these have the disadvantage of breaking
    the connection to the database
  • TAG summary format is created
  • Ntuple-like data is stored in the database
  • Connection is maintained, user can access higher
    levels of the database

15
Creation of TAGS
  • Users can create tags or use tags generated at
    production time.
  • To create tags a shallow copy of the database is
    created

16
Processing Tags
  • Tags can be used to select events and perform
    basic end-game analysis steps
  • Making Plots
  • Applying Cuts
  • Performing Fits
  • The tags are fairly small
  • Shallow copy is stored
  • locally
  • Allow user to access higher
  • levels of the database for
  • selected events

17
Event Visualization
  • Once a small set of events have been selected,
    they can be visualized using IGUANA.
Write a Comment
User Comments (0)
About PowerShow.com