Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid

Description:

Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid Jinghua Liu for Pablo Yepes, Jinghua Liu Rice University, Houston, TX Maarten Ballintijn, Gunther Roland, – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 20
Provided by: Fons7
Category:

less

Transcript and Presenter's Notes

Title: Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid


1
Analysis of CMS Heavy Ion Simulation Data Using
ROOT/PROOF/Grid
  • Jinghua Liu
  • for
  • Pablo Yepes, Jinghua Liu
  • Rice University, Houston, TX
  • Maarten Ballintijn, Gunther Roland,
  • Bolek Wyslouch, Jinlong Zhang
  • MIT, Cambridge, MA
  • Supported by NSF grants 0218603, 0219063

2
Outline
  • From data analysis users point of view
  • Why ROOT/PROOF/Grid
  • How Step by Step
  • What Test Result
  • Summary

Other PROOF talks in this conference Fons
Rademakers Maarten Ballintijn
3
ROOT/PROOF
  • ROOT as a data analysis tool
  • PROOF Parallel ROOT Facility ,based on and part
    of ROOT
  • on clusters of heterogeneous machines
  • parallel analysis of objects in a set of files
  • parallel execution of scripts
  • Transparency, Scalability, Adaptability, Error
    handling, Authentication
  • Bring the KB to the PB not the PB to the KB
  • KB code--gtCPU, PB data
  • Use distributed CPUs to analyze distributed data

4
PROOF/Grid Interface
  • Use a Grid Resource Broker to detect which nodes
    in a cluster can be used in the parallel session
  • Use Grid File Catalogue and Replication Manager
  • Utilize Grid Monitoring Services
  • Support Globus Authentication
  • Abstract Grid interface

5
Step by Step
  • Setup PC cluster(s) (for PROOF/Grid)
  • Prepare the data files
  • Write analysis code (algorithm)
  • Compile a data set for PROOF
  • Run a PROOF job
  • Get the results

6
PC Clusters
  • Client machine (desktop)
  • P4 _at_ 1.8GHz /512MB/40GB
  • Cluster1
  • 2 Dual Xeon _at_ 2.4GHz /1GB/360GB
  • 1 Dual Athlon _at_ 1.73GHz /1GB/240GB
  • 8 Dual PIII _at_ 400MHz /512MB/60GB
  • Cluster 2
  • 3 Dual Athlon _at_ 1.67GHz /2GB/200GB
  • Operating systems
  • RedHat 6.1, RedHat 7.3, Slackware 8.1
  • Globus version 2.2

7
CMS Heavy Ion Simulation
  • Jet high-pT particle angular correlation
  • Use Calorimeters only

8
CMS Heavy Ion Simulation
  • Pythia (event generator) 10,000 jet events
  • Hijing (Heavy Ion event generator) 1000 events
  • Each Hijing event (dN/dy5000) was divided into
    500 sub-events
  • Randomly re-combine 500 sub-events (from
    different events) to form a new Hijing event, a
    cheap way to obtain more Monte Carlo events
  • CMSIM (GEANT 3 based simulation program for CMS)

9
Data Production Globus Jobs
  • Globus used to submit manage the jobs
  • No data replication (files were intentionally
    stored locally)

10
Build ROOT Tree
  • Superimpose jet events on top of Hijing events
    and generate ROOT Tree
  • Standalone code linked with ROOT libraries
  • CMS Ecal (Electromagnetic Calorimeter)
  • barrel 61200 cells, endcap 14648 cells
  • HCal (Hadronic Calorimeter)
  • 14616 cells (multi-layer)
  • 4032 towers
  • calotree--Ecal cells (energy, position)
  • Hcal towers (energy, position)
  • 10,000 events were split into 100 files, 100
    events each,
  • file size 160MB, total data 16GB
  • Data distributed, each node got some local files

11
TSelector The Algorithms
  • Create TSelector from TTree

root root0 TFile f(heavyion001.root) root1
calotree-gtMakeSelector(myselector) root2
.q ls myselector.C myselector.h
  • Add the analysis code (algorithm) into TSelector

vi myselector.h vi myselector.C
12
TSelector The Algorithms
  • myselector.h

Class myselector public TSelector public
TTree fChain . . private
TH1F hist1d TH2F hist2d . .
.
13
TSelector The Algorithms
  • myselector.C

void myselectorBegin(TTree tree) hist1d
new TH1F(DeltaPhi,DeltaPhi,100,180.,180.) His
t2d new TH2F(EtaPhi,EtaPhi,100,-5.,5.,100,-4
.,4.) fOutput-gtAdd(hist1d) fOutput-gtAdd(hist2d)
Bool_t myselectorProcess(Int_t entry)
users analysis code goes here! for(i0 ilt
nclusters i) if (Et1gt5)
for(ji1 jlt nclusters j) if(Et2gt5)
DeltaPhi
hist1d-gtFill(DeltaPhi)
14
TDSet Data Location
  • Specify a collection of TTrees or files

TDSet ds new TDSet(TTree, calotree)
ds-gtAdd(/data1/cms/cmsim/heavyion001.root)
ds-gtAdd(/data1/cms/cmsim/heavyion002.root)
ds-gtAdd(lfn//pcs21.rice.edu/data5/heavyion110.r
oot) ds-gtAdd(lfn//pcs11.rice.edu/cms/cmsim/
heavyion230.root) ds-gtPrint()
  • Its better to put these into a macro
  • Returned by DB or File Catalog query etc

15
Running a PROOF Job
root gROOT-gtProof(proofmaster.rice.edu)
TDSet ds new TDSet(TTree, calotree)
ds-gtAdd(. . .) . . . ds-gtProcess(myselecto
r.C, options, nentries, first) (note
options must be pre-coded in myselector.C)
TH1F h1(TH1F )gProof-gtGetOutput(DeltaPhi)
h1-gtDraw()
16
Angular Correlation
17
Scale plot
  • Analysis speed vs. CPUs (PIII 1GHz equivalent)
  • CPU power/data size balanced
  • CPU intensive calculations

18
Summary
  • CMS Heavy Ion Analysis implemented and tested
    with PROOF
  • Scales well with CPUs
  • PROOF/Grid can provide the data analysis power
    unavailable otherwise. This power can be achieved
    without much extra effort
  • PROOF/Grid interface is under rapid development.
    The plan is to extend the presented study to use
    Grid interface

19
  • The End
Write a Comment
User Comments (0)
About PowerShow.com