Title: Analysis of CMS Heavy Ion Simulation Data Using ROOT/PROOF/Grid
1Analysis of CMS Heavy Ion Simulation Data Using
ROOT/PROOF/Grid
- Jinghua Liu
- for
- Pablo Yepes, Jinghua Liu
- Rice University, Houston, TX
- Maarten Ballintijn, Gunther Roland,
- Bolek Wyslouch, Jinlong Zhang
- MIT, Cambridge, MA
- Supported by NSF grants 0218603, 0219063
2Outline
- From data analysis users point of view
- Why ROOT/PROOF/Grid
- How Step by Step
- What Test Result
- Summary
Other PROOF talks in this conference Fons
Rademakers Maarten Ballintijn
3ROOT/PROOF
- ROOT as a data analysis tool
- PROOF Parallel ROOT Facility ,based on and part
of ROOT - on clusters of heterogeneous machines
- parallel analysis of objects in a set of files
- parallel execution of scripts
- Transparency, Scalability, Adaptability, Error
handling, Authentication - Bring the KB to the PB not the PB to the KB
- KB code--gtCPU, PB data
- Use distributed CPUs to analyze distributed data
4PROOF/Grid Interface
- Use a Grid Resource Broker to detect which nodes
in a cluster can be used in the parallel session - Use Grid File Catalogue and Replication Manager
- Utilize Grid Monitoring Services
- Support Globus Authentication
- Abstract Grid interface
5Step by Step
- Setup PC cluster(s) (for PROOF/Grid)
- Prepare the data files
- Write analysis code (algorithm)
- Compile a data set for PROOF
- Run a PROOF job
- Get the results
6PC Clusters
- Client machine (desktop)
- P4 _at_ 1.8GHz /512MB/40GB
- Cluster1
- 2 Dual Xeon _at_ 2.4GHz /1GB/360GB
- 1 Dual Athlon _at_ 1.73GHz /1GB/240GB
- 8 Dual PIII _at_ 400MHz /512MB/60GB
- Cluster 2
- 3 Dual Athlon _at_ 1.67GHz /2GB/200GB
- Operating systems
- RedHat 6.1, RedHat 7.3, Slackware 8.1
- Globus version 2.2
7CMS Heavy Ion Simulation
- Jet high-pT particle angular correlation
- Use Calorimeters only
8 CMS Heavy Ion Simulation
- Pythia (event generator) 10,000 jet events
- Hijing (Heavy Ion event generator) 1000 events
- Each Hijing event (dN/dy5000) was divided into
500 sub-events - Randomly re-combine 500 sub-events (from
different events) to form a new Hijing event, a
cheap way to obtain more Monte Carlo events - CMSIM (GEANT 3 based simulation program for CMS)
9Data Production Globus Jobs
- Globus used to submit manage the jobs
- No data replication (files were intentionally
stored locally)
10 Build ROOT Tree
- Superimpose jet events on top of Hijing events
and generate ROOT Tree - Standalone code linked with ROOT libraries
- CMS Ecal (Electromagnetic Calorimeter)
- barrel 61200 cells, endcap 14648 cells
- HCal (Hadronic Calorimeter)
- 14616 cells (multi-layer)
- 4032 towers
- calotree--Ecal cells (energy, position)
- Hcal towers (energy, position)
- 10,000 events were split into 100 files, 100
events each, - file size 160MB, total data 16GB
- Data distributed, each node got some local files
11TSelector The Algorithms
- Create TSelector from TTree
root root0 TFile f(heavyion001.root) root1
calotree-gtMakeSelector(myselector) root2
.q ls myselector.C myselector.h
- Add the analysis code (algorithm) into TSelector
vi myselector.h vi myselector.C
12TSelector The Algorithms
Class myselector public TSelector public
TTree fChain . . private
TH1F hist1d TH2F hist2d . .
.
13TSelector The Algorithms
void myselectorBegin(TTree tree) hist1d
new TH1F(DeltaPhi,DeltaPhi,100,180.,180.) His
t2d new TH2F(EtaPhi,EtaPhi,100,-5.,5.,100,-4
.,4.) fOutput-gtAdd(hist1d) fOutput-gtAdd(hist2d)
Bool_t myselectorProcess(Int_t entry)
users analysis code goes here! for(i0 ilt
nclusters i) if (Et1gt5)
for(ji1 jlt nclusters j) if(Et2gt5)
DeltaPhi
hist1d-gtFill(DeltaPhi)
14TDSet Data Location
- Specify a collection of TTrees or files
TDSet ds new TDSet(TTree, calotree)
ds-gtAdd(/data1/cms/cmsim/heavyion001.root)
ds-gtAdd(/data1/cms/cmsim/heavyion002.root)
ds-gtAdd(lfn//pcs21.rice.edu/data5/heavyion110.r
oot) ds-gtAdd(lfn//pcs11.rice.edu/cms/cmsim/
heavyion230.root) ds-gtPrint()
- Its better to put these into a macro
- Returned by DB or File Catalog query etc
15Running a PROOF Job
root gROOT-gtProof(proofmaster.rice.edu)
TDSet ds new TDSet(TTree, calotree)
ds-gtAdd(. . .) . . . ds-gtProcess(myselecto
r.C, options, nentries, first) (note
options must be pre-coded in myselector.C)
TH1F h1(TH1F )gProof-gtGetOutput(DeltaPhi)
h1-gtDraw()
16Angular Correlation
17Scale plot
- Analysis speed vs. CPUs (PIII 1GHz equivalent)
- CPU power/data size balanced
- CPU intensive calculations
18Summary
- CMS Heavy Ion Analysis implemented and tested
with PROOF - Scales well with CPUs
- PROOF/Grid can provide the data analysis power
unavailable otherwise. This power can be achieved
without much extra effort - PROOF/Grid interface is under rapid development.
The plan is to extend the presented study to use
Grid interface
19