Title: PROOF Parallel ROOT Facility
1PROOF - Parallel ROOT Facility
- Kilian Schwarz
- Robert Manteufel
- Carsten Preuß
- GSI
- http//root.cern.ch
Bring the KB to the PB not the PB to the KB
2IntroductionA step towards a solution (Ali)
ROOT AliEn
- ROOT is becoming most popular available physics
analysis toolkit. - Interactive analysis work in familiar C style
syntax - data visualisation, an object-oriented I/O system
- crucial role for the LCG project ?!
- is successfully used within the AliROOT framework
of the ALICEexperiment as an all in one solution
- PROOF extends workstation based concept of ROOT
to the 'parallel ROOT facility'. - user procedures are kept identical during an
analysis session - tasks are distributed automatically in the
background - AliEn as a GRID analysis platform provides two
key elements - a global filesystem
- files are indexed and tagged in a virtual file
catalogue and everywhere globally accessible - a global queuesystem
- global job scheduling according to resource
requirements
3Parallel Analysis of Event Data
proof.conf slave node1 slave node2 slave
node3 slave node4
Remote PROOF Cluster
Local PC
root
.root
node1
ana.C
.root
root
node2
root root 0 tree.Process(ana.C)
root root 0 tree.Process(ana.C) root 1
gROOT-gtProof(remote)
root root 0 tree.Process(ana.C) root 1
gROOT-gtProof(remote) root 2
dset-gtProcess(ana.C)
.root
node3
.root
node4
4PROOF - Scalability
5GSI environmentthe prooflogin-script
scanning user-parameters for errors
processing user-parameters
scanning LSF-Cluster for PROOF-jobs
testing .rootrc
building scripts (cleanup.sh, proofstarter.C and
proofd.sh)
getting / setting ROOT-version
starting the wanted amount of PROOF-daemons
building .proof.conf and .rootauthrc
starting local rootd
starting ROOT and executing proofstarter.C
starting PROOF, uploading packages and starting
the analysis
killing jobs and processes
removing all builded files
6User-parameters/usr/local/bin/prooflogin
- -s slave-count
- -t termination-time
- -v ROOT-version
- -f ROOT-files
- -lib library-files
- -par file-packages
- -mol starts the master on the localhost
- -? / -h / -help help for proof.sh
- optional -file a text-file with all parameters
written in
7dedicated batch queue for PROOF
- only proof jobs are started in the dedicated
Proof queue - Quick Response Queue
- currently in test operation on 30 nodes of the
GSI batch farm - In this queue proof jobs have advantage. Non
proof jobs will be set on hold
8 PROOF configuration files
proofserv lxb108kschwarz0 lxb109kschwarz0
lxb110kschwarz0
- HOME/.proof.conf (e.g.)
- node lxb108 port1095 usrpwdslave lxb108
port1096 usrpwdslave lxb109 port1095
usrpwdslave lxb109 port1096 usrpwdslave lxb110
port1095 usrpwdslave lxb110 port1096 usrpwd
9Interactive Analysis with PROOF
- Basic requirements
- Analysis data has to be stored as objects derived
from TObject in ROOT trees - proofds have to load extension libraries for
user-specific objects toaccess the data members - Analysis code has to be inserted in the
automatically generated - selector macro for the object to be analyzed
- ltclassobjectgt-gtMakeSelector()
- Receipe
- -store your objects in trees
- -use the selector macro for analysis code
10Interactive Analysis with AliEn ROOT/PROOF
Work Distribution
- interactive analysis with Proof steered by a
data packetizer - in a local cluster
- cluster wide accessible data can be processed by
all slaves - packet takeover by all slaves!
- in a grid environment
- site wide accessible data can be processed by all
slaves - packet takeover by all slaves within one site !
11Short explanation for creating an analysis script
- 4 steps
- Open ROOT-file.
- make Selector-files (analysis files)
- edit header-file
- edit source-file
12Open ROOT-file and create Selector files
- Open your file
- TFile f(/u/dvgamma/hitfile.root")
- Create Selector files
- hittree-gtMakeSelector(Anaproof)
- the ROOT-file contains a tree called hittree
13Edit header file
- Add your branch
- TBranch b_myHit
- Set branch address
- Fchain-gtSetBranchAddress(myHit,myHit)
- Add user defined objects and some data members
- (will be explained later)
14Class TCounter
- Dummy class designed for collecting analysis
results from slaves - Keep the data till its catched in
SlaveTerminate()
15Edit source file (1/2)
- Explanation for each function see on top
- Analysis is embedded in AnaproofProcess(Int_t
entry) - the analysis checks the hits in chambers and
counts them - the counterobject collects hitcounters
- In AnaproofSlaveTerminate() add
- fOutput-gtAdd(counterobj)
16Edit source file (2/2)
- First you get your counterobj as an TObject from
the outList - Convert it back before it can be used as an
TCounter object
17libraries and packages
- TMytrackerhit and TCounter are two user defined
classes. - To use them in a PROOF session you have to build
a package that can be - uploaded by a ROOTdemon.
- You need following dictionary/file mix
- libTMytrackerhit/PROOF-INF/SETUP.C
- A look into Setup.C
- Int_t SETUP()
-
- gSystem-gtLoad(/u/dvgamma/projects/globus/onest
ep/ROOT/libTMytrackhit.so) - gSystem-gtLoad(/u/dvgamma/projects/globus/onest
ep/PROOF/TCounter/TCounter.so) - return 1
-
- To create a package
- tar czf libTMytrackerhit.par libTMytrackerhit
18Finally launch a PROOF-Session and start the
analysis
- Start the PROOF-session via script or manually
- Inside of the session
- Upload packages
- Enable packages
- Create TDSet
- Add file
- Start analysis
19At least a screenshot
20Andreas J. Peters CERN/Geneva _at_ ACAT03
Tokyo/Japan
Andreas J. Peters CERN/Geneva _at_ ACAT03
Tokyo/Japan
Jointventure of AliEn ROOT
Client Host
Client or Remote Host
VO Service DB Hosts
TGrid Class
ltothergt Plugin
AliEn Services
API Service
AliEn C API
gSOAP Client
TAlien Plugin
DB Proxy
DBI
Catalogue DB
- AliEn Services Catalogue are accessible via
TAlien(TGrid) class and global ltgGridgt variable
in ROOT - TAlien uses a SOAP based AliEn C API
Examples - TGridConnect(alien//aliendb115000/?direct,
) // inititate gGrid with AliEn plugin (API
server at aliendb1, port 15000) - gGrid-gtmkdir(/alice/acat03)// create
directory in virtual file catalogue
21AliEn Job Description Example Running a ROOT
macro on registered data
Executable "root" Packages
"ROOT3.10.01" Arguments "CommandROOT -x
macro.C" InputData "/alice/production/peters/
Tree.root InputFile "LF/alice/user/p/pete
rs/macro.C" OutputFile "myhisto.root"
Simple and readable !
22Interactive Analysis with AliEn ROOT/PROOF
AliEn Grid Proof Setup
PROOF SLAVE SERVERS
PROOF MASTER SERVER
- Guaranteed site access through
- multiplexing TcpRouters
USER SESSION
23Interactive Analysis with AliEn ROOT/PROOF
Sample Session Connect/Query
24Interactive Analysis with AliEn ROOT/PROOF
Sample Session connection to assigned proofds
25Interactive Analysis with AliEn ROOT/PROOF
Sample Session data processing
26Unification of Batch Interactive Analysis with
AliEn ROOT/PROOF
- current implementation
-
- datasets are represented by objects of the type
TDSet in ROOT - a GRID data query assigns data files to TDSet
Objects - the process method initiates the interactive
processing on the assigned GRID proof cluster - to come
- the same process method initiates the batch
processing of the same data set and automatic
merging of results.ALICE will test the analysis
facilities during the physics data challenge end
2004.
27Thanks to
- Fons Rademakers
- Andreas Joachim Peters
- For their contributions to the transparencies
28http//www-w2k.gsi.de/root/