Title: Status of LHCb Data GRID
1Status of LHC(b) Data GRID
- Eric van Herwijnen
- Thursday, 11 july 2002
-
2Contents
- LHCb distributed computing for Monte Carlo
(SLICE) - Participation in European Data Grid (EDG) and LHC
Computing Grid (LCG) - Status EDG
- Current use of EDG middleware
- Use of DataGrid during datachallenge
- Current LHCb Grid applications RD (Ganga)
- Conclusions
3Simulation for LHCb and Integrated Computing
Environment
- SLICE has been running for 3 years processed
many millions of events for LHCb design - Main production sites (tier-1)
- CERN (gt200), Bologna (150), Lyon CCIN2P3 (60),
RAL (150) - Secondary sites (tier-2)
- Liverpool (300), Amsterdam VU (20)
- For 2002 Data Challenges
- Bristol (20), Cambridge (10), Oxford (10),
ScotGrid (128) (Edinburgh/Glasgow), Imperial
College, Rio de Janeiro, Moscow - In 2003, add Barcelona, Germany, Switzerland
Poland.
4Example the Bologna Beowulf cluster
- Set up at INFN-CNAF
- 100 CPUs hosted in Dual Processor machines
(ranging from 866 MHz to 1.2 GHz PIII), 512 MB
RAM - 2 Network Attached Storage systems
- 1 TB in RAID5, with 14 IDE disks hot spare
- 1 TB in RAID5, with 7 SCSI disks hot spare
- Linux disk-less processing nodes with OS
centralized on a file server (root file-system
mounted over NFS) - Usage of private network IP addresses and
Ethernet VLAN - High level of network isolation
- Access to external services (afs, mccontrol,
bookkeeping db, java servlets of various kinds,
) provided by means of NAT mechanism on a GW
node
5Farm Configuration (Bologna)
6Performance (Bologna)
- Farm capable to simulate and reconstruct about
(700 LHCb-events/day)(100 CPUs)70000
LHCb-events/day - Data transfer over the WAN to the CASTOR tape
library at CERN realised by using bbftp - very good throughput (up to 70 Mbits/s over
currently available 100 Mbits/s)
7LOGICAL FLOW
Submit jobs remotely via Web
Analysis
Execute on farm
Data quality check
Update bookkeeping database
Transfer data to mass store
8Monitoring and Control
- LHCb has adopted PVSS II as prototype control and
monitoring system for MC production. - PVSS is a commercial SCADA (Supervisory Control
And Data Acquisition) product developed by ETM. - Adopted as Control framework for LHC Joint
Controls Project (JCOP). - Available for Linux and Windows platforms.
9(No Transcript)
10European Data Grid and LHC Computing Grid
- LHCb takes part in WP8 (Applications) of the
European DataGrid (EDG) project - RD project to develop Grid middleware, in EDG
case on top of Globus - Middleware enables applications such as HEP to
profit from the Grid - LHC Computing Grid (LCG) project to design and
build Grid infrastructure required by LHC - Research and Technology Assessment groups e.g.
Common HEP use cases for the Grid
11Status DataGrid
- Succesful demos given by LHCb
- To EU reviewers (march)
- Opening of the National E-science Centre in
Edinburgh (april) - Tests of job submission, job monitoring, simple
resource broker features, write output datasets
to Castor - Version 1.2 of middleware being debugged and
installed
12Current use of EDG middleware
- Authentication
- grid-proxy-init
- Job submission to DataGrid
- dg-job-submit
- Monitoring and control
- dg-job-status
- dg-job-cancel
- dg-job-get-output
- Data publication and replication
- globus-url-copy, GDMP
- Resource scheduling use of CERN MSS
- JDL, sandboxes, storage elements
13Example 1Job Submission
- dg-job-submit /home/evh/sicb/sicb/bbincl1600061.jd
l -o /home/evh/logsub/ - bbincl1600061.jdl
-
- Executable "script_prod"
- Arguments "1600061,v235r4dst,v233r2"
- StdOutput "file1600061.output"
- StdError "file1600061.err"
- InputSandbox "/home/evhtbed/scripts/x509up_u149
","/home/evhtbed/sicb/mcsend","/home/evhtbed/sicb/
fsize","/home/evhtbed/sicb/cdispose.class","/home/
evhtbed/v235r4dst.tar.gz","/home/evhtbed/sicb/sicb
/bbincl1600061.sh","/home/evhtbed/script_prod","/h
ome/evhtbed/sicb/sicb1600061.dat","/home/evhtbed/s
icb/sicb1600062.dat","/home/evhtbed/sicb/sicb16000
63.dat","/home/evhtbed/v233r2.tar.gz" - OutputSandbox "job1600061.txt","D1600063","file
1600061.output","file1600061.err","job1600062.txt"
,"job1600063.txt"
14Example 2 Data replication
Compute Element
Storage Element
MSS
Local disk
Job
Data
globus-url-copy
Data
register-local-file
publish
CERN TESTBED
Replica Catalogue NIKHEF - Amsterdam
REST-OF-GRID
replica-get
Job
Data
Storage Element
15Data Challenge 1 (July-August 2002)
- Physics Data Challenge (PDC) for detector,
physics and trigger evaluations - based on existing MC production system DataGrid
treated as a separate distributed center - Computing Data Challenge (CDC) for checking
developing software - Tests of new production, bookkeeping and
configuration databases
16Use of DataGrid during the datachallenge
- The DataGrid has been integrated into SLICE as a
separate center - Software not yet production quality
- The CERN testbed that we will use for the data
challenge has nothing to do with the Grid
17Specific tests of EDG middleware
- Installation of LHCb production runtime
environment - Production of data
- Copying of data to local (ie Castor, HPSS and
RAL) mass store - Updating of bookkeeping database
18Future tests
- Tests of resource broker let the Grid sort out
where to run the job - Transparent use of replicas ( catalog) let the
Grid sort out where to store the data and how to
retrieve it - Integrate Grid facilities with new data
management databases (production, configuration
and bookkeeping)
19GANGA Gaudi ANd Grid Alliance
GANGA
GUI
Collective Resource Grid Services
Histograms Monitoring Results
JobOptions Algorithms
GAUDI Program
20Needed functionality
- Before using the GUI
- Obtaining of certificates and credentials
(graphical tools) - Revocation of Grid authorisation (if user no
longer needs to access the Grid) - Get web access to the GANGA server (for the
remote user) - GUI for the resource browsing
- GUI for data management tools
- Three Phases of a Job lifetime
- Preparation of the user job
- Tasks during user program execution
- Tasks after job execution
21Python Bus Design(A possible model for
implementation)
22Conclusions
- LHCb already has distributed MC production using
GRID facilities for job submission - We are embarking on large scale data challenges
commencing July 2002 - Grid middleware will be being progressively
integrated into our production environment as it
matures (starting with EDG, and looking forward
to GLUE) - R/D projects are in place for interfacing users
and Gaudi/Athena software framework to Grid
services, and putting production system into
integrated Grid environment with monitoring and
control - All work being conducted in close participation
with EDG and LCG projects