The D0 NIKHEF Farm - PowerPoint PPT Presentation

About This Presentation
Title:

The D0 NIKHEF Farm

Description:

D0 Trigger rate is 100 Hz, 107 seconds/yr 109 events/yr. We want 10% of that be simulated 108 ... 512 MB SDRAM. 20 GByte EIDE disk. 1.2 Tbyte : 75 GB EIDE disks ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 34
Provided by: kor91
Category:
Tags: nikhef | farm | sdram

less

Transcript and Presenter's Notes

Title: The D0 NIKHEF Farm


1
The D0 NIKHEF Farm
Kors Bos
Fermilab, May 23 2001
2
Layout of this talk
D0 Monte Carlo needs The NIKHEF D0 farm The data
we produce The SAM data base A Grid
intermezzo The network The next steps
Fermilab, May 23 2001
3
D0 Monte Carlo needs
  • D0 Trigger rate is 100 Hz, 107 seconds/yr ? 109
    events/yr
  • We want 10 of that be simulated ? 108 events/yr
  • To simulate 1 QCD event takes 3 minutes (size 2
    Mbyte)
  • On a 800 MHz PIII
  • So 1 cpu can produce 105 events/yr (200 Gbyte)
  • Assuming a 60 overall efficiency
  • So our 100 cpu farm can produce 107 events/yr
    (20 Tbyte)
  • And this is only 10 of the goal we set ourselves
  • Not counting Nijmegen D0 farm yet
  • So we need another 900 cpus
  • UTA (50), Lyon (200), Prague(10), BU(64),
  • Nijmegen(50), Lancaster(200), Rio(25),

4
How it looks
5
The NIKHEF D0 Farm
 

6
50 Farm nodes(100 cpus)Dell Precision
Workstation 220
  • Dual Pentium III processor 800 MHz / 256 kB
    cache each
  • 512 MB PC800 ECC RDRAM
  • 40 GB (7200 rpm) ATA-66 disk drive
  • no screen
  • no keyboard
  • no mouse
  • wake up on Lan functionality

7
The File ServerElonex EIDE Server
The Farm ServerDell Precision 620 workstation
  • Dual Pentium III 700 MHz
  • 512 MB SDRAM
  • 20 GByte EIDE disk
  • Dual Pentium III Xeon 1 GHz
  • 512 MB RDRAM
  • 72.8 GByte SCSI disk
  • 1.2 Tbyte 75 GB EIDE disks
  • Will also serve as D0 software server for the
    NIKHEF/D0 people
  • 2 x Gigabit Netgear GA620 network card

8
Software on the farm
  • Boot via the network
  • Standard Redhat Linux 6.2
  • Ups/upd on the server
  • D0 software on the server
  • FBSNG on the server, deamon on the nodes
  • SAM on the file server
  • Used to test new machines

9
What we run on the farm
  • Particle Generator Pythia or Isajet
  • Geant Detector simulation d0gstar
  • Digitization, adding min.bias psim
  • Check the data mc_analyze
  • Reconstruction preco
  • Analysis reco_analyze

10
Example Min.bias
  • Did a run with 1000 events on all cpus
  • Took 2 min./event
  • So 1.5 days for the whole run
  • Ouput file size 575 MByte
  • We left those files on the nodes
  • reason for enough local disk space
  • Intend to repeat that sometimes

11
Output data
  • -rw-r--r-- 1 a03 computer
    298 Nov 5 1925 RunJob_farm_qcdJob308161443.param
    s
  • -rw-r--r-- 1 a03 computer 1583995325 Nov
    5 1035 d0g_mcp03_pmc03.00.01_nikhef.d0farm_isajet
    _qcd-incl-PtGt2.0_mb-none_p1.1_308161443_2000
  • -rw-r--r-- 1 a03 computer
    791 Nov 5 1925 d0gstar_qcdJob308161443.params
  • -rw-r--r-- 1 a03 computer
    809 Nov 5 1925 d0sim_qcdJob308161443.params
  • -rw-r--r-- 1 a03 computer 47505408
    Nov 3 1615 gen_mcp03_pmc03.00.01_nikhef.d0farm_i
    sajet_qcd-incl-PtGt2.0_mb-none_p1.1_308161443_2000
  • -rw-r--r-- 1 a03 computer
    1003 Nov 5 1925 import_d0g_qcdJob308161443.py
  • -rw-r--r-- 1 a03 computer
    912 Nov 5 1925 import_gen_qcdJob308161443.py
  • -rw-r--r-- 1 a03 computer
    1054 Nov 5 1926 import_sim_qcdJob308161443.py
  • -rw-r--r-- 1 a03 computer
    752 Nov 5 1925 isajet_qcdJob308161443.params
  • -rw-r--r-- 1 a03 computer
    636 Nov 5 1925 samglobal_qcdJob308161443.params
  • -rw-r--r-- 1 a03 computer 777098777 Nov
    5 1924 sim_mcp03_psim01.02.00_nikhef.d0farm_isaj
    et_qcd-incl-PtGt2.0_mb-poisson-2.5_p1.1_308161443_
    2000
  • -rw-r--r-- 1 a03 computer
    2132 Nov 5 1926 summary.conf

12
Output data translated
  • 0.047 Gbyte gen_
  • 1.5 Gbyte d0g_
  • 0.7 Gbyte sim_
  • import_gen_.py
  • import_d0g_.py
  • import_sim_.py

isajet_.params RunJob_Farm_.params d0gstar_.par
ams d0sim_.params samglobal_.params Summary.conf

12 files for generatord0gstarpsim But of course
only 3 big ones Total 2 Gbyte
13
Data management
parameters
Import_gen.py
geant data (hits)
Import_d0g.py
sim data (digis)
Import_sim.py
Import_reco.py
14
Automation
  • Mc_runjob (modified)
  • Prepares MC jobs (gensimrecoanal)
  • (f.e.) 300 events per job/cpu
  • Repeat (f.e.) 500 times
  • Submits them into the batch (FBS)
  • Ran on the nodes
  • Copy to fileserver after completion
  • A separate batch job onto the fileserver
  • Submits them into SAM
  • Sam does file transfers to Fermi and SARA
  • Runs for a week

15
1.2 TB
fbs(rcp) fbs(sam)
mcc request
farm server
SAM DB
file server
fbs job 1 mcc 2 rcp 3 sam
fbs(mcc)
datastore
mcc input
FNAL SARA
mcc output
node
50
control
40 GB
data
metadata
16
This is a grid!
17
The Grid
  • Not just D0, but for the LHC expts.
  • Not just SAM, but for any database
  • Not just farms, but any cpu resource
  • Not just SARA, but any mass storage
  • Not just FBS, but any batch system
  • Not just HEP, but any science, EO,

18
European Datagrid Project
  • 3 yr. Project for 10 M
  • Manpower to develop grid tools
  • Cern, in2p3, infn, pparc, esa, fom
  • Nikhef sara knmi
  • Farm management
  • Mass storage management
  • Network management
  • Testbed
  • HEP EO applications

19
LHC - Regional Centres
KEK
CERN Tier 0
INFN
BNL
IN2P3
NIKHEF/ SARA
RAL
FNAL
Tier 1
Utrecht
Vrije Univ.
Tier2
Nijmegen
Amsterdam
Brussel
SURFnet
Leuven
Department
Atlas
LHCb
Alice
possibly
20
DataGrid Test bed sites
Nikhef
21
The NL-Datagrid Project
22
NL-Datagrid Goals
  • National test bed for middleware development
  • WP4, WP5, WP6, WP7, WP8, WP9
  • To become an LHC Tier-1 center
  • ATLAS, LHCb, Alice
  • To use it for the existing program
  • D0, Antares
  • To use it for other sciences
  • EO, Astronomy, Biology
  • for tests with other (Trans Atlantic) grids
  • D0
  • PPDG, GriPhyN

23
NL-Datagrid Testbed Sites
Univ.Amsterdam (Atlas)
Vrije Univ. (LHCb)
CERN RAL FNAL ESA
Nijmegen Univ. (Atlas)
Univ.Utrecht (Alice)
24
Dutch Grid topology
Alice
Utrecht Univ.
Nijmegen Univ.
LHCb
D0 Atlas
D0 Atlas LHCb Alice
25
End of the Grid intermezzo
Back to The NIKHEF D0 farm and Fermilab The
network
26
Network bandwidth
  • NIKHEF ?SURFnet 1 Gbit
  • SURFnet Amsterdam ? Chicago 622 Mbit
  • Esnet Chicago ? Fermilab 155 Mbit ATM
  • But ftp gives us 4 Mbit/sec
  • bbftp gives us 25 Mbit/sec
  • bbftp processes in parallel 45 Mbit/sec
  • For 2002
  • NIKHEF ?SURFnet 2.5 Gbit
  • SURFnet Amsterdam ? Chicago 622 Mbit
  • SURFnet Amsterdam ? Chicago 2.5 Bbit optical
  • Chicago ? Fermilab ? but more ..

27
ftp
  • ftp gives you 4 Mb/s to Fermilab
  • bbftp increased buffer, streams
  • gsiftp with security layer, increased buffer, ..
  • grid_ftp increased buffer, streams, sockets,
    fail-over protection, security
  • bbftp ? 20 Mb/s
  • grid_ftp ? 25 Mb/s
  • Multiple ftp in // ? factor 2 seen
  • Should get to gt 100 Mbit/sec ?
  • Or 1 Gbyte/minute

28
SURFnet5 access capacity
29
TA access capacity
NewYork
Abilene
STAR-LIGHT
ESNET
Geneva
2.5 Gb
MREN
622 Mb
STAR-TAP
30
Network load last week
  • Needed for 100 MC CPUs 10 Mbit/s (200 GB/day)
  • Available to Chicago 622 Mbit/s
  • Available to FNAL 155 Mbit/s
  • Needed next year (double cap.) 25 Mbit/s
  • Available to Chicago 2.5 Gbit/s factor 100
    more !!
  • Available to FNAL ??

31
New nodes for D0
  • In a 2u 19 mounting
  • Dual 1 GHz PIII
  • 1 Gbyte RAM
  • 40 Gbyte disk
  • 100 Mbit ethernet
  • Cost k2
  • Dell machines were k4 (tax incl) ?
  • FACTOR 2 cheaper!!
  • assembly time 1/hour
  • 1 switch k2.5 (24 ports)
  • 1 rack k2 (46u high)
  • Requested for 2001 k60
  • 22 dual cpus
  • 1 switch
  • 1 19 rack

32
(No Transcript)
33
The End
Kors Bos
Fermilab, May 23 2001
Write a Comment
User Comments (0)
About PowerShow.com