les robertson cernit0899 1 - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

les robertson cernit0899 1

Description:

... controller (probably a PC) to provide data caching, data redundancy, recovery ... Two RAID controllers. Dual-attached disks. Controllers connect to the ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 24
Provided by: Rober877
Category:

less

Transcript and Presenter's Notes

Title: les robertson cernit0899 1


1
Offline Computing Farms for LHC
  • Summary of the requirements of the LHC
    experiments
  • Current ideas about components
  • Strawman LHC computing farm
  • Space, power cooling requirements
  • Questions to ST

2
Units
  • Data storage
  • PetaByte (PB) 1015 Bytes
  • TeraByte (TB) 1012 Bytes
  • 1 PetaByte
  • 20,000 Redwood tapes (gt3 StorageTek silos)
  • 30,000 Cheetah 36 disks (largest hard disk used
    today)
  • 100,000 dual-sided DVD-RAM disks
  • 1,500,000 sets of the Encyclopaedia Britannica
  • Processors - SPECint95 (SI95)
  • 1 SI95 10 CERN-units 40 MIPS
  • 400 MHz PentiumII 8 SI95 (CERN benchmark)

3
Raw Data requirements recording via the network
to B.513
  • CMS, ATLAS
  • 100 MB/sec
  • 1 PetaByte per year during the proton run
  • LHCb
  • 50 MB/sec
  • 500 TeraBytes per year during the proton run
  • ALICE
  • 1 GigaByte/sec
  • 1 PetaByte per year during the ions run

Current data recording rates NA48 - 25
MB/sec COMPASS (next year) - 35 MB/sec
30 of CMS 3 of ALICE
4
Offline Capacity Estimates(i.e. capacity in
B.513)
1998 estimates
  • Estimate uses figures from CMS in mid-98ATLAS
    would be similar, ALICE, LHCb about half the size

5
Evolution of Computing Capacity - SPECint95
1'000
900
800
700
600
500
K SPECint95 Units
400
COMPASS
300
LHC
200
Others
100
0
1997
1998
1999
2000
2001
2002
2003
2004
2005
year
5K SI951100 processors
6
Long Term Tape Storage Estimates
14'000
12'000
10'000
8'000
LHC
TeraBytes
6'000
4'000
Current
COMPASS
2'000
Experiments
0
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
Year
7
Basic Principles
  • HEP computing has the property of event
    independenceso we can process any number of
    events in parallel
  • CERN distributed architecture - SHIFT 99
  • simplest components (hyper-sensitive to cost,
    aversion to complication)
  • throughput (before performance)
  • resilience (mostly up all of the time)
  • computing fabric for flexibility, scalability

Mass Computing rather than Supercomputing
8
Components (i)
  • off-the-shelf, mass market components whenever
    possible
  • Processors
  • low-end PCs (simple boxes intended for the home
    or small office)
  • assembled into clusters and sub-farms
  • according to practical considerations like
  • throughput of first level LAN switch
  • rack capacity
  • power cooling, .
  • each cluster comes with a suitable chunk of I/O
    capacity
  • each sub-farm fits in a rack

9
Processor cluster
sub-farm 36 boxes, 144 cpus, 5 m2
basic box four 100 SI95 processors standard
network connection (2 Gbps) 15 of systems
configured as I/O servers (disk server,
disk-tape mover, Objy AMS, ..) with
additional connection to the storage
network cluster 9 basic boxes with a network
switch (lt10 Gbps) sub-farm 4 clusters - with a
second-level network switch (lt50 Gbps) one
sub-farm fits in one rack
cluster and sub-farm sizing adjusted to fit
conveniently the capabilities of network switch,
racking, power distribution components
lmr for Monarc study- april 1999
10
(No Transcript)
11
Components (ii)
  • Disks
  • inexpensive disks - designed for the PC market
  • packaged with a smart controller (probably a PC)
    to provide data caching, data redundancy,
    recovery

12
disk sub-system
rack Integral number of arrays, with first level
network switches In the main model, half-height
3.5 disks are assumed, 16 per shelf of a 19
rack. With space for 18 shelves in the rack
(two-sided), half of the shelves are populated
with disks, the remainder housing controllers,
network switches, power distribution.
array Two RAID controllers Dual-attached
disks Controllers connect to the storage
network Sizing of array subject to components
available
disk size restricted to give a disk count which
matches the number of processors (and thus number
of active processes)
lmr for Monarc study- april 1999
13
Components (iii)
  • Tapes
  • a mass market solution will probably NOT be
    available-
  • possibly we shall still be using robots like the
    ones installed today
  • General disclaimer
  • these are just estimates
  • how the technology evolves is only one
    component, the market decides on the capacity
    of the products

14
storage network
12 Gbps
processors
5600 processors 1400 boxes 160 clusters 40
sub-farms
tapes
1.5 Gbps
0.8 Gbps
6 Gbps
8 Gbps
24 Gbps
farm network
960 Gbps
0.8 Gbps (daq)
100 drives
CMS Offline Farm at CERN circa 2006
LAN-WAN routers
250 Gbps
storage network
5 Gbps
0.5 M SPECint95 5,600 processors 0.5 PByte
disk 5,400 disks
0.8 Gbps
5400 disks 340 arrays ...
disks
lmr for Monarc study- april 1999
15
Layout and power - CMS or ATLAS
totals 400 KWatts 370 m2
18 metres
tapes 120 m2
14 KW
245 KW
24 metres
lmr for Monarc study- april 1999
16
Caution
  • These are only estimates
  • requirements
  • technology
  • http//nicewww.cern.ch/les/pasta/welcome.html
  • http//nicewww.cern.ch/omartin/nt3-99-ohm.html

17
Space available in B.513
  • Total space in technical rooms in B.513
  • Computer room 1.400 m2
  • Tape vault 1.100 m2
  • MG room 200 m2
  • Total 2.700 m2
  • Estimate for LHC 1.600 m2 cleared space

18
Questions
  • Power
  • REQUIRED about 2 MW surviving short cuts (UPS)
  • what infrastructure needs to be changed?
  • power distribution within the building, rooms?
  • what about backup generators?
  • cost estimates?
  • Cooling (52 week per year usage)
  • How much cooling capacity is required, how much
    exists?
  • Power requirements for cooling?
  • Advice on packaging of the equipment (e.g. should
    we buy cards in racks rather than use
    flow-through office systems on shelves?)
  • Cooling in the B.513 sous-sol?
  • Cost estimates?

19
Questions (ii)
  • smoke/fire detection
  • current discussion between IT (Dave Underhill)
    and ST, TIS
  • is the current method sufficient?
  • what is the best practice for computer halls?
  • can smoke/heat sources be localised in an open
    hall?
  • does it make sense to tie detection to power?
  • What questions should we be asking?

20
Each silo has 6,000 slots, each of which can hold
a 50GB cartridge gt theoretical capacity 1.2
PetaBytes
21
(No Transcript)
22
(No Transcript)
23
About 250 PCs, with 500 Pentium processors are
currently installed for offline physics data
processing
Write a Comment
User Comments (0)
About PowerShow.com