Title: Tape Storage Issues
1Tape Storage Issues Bernd Panzer-Steindel LCG
Fabric Area Manager CERN/IT
2Concentrate on linear tape technology , not
helical scan (Exabyte, AIT..) Todays choices
IBM 3592 drives ? 40 MB/s and 300
GB cartridges 25
KCHF and 0.7 CHF/GB StorageTek 9940B
drives ? 30 MB/s and 200 GB cartridges
35 KCHF and 0.6
CHF/GB LTO consortium (HP, IBM, Certance) LTO-2
drives ? 20 MB/s and 200 GB cartridges
15 KCHF and 0.4 CHF/GB
(decreased by factor 2 during last 18
month) LTO-3 drives are available since about 3
weeks, 40-60 MB/s and 400 GB cartridges ?
0.4 CHF/GB the standalone drive costs about 6
KCHF, modified robotic drives are available 3-6
month later and the modifications (fibre channel,
extra mechanics,etc.) adds another 10 KCHF to
the cost the error on the drive costs is up to
50, media costs varies by 10
3- There are very little choices in the large
robotic tape storage area. - 6500 cartridge silo including robotics 1
MCHF - 30 - (the best , flexible in the market is currently
probably the STK 8500 tape library) - an old STK powderhorn silo (5500 slots) costs
about 200 KCHF, but does not - support LTO or the new IBM drives
- to be considered
- single large installation or distributed,
separate installations - locality of data, load balancing for reading is
defined at writing time - regular physical movement of tapes is not really
an option - the pass-through mechanism between silos has
still locality restrictions - ? one move of a tape one mount of a tape
prices of robots and drives have large error
margins (50), because these are non-commodity
products and depend heavily on the negotiations
with the vendors (level of discount)
4Time matrix of drives
STK 9940B, 200 GB cassettes ,
30 MB/s speed, today STK STKA,
500 GB cassettes , 120 MB/s speed, mid 2005 STK
STKB, 1000 GB cassettes , 240 MB/s
speed, beg. 2008 ? IBM 3592A,
300 GB cassettes , 40 MB/s speed, today IBM
3592B, 600 GB cassettes , 80 MB/s
speed, beg. 2006 LTO LTO2, 200
GB cassettes , 20 MB/s speed, today LTO
LTO3, 400 GB cassettes , 60 MB/s
speed, mid 2005 LTO LTO4, 800 GB
cassettes , 120 MB/s speed, beg. 2008
- Boundaries
- would like to see the drive for about one year
in the market - 5 years max lifetime for a drive technology
- one year overlap of old and new tape drive
installations - have a new service ready for 2007
- ? not easy to achieve
5 Copying of tape
data if one assumes a 4 year effective lifetime
of tape drives and a linear data growth, then the
average lifetime of tape data is less than 3
years (? disk lifetime) cost of the copy ,
example Assume 10 PB of data, new
cartridges cost 0.2 CHF/GB, copy needs to be
done within 300 days, drive
performance is 25 MB/s (inefficiencies
included) Gain double density tapes, 10 PB
free space 2 MCHF Loss 10 PB in 300 days
390 MB/s copy speed 16 input
drives (model A) 16 output drives (model B)
1.3 MCHF 1 FTE 48h disk buffer
extra tapes/slots 0.3 MCHF ?
Gain 2 MCHF Loss 1.6 MCHF Copy is
mandatory at the end of a tape drive technology
cycle. In-between it might be cost effective to
re-use the already existing tapes at double
density There are some uncertainties in the
calculations and it might not be cost-effective
6 Comparing disk and
tape storage A few example calculations (a bit
provocative) need 1 PB of storage with 1 GB/s
performance (today) Tape 50 tape drives
with 35 MB/s at 40 KCHF each (including
infrastructure), 50 efficiency
5000 cartridges, 200 GB , 0.5 CHF/GB 5000
slots in a robotic silo, 1 MCHF
? total 3.5 MCHF Disks (1) NAS server with
2 TB each, RAID5, 3 CHF/GB ? 3
MCHF, 500 server, 50 GB/s Disk (2) NAS
server, each with 100 200 GB USB/Firewire
disks, RAID5, 1.5 CHF/GB 65
server 6500 disks ? 325 KCHF 1.5 MCHF 1.8
MCHF (6.5 GB/s)
(feasibility study under way) need 10 PB with 5
GB/s Tape 10 MCHF (silosrobotic) 5 MCHF
(tapes) 10 MCHF (drives) 25 MCHF disk (1)
NAS server 30 MCHF disk (2) NAS server
USB/Firewire 18 MCHF But the cost of
complexity is NOT included and the cost
uncertainties are large. This just shows that
disk and tape storage costs are already comparable
7Disk space versus tape space
- Application access pattern dependent
- Re-use of cached files
- if low (e.g. random sampling over large data
sets, reprocessing production) - minimum disk space average aggregate drives
speed 24h - (e.g.
100 MB/s 10 TB) - Correlated to the efficient use of tape drives at
speeds higher than 50 MB/s - Number of I/O streams from batch jobs
- ranges from sequential access of large files to
- sequential access of small files, hoping
through a large number - random access on disks
- Consequence moving from a fully distributed,
averaged access to - a more synchronous access with guaranteed stream
speeds - much more sophistication needed in file system ,
disk server - and HSM tuning and organization
8Mass storage performance
First set of parameters defining the access
performance of an application to the mass Storage
system
number of running batch jobs internal
organization of jobs ( exp.) (e.g. just request
file before usage) priority policies (between
exp. and within exp.) HSM scheduling
implementation
speed of the robot distribution of tapes in
silos (at the time of writing the data)
HSM database performance
tape drive speed tape drive efficiency
disk server filesystem OS driver
HSM load balancing mechanism monitoring Fault
Tolerance disk server optimization
data layout on disk exp. policy access patterns
(exp.) performance overall file size
bugs and features
9Tape storage
Example, influence on batch system
Analyzing the Lxbatch inefficiency trends, wait
time due to tape queues
stager hits files limit
10Example File sizes
Average file size on disk ATLAS
43 MB ALICE 27 MB CMS
67 MB LHCb 130 MB COMPASS
496 MB NA48 93 MB
large amounts lt 10MB
11Analytical calculation of tape drive efficiencies
100
40
20
efficiency
10
7
5
3
2
1
file size MB
average files per mount 3 large of batch
jobs requesting files, one-by-one
tape mount time 120 s file overhead
4.4 s
12Storage performance
- combination of problems example small
files randomness of access - possible solutions
- concatenation of files on application or MSS
level - extra layer of disk cache, Vendor or
home-made - hierarchy of fast and slow access tape drives
- very large amounts of disk space
- .
Currently quite some effort is put into the
analysis of all the available monitoring informati
on to understand much better the influence of the
different parameters on the overall
performance. the goal is to be able to calculate
the cost of data transfers from tape to the
application ? CHF per MB/s for volume of X
TB
13Summary
- cost calculations are difficult and are site
specific - access patterns are very important, but hard to
predict - disk versus tapes needs more investigations
- the ratio disk space / tape space depends on the
access patterns - storage performance efficiencies depend on a lot
of parameters - more work needed, close collaboration between
experiments - (computing model implications) and the IT side
(Tiers and CERN)