POW : System optimisations for data management - PowerPoint PPT Presentation

About This Presentation
Title:

POW : System optimisations for data management

Description:

Before optimising , you need a workable foundation. Hardware: optimisation of use feasible ... Undermanning a principal cause. User applications: who knows? ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 19
Provided by: csc98
Category:

less

Transcript and Presenter's Notes

Title: POW : System optimisations for data management


1
11 November 2004
POW System optimisations for data management
2
It says probable user error..
3
Before optimising , you need a workable foundation
  • Hardware optimisation of use feasible
  • May mean virtualisation is soon needed
  • Software optimisation is possible,
  • Timescales long and rather unpredictable
  • Undermanning a principal cause
  • User applications who knows?

4
Plenty of previous experiencemostly bad choices
IBM 3494, 95-00 (too small)
TL820, 95-99 (too small)
IBM 3495, 90-95 (no SCSI)
Sony Petasite test, 98 (wrong drive)
GRAU J multimedia test, 98 dont need it
5
Current usage of devices is inefficient
  • You have seen JvEs evidence
  • Write 75 effective duty cycle feasible,
    because
  • ? s drive assigned but waiting for robotics
  • 30 s pick/place in drive
  • 45 s to first byte
  • 10 GB migration, 300 s writing
  • 45 s rewind/unload
  • ? s wait for robotics to dismount and free drive
  • Drive reassigned to next task
  • Read 20 or less effective duty cycle seen
  • As above, but typically read 1 file of lt 1 GB
  • 120 s overhead, 30 s reading
  • Savage marshalling of reads?
  • Faster robotics?

6
Easy things
  • Current devices could do (part/most of ) the job
  • 4 GBytes/s write is 120 9940B, 6 MF once
  • 15 PB/yr is
  • 15 Powderhorns/yr, 4 MF/yr
  • 90 K vols/yr, 9 MF/yr
  • Reading, as done today, 4 GBytes/s
  • 500 drives, 25 MF once
  • Need more space (2 new buildings?) and 30 MF
  • Need 15 MF/yr

7
Hard things
  • All current drives will be old, by 2007
  • IBM 3592, 300GB 40 Mbytes/s, out already 1 yr
  • STK 9940B, 200GB 30 Mbytes/s, successor 2005?
  • All current libraries will be old, too
  • IBM 3584, 5 K vols, maybe 100 EPH
  • STK Powderhorn, 5.5 K vols 190 EPH
  • IBM 3584 obsolete (slow, no passthrough..)

8
Major uncertainties 2007-2022
  • Will it soon all be on disk? 15 PB/yr? 15 MF?
  • If so, use drives designed for backup/recall LTO
    x
  • LTO use means must replace Powderhorns
  • Or, will tape reads be still very important, if
    not the predominant load?
  • If so, need very many drives 500-1000
  • What high end commercial devices will be
    available then? This is NOT commodity, no mass
    market
  • Will HP, IBM, STK or Quantum still be making
    drives?
  • What sort?
  • All directed to corporate backup/recall
  • All seem designed to be used via FC fabrics,
    virtualisation.

9
Is the current model tenable?
  • Three layers of separated functions.
  • CPU servers, 10K
  • Disk servers, 1 K at n TB each
  • Tape servers, n x 100 at 100-200 MB/s
  • Tape devices, n x 100 at 100-200 MB/s
  • Tape libraries, 2, 10 year lifetime
  • Tapes, 30 K vols of 500 GB/yr, 3-5 year
    lifetime
  • Current disk servers cannot really drive tape
    units capable of gt100 Mbyte/s
  • Current GBit networking and FC HBAs cant handle
    this either
  • Maybe obliged to use a virtualised tape layer?

10
Sensible to have a flexible start point
  • Install at least two suppliers on the floor
  • This means in practice IBM and STK
  • Both offer their own brand and at least LTO as
    well
  • This means at least 2 types of library
  • ADIC (multimedia, Scalar 1000, bought out GRAU)
    too expensive??
  • IBM library offering no use, but cant be
    avoided presently
  • STK SL8500 does not offer IBM 3592, but is
    SDLT/LTO compatible, very fast and very
    extensible
  • And then there is virtualisation this should be
    tried

11
Why virtualise?
  • Backup notoriously does not share resources and
    uses a large amount of these virtualising the
    resources used solves this for good. Today backup
    claims
  • 2 silos of 3590, 12 3590E, 8 9940B
  • 1 STK Timberwolf 9710 library, 7 DLT7000s,
  • CASTOR-based AFS backup
  • Soon the L700 (in Prevessin)..
  • User applications and user disk servers are not
    very efficient, and would struggle to use high
    performance devices
  • There are products that can virtualise all these
    libraries and drives, and the real resources are
    shared (STK VMS Open, Indigo)
  • The real drives would be exploited at maximum
    capability, and..
  • No problematic tape driver maintenance or
    performance mods needed
  • No high speed disk server to drive attach is
    needed
  • An efficient hidden disk buffer resides between
    user and real drives
  • Media and drive migration not a user (ie CASTOR)
    concern
  • Multicopy, remote vaulting, policies are
    offered

STK presentation exceprt
12
A New Foundation for our Data Storage Services
  • Common Platform
  • As much off the shelf technology as practical
  • I/F via generally available HBAs
  • Intel processors, GigE interconnect (mesh)
  • StorageTek specific value add
  • Cluster optimized for high data throughput and
    failover
  • High performance write biased disk buffer
  • Software
  • Linux based
  • StorageTek IP
  • Journaling, virtual tape/library, data lifecycle
    and policy management

STK
13
New Storage Services
Common Platform
STK
14
Common Platform Node Architecture
Data
2x 2Gb FC
Mover
Disk
Cont-
4x Gigabit
Customer Tape applications
roller
Ethernet
600 MB/s
Data
2x 2Gb FC
Mover
SATA disk buffer Up to 45 TB
STK
15
Scalability via the Common Platform
Node pair
Tape drives
Gigabit Switches (Mesh)
Client
Disk
System
Disk
Client
System
STK
16
Hardware Configurations
1-node 2-node (node pair) 4-node (dual node pair) 8-node (quad-node pair) 24-node (12 node pairs)
Throughput max 300 MB/s(1.2 TB/hr) 600 MB/s(2 TB/hr) 1200 MB/s(4 TB/hr) 2400 MB/s(gt8 TB/hr) gt7200 MB/s(gt25 TB/hr)
Front end FC ports 2 4 8 16 48
Max virtual devices 512 1024 2048 4192 12,576
B.E. disk FC ports 2 4 8 16 48
B.E. tape FC ports 2 4 8 16 48
Minimum usable disk buffer 6 TB 6 TB 12 TB 24 TB 72 TB
Maximum usable disk buffer 45 TB 45 TB 90 TB 180 TB 540 TB
RAS 99.9 99.999 99.999 99.999 99.999
STK
17
So next
  • Will we get an SL8500?
  • Will we get enough equipment to use the IBM 3592s
    , after initial test?
  • 8 drives (if bought after test) vs 46 9940B
  • Must provide 10 of workload support
  • This means 2,500 cartridges (0.5 MF)
  • This means a much larger 3584, or an IBM
    maintained and separate Powderhorn..

18
Another project, which turned out unexpectedly..
Write a Comment
User Comments (0)
About PowerShow.com