Storage: Alternate Futures - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Storage: Alternate Futures

Description:

... surface mount disks Disk (magnetic store) on a chip: (micro machines in Silicon) Full Apps (e.g. SAP, ... forbes.com/asap/97/0407/090.htm Thin Client ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 44
Provided by: JimG103
Category:

less

Transcript and Presenter's Notes

Title: Storage: Alternate Futures


1
Storage Alternate Futures
Yotta Zetta Exa Peta Tera Giga Mega Kilo
  • Jim Gray
  • Microsoft Research
  • Research.Micrsoft.com/Gray/talks
  • NetStore 99
  • Seattle WA, 14 Oct 1999

2
Acknowledgments Thank You!!
  • Dave Patterson
  • Convinced me that processors are moving to the
    devices.
  • Kim Keeton and Erik Riedell
  • Showed that many useful subtasks can be done by
    disk-processors, and quantified execution
    interval
  • Remzi Dusseau
  • Re-validated Amdhls laws

3
Outline
  • The Surprise-Free Future (5 years)
  • 500 mips cpus for 10
  • 1 Gb RAM chips
  • MAD at 50 Gbpsi
  • 10 GBps SANs are ubiquitous
  • 1 GBps WANs are ubiquitous
  • Some consequences
  • Absurd (?) consequences.
  • Auto-manage storage
  • Raid10 replaces Raid5
  • Disc-packs
  • Disk is the archive media of choice
  • A surprising future?
  • Disks (and other useful things) become
    supercomputers.
  • Apps run in the disk

4
The Surprise-free Storage Future
  • 1 Gb RAM chips
  • MAD at 50 Gbpsi
  • Drives shrink one quantum
  • Standard IO
  • 10 GBps SANs are ubiquitous
  • 1 Gbps WANs are ubiquitous
  • 5 tips cpus for 1K and 500 mips cpus for 10

5
1 Gb RAM Chips
  • Moving to 256 Mb chips now
  • 1Gb will be standard in 5 years, 4 Gb will
    be premium product.
  • Note
  • 256Mb 32MB the smallest memory
  • 1 Gb 128 MB the smallest memory

6
MAD at 50 Gbpsi
  • MAD Magnetic Aerial Density
  • 3-10 Mbpsi in products
  • 20 Mbpsi in lab
  • 50 Mbpsi paramagnetic limit
    but. People have ideas.
  • Capacity rise 10x in 5 years (conservative)
  • Bandwidth rise 4x in 5 years (densityrpm)
  • Disk 50GB to 500 GB,
  • 60-80MBps
  • 1k/TB
  • 15 minute to 3 hour scan time.

7
Disk vs Tape
  • Disk
  • 47 GB
  • 15 MBps
  • 10 ms seek time
  • 5 ms rotate time
  • 9/GB for drive 3/GB for ctlrs/cabinet
  • 4 TB/rack
  • Tape
  • 40 GB
  • 5 MBps
  • 30 sec pick time
  • Many minute seek time
  • 5/GB for media10/GB for drivelibrary
  • 10 TB/rack

Guestimates Cern 200 TB 3480 tapes 2 col
50GB Rack 1 TB 20 drives
The price advantage of tape is narrowing, and
the performance advantage of disk is growing
8
System On A Chip
  • Integrate Processing with memory on one chip
  • chip is 75 memory now
  • 1MB cache gtgt 1960 supercomputers
  • 256 Mb memory chip is 32 MB!
  • IRAM, CRAM, PIM, projects abound
  • Integrate Networking with processing on one chip
  • system bus is a kind of network
  • ATM, FiberChannel, Ethernet,.. Logic on chip.
  • Direct IO (no intermediate bus)
  • Functionally specialized cards shrink to a chip.

9
500 mips System On A Chip for 10
  • 486 now 7 233 Mhz ARM for 10 system on a
    chiphttp//www.cirrus.com/news/products99/news-pr
    oduct14.html AMD/Celeron 266 30
  • In 5 years, todays leading edge will be
  • System on chip (cpu, cache, mem ctlr, multiple
    IO)
  • Low cost
  • Low-power
  • Have integrated IO
  • High end is 5 BIPS cpus

10
Standard IO in 5 Years
  • Probably
  • Replace PCI with something better will still
    need a mezzanine bus standard
  • Multiple serial links directly from processor
  • Fast (10 GBps/link) for a few meters
  • System Area Networks (SANS) ubiquitous (VIA
    morphs to SIO?)

11
Ubiquitous 10 GBps SANs in 5 years
  • 1Gbps Ethernet are reality now.
  • Also FiberChannel ,MyriNet, GigaNet, ServerNet,,
    ATM,
  • 10 Gbps x4 WDM deployed now (OC192)
  • 3 Tbps WDM working in lab
  • In 5 years, expect 10x, progress is astonishing
  • Gilders law Bandwidth grows 3x/year
    http//www.forbes.com/asap/97/0407/090.htm

1 GBps
120 MBps (1Gbps)
80 MBps
5 MBps
40 MBps
20 Mbsp
12
Thin Clients mean HUGE servers
  • AOL hosting customer pictures
  • Hotmail allows 5 MB/user, 50 M users
  • Web sites offer electronic vaulting for SOHO.
  • IntelliMirror replicate client state on server
  • Terminal server timesharing returns
  • . Many more.

13
Standard Storage Metrics
  • Capacity
  • RAM MB and /MB today at 512MB and 3/MB
  • Disk GB and /GB today at 50GB and 10/GB
  • Tape TB and /TB today at 50GB and
    12k/TB (nearline)
  • Access time (latency)
  • RAM 100 ns
  • Disk 10 ms
  • Tape 30 second pick, 30 second position
  • Transfer rate
  • RAM 1 GB/s
  • Disk 15 MB/s - - - Arrays can go to 1GB/s
  • Tape 5 MB/s - - - striping is
    problematic, but works

14
New Storage Metrics Kaps, Maps, SCAN?
  • Kaps How many kilobyte objects served per second
  • The file server, transaction processing metric
  • This is the OLD metric.
  • Maps How many megabyte objects served per second
  • The Multi-Media metric
  • SCAN How long to scan all the data
  • the data mining and utility metric
  • And
  • Kaps/, Maps/, TBscan/

15
For the Record (good 1999 devices packaged in
systemhttp//www.tpc.org/results/individual_resul
ts/Compaq/compaq.5500.99050701.es.pdf)
X 100
Tape is 1Tb with 4 DLT readers at 5MBps each.
16
For the Record (good 1999 devices packaged in
systemhttp//www.tpc.org/results/individual_resul
ts/Compaq/compaq.5500.99050701.es.pdf)
Tape is 1Tb with 4 DLT readers at 5MBps each.
17
The Access Time Myth
  • The Myth seek or pick time dominates
  • The reality (1) Queuing dominates
  • (2) Transfer dominates BLOBs
  • (3) Disk seeks often short
  • Implication many cheap servers better than
    one fast expensive server
  • shorter queues
  • parallel transfer
  • lower cost/access and cost/byte
  • This is obvious for disk arrays
  • This even more obvious for tape arrays

Wait
Transfer
Transfer
Rotate
Rotate
Seek
Seek
18
Storage Ratios Changed
  • DRAM/disk media price ratio changed
  • 1970-1990 1001
  • 1990-1995 101
  • 1995-1997 501
  • today 0.1pMB disk 301
    3pMB dram
  • 10x better access time
  • 10x more bandwidth
  • 4,000x lower media price

19
Data on Disk Can Move to RAM in 8 years
301
6 years
20
Outline
  • The Surprise-Free Future (5 years)
  • 500 mips cpus for 10
  • 1 Gb RAM chips
  • MAD at 50 Gbpsi
  • 10 GBps SANs are ubiquitous
  • 1 GBps WANs are ubiquitous
  • Some consequences
  • Absurd (?) consequences.
  • Auto-manage storage
  • Raid10 replaces Raid5
  • Disc-packs
  • Disk is the archive media of choice
  • A surprising future?
  • Disks (and other useful things) become
    supercomputers.
  • Apps run in the disk.

21
The (absurd?) consequences
  • 256 way nUMA?
  • Huge main memories now 500MB - 64GB memories
    then 10GB - 1TB memories
  • Huge disksnow 5-50 GB 3.5 disks then 50-500
    GB disks
  • Petabyte storage farms
  • (that you cant back up or restore).
  • Disks gtgt tapes
  • Small disksOne platter one inch 10GB
  • SAN convergence 1 GBps point to point is easy
  • 1 GB RAM chips
  • MAD at 50 Gbpsi
  • Drives shrink one quantum
  • 10 GBps SANs are ubiquitous
  • 500 mips cpus for 10
  • 5 bips cpus at high end

22
The Absurd? Consequences
  • Further segregate processing from storage
  • Poor locality
  • Much useless data movement
  • Amdahls laws bus 10 B/ips io 1 b/ips

Disks
Processors
100 GBps
10 TBps
1 Tips
100TB
23
Storage Latency How Far Away is the Data?
Andromeda
9
Tape /Optical
10
2,000 Years
Robot
6
Pluto
Disk
2 Years
10
1.5 hr
Olympia
Memory
100
This Hotel
10
10 min
On Board Cache
On Chip Cache
2
This Room
Registers
1
My Head
1 min
24
Consequences
  • AutoManage Storage
  • Sixpacks (for arm-limited apps)
  • Raid5-gt Raid10
  • Disk-to-disk backup
  • Smart disks

25
Auto Manage Storage
  • 1980 rule of thumb
  • A DataAdmin per 10GB, SysAdmin per mips
  • 2000 rule of thumb
  • A DataAdmin per 5TB
  • SysAdmin per 100 clones (varies with app).
  • Problem
  • 5TB is 60k today, 10k in a few years.
  • Admin cost gtgt storage cost???
  • Challenge
  • Automate ALL storage admin tasks

26
The Absurd Disk
  • 2.5 hr scan time (poor sequential access)
  • 1 aps / 5 GB (VERY cold data)
  • Its a tape!

1 TB
100 MB/s
200 Kaps
27
Extreme case 1TB disk Alternatives
  • Use all the heads in parallel
  • Scan in 30 minutes
  • Still one Kaps/5GB
  • Use one platter per arm
  • Share power/sheetmetal
  • Scan in 30 minutes
  • One KAPS per GB

500 MB/s
1 TB
200 Kaps
500 MB/s
200GB each
1,000 Kaps
28
Drives shrink (1.8, 1)
  • 150 kaps for 500 GB is VERY cold data
  • 3 GB/platter today, 30 GB/platter in 5years.
  • Most disks are ½ full
  • TPC benchmarks use 9GB drives (need arms or
    bandwidth).
  • One solution smaller form factor
  • More arms per GB
  • More arms per rack
  • More arms per Watt

29
Prediction 6-packs
  • One way or another, when disks get huge
  • Will be packaged as multiple arms
  • Parallel heads gives bandwidth
  • Independent arms gives bandwidth aps
  • Package shares power, package, interfaces

30
Stripes, Mirrors, Parity (RAID 0,1, 5)
  • RAID 0 Stripes
  • bandwidth
  • RAID 1 Mirrors, Shadows,
  • Fault tolerance
  • Reads faster, writes 2x slower
  • RAID 5 Parity
  • Fault tolerance
  • Reads faster
  • Writes 4x or 6x slower.

0,3,6,..
1,4,7,..
2,5,8,..
0,1,2,..
0,1,2,..
0,2,P2,..
1,P1,4,..
P0,3,5,..
31
RAID 10 (strips of mirrors) Winswastes space,
saves arms
  • RAID 5
  • Performance
  • 225 reads/sec
  • 70 writes/sec
  • Write
  • 4 logical IO,
  • 2 seek 1.7 rotate
  • SAVES SPACE
  • Performance degrades on failure
  • RAID1
  • Performance
  • 250 reads/sec
  • 100 writes/sec
  • Write
  • 2 logical IO
  • 2 seek 0.7 rotate
  • SAVES ARMS
  • Performance improves on failure

32
The Storage RackToday
  • 140 arms
  • 4TB
  • 24 racks24 storage processors61 in rack
  • Disks 2.5 GBps IO
  • Controllers 1.2 GBps IO
  • Ports 500 MBps IO

33
Storage Rack in 5 years?
  • 140 arms
  • 50TB
  • 24 racks24 storage processors61 in rack
  • Disks 2.5 GBps IO
  • Controllers 1.2 GBps IO
  • Ports 500 MBps IO
  • My suggestion move the processors into the
    storage racks.

34
Its hard to archive a PetaByteIt takes a LONG
time to restore it.
  • Store it in two (or more) places online (on
    disk?).
  • Scrub it continuously (look for errors)
  • On failure, refresh lost copy from safe copy.
  • Can organize the two copies differently
    (e.g. one by time, one by space)

35
Crazy Disk Ideas
  • Disk Farm on a card surface mount disks
  • Disk (magnetic store) on a chip (micro machines
    in Silicon)
  • Full Apps (e.g. SAP, Exchange/Notes,..) in the
    disk controller (a processor with 128 MB dram)

ASIC
The Innovator's Dilemma When New Technologies
Cause Great Firms to FailClayton M.
Christensen.ISBN 0875845851
36
The Disk Farm On a Card
  • The 500GB disc card
  • An array of discs
  • Can be used as
  • 100 discs
  • 1 striped disc
  • 50 Fault Tolerant discs
  • ....etc
  • LOTS of accesses/second
  • bandwidth

14"
37
Functionally Specialized Cards
P mips processor
  • Storage
  • Network
  • Display

Today P50 mips M 2 MB
M MB DRAM
In a few years P 200 mips M 64 MB
ASIC
ASIC
38
Its Already True of PrintersPeripheral
CyberBrick
  • You buy a printer
  • You get a
  • several network interfaces
  • A Postscript engine
  • cpu,
  • memory,
  • software,
  • a spooler (soon)
  • and a print engine.

39
All Device Controllers will be Cray 1s
  • TODAY
  • Disk controller is 10 mips risc engine with 2MB
    DRAM
  • NIC is similar power
  • SOON
  • Will become 100 mips systems with 100 MB DRAM.
  • They are nodes in a federation (can run Oracle
    on NT in disk controller).
  • Advantages
  • Uniform programming model
  • Great tools
  • Security
  • Economics (cyberbricks)
  • Move computation to data (minimize traffic)

Central Processor Memory
Tera Byte Backplane
40
With Tera Byte Interconnectand Super Computer
Adapters
  • Processing is incidental to
  • Networking
  • Storage
  • UI
  • Disk Controller/NIC is
  • faster than device
  • close to device
  • Can borrow device package power
  • So use idle capacity for computation.
  • Run app in device.
  • Both Kim Keeton (UCB) and Erik Riedel (CMU)
    thesis investigate thisshow benefits of this
    approach.

41
Implications
Conventional
Radical
  • Move app to NIC/device controller
  • higher-higher level protocols CORBA / COM.
  • Cluster parallelism is VERY important.
  • Offload device handling to NIC/HBA
  • higher level protocols I2O, NASD, VIA, IP, TCP
  • SMP and Cluster parallelism is important.

42
How Do They Talk to Each Other?
  • Each node has an OS
  • Each node has local resources A federation.
  • Each node does not completely trust the others.
  • Nodes use RPC to talk to each other
  • CORBA? COM? RMI?
  • One or all of the above.
  • Huge leverage in high-level interfaces.
  • Same old distributed system story.

Applications
Applications
datagrams
datagrams
streams
RPC
?
streams
RPC
?
SIO
SIO
SAN
43
Outline
  • The Surprise-Free Future (5 years)
  • Astonishing hardware progress.
  • Some consequences
  • Absurd (?) consequences.
  • Auto-manage storage
  • Raid10 replaces Raid5
  • Disc-packs
  • Disk is the archive media of choice
  • A surprising future?
  • Disks (and other useful things) become
    supercomputers.
  • Apps run in the disk
Write a Comment
User Comments (0)
About PowerShow.com