Designing for 20TB Disk Drives And enterprise storage - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

Designing for 20TB Disk Drives And enterprise storage

Description:

1. Designing for 20TB Disk Drives. And 'enterprise storage' Jim Gray, Microsoft research ... Smart drives. Camera with micro-drive. Replay / Tivo / Ultimate TV ... – PowerPoint PPT presentation

Number of Views:41

Avg rating:3.0/5.0

Slides: 34

Provided by: jimg178

Category:

more less

Transcript and Presenter's Notes

Title: Designing for 20TB Disk Drives And enterprise storage

1
Designing for 20TB Disk DrivesAnd enterprise
storage

Jim Gray, Microsoft research

2
Disk Evolution
Kilo Mega Giga Tera Peta Exa Zetta Yotta

Capacity100x in 10 years 1 TB 3.5 drive in
2005 20 TB? in 2012?!
System on a chip
High-speed SAN
Disk replacing tape
Disk is super computer!

3
Disks are becoming computers

Smart drives
Camera with micro-drive
Replay / Tivo / Ultimate TV
Phone with micro-drive
MP3 players
Tablet
Xbox
Many more

ApplicationsWeb, DBMS, Files OS
Disk Ctlr 1Ghz cpu 1GB RAM
Comm Infiniband, Ethernet, radio
4
Intermediate Step Shared Logic
Snap 1TB 12x80GB NAS

Brick with 8-12 disk drives
200 mips/arm (or more)
2xGbpsEthernet
General purpose OS
10k/TB to 100k/TB
Shared
Sheet metal
Power
Support/Config
Security
Network ports
These bricks could run applications (e.g. SQL or
Mail or..)

NetApp .5TB 8x70GB NAS
Maxstor 2TB 12x160GB NAS
IBM TotalStorage 360GB 10x36GB NAS
5
Hardware

Homogenous machines leads to quick response
through reallocation
HP desktop machines, 320MB RAM, 3u high, 4 100GB
IDE Drives
4k/TB (street), 2.5processors/TB, 1GB RAM/TB
3 weeks from ordering to operational

Slide courtesy of Brewster Kahle, _at_ Archive.org
6
Disk as Tape

Tape is unreliable, specialized, slow, low
density, not improving fast, and expensive
Using removable hard drives to replace tapes
function has been successful
When a tape is needed, the drive is put in a
machine and it is online. No need to copy from
tape before it is used.
Portable, durable, fast, media cost raw tapes,
dense. Unknown longevity suspected good.

Slide courtesy of Brewster Kahle, _at_ Archive.org
7
Disk As Tape What format?

Today I send NTFS/SQL disks.
But that is not a good format for Linux.
Solution Ship NFS/CIFS/ODBC servers (not disks)
Plug disk into LAN.
DHCP then file or DB server via standard
interface.
Web Service in long term

8
State is Expensive

Stateless clones are easy to manage
App servers are middle tier
Cost goes to zero with Moores law.
One admin per 1,000 clones.
Good story about scaleout.
Stateful servers are expensive to manage
1TB to 100TB per admin
Storage cost is going to zero(2k to 200k).
Cost of storage is management cost

9
Databases ( SQL)

VLDB survey (Winter Corp).
10 TB to 100TB DBs.
Size doubling yearly
Riding disk Moores law
10,000 disks at 18GB is 100TB cooked.
Mostly DSS and data warehouses.
Some media managers

10
Interesting facts

No DBMSs beyond 100TB.
Most bytes are in files.
The web is file centric
eMail is file centric.
Science (and batch) is file centric.
But.
SQL performance is better than CIFS/NFS..
CISC vs RISC

11
BarBar the biggest DB

500 TB
Uses Objectivity
SLAC events
Linux cluster scans DB looking for patterns

12
300 TB (cooked)Hotmail / Yahoo

Clone front ends 10,000_at_hotmail.
Application servers
100 _at_ hotmail
Get mail box
Get/put mail
Disk bound
30,000 disks
20 admins

13
AOL (msn) (1PB?)

10 B transactions per day (10 of that)
Huge storage
Huge traffic
Lots of eye candy
DB used for security/accounting.
GUESS AOL is a petabyte
(40M x 10MB 400 x 1012)

14
Google1.5PB as of last spring

8,000 no-name PCs
Each 1/3U, 2 x 80 GB disk, 2 cpu 256MB ram
1.4 PB online.
2 TB ram online
8 TeraOps
Slice-price is 1K so 8M.
15 admins (!) ( 1/100TB).

15
Astronomy

Ive been trying to apply DB to astronomy
Today they are at 10TB per data set
Heading for Petabytes
Using Objectivity
Trying SQL (talk to me offline)

16
Scale Out Buy Computing by the Slice709,202
tpmC! 1 Billion transactions/day

Slice 8cpu, 8GB, 100 disks (1.8TB) 20ktpmC per
slice, 300k/slice
clients and 4 DTC nodes not shown

17
ScaleUp A Very Big System!

UNISYS Windows 2000 Data Center Limited Edition
32 cpus on
32 GB of RAM and
1,061 disks (15.5 TB)
Will be helped by 64bit addressing

24 fiber channel
18
Hardware
8 Compaq DL360 Photon Web Servers
One SQL database per rack Each rack contains 4.5
tb 261 total drives / 13.7 TB total
Fiber SAN Switches
Meta Data Stored on 101 GB Fast, Small
Disks(18 x 18.2 GB)
SQL\Inst1
Imagery Data Stored on 4 339 GB Slow, Big
Disks (15 x 73.8 GB)
SQL\Inst2
SQL\Inst3
To Add 90 72.8 GB Disks in Feb 2001 to create 18
TB SAN
Spare
4 Compaq ProLiant 8500 Db Servers
19
Amdahls Balance Laws

parallelism law If a computation has a serial
part S and a parallel component P, then the
maximum speedup is (SP)/S.
balanced system law A system needs a bit of IO
per second per instruction per secondabout 8
MIPS per MBps.
memory law ?1 the MB/MIPS ratio (called alpha
(?)), in a balanced system is 1.
IO law Programs do one IO per 50,000
instructions.

20
Amdahls Laws Valid 35 Years Later?

Parallelism law is algebra so SURE!
Balanced system laws?
Look at tpc results (tpcC, tpcH) at
http//www.tpc.org/
Some imagination needed
Whats an instruction (CPI varies from 1-3)?
RISC, CISC, VLIW, clocks per instruction,
Whats an I/O?

21
TPC systems

Normalize for CPI (clocks per instruction)
TPC-C has about 7 ins/byte of IO
TPC-H has 3 ins/byte of IO
TPC-H needs ½ as many disks, sequential vs random
Both use 9GB 10 krpm disks (need arms, not bytes)

22
TPC systems Whats alpha (MB/MIPS)?

Hard to say
Intel 32 bit addressing ( 4GB limit). Known CPI.
IBM, HP, Sun have 64 GB limit. Unknown CPI.
Look at both, guess CPI for IBM, HP, Sun
Alpha is between 1 and 6

23
Performance (on current SDSS data)

Run times on 15k COMPAQ Server (2 cpu, 1 GB ,
8 disk)
Some take 10 minutes
Some take 1 minute
Median 22 sec.
Ghz processors are fast!
(10 mips/IO, 200 ins/byte)
2.5 m rec/s/cpu

1,000 IO/cpu sec 64 MB IO/cpu sec
24
How much storage do we need?
Yotta Zetta Exa Peta Tera Giga Mega Kilo
Everything! Recorded

Soon everything can be recorded and indexed
Most bytes will never be seen by humans.
Data summarization, trend detection anomaly
detection are key technologies
See Mike Lesk How much information is there
http//www.lesk.com/mlesk/ksg97/ksg.html
See Lyman Varian
How much information
http//www.sims.berkeley.edu/research/projects/how
-much-info/

All Books MultiMedia
All LoC books (words)
.Movie
A Photo
A Book
24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9
nano, 6 micro, 3 milli
25
Standard Storage Metrics