John Sing - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

John Sing

Description:

Title: Slide 1 Author: sfadden Last modified by: john sing Created Date: 1/6/2006 7:17:53 PM Document presentation format: On-screen Show Company: The Savo Group – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 14
Provided by: sfa69
Category:

less

Transcript and Presenter's Notes

Title: John Sing


1
IBM System x GPFS Storage ServerA Revolution in
HPC Intelligent Cluster Storage
  • John Sing
  • IBM Systems Technology Group
  • A New Era in Technical Computing Powerful.
    Comprehensive. Intuitive

2
IBM innovation for server and storage convergence
enables Big Data convergence with High
Performance Computing (HPC).
Improved Data Availability Declustered RAID
with GPFS reduces overhead and speeds data
rebuilds by 4-6 times
IBM System x GPFS Storage Server
Data Integrity, Reliability Flexibility
End-to-end checksum, 2- 3- faults tolerance,
application optimized RAID
Superior Performance Using more powerful x86
cores instead of the special-use disk controller
chips
Higher Efficiency Server and Storage
co-packaging improves density and efficiency
Better Value Software-based controller reduces
hardware overheads and costs
High-Speed Interconnect Clustering storage
traffic, including failover (InfiniBand, 10 GbE)
3
SONAS and GPFS Storage Server
Positioning Positioning SONAS GSS
Segment Target Market Segment General Purpose, Scale Out File Object Storage Technical/High Performance Computing where GPFS is installed
Segment Applicable Industries Universities, Government, Healthcare/Life Sciences, Oil/Gas High Performance Compute (aka Grid) workloads within industries that require them
Technology Placement Workloads High Perf IOPs, sequential streaming, general file serving for Windows NIX Focused on high bandwidth sequential streaming for GPFS grids
Technology Placement Key Value Proposition GUI driven administration, clustered Windows CIFS support, TSM NDMP integration, flexible storage array options (Storwize V7000, XIV, DCS3700), encapsulates GPFS controls Performance Oriented, Grow to Extremely Large Capacity, full administrative GPFS control
Geeky Stuff Protocol Support NFS, CIFS, HTTPS, FTP GPFS NSD, NFS
Geeky Stuff Connectivity 1GbE, 10GbE Infiniband, 10 GbE, 40 GbE
Geeky Stuff Capacity Range 100TB-15PB Flexible options for V7000, XIV, DCS3700 backend Storage 464TB dozens of PBs Fixed building block of 60 drive increments
Sales GTM Who owns it? How its sold? Who gets credit? Where can it be sold? System Storage, AAS, Storage Sellers, All markets/no restrictions System x High volume resellers, HPC / technical computing managed availability in 2013
4
GPFS Storage Server Scalable Building Block
Approach to Storage
Complete Storage Solution Data Servers, Disk (SSD
and NL-SAS), Software, Infiniband and Ethernet
No storage controllers!
x3650 M4
  • High Density HPC Options
  • 18 Enclosures
  • 2 - 42u Standard Racks
  • 1044 NL-SAS 18 SSD
  • 36 GB/Second
  • Model 24
  • Light and Fast
  • 4 Enclosures 20U
  • 232 NL-SAS 6 SSD
  • 10 GB/Second
  • Model 26
  • HPC Workhorse!
  • 6 Enclosures 28U
  • 12 GB/Second
  • 348 NL-SAS 6 SSD

4
5

What makes this different
Clients
l
o
D
R
NSD File Server 1 x3650
File/Data Servers
Migrate RAID and Disk Management to Standard File
Servers!
GPFS Native RAID
NSD File Server 2
GPFS Native RAID
Custom Dedicated Disk Controllers
JBOD Disk Enclosures
6
Feature Detail
  • Declustered RAID
  • Data and parity stripes are uniformly partitioned
    and distributed across a disk array.
  • Arbitrary number of disks per array
    (unconstrained to an integral number of RAID
    stripe widths)
  • 2-fault and 3-fault tolerance (RAID-D2, RAID-D3)
  • Reed-Solomon parity encoding
  • 2 or 3-fault-tolerant stripes 8 data strips
    2 or 3 parity strips
  • 3 or 4-way mirroring
  • End-to-end checksum
  • Disk surface to GPFS user/client
  • Detects and corrects off-track and lost/dropped
    disk writes
  • Asynchronous error diagnosis while affected IOs
    continue
  • If media error verify and restore if possible
  • If path problem attempt alternate paths
  • If unresponsive or misbehaving disk power cycle
    disk
  • Supports service of multiple disks on a carrier
  • IO ops continue on for tracks whose disks have
    been removed during carrier service

6
7
De-clustering Bringing Parallel Performance to
Disk Maintenance
  • Traditional RAID Narrow dataparity arrays
  • Rebuild uses IO capacity of an arrays only 4
    (surviving) disks

20 disks, 5 disks per traditional RAID array
Striping across all arrays, all file accesses are
throttled by array 2srebuild overhead.
4x4 RAID stripes(data plus parity)
  • Declustered RAID Dataparity distributed over
    all disks
  • Rebuild uses IO capacity of an arrays 19
    (surviving) disks

20 disks in 1 De-clustered array
Load on files accesses are reduced by 4.8x
(19/4)during array rebuild.
16 RAID stripes(data plus parity)
7
8
Low-Penalty Disk Rebuild Overhead
Reduces Rebuild Overhead by 3.5x
8
9
Non-intrusive disk diagnostics
  • Disk Hospital Background determination of
    problems
  • While a disk is in hospital, GNR non-intrusively
    and immediately returns data to the client
    utilizing the error correction code.
  • For writes, GNR non-intrusively marks write data
    and reconstructs it later in the background after
    problem determination is complete.
  • Advanced fault determination
  • Statistical reliability and SMART monitoring
  • Neighbor check, drive power cycling
  • Media error detection and correction
  • Supports concurrent disk firmware updates

9
10
Mean time to data loss 82 vs. 83
Parity 50 disks 200 disks 50,000 disks
82 200,000 years 50,000 years 200 years
83 250 billion years 60 billion years 230 million years
These figures assume uncorrelated failures and
hard read errors.
Simulation assumptions Disk capacity 600-GB,
MTTF 600khrs, hard error rate 1-in-1015 bits,
47-HDD declustered arrays, uncorrelated failures.
These MTTDL figures are due to hard errors, AFR
(2-FT) 5 x 10-6, AFR (3-FT) 4 x 10-12
10
11
Data management portfolio for high performance
Technical Computing
Focus Raw Raw Performance I/O Bandwidth
IBM Data Management Leadership
Focus Managed Building Block
GPFS Storage Server
Government
Focus Ease of Use Reliability
SONAS
High End Research
Petroleum
Media/Ent.
Financial Services
DCS3860 DCS3700
Bio/Life Science
GPFS
Higher End University
CAE
Direct Attached (DS3500 V3700)
Smaller Installations
IBM Tape, Tivoli Storage Manager, and HPSS
12
Summary IBM GPFS Storage Server
  • Whats New
  • Replaces external hardware controller with
    software based RAID
  • Modular upgrades improve TCO
  • Non-intrusive disk diagnostics
  • Client Business Value
  • Integrated and ready to go for GPFS applications
  • 3 years maintenance and support
  • Improved storage affordability
  • Delivers data integrity, end-to-end
  • Faster rebuild and recovery times
  • Reduces rebuild overhead by 3.5x
  • Key Features
  • Declustered RAID (82p, 83p)
  • 2- and 3-fault-tolerant erasure codes
  • End-to-end checksum
  • Protection against lost writes
  • Off-the-shelf JBODs
  • Standardized in-band SES management
  • SSD Acceleration Built-in

13
Doing Big Data Since 1998 IBM GPFS
Write a Comment
User Comments (0)
About PowerShow.com