THE HP AUTORAID HIERARCHICAL STORAGE SYSTEM - PowerPoint PPT Presentation

About This Presentation
Title:

THE HP AUTORAID HIERARCHICAL STORAGE SYSTEM

Description:

Level 3: bit or byte level interleaved with dedicated parity disk ... Twelve 2.0GB Seagate Barracuda disks (7200rpm) Compared with: Data General RAID array with ... – PowerPoint PPT presentation

Number of Views:236
Avg rating:3.0/5.0
Slides: 29
Provided by: jeha1
Learn more at: https://www2.cs.uh.edu
Category:

less

Transcript and Presenter's Notes

Title: THE HP AUTORAID HIERARCHICAL STORAGE SYSTEM


1
THE HP AUTORAIDHIERARCHICAL STORAGE SYSTEM
  • J. Wilkes, R. Golding, C. StaelinT. Sullivan
  • HP Laboratories, Palo Alto, CA

2
INTRODUCTION
  • must protect data against disk failures too
    frequent and too hard to repair
  • possible solutions
  • for small numbers of disks mirroring
  • for larger number of disks RAID

3
RAID
  • Typical RAID Organizations
  • Level 3 bit or byte level interleaved with
    dedicated parity disk
  • Level 5 block interleaved with parity blocks
    stored on all disks

4
LIMITATIONS OF RAID (I)
  • Each RAID level performs well for a narrow range
    of workloads
  • Too many parameters to configure data- and
    parity-layout, stripe depth, stripe width, cache
    sizes, write-back policies, ...

5
LIMITATIONS OF RAID (II)
  • Changing from one layout to another or adding
    capacity requires downloading and reloading the
    data
  • Spare disks remain unused until a failure occurs

6
A BETTER SOLUTION
  • A managed storage hierarchy
  • mirror active data
  • store in a RAID 5 less active data
  • This requires locality of reference
  • active subset must be rather stablefound to be
    true in several studies

7
IMPLEMENTATION LEVEL
  • Storage hierarchy could be implemented
  • Manually can use the most knowledge but cannot
    adapt quickly
  • In the file system offers best balance of
    knowledge and implementation freedom but specific
    to a particular file system
  • Through a smart array controller easiest to
    deploy (HP AutoRAID)

8
MAJOR FEATURES (I)
  • Mapping of host block addresses to physical disk
    locations
  • Mirroring of write-active data
  • Adaptation to changes in the amount of data
    stored
  • Starts RAID 5 when array becomes full
  • Adaptation to workload changes
  • Hot-pluggable disks, fans, power supplies and
    controllers

9
MAJOR FEATURES (II)
  • On-line storage capacity expansion system
    switches then to mirroring
  • Can mix or match disk capacities
  • Controlled fail-over can havedual controllers
    (primary/standby)
  • Active hot spares used for more mirroring
  • Simple administration and setup appears to host
    as one or more logical units
  • Log-structured RAID 5 writes

10
RELATED WORK (I)
  • Storage Technology Corporation Iceberg
  • also uses redirection but based on RAID 6
  • handles variable size records
  • emphasis on very high reliability

11
RELATED WORK (II)
  • Floating parity scheme from IBM Almaden
  • Relocated parity blocks and uses distributed
    sparing
  • Work on log-structured file systems at U.C.
    Berkeley and cleaning policies

12
RELATED WORK (III)
  • Whole literature on hierarchical storage systems
  • Schemes compressing inactive data
  • Use of non-volatile memory (NVRAM) for optimizing
    writes
  • Allows reliable delayed writes

13
OVERVIEW
Processor,RAM and Control Logic
Parity Logic
2x10MB/s bus
DRAM Read Cache
Matching RAM
NVRAM Write Cache
Other RAM
SCSIController
20 MB/s
Host Computer
14
PHYSICAL DATA LAYOUT
  • Data space on disks is broken up into large
    Physical EXTents (PEXes)
  • Typical size is 1 MB
  • PEXes can be combined to form Physical Extent
    Groups (PEGs) containing at least three PEXes on
    three different disks
  • PEGs can be assigned to the mirrored storage
    class or to the RAID 5 storage class
  • Segments are the units on contiguous space on a
    disk (128 KB in prototype)

15
LOGICAL DATA LAYOUT
  • Logical allocation and migration unit is the
    Relocation Block (RB)
  • Size in prototype was 64 KB
  • Smaller RBs require more mapping information but
    larger RBs increase migration costs after small
    updates
  • Each PEG holds a fixed number of RBs

16
MAPPING STRUCTURES
  • Map addresses from virtual volumes to PEGs, PEXes
    and physical disk addresses
  • Optimized for finding fast the physical address
    of a RB given its logical address
  • Each logical unit has a virtual device table
    listing all RBs in the logical unitand pointing
    to their PEG
  • Each PEG has a PEG Table listing all RBs in the
    PEG and the PEXes used to store them

17
NORMAL OPERATIONS (I)
  • Requests are sent to the controller in SCSI
    Command Descriptor Blocks (CDB)
  • Up to 32 CBs can be simultaneously active and
    2048 other ones queued
  • Long requests are broken into 64 KB segments

18
NORMAL OPERATIONS (II)
  • Read requests
  • Test first to see if data are not already in read
    cache or in non-volatile write cache
  • Otherwise allocate space in cache and issue one
    or more requests to back-end storage classes
  • Write requests return as soon as data are
    modified in non-volatile write cache
  • Cache has a delayed write policy

19
NORMAL OPERATIONS (III)
  • Flushing data from cache can involve
  • A back-end write to a mirrored storage class
  • Promotion from RAID 5 to mirrored storage before
    the write
  • Mirrored reads and writes are straightforward

20
NORMAL OPERATIONS (IV)
  • RAID 5 reads are straightforward
  • RAID 5 writes can be done
  • On a per-RB base requires two reads and two
    writes
  • In batched writes more complex but cheaper

21
BACKGROUND OPERATIONS
  • Triggered when array has been idle for some time
  • Include
  • Compaction of empty RB slots,
  • Migration between storage classes (using an
    approximate LRU algorithm) and
  • Load balancing between disks

22
MONITORING
  • System also includes
  • An I/O logging tool and
  • A management tool for analyzing the array
    performance

23
PERFORMANCE RESULTS (I)
  • HP AutoRAID configuration with
  • 16 MB of controller data cache
  • Twelve 2.0GB Seagate Barracuda disks (7200rpm)
  • Compared with
  • Data General RAID array with64 MB front-end
    cache
  • Eleven individual disk drives implementing disk
    striping but without any redundancy

24
PERFORMANCE RESULTS (II)
  • Results of OLTP database workload
  • AutoRAID was better than RAID array and
    comparable to set of non-redundant drives
  • But whole database was stored in mirrored
    storage!
  • Micro benchmarks
  • AutoRAID is always better than RAID array but has
    smaller I/O rates than set of drives

25
SIMULATION RESULTS (I)
  • Increasing the disk speed improves the
    throughput
  • Especially if density remains constant
  • Transfer rates matter more than rotational
    latency
  • 64KB seems to be a good size for the Relocation
    Blocks
  • Around the size of a disk track

26
SIMULATION RESULTS (II)
  • Best heuristics for selecting the mirrored copy
    to be read is shortest queue
  • Allowing write cache overwrites has a HUGE impact
    on performance
  • RBs demoted to RAID should use existing holes
    when the system is not too loaded

27
SUMMARY (I)
  • System is very easy to set up
  • Dynamic adaptation is a big win but it will not
    work for all workloads
  • Software is what makes AutoRAID, not the hardware
  • Being auto adaptive makes AutoRAIDhard to
    benchmark

28
SUMMARY (II)
  • Future work includes
  • System tuning especially
  • Idle period detection
  • Front-end cache management algorithms
  • Developing better techniques for synthesizing
    traces
Write a Comment
User Comments (0)
About PowerShow.com