EC/MIS 619 - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

EC/MIS 619

Description:

using multiple fixed-disk drives, high speed controllers and special software ... larger chunk size means single disk may be accessed (unsynchronized or decoupled) ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 29
Provided by: cretsonl
Category:

less

Transcript and Presenter's Notes

Title: EC/MIS 619


1
EC/MIS 619
  • RAID Data Protection

2
RAID
  • RAID a redundant array of inexpensive disks
  • using multiple fixed-disk drives, high speed
    controllers and special software drivers to
    control the safety of your data and improve the
    performance of fixed-disk system.
  • All commercial systems use Small Computer Systems
    Interface (SCSI pronounced scuzzy)

3
  • RAID levels of 1 or higher protect your data by
    spreading it on multiple disks then calculating
    and storing parity bit information.
  • This redundancy allows one drive to fail without
    causing the array itself to fail.
  • RAID 1 increases disk subsystem performance by
    distributing data across several drives, allowing
    the same data to be retrieved from many locations
    - depending on which is closer to the read
    head(s).

4
RAID Levels
  • Several level of RAID exists. RAID 0 increases
    read and write performance but does not provide
    data protection.
  • RAID was initially designed for mainframe and
    microcomputers. Until recently its deployment was
    limited by price.
  • Plummeting disk price has made RAID more common
    of late.
  • Disk drives provide two functions read and
    write. RAID may be chosen to optimize these
    functions.

5
RAID 0
  • High performance, zero redundancy array.
  • Isnt truly RAID at all.
  • Data is lost if one drive fails
  • It strips blocks of data across multiple disks to
    improve subsystem throughput.
  • RAID 0 is used for applications needing the
    highest possible read and write rates.

6
RAID 0 with 2 disks
Sector 1
Sector 2
Sector 3
Sector 4
Sector 6
Sector 5
Sector 7
Sector 8
Sector 9
Sector 10
Disk 2
Disk 1
7
  • RAID 0 is important because the same stripping
    mechanism used in RAID is used to improve
    performance in other RAID levels.
  • RAID 0 is inexpensive because
  • no additional disk space is needed for parity
    data
  • it uses simple algorithms that dont add much
    overhead or require a dedicated processor.
  • RAID 0 uses stripping to store data
  • data blocks are alternately written to different
    physical drives that make up the logical drive
    used by the array.

8
  • Each read request is directed to the individual
    drive on which multiple blocks
  • multiple reads request are generated serially
  • stripping allows data transfer to occur in
    parallel.
  • Overall read time is significantly reduced.
  • Efficiency is influenced by size data blocks

9
RAID 1 with 2 disks
File Data
Sector 1
Sector 1
Sector 2
Sector 2
Sector 3
Sector 3
Sector 4
Sector 4
Sector 5
Sector 5
Disk 2
Disk 1
10
RAID 1
  • We avoid loosing data by making copies of it.
  • RAID 1 provides 100 redundancy- if a disk is
    lost in array1, theres another drive with an
    exact duplicate of the failed drive's contents.
  • RAID 1 offers the highest level of redundancy
    (but at the highest cost).

11
  • Mirroring- each drive has a twin drive. Write
    functions take place simultaneously.
  • Disadvantages
  • twice as many disk needed for storage
  • slow writes - because of overheads introduces
    with the need to write to twin drives and
    maintain coherency of their contents.
  • Advantages
  • duplicates means data loss is less likely
  • faster reads - reads can be made from drive whose
    head is closest to the data.

12
  • Duplexing similar to mirroring but adds a second
    host adapter to control the second set of drives.
  • Introduces cost of second adapter
  • Duplexing with two cards eliminates the host
    adapter as a single point of failure.
  • In a large server the cost of duplicating every
    disk gets very expensive. Cost are reasonable for
    smaller systems

13
  • Read Performance is faster than that of stand
    alone drive.
  • Offers two read alternatives
  • Circular queue or round-robin scheduling reads
    are alternated between two physical drive with
    each drive handling every second request.
  • Geometric, regional, or assigned circular
    scheduling overcomes slow reads by giving two
    drives the responsibility to cover only half of
    the physical drives, thus head positioning time
    is minimized.

14
  • RAID 1 Write Performance is more problematic.
    Data has to be written twice (like going through
    the checkout line twice)
  • This double write that facilitates the high level
    of data safety.
  • Overall RAID 1 performance for most server
    environments, reads greatly outnumber writes.
    Therefore any factor that benefits read
    performance at the expense of write performance
    will improve server performance most of the time.

15
  • Secondly, writing to two drives does not cut
    write performance in half. Performance is cut by
    10-20.
  • Write request are generated serially, but the
    actual writes are performed in parallel. The
    extra time for the second write has little impact
    on total time.

16
RAID 2
  • Distributes data at the bit level
  • Uses multiple bits to store parity data and uses
    a large number of individual drives
  • Large amount of processing overhead makes it
    unsuitable for database servers.
  • RAID 2 is useful for special purpose servers such
    as digital video servers.

17
RAID 3
  • Distributes data at the byte level
  • Dedicates one drive for storage of parity bits
  • In a 4 disk system, the first three disks stores
    data and the fourth handles the parity bits.
  • RAID 3 is optimized for special purpose servers
    such as digital video servers and is
    inappropriate for random access areas such as PC
    LANs.

18
(No Transcript)
19
RAID 4
  • Similar to RAID 3, except it stripes data at the
    block rather than byte level - providing better
    read performance than RAID 3.
  • The small chunk size of RAID 3s means every read
    requires participation from every disk in the
    array. As such RAID 3 is referred to as being
    synchronized or coupled
  • RAID 4 larger chunk size means single disk may be
    accessed (unsynchronized or decoupled)
  • RAID 4 is not for PC LANs.

20
RAID 5
21
  • Most common RAID for PC LANs
  • Parity data is striped across all disks taking up
    the equivalent of one disk.
  • The equivalent of one drive is unavailable to the
    operating system.
  • RAID 5 is optimized for transaction-processing
    activity, in which users frequently read and
    write small amounts of data.

22
  • RAID 5 read performance does not need to access
    parity information unless one or more strips are
    unreadable.
  • Both data and parity bits are optimized for
    sequential reads.
  • Allows parallel read and write, attaining better
    performance on random reads.
  • Matches RAID 0 on sequential reads.

23
  • RAID 5 write performance writing is more
    problematic.
  • Write requires 4 disk operations
  • one read for data and another for existing parity
    information. Parity information is recalculated
    based on the read and pending write. Two writes
    are performed (one for data, one for parity
    info).
  • This compares to one operation for RAID 0.
  • Guard against system failure after data but
    before parity information is written.

24
Stacked RAID
File Data
Sector 1
Sector 2
Sector 4
Sector 3
Sector 5
Sector 6
Sector 7
Sector 8
Sector 10
Sector 9
Array 2
Array 1
Sector 1
Sector 1
Sector 2
Sector 2
Sector 3
Sector 3
Sector 4
Sector 4
Sector 5
Sector 6
Sector 6
Sector 5
Sector 7
Sector 7
Sector 8
Sector 8
Sector 10
Sector 9
Sector 9
Sector 10
Disk 1
Disk 4
Disk 2
Disk 3
25
Stacked RAID
  • RAID arrays are seen as a single logical disk by
    the operating system.
  • This allows us to stack arrays, using one level
    to control an array of arrays (individual disks
    are replaced by arrays operating at the same or
    different levels)

26
  • This allows us to gain the benefits of multiple
    levels without the disadvantages.
  • High performance RAID is visible while lower
    performance RAID providing redundancy is hidden
  • E.g. RAID 0/1 - combines RAID 0 stripping with
    RAID 1 redundancy.

27
Array Selection (failure)
  • RAID 0 failure results in data loss
  • RAID 1 - data is read from mirrored drive
  • RAID 3 - parity drive failure results in no data
    loss. If data drive fails, reads of other drive
    plus parity drive is used to reconstruct the
    data.
  • RAID 5 - failure results in loss of some data and
    some parity information. Reading of other drives
    is used to reconstruct the data.

28
Specifying RAID Implementation
  • Decision is made based on
  • the type of data to be stored (large vs. small
    writes)
  • importance of performance vs. safety of data.
  • Cost
  • e.g. RAID 0 offers high performance RAID 1
    offers redundancy RAID 3 good performance and
    safety but is poor for PC LANs Stacked RAID
    offers good performance and safety but is
    expensive.
Write a Comment
User Comments (0)
About PowerShow.com