Reliability of Disk Systems - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Reliability of Disk Systems

Description:

Essentially, it is the availability of data when there is a ... i th Block of Disk1: 11110000. i th Block of Disk 2: 10101010. i th Block of Disk 3: 00111000 ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 26
Provided by: scie232
Category:

less

Transcript and Presenter's Notes

Title: Reliability of Disk Systems


1
Reliability of Disk Systems
2
Reliability
  • So far, we looked at ways to improve the
    performance of disk systems.
  • Next, we will look at ways to improve the
    reliability of disk systems.
  • What is reliability?
  • Essentially, it is the availability of data when
    there is a disk failure of some sort.
  • This is achieved at the cost of some redundancy
  • data and/or disks.

3
Intermittent Failures
  • In an intermittent failure, we may get several
    bad reads, for example, but with repeated
    attempts we may eventually get a good.
  • Disk sectors are stored with some redundant bits
    that can be used to tell us if an I/O operation
    was successful.
  • For writes, we may want to again check the status
  • We can, of course, re-read the sector and compare
    it to the original
  • But this is expensive
  • Instead, we simply re-read the sector and check
    the status bits

4
Checksums for failure detection
  • A useful tool for status validation is the
    checksum
  • One or more bits that, with high probability,
    verify the correctness of the operation
  • The checksum is written by the disk controller.
  • A simple form of checksum is the parity bit
  • Here, a bit is added to the data so that the
    number of 1s amongst the data bits the parity
    bit is always even.
  • A disk read (per sector) would return a status
    value of good if the bit string has an even
    number of 1s otherwise, status bad

5
(Interleaved) Parity bits
  • It is possible that more than one bit in a sector
    be corrupted
  • Error(s) may not be detected.
  • Suppose bits error randomly Probability of
    undetected error (i.e. even 1s) is thus 50
    (Why?)
  • Lets have 8 parity bits
  • 01110110
  • 11001101
  • 00001111
  • 10110100
  • Probability of error is 1/28 1/256
  • With n parity bits, the probability of undetected
    error 1/2n

6
Recovery from disk crashes
  • Mean time to failure (MTTF) when 50 of the
    disks have crashed, typically 10 years
  • Simplified (assuming this happens linearly)
  • In the 1st year 5,
  • In the 2nd year 5,
  • In the 20th year 5
  • However the mean time to a disk crash doesnt
    have to be the same as the mean time to data
    loss there are solutions.

7
Redundant Array of Independent Disks, RAID
  • RAID 1Mirror each disk (data/redundant disks)
  • If a disk fails, restore using the mirror
  • Assume
  • 5 failure per year MTTF 10 years (for disks).
  • 3 hours to replace and restore failed disk.
  • If a failure to one disk occurs, then the other
    better not fail in the next three hours.
  • Probability of failure 5 ?3/(24 ? 365)
    1/58400.
  • If one disk fails every 10 years, then one of two
    will fail every 5 years.
  • One in 58,400 of those failures results in data
    loss MTTF 292,000 years.
  • Drawback We need one redundant disk for each
    data disk.

This is the mean time to failure for data.
8
RAID 4
  • RAID 4 One redundant disk only.
  • n data disks 1 redundant disk (for any n)
  • Well refer to the expression x?y as modulo-2 sum
    of x and y (XOR)
  • E.g. 11110000 ? 10101010 01011010
  • Now, each block in the redundant disk has the
    modulo-2 sum for the corresponding blocks in the
    other disks.
  • i th Block of Disk 1 11110000
  • i th Block of Disk 2 10101010
  • i th Block of Disk 3 00111000
  • i th Block of red. disk 01100010
  • In effect this is just a distributed form of the
    block-interleaved parity discussed earlier.

9
Properties of XOR ?
  • Commutativity x?y y?x
  • Associativity x?(y?z) (x?y)?z
  • Identity x?0 0?x x (0 is vector)
  • Self-inverse x?x 0
  • As a useful consequence, if x?yz, then we can
    add x to both sides and get yx?z
  • More generally
  • 0 x1?...?xn
  • Then adding xi to both sides, we get
  • xi x1?xi-1 ?xi1?...?xn

10
Failure recovery in RAID 4
  • We must be able to restore whatever disk crashes.
  • Just compute the modulo2 sum of corresponding
    blocks of the other disks.
  • Use equation
  • Example
  • i th Block of Disk1 11110000
  • i th Block of Disk 2 10101010
  • i th Block of Disk 3 00111000
  • i th Block of red disk 01100010

Disk 2 crashes. Compute it by taking the modulo 2
sum of the rest.
11
RAID 4 (Contd)
  • Reading as usual
  • Interesting possibility If we want to read from
    disk i, but it is busy and all other disks are
    free, then instead we can read the corresponding
    blocks from all other disks and modulo2 sum
    them.
  • Writing
  • Write block.
  • Update redundant block

12
How do we get the value for the redundant block?
  • Naively Read all n corresponding blocks
  • ? n1 disk I/Os, which is
  • n-1 blocks read,
  • 1 data block write,
  • 1 redundant block write).
  • Better How?

13
How do we get the value for the redundant block?
  • Better Writing To write block j of data disk i
    (new value v)
  • Read old value of that block, say o.
  • Read the jth block of the redundant disk, say r.
  • Compute w v ? o ? r.
  • Write v in block j of disk i.
  • Write w in block j of the redundant disk.
  • Total 4 disk I/O (true for any number of data
    disks)
  • Problem Why does this work?
  • Intuition v ? o is the change to the parity.
  • Redundant disk must change to compensate.

14
Example
  • i th Block of Disk1 11110000
  • i th Block of Disk 2 10101010
  • i th Block of Disk 3 00111000
  • i th Block of red disk 01100010
  • Suppose we change 10101010 into 01101110
  • 10101010
  • 01101110
  • 01100010
  • ---------------
  • 10100110
  • 11110000
  • 01101110
  • 00111000
  • -------------
  • 10100110

15
RAID 5
  • RAID 4 Problem The redundant disk is involved
    in every write ? Bottleneck!
  • Solution is RAID 5 vary the redundant disk for
    different blocks.
  • Example n disks
  • block j is redundant on disk i if i jn.
  • Example n4. So, there are 4 disks.
  • First disk numbered 0, would be the redundant
    when considering cylinders numbered 0, 4, 8, 12
    etc. (because they leave reminder 0 when divided
    by 4).
  • Disk numbered 1, would be the redundant for its
    cylinders numbered 1, 5, 9, etc.

16
RAID 5 (Contd)
  • The reading/writing load for each disk is the
    same.
  • In one block write whats the probability that a
    disk is involved?
  • Each disk has 1/(n1) probability to have the
    block.
  • If not, i.e. with probability n/(n1), then it
    has 1/n chance that it will be the redundant
    block for that block number.
  • So, each of the four disks is involved in
  • 1/(n1) 1 (n/(n1))(1/n) 2/(n1) of the
    writes.

17
RAID 6 - for multiple disk crashes
  • Lets focus on recovering from two disk crashes.
  • Setup
  • 7 disks, numbered 1 through 7
  • The first 4 are data disks, and disks 5 through 7
    are redundant.
  • The relationship between data and redundant disks
    is summarized by a 3 x 7 matrix of 0's and 1's

Redundant disks
Data disks
The disks with 1 in a given row of the matrix are
treated as if they were the entire set of disks
in a RAID level 4 scheme.
1 2 3 4 5 6 7
1 1 1 0 1 0 0
1 1 0 1 0 1 0
1 0 1 1 0 0 1
The columns for the redundant disks have a single
1. All columns are different. No all-0s column.
18
RAID 6 - example
  • 1) 11110000
  • 2) 10101010
  • 3) 00111000
  • 4) 01000001

disk 5 is modulo 2 sum of disks 1,2,3 disk 6 is
modulo 2 sum of disks 1,2,4 disk 7 is modulo 2
sum of disks 1,3,4
5) 01100010 6) 00011011 7) 10001001
Redundant disks
Data disks
1 2 3 4 5 6 7
1 1 1 0 1 0 0
1 1 0 1 0 1 0
1 0 1 1 0 0 1
19
RAID 6 Failure Recovery
  • Why is it possible to recover from two disk
    crashes?
  • Let the failed disks be a and b.
  • Since all columns of the redundancy matrix are
    different, we must be able to find some row r in
    which the columns for a and b are different.
  • Suppose that a has 0 in row r, while b has 1
    there.
  • Then we can compute the correct b by taking the
    modulo-2 sum of corresponding bits from all the
    disks other than b that have 1 in row r.
  • Note that a is not among these, so none of them
    have failed.
  • Having done so, we must recompute a, with all
    other disks available.

20
RAID 6 How many redundant disks?
  • The number of disks can be one less than any
    power of 2, say 2k 1.
  • Of these disks, k are redundant, and the
    remaining 2k 1 k are data disks, so the
    redundancy grows roughly as the logarithm of the
    number of data disks.
  • For any k, we can construct the redundancy matrix
    by writing all possible columns of k 0's and 1's,
    except the all-0's column.
  • The columns with a single 1 correspond to the
    redundant disks, and the columns with more than
    one 1 are the data disks.

Note finally that we can combine RAID 6 with RAID
5 to reduce the performance bottleneck on the
redundant disks
21
Raid level 0 Disk Striping
22
Nested levels RAID 01
23
Nested levels RAID 10
24
Nested levels RAID 50
  • RAID 0
  • .-------------------------------------
    ----------------.

  • RAID 5 RAID 5
    RAID 5
  • .-----------------. .-----------------.
    .-----------------.


25
Nested levels RAID 60
  • RAID 0
  • .---------------------------------
    ---.

  • RAID 6
    RAID 6
  • .--------------------------.
    .--------------------------.

Write a Comment
User Comments (0)
About PowerShow.com