Title: Operating Systems
1Operating Systems
- RAID Redundant Array of Independent Disks
Thanks to Yoram Dahan
2Motivation -)
3What does RAID stand for?
4The Problem
- The Mean Time Between Failure (MTBF) of the
array will be equal to the MTBF of an individual
drive, divided by the number of drives in the
array. Because of this, the MTBF of an array of
drives would be too low for many application
requirements.
5The Solution
- Disk arrays can be made fault-tolerant by
redundantly storing information in various ways. - Five types of array architectures, RAID-1 through
RAID-5, were defined by the Berkeley paper, each
providing disk fault-tolerance and each offering
different trade-offs in features and performance. - In addition to these five redundant array
architectures, it has become popular to refer to
a non-redundant array of disk drives as a RAID-0
array.
6Data Striping
- Fundamental to RAID is "striping", a
method of concatenating multiple drives into one
logical storage unit. - Striping involves partitioning each drive's
storage space into strips which may be as small
as one sector (512 bytes) or as large as several
megabytes.
7Logical to physical data mapping for striping
Physical Disk 0
Physical Disk 1
Physical Disk 2
Physical Disk 3
strip 0
strip 1
strip 2
strip 3
stripe
strip 4
strip 5
strip 6
strip 7
strip 8
strip 9
strip 10
strip11
strip 12
strip 13
strip 14
strip 15
8RAID Idea
- Several improvements in disk-use techniques
involve the use of multiple disks working
cooperatively. - Disk striping uses a group of disks as one
storage unit. - RAID schemes improve performance and improve the
reliability of the storage system by storing
redundant data. - Mirroring or shadowing keeps duplicate of each
disk. - Block interleaved parity uses much less
redundancy.
9RAID Common Characteristics
- A set of physical disk drives viewed by the OS as
a single logical drive. - Data are distributed across the array of disk
drives. - Redundant disk capacity is used to store parity
information, which guarantees data recoverability
in case of a disk failure.
10RAID Structure
- RAID provides reliability via redundancy.
- RAID is arranged into six different levels
11RAID Levels
12RAID 0
13RAID 0 (non-redundant)
14Data Mapping for RAID Level 0 Array
Physical Disk 0
Physical Disk 1
Physical Disk 2
Physical Disk 3
strip 0
strip 1
strip 2
strip 3
strip 4
strip 5
strip 6
strip 7
strip 8
strip 9
strip 10
strip11
strip 12
strip 13
strip 14
strip 15
15RAID 1
16RAID 1 (mirrored)
17RAID 2
- RAID-2
- RAID Level 2 uses Hamming error correction codes.
- Is intended for use with drives which do not have
built-in error detection. - All SCSI drives support built-in error detection,
so this level is of little use when using SCSI
drives.
18 RAID 2
(Redundancy through Hamming code)
f2(b)
f1(b)
f0(b)
b2
b1
b0
b2
19RAID 3
- RAID-3
- RAID Level 3 stripes data at a byte level across
several drives, with parity stored on one
drive.It is otherwise similar to level 4. - Byte-level striping requires hardware support for
efficient use.
20RAID 3 (bit-interleaved parity)
P(b)
b2
b1
b0
b2
21RAID 4
- RAID-4
- RAID Level 4 stripes data at a block level across
several drives, with parity stored on one drive. - The parity information allows recovery from the
failure of any single drive. - The performance of a level 4 array is very good
for reads (the same as level 0). - Writes require that parity data be updated each
time. This slows small random writes, but large
writes or sequential writes are fairly fast. - Because only one drive in the array stores
redundant data, the cost per megabyte of a level
4 array can be fairly low.
22RAID 4 (block-level parity)
23RAID 5
- RAID-5
- RAID Level 5 is similar to level 4, but
distributes parity among the drives. - Can speed small writes in multiprocessing
systems, since the parity disk does not become a
bottleneck. - The performance for reads tends to be
considerably lower than a level 4 array. - The cost per megabyte is the same as for level
4.
24 RAID 5 (block-level distributed
parity)
25RAID 6 (dual redundancy)
26RAID (0 1) and (1 0)
27RAID Levels
28Summary (0)
29Summary (1)
30Summary (2)
31Summary (3)
32Summary (4)
33Summary (5)
34Hardware RAID
- The hardware based system manages the RAID
subsystem independently from the host and
presents to the host only a single disk per RAID
array. This way the host doesn't have to be aware
of the RAID subsystems(s). - Two solutions
- Controller based hardware solution.
- External hardware solution (SCSI---SCSI RAID).
35RAID System (1)
36RAID System (2)
37The controller based hardware solution
- DPT's SCSI controllers are a good example for a
controller based RAID solution. - The intelligent controller manages the RAID
subsystem independently from the host. - The advantage over an external SCSI---SCSI RAID
subsystem is that the controller is able to span
the RAID subsystem over multiple SCSI channels
and by this remove the limiting factor external
RAID solutions have the transfer rate over the
SCSI bus.
38The external hardware solution
- Solution SCSI---SCSI RAID
- An external RAID box moves all RAID handling
"intelligence" into a controller that is sitting
in the external disk subsystem. - The whole subsystem is connected to the host via
a normal SCSI controller and appears to the host
as a single disk.
39Comparison of both solutions
- The external hardware solution has drawbacks
compared to the controller based solution - The single SCSI channel used in this solution
creates a bottleneck. - 4 SCSI drives can already completely flood a SCSI
bus, since the average transfer size is around
4KB and the command transfer overhead - which
even in Ultra SCSI is still done asynchronously -
takes most of the bus time.
40Software RAID
- RAID solution that is completely hardware
independent.
41Examples for Software RAID
- The MD driver in the Linux kernel
- The MD driver in the Linux kernel is an example
of a RAID solution that is completely hardware
independent. - Its application is limited, it only provides RAID
level 0.
42Examples for Software RAID
- Adaptec RAID controllers
- They have no RAID functionality whatsoever on the
controller. - They depend on external drivers to provide all
external RAID functionality. - They are basically only multiple single AHA2940
controllers which have been integrated on one
card. Linux detects them as AHA2940 and treats
them accordingly. - Every OS needs its own special driver for this
type of RAID solution, this is error prone and
not very compatible
43Hardware vs. Software RAID
- Software-based arrays occupy host system memory,
consume CPU cycles and are operating system
dependent. - Software-based arrays degrade overall server
performance - Unlike hardware-based arrays, the performance of
a software-based array is directly dependent on
server CPU performance and load. - Software-based implementations commonly require a
separate boot drive, which is NOT included in the
array.
44Hardware vs. Software RAID
- Hardware arrays also do not occupy any host
system memory. - Not operating system dependent.
- Since the host CPU can execute user applications
while the array adapter's processor
simultaneously executes the array functions, the
result is true hardware multi-tasking. - Hardware arrays are also highly fault tolerant.
45DISK/TREND NEWS
- Enterprise storage is a changing landscape,
butRAID sales continue up, topping 14 billion
in 1999.
Source 1999 DISK/TREND Report
46Bibliography
http//www.raid-advisory.com/
http//www.sparcproductdirectory.com/raid2.html
http//www.storagesearch.com/products.html
http//www.uni-mainz.de/neuffer/scsi/what_is_raid
.html
http//www.disktrend.com/newsarry.htm
http//www.raidinc.com/RAID/xanmed_popup.html
47Disk Attachment
- Disks may be attached one of two ways
- Host attached via an I/O port.
- Network attached via a network connection.
48Network-Attached Storage
49Storage-Area Network