0% found this document useful (0 votes)
117 views19 pages

A Case For Redundant Arrays of Inexpensive Disks (RAID)

Single chip computers improved in performance by 40% per year. E.g. MTTFdisk = 30,000 h MTTF100 = 300,000 to 800,000 hours. Different RAID solutions will benefit different target system configurations.

Uploaded by

dayas1979
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views19 pages

A Case For Redundant Arrays of Inexpensive Disks (RAID)

Single chip computers improved in performance by 40% per year. E.g. MTTFdisk = 30,000 h MTTF100 = 300,000 to 800,000 hours. Different RAID solutions will benefit different target system configurations.

Uploaded by

dayas1979
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

RAID

RAID: Redundant Arrays of Inexpensive Disks

this discussion is based on the paper:


A Case for Redundant Arrays of Inexpensive Disks (RAID), David A Patterson, Garth Gibson, and Randy H Katz, In Proceedings of the ACM SIGMOD International Conference on Management of Data (Chicago, IL), pp.109--116, 1988.

2007 A.W. Krings

RAID
!

Motivation

single chip computers improved in performance by 40% per year RAM capacity quadrupled capacity every 2-3 years Disks (magnetic technology)

capacity doubled every 3 years price cut in half every 3 years raw seek time improved 7% every year

Note: values presented in Pattersons paper are dated! Note: paper discusses pure RAID, not smarter implementations, e.g. caching. 2

2007 A.W. Krings

RAID

Amdahls Law: Effective Speedup


f = fraction of work in fast mode k = speedup while in fast mode assume 10% I/O operation if CPU 10x => effective speedup is 5 if CPU 100x => effective speedup is 10
"

Example:

90 % of potential speedup is wasted

2007 A.W. Krings

RAID
!

Motivation

compare mainframe mentality with todays possibilities, e.g. cost, configuration

CPU

Mainframe
Channel Controller

CPU

Small Computer
DMA

Memory

Memory

SCSI

2007 A.W. Krings

RAID

Reliability
Bad news!

e.g. MTTFdisk = 30,000 h MTTF100 = 300 h ( < 2 weeks) MTTF1000 = 30 h Note, that these numbers are very dated. Todays drives are much better. MTBF > 300,000 to 800,000 hours. even if we assume higher MTTF of individual disks, the problem stays. 5

2007 A.W. Krings

RAID
!

RAID Reliability

partition disks into reliability groups and check disks


D = total number of data disks G = # data disks in group C = # check disks in group

2007 A.W. Krings

RAID
!

Target Systems

Different RAID solutions will benefit different target system configurations. Supercomputers

larger blocks of data, i.e. high data rate small blocks of data high I/O rate read-modify-write sequences

Transaction processing

2007 A.W. Krings

RAID
!

5 RAID levels

RAID 1: mirrored disks RAID 2: hamming code for ECC RAID 3: single check disk per group RAID 4: independent read/writes RAID 5: no single check disk

2007 A.W. Krings

RAID
!

RAID level 1: Mirrored Disks


Most expensive option Tandem doubles controllers too Write to both disks Read from one disk Characteristics:


2007 A.W. Krings

S = slowdown. In synchronous disks spindles are synchronized so that the corresponding sectors of a group of disks can be accessed simultaneously. For synchr. disks S = 1. Reads = 2D/S, i.e. concurrent read possible Write = D/S, i.e. no overhead for concurrent write of same data R-Modify-Write = 4D/(3S) Pat88 Table II (pg. 112)

RAID

2007 A.W. Krings

10

RAID
!

RAID level 2: Hamming Code

DRAM => problem with !-particles Solution, e.g. parity for SED, Hamming code for SEC Recall Hamming Code Same idea using one disk drive per bit Smallest accessible unit per disk is one sector

access G sectors, where G = # data disks in a group

If operation on a portion of a group is needed:


1) read all data 2) modify desired position 3) write full group including check info

2007 A.W. Krings

11

Recall Hamming Code

m = data bits k = parity bits


2007 A.W. Krings

12

Compute Check

2007 A.W. Krings

13

RAID

Allows soft errors to be corrected on the fly. Useful for supercomputers, not useful for transaction processing e.g. used in Thinking Machine (Connection Machine) Data Vault with G = 32, C = 8. Characteristics:

Pat88 Table III (pg 112)

2007 A.W. Krings

14

2007 A.W. Krings

15

RAID
!

RAID level 3: Single Check Disk per Group


Parity is SED not SEC! However, often controller can detect if a disk has failed

information of failed disk can be reconstructed extra redundancy on disk, i.e. extra info on sectors etc. read data disks to restore replacement compute parity and compare with check disk if parity bits are equal => data bit = 0 otherwise => data bit = 1

If check disk fails

If data disk fails


2007 A.W. Krings

16

RAID

Since less overhead, i.e. one check disk only => Effective performance increases Reduction in disks over L2 decreases maintenance Performance same as L2, however, effective performance per disk increases due to smaller number of check disks Better for supercomputers, not good for transaction proc. Maxtor, Micropolis introduced first RAID-3 in 1988 Characteristics:

Pat88 Table IV (pg 113)

2007 A.W. Krings

17

2007 A.W. Krings

18

RAID
!

RAID level 4: Independent Reads/Writes


Pat88 fig 3 pg. 113 compares data locations Disk interleaving has advantages and disadvantages Advantage of previous levels:

large transfer bandwidth all disks in a group are accessed on each operation (R,W) spindle synchronization
"

Disadvantages of previous levels:


if none => probably close to worse case average seek times, access times (tracking + rotation)

Interleave data on disks at sector level Uses one parity disk 19

2007 A.W. Krings

2007 A.W. Krings

20

RAID

for small accesses


need only access to 2 disks, i.e. 1 data & parity new parity can be computed from old parity + old/new data compute: Pnew = dataold XOR datanew XOR Pold

e.g. small write


1) read old data + parity 2) write new data + parity in parallel

Bottleneck is parity disk e.g. small read

only read one drive (data) Pat88 Table V (pg 114)

Characteristics:

2007 A.W. Krings

21

2007 A.W. Krings

22

RAID
!

RAID level 5: No Single Check Disk


Distributes data and check info across all disks, i.e. there are no dedicated check disks. Supports multiple individual writes per group Best of 2 worlds

small Read-Modify-Write large transfer performance 1 more disk in group => increases read performance Pat88 Table VI (pg 114)

Characteristics:

2007 A.W. Krings

23

2007 A.W. Krings

24

RAID
!

Patterson Paper

discusses all levels on pure hardware problem refers to software solutions and alternatives, e.g. disk buffering with transfer buffer the size of a track, spindle synchronization of groups not necessary improving MTTR by using spares low power consumption allows use of UPS relative performance shown in Pat88 fig. 5 pg. 115

2007 A.W. Krings

25

2007 A.W. Krings

26

RAID
!

Summary

Data Striping for improved performance

distributes data transparently over multiple disks to make them appear as a single fast, large disk improves aggregate I/O performance by allowing multiple I/Os to be serviced in parallel
" "

independent requests can be serviced in parallel by separate disks single multiple-block block requests can be serviced by multiple disks acting in coordination

Redundancy for improved reliability


large number of disks lowers overall reliability of disk array thus redundancy is necessary to tolerate disk failures and allow continuous operation without data loss

2007 A.W. Krings

27

RAID
!

other RAIDs

RAID 0

employs striping with no redundancy at all claim of fame is speed alone has best write performance, but not the best read performance
"

why? (other RAIDs can schedule requests on the disk with the shortest expected seek and rotational delay)

RAID 6 (P + Q Redundancy)

uses Reed-Solomon code to protect against up to 2 disk failures using the bare minimum of 2 redundant disks.

2007 A.W. Krings

28

Source Che94

2007 A.W. Krings

29

RAID

String management

2007 A.W. Krings

30

RAID
!

Case Studies

Thinking Machines Corp.: TMC ScaleArray


RAID level 3 for CM-5 massively parallel processor (MPP) high bandwidth for large files OS provides file system that can deliver data from a single file to multiple processors from multiple disks uses 4 SCSI-2 strings with 2 disks each (= 8 disks) these 4 strings are attached to an 8MB disk buffer 3 of these units are attached to the backbone (=> 3x8=24 disks) normal configuration: 22 data, 1 parity, 1 spare

2007 A.W. Krings

31

RAID
!

Case Studies

HP: TickerTAIP/DataMesh

material shown is from The TickerTAIP Parallel RAID Architecture, Cao [Link]., ACM Trans. on Computer Systems, Vol. 12, No.3, August 1994, pp.236-269. traditional RAID architecture
"

host interface

bottleneck single point of failure

2007 A.W. Krings

32

RAID
!

Case Studies cont.

TickerTAIP/DataMesh Issues

getting away from centralized architecture different algorithms for computing RAID parity techniques for establishing request atomicity, sequencing, and recovery disk-level request-scheduling algorithms inside the array

2007 A.W. Krings

33

RAID
!

Case Studies

HP: TickerTAIP/DataMesh

TickerTAIP array architecture

TickerTAIP system environment

2007 A.W. Krings

34

RAID
!

Case Studies

HP: AutoRAID

provide a RAID that will provide excellent performance and storage effeciency in the presence of dynamically changing workloads provides both level 1 and level 5 RAID dynamically shift data to the appropriate level dynamically shift data to level 5 if approaching maximum array capacity parity logging hot pluggable disks, spare controller, dynamically addapts to added capacity Wilkes, J. et. al. The HP AutoRAID hierarchical storage system, ACM Trans. on Comuter systems, 14, 1 (Feb.), 108-136,

2007 A.W. Krings

35

RAID
!

Case Studies

StorageTek: Iceberg 9200 Disk Array Subsystem

using 5.25-inch disks to look like traditional IBM mainframe disks implements an extended RAID level 5 and level 6 disk array array consists of 13 data drives, P and Q drives, and a hot spare data, parity and Reed-Solomon coding are stiped across the 15 active drives

2007 A.W. Krings

36

RAID
!

other RAIDs

because of limitations of each RAID level on its own, several flavors of RAID have appeared which attempt to combine the best performance attributes e.g. RAID 0+1

combine RAID 0 striping with RAID 1 mirroring write coalescing uses write buffering to accumulate or coalesce multiple data blocks writes data in one chunk

e.g. RAID3/5

2007 A.W. Krings

37

You might also like