Disksraid 09 PDF
Disksraid 09 PDF
50 Years Old!
Track
Sector
Rotation
Seek Time Delay
Modern disks
Barracuda Cheetah X15 36LP
180
Capacity 181GB 36.7GB
Disk/Heads 12/24 4/8
Cylinders 24,247 18,479
Sectors/track ~609 ~485
Speed 7200RPM 15000RPM
Latency (ms) 4.17 2.0
Avg seek (ms) 7.4/8.2 3.6/4.2
Track-2- 0.8/1.1 0.3/0.4
Disks vs. Memory
• Smallest write: sector • (usually) bytes
• Atomic write = sector • byte, word
• Random access: 5ms • 50 ns
– not on a good curve – faster all the time
• Sequential access: 200MB/s • 200-1000MB/s
• Cost $.002MB • $.10MB
• Crash: doesn’t matter (“non- • contents gone (“volatile”)
volatile”)
Disk Structure
• Disk drives addressed as 1-dim arrays of logical blocks
– the logical block is the smallest unit of transfer
• This array mapped sequentially onto disk sectors
– Address 0 is 1st sector of 1st track of the outermost cylinder
– Addresses incremented within track, then within tracks of the
cylinder, then across cylinders, from innermost to outermost
• Translation is theoretically possible, but usually difficult
– Some sectors might be defective
– Number of sectors per track is not a constant
Non-uniform #sectors / track
• Reduce bit density per track for outer layers (Constant
Linear Velocity, typically HDDs)
• Have more sectors per track on the outer layers, and
increase rotational speed when reading from outer tracks
(Constant Angular Velcity, typically CDs, DVDs)
Disk Scheduling
• The operating system tries to use hardware efficiently
– for disk drives having fast access time, disk bandwidth
• Access time has two major components
– Seek time is time to move the heads to the cylinder containing the
desired sector
– Rotational latency is additional time waiting to rotate the desired
sector to the disk head.
• Minimize seek time
• Seek time seek distance
• Disk bandwidth is total number of bytes transferred, divided by
the total time between the first request for service and the
completion of the last transfer.
Disk Scheduling (Cont.)
• Several scheduling algos exist service disk I/O
requests.
• We illustrate them with a request queue (0-199).
Head pointer 53
FCFS
Illustration shows total head movement of 640 cylinders.
SSTF
• Selects request with minimum seek time from current
head position
• SSTF scheduling is a form of SJF scheduling
– may cause starvation of some requests.
• Illustration shows total head movement of 236
cylinders.
SSTF (Cont.)
SCAN
• The disk arm starts at one end of the disk,
– moves toward the other end, servicing requests
– head movement is reversed when it gets to the other end of
disk
– servicing continues.
• Sometimes called the elevator algorithm.
• Illustration shows total head movement of 208
cylinders.
SCAN (Cont.)
C-SCAN
• Provides a more uniform wait time than SCAN.
• The head moves from one end of the disk to the
other.
– servicing requests as it goes.
– When it reaches the other end it immediately returns to
beginning of the disk
• No requests serviced on the return trip.
• Treats the cylinders as a circular list
– that wraps around from the last cylinder to the first one.
C-SCAN (Cont.)
C-LOOK
• Version of C-SCAN
• Arm only goes as far as last request in each
direction,
– then reverses direction immediately,
– without first going all the way to the end of the disk.
C-LOOK (Cont.)
Selecting a Good Algorithm
• SSTF is common and has a natural appeal
• SCAN and C-SCAN perform better under heavy load
• Performance depends on number and types of requests
• Requests for disk service can be influenced by the file-allocation
method.
• Disk-scheduling algorithm should be a separate OS module
– allowing it to be replaced with a different algorithm if necessary.
• Either SSTF or LOOK is a reasonable default algorithm
Disk Formatting
• After manufacturing disk has no information
– Is stack of platters coated with magnetizable metal oxide
• Before use, each platter receives low-level format
– Format has series of concentric tracks
– Each track contains some sectors
– There is a short gap between sectors
data disks
Raid Level 1
• Mirrored Disks
• Data is written to two places
– On failure, just use surviving disk
• On read, choose fastest to read
– Write performance is same as single drive, read performance
is 2x better
• Expensive
Parity disk
data disks
Raid Level 4
• Combines Level 0 and 3 – block-level parity with Stripes
• A read accesses all the data disks
• A write accesses all data disks plus the parity disk
• Heavy load on the parity disk
Parity disk
data disks
Raid Level 5
• Block Interleaved Distributed Parity
• Like parity scheme, but distribute the parity info over all
disks (as well as data over all disks)
• Better read performance, large write performance
– Reads can outperform SLEDs and RAID-0
• Model:
– An incorrect disk write can be detected by looking at the ECC
– It is very rare that same sector goes bad on multiple disks
– CPU is fail-stop
• Use 2 identical disks Approach
– corresponding blocks on both drives are the same
• 3 operations:
– Stable write: retry on 1st until successful, then try 2nd disk
– Stable read: read from 1st. If ECC error, then try 2nd
– Crash recovery: scan corresponding blocks on both disks
• If one block is bad, replace with good one
• If both are good, replace block in 2nd with the one in 1st
CD-ROMs