NguyenDucHuy CHAP5 (1)
NguyenDucHuy CHAP5 (1)
- A block device is one that stores information in fixed-size blocks, each one with its own
address.
- Example : Hard disks, Blu-ray discs, and USB sticks are common block devices.
-Example : Printers, network interfaces, mice (for pointing), rats (for psychology lab
experiments), and most other devices that are not disk-like can be seen as character
devices.
Q2. How does the CPU communicate with the control registers and device data buffers?
1. CPU with control register
In the first approach, each control register is assigned an I/O port number, an 8- or
16-bit integer. The set of all the I/O ports form the I/O port space, which is protected so that
ordinary user programs cannot access it (only the operating system can). Using a special I/O
instruction such as :
IN REG,PORT
CPU can read in control register PORT and store the result in CPU register REG.
Similarly, using
OUT PORT,REG
CPU can write the contents of REG to a control register. Most early computers, including
nearly all mainframes, such as the IBM 360 and all of its successors, worked this way
-In this scheme, the address spaces for memory and I/O are different, as shown in
Fig. 5-2(a). The instructions
IN R0,4
and
MOV R0,4
are completely different in this design. The former reads the contents of I/O port 4 and puts it
in R0 whereas the latter reads the contents of memory word 4 and puts it in R0. The 4s in
these examples refer to different and unrelated address spaces.
2. CPU with device data buffers
How it Works:
1. The CPU uses the OUT instruction to send data to the device by writing to the
device’s I/O port.
2. When receiving data, the CPU issues an IN instruction to read from the device’s
buffer via its port.
3. Each time the CPU accesses the I/O port, it polls the device to see if it’s ready to
send or receive data.
Example:
The CPU sends a character to the printer by writing to the printer’s data port (e.g.,
OUT 0x3F8, 'A').
The printer stores the data in its buffer and prints it.
How it Works:
1. The CPU reads data from the device buffer by loading from a specific memory
address (e.g., LOAD R1, [0x4000]).
2. It writes data to the device buffer by storing data into the designated address (e.g.,
STORE [0x4000], 'A').
3. The device monitors the memory address and retrieves or stores data accordingly.
Example:
A network card might use a range of memory addresses, and the CPU writes data
to the card’s buffer by storing it in these memory locations.
Q3. What is a DMA controller? What does it contain? Describe how it works.
The CPU can request data from an I/O controller one byte at a time, but doing so wastes
the CPU’s time, so a different scheme, called DMA (Direct Memory Access) is often used
-The CPU initializes the DMA controller with the source and destination addresses, transfer
size, and mode; a peripheral sends a DMA request (DRQ) to signal readiness for data
transfer; the DMA controller requests the bus from the CPU via a bus request (BRQ) and,
upon receiving the bus grant, takes control; it transfers data directly between the peripheral
and memory, updating addresses and reducing the transfer count with each step; once the
transfer is complete, the DMA controller sends an interrupt to notify the CPU and releases
the bus for normal CPU operations to resume.
Q4. What are the four properties of a precise interrupt? In these properties, a PC is
mentioned. What is the PC? Explain how it works.
An interrupt that leaves the machine in a well-defined state is called a precise interrupt
(Walker and Cragon, 1995). Such an interrupt has four properties:
-The Program Counter (PC) is a special register in the CPU that holds the memory address
of the next instruction to be executed. It plays a crucial role in controlling the flow of a
program by ensuring that instructions are fetched and executed in the correct order.
There are three fundamentally different ways that I/O can be performed. In this section
we will look at the first one (programmed I/O). In the next two sections we will examine
the others (interrupt-driven I/O and I/O using DMA). The simplest form of I/O is to have
the CPU do all the work. This method is called programmed I/O.
It is simplest to illustrate how programmed I/O works by means of an example.
Consider a user process that wants to print the eight-character string ‘‘ABCDEFGH’’ on
the printer via a serial interface. Displays on small embedded systems sometimes work
this way. The software first assembles the string in a buffer in user space, as shown in
Fig. 5-7(a).
1. Step (a): User Space to Kernel Space Transfer Begins
The string to be printed ("ABCD EFGH") is stored in user space (the part of
memory allocated for applications).
The kernel space handles low-level interactions with devices such as
printers.
The printing process is initiated by transferring data from the user space to
the printer's buffer managed by the kernel (the printer is not directly
accessed by user programs).
2. Step (b): First Character Sent to Printer
The kernel sends the first portion of data (the letter "A") to the printer for
printing.
The "Next" pointer keeps track of where the next character to be printed is
located.
The printed page contains the character "A" that has been printed so far.
3. Step (c): Second Character Sent
The process continues with the next characters (now "AB") being sent to the
printer.
The "Next" pointer updates again, moving to the next part of the string for
the following transfer.
The printed page now shows "AB," representing the cumulative progress of
the printing.
-Programmed I/O is simple but has the disadvantage of tying up the CPU full time until all
the I/O is done. If the time to ‘‘print’’ a character is very short (because all the printer is doing
is copying the new character to an internal buffer), then busy waiting is fine. Also, in an
embedded system, where the CPU has nothing else to do, busy waiting is fine. However, in
more complex systems, where the CPU has other work to do, busy waiting is inefficient. A
better I/O method is needed.
Initialization: CPU stores the string (e.g., "Hello") and enables printer interrupts.
First Transfer: CPU sends the first character (H) to the printer and continues other tasks.
Interrupt Request: Printer prints H and sends an interrupt when ready for the next character.
Interrupt Handler: The CPU’s interrupt handler sends the next character (e) and updates
the pointer.
Repeat: This process continues for each character (l, l, o), with interrupts managing the
flow.
Completion: After the last character is sent, the handler disables further interrupts, and the
CPU resumes normal operation.
The big win with DMA is reducing the number of interrupts from one per character to one per
buffer printed. If there are many characters and interrupts are slow, this can be a major
improvement. On the other hand, the DMA controller is usually much slower than the main
CPU. If the DMA controller is not capable of driving the device at full speed, or the CPU
usually has nothing to do anyway while waiting for the DMA interrupt, then interrupt-driven
I/O or even programmed I/O may be better. Most of the time, though, DMA is worth it.
1. Initialization:
The CPU sets up the DMA controller by specifying:
The source address (where the string, e.g., "Hello", is stored in
memory).
The destination address (the printer’s buffer or I/O port).
The transfer size (number of characters to be printed).
The transfer mode (e.g., byte-by-byte transfer).
2. DMA Request:
The DMA controller takes over once initialized. The CPU instructs the DMAto
start transferring data, then continues with other tasks.
3. Data Transfer:
The DMA controller transfers the string from memory directly to the
printer’s buffer (character-by-character or in chunks).
For each character, the DMA checks if the printer buffer is ready before
sending the next byte.
4. Interrupt on Completion:
After the entire string is transferred, the DMA controller sends an interrupt
to the CPU, notifying it that the print operation is complete.
5. CPU Resumes Normal Operation:
The CPU, which was free to perform other tasks during the transfer, handles
the interrupt and resumes normal operations.
Q8. What are the differences between device controllers and device drivers?
There is a great deal more work involved for the operating system. We will now give an
outline of this work as a series of steps that must be performed in software after the
hardware interrupt has completed. It should be noted that the details are highly system
dependent, so some of the steps listed below may not be needed on a particular machine,
and steps not listed may be required. Also, the steps that do occur may be in a different
order on some machines.
1. Save any registers (including the PSW) that have not already been saved by the
interrupt hardware.
2. Set up a context for the interrupt-service procedure. Doing this may involve setting
up the TLB, MMU and a page table.
5. Copy the registers from where they were saved (possibly some stack) to the
process table.
6. Run the interrupt-service procedure. It will extract information from the interrupting
device controller’s registers.
7. Choose which process to run next. If the interrupt has caused some high-priority
process that was blocked to become ready, it may be chosen to run now.
8. Set up the MMU context for the process to run next. Some TLB setup may also be
needed.
A buffer in user space stores multiple characters before the user process is woken
up.
Issue: If the buffer is paged out to disk when a new character arrives, the system
cannot write to it immediately.
Solution: Locking the buffer in memory can prevent this, but if many processes do
this, memory resources are exhausted.
A buffer in the kernel temporarily holds incoming data. When full, the kernel copies it
to the user buffer in one operation.
Improvement: Reduces context-switching and avoids issues with paging.
Issue: Data may be lost if new data arrives while the user buffer is being pagedin.
(d) Double Buffering (Improved Reliability)
Step 1: User process writes a packet to the kernel buffer to continue workingwithout
waiting.
Reason: Avoids blocking the user process until transmission completes.
Step 2: The kernel copies the packet from its buffer to the network controller’sbuffer.
Reason: The controller ensures consistent speed during transmission,
avoiding interruptions from CPU or other devices.
Step 3: The packet is transmitted over the network.
Problem: Multiple copies introduce latency.
Step 4: At the receiver side, the packet is first buffered in the receiver’s controller buffer to
avoid dropping data.
Step 5: The packet is copied to the receiver’s kernel buffer and then to the userprocess’s
buffer.
Acknowledgment: After the packet is processed, the receiver sends an acknowledgment
back to the sender, indicating it can send the next packet.
11. Describe the seven standard configurations of RAID ?
-RAID level 0 is illustrated in Fig. 5-20(a). It consists of viewing the virtual single disk
simulated by the RAID as being divided up into strips of k sectors each, with sectors 0 to k −
1 being strip 0, sectors k to 2k − 1 strip 1, and so on. For k = 1, each strip is a sector; for k = 2
a strip is two sectors, etc. The RAID level 0 organization writes consecutive strips over the
drives in round-robin fashion, as depicted in Fig. 5-20(a) for a RAID with four disk drives.
-The next option, RAID level 1, shown in Fig. 5-20(b), is a true RAID. It duplicates all the
disks, so there are four primary disks and four backup disks. On a write, every strip is written
twice. On a read, either copy can be used, distributing the load over more drives.
Consequently, write performance is no better than for a single drive, but read performance
can be up to twice as good. Fault tolerance is excellent: if a drive crashes, the copy is simply
used instead. Recovery consists of simply installing a new drive and copying the entire
backup drive to it
-Unlike levels 0 and 1, which work with strips of sectors, RAID level 2 works on a word basis,
possibly even a byte basis. Imagine splitting each byte of the single virtual disk into a pair of
4-bit nibbles, then adding a Hamming code to each one to form a 7-bit word, of which bits 1,
2, and 4 were parity bits. Further imagine that the seven drives of Fig. 5-20(c) were
synchronized in terms of arm position and rotational position. Then it would be possible to
write the 7-bit Hamming coded word over the seven drives, one bit per drive.
-RAID level 3 is a simplified version of RAID level 2. It is illustrated in Fig. 5-20(d). Here a
single parity bit is computed for each data word and written to a parity drive. As in RAID level
2, the drives must be exactly synchronized, since individual data words are spread over
multiple drives.
-RAID level 4 [see Fig. 5-20(e)] is like RAID level 0, with a strip-for-strip parity written onto an
extra drive. For example, if each strip is k bytes long, all the strips are EXCLUSIVE ORed
together, resulting in a parity strip k bytes long. If a drive crashes, the lost bytes can be
recomputed from the parity drive by reading the entire set of drives.
-As a consequence of the heavy load on the parity drive, it may become a bottleneck. This
bottleneck is eliminated in RAID level 5 by distributing the parity bits uniformly over all the
drives, round-robin fashion, as shown in Fig. 5-20(f). However, in the event of a drive crash,
reconstructing the contents of the failed drive is a complex process.
-Raid level 6 is similar to RAID level 5, except that an additional parity block is used. In other
words, the data is striped across the disks with two parity blocks instead of one. As a result,
writes are bit more expensive because of the parity calculations, but reads incur no
performance penalty. It does offer more reliability (imagine what happens if RAID level 5
encounters a bad block just when it is rebuilding its array).