ALL CSC 417 NOTE
ALL CSC 417 NOTE
• b. Cache Memory
• Levels:
• L1 Cache: Closest to the CPU, smallest in size, but fastest.
• L2 Cache: Larger than L1, slightly slower.
• L3 Cache: Shared among multiple CPU cores, larger and slower than L2.
• Purpose: Reduces latency by storing frequently accessed instructions/data.
• Features:
• Faster than RAM but more expensive.
• Operates using principles like spatial locality (access data close to recently accessed data) and temporal
locality (reuse recently accessed data).
• c. Primary Memory (RAM)Dynamic RAM (DRAM): Used in main memory.
Slower but cheaper than SRAM. It needs periodic refreshing to retain data.
Static RAM (SRAM): Used in cache. Faster and more expensive than DRAM.
• d. Secondary Memory
• Purpose: Provides non-volatile, large-scale storage for data and programs.
• Types:
• HDDs: Mechanical storage; slower read/write speeds.
• SSDs: Flash-based, faster than HDDs, but costlier.
• Hybrid Drives: Combines HDD and SSD for balance.
• e. Tertiary Memory Purpose: Used for archival and backups.
Examples: Optical drives, and magnetic tapes.
• f. Virtual Memory Purpose: Expands the apparent memory available
to applications by using the hard disk as an extension of RAM.
Implementation: Uses paging to divide memory into fixed-sized
blocks. Maps logical addresses to physical addresses.
Characteristics of Memory Operations
• Memory operations determine how efficiently data is stored, accessed, and
manipulated.
• a. Key Characteristics
• a. Access Time
• Time taken to access data from memory.
• Registers: Few nanoseconds.
• RAM: 10-100 nanoseconds.
• HDD: 10 milliseconds.
• b. Memory Bandwidth
• The rate at which data is read or written, measured in GB/s.
• Important for high-performance systems like gaming PCs or data-intensive servers.
• c. Volatility
• Volatile: Requires power to retain data (e.g., RAM).
• Non-Volatile: Retains data without power (e.g., SSDs, HDDs, NVRAM).
• d. Latency vs Throughput
• Latency: Time delay between request and delivery.
• Throughput: Amount of data processed in a given time.
• e. Power Efficiency
• Power consumption varies by memory type.
• Registers consume the least energy.
• HDDs consume more energy during spinning and seek operations.
Key Memory Operations
• a. Read Operation-Involves fetching data stored at a specific address.
Sequential Access: Data read in sequence (e.g., tapes).Random
Access: Access any location directly (e.g., RAM).
• b. Write Operation-Saves new data to memory. May overwrite
existing data in writable memory (e.g., RAM, SSD).
• c. Erase Operation- Applies to non-volatile memory like flash. Flash
memory requires data blocks to be erased before rewriting.
• d. Memory Mapping -Used by CPUs to map logical addresses to
physical memory locations. Performed by the Memory Management
Unit (MMU).
Performance Enhancements
• a. Caching Techniques
1. Write-Through Cache:
1. Data written to both cache and main memory.
2. Pros: Data consistency.
3. Cons: Slower write performance.
2. Write-Back Cache:
1. Data written only to cache initially and main memory updated later.
2. Pros: Faster writes.
3. Cons: Risk of data loss in cache failures.
• b. Prefetching
• Predictively loading data into cache based on anticipated future needs.
• c. Parallelism in Memory
• Multithreading to allow simultaneous memory access.
Emerging Memory Technologies
• a. 3D NAND Structure: Stacks memory cells vertically.
• Benefits: Increased density. Lower cost per GB. Higher durability for SSDs.
• b. Phase Change Memory (PCM) - Uses heat-induced phase changes in
materials to store data.
Advantages: Faster than flash. High endurance.
• c. Magnetoresistive RAM (MRAM) - Uses magnetic states to store data.
Non-volatile and offers near-RAM speeds.
• d. Neuromorphic Memory - Mimics biological neural networks for AI and
machine learning applications.
• e. Quantum Memory - Uses quantum mechanics principles to store data in
quantum states. Still in research but promises revolutionary capacity and
speed.
Advanced Applications of Memory Systems
• a. High-Performance Computing (HPC)
• Requires large, fast memory systems for simulations and data analysis.
• b. Cloud Computing
• Relies on distributed memory systems for scalable storage and
performance.
• c. Internet of Things (IoT)
• Embedded memory systems with low power consumption are critical for
IoT devices.
• d. AI and Machine Learning
• Demand high-speed memory with massive bandwidth for real-time data
processing.
Cache Memory
Characteristics
• Location
• Capacity
• Unit of transfer
• Access method
• Performance
• Physical type
• Physical characteristics
• Organisation
Location
• CPU
• Internal
• External
Capacity
• Word size
— The natural unit of organisation
• Number of words
— or Bytes
Unit of Transfer
• Internal
— Usually governed by data bus width
• External
— Usually a block which is much larger than a
word
• Addressable unit
— Smallest location which can be uniquely
addressed
— Word internally
— Cluster on M$ disks
Access Methods (1)
• Sequential
— Start at the beginning and read through in
order
— Access time depends on location of data and
previous location
— e.g. tape
• Direct
— Individual blocks have unique address
— Access is by jumping to vicinity plus sequential
search
— Access time depends on location and previous
location
— e.g. disk
Access Methods (2)
• Random
— Individual addresses identify locations exactly
— Access time is independent of location or
previous access
— e.g. RAM
• Associative
— Data is located by a comparison with contents
of a portion of the store
— Access time is independent of location or
previous access
— e.g. cache
Memory Hierarchy
• Registers
— In CPU
• Internal or Main memory
— May include one or more levels of cache
— “RAM”
• External memory
— Backing store
Memory Hierarchy - Diagram
Performance
• Access time
— Time between presenting the address and
getting the valid data
• Memory Cycle time
— Time may be required for the memory to
“recover” before next access
— Cycle time is access + recovery
• Transfer Rate
— Rate at which data can be moved
Physical Types
• Semiconductor
— RAM
• Magnetic
— Disk & Tape
• Optical
— CD & DVD
• Others
— Bubble
— Hologram
Physical Characteristics
• Decay
• Volatility
• Erasable
• Power consumption
Organisation
• Physical arrangement of bits into words
• Not always obvious
• e.g. interleaved
The Bottom Line
• How much?
— Capacity
• How fast?
— Time is money
• How expensive?
Hierarchy List
• Registers
• L1 Cache
• L2 Cache
• Main memory
• Disk cache
• Disk
• Optical
• Tape
So you want fast?
• It is possible to build a computer which
uses only static RAM (see later)
• This would be very fast
• This would need no cache
— How can you cache cache?
• This would cost a very large amount
Locality of Reference
• During the course of the execution of a
program, memory references tend to
cluster
• e.g. loops
Cache
• Small amount of fast memory
• Sits between normal main memory and
CPU
• May be located on CPU chip or module
Cache
• Small amount of fast memory
• Sits between normal main memory and
CPU
• May be located on CPU chip or module
Cache and Main Memory
• The use of multiple levels of cache is
depicted in the in the (b) of the figure
above.
• The L2 cache is slower and typically larger
than the L1 cache, and the L3 cache is
slower and typically larger than the L2
cache.
Cache/Main Memory Structure
• The structure of a cache/ main- memory
system is shown below.
• Main memory consists of up to 2n
addressable words, with each word having
a unique n- bit address.
• For mapping purposes, this memory is
considered to consist of a number of
fixed- length blocks of K words each. That
is, there are M = 2n/K blocks in main
memory. The cache consists of m blocks,
called lines. Each line contains K words
plus a tag of a few bits
Cache/Main Memory Structure
• Each line also includes control bits (not
shown), such as a bit to indicate whether
the line has been modified since being
loaded into the cache.
• The length of a line, not including tag and
control bits, is the line size. The line size
may be as small as 32 bits, with each
“word” being a single byte; in this case
the line size is 4 bytes.
• If a word in a block of memory is read,
that block is transferred to one of the lines
of the cache. Because there are more
blocks than lines, an individual line cannot
be uniquely and permanently dedicated to
a particular block.
• Thus, each line includes a tag that
identifies which particular block is
currently being stored.
Cache operation – overview
• CPU requests contents of memory location
• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block from
main memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which
block of main memory is in each cache
slot
Cache Read Operation - Flowchart
• The processor generates the read address (RA)
of a word to be read. If the word is contained in
the cache, it is delivered to the processor.
Otherwise, the block containing that word is
loaded into the cache, and the word is delivered
to the processor.
• The figure below shows typical of contemporary
cache organizations. In this organization, the
cache connects to the processor via data, control,
and address lines. The data and address lines
also attach to data and address buffers, which
attach to a system bus from which main memory
is reached.
Typical Cache Organization
• When a cache hit occurs, the data and
address buffers are disabled and
communication is only between processor
and cache, with no system bus traffic.
When a cache miss occurs, the desired
address is loaded onto the system bus
and the data are returned through the
data buffer to both the cache and the
processor.
Cache Design
• Addressing
• Size
• Mapping Function
• Replacement Algorithm
• Write Policy
• Block Size
• Number of Caches
Cache Addressing
• Where does cache sit?
— Between processor and virtual memory management
unit
— Between MMU and main memory
• Logical cache (virtual cache) stores data using
virtual addresses
— Processor accesses cache directly, not thorough physical
cache
— Cache access faster, before MMU address translation
— Virtual addresses use same address space for different
applications
– Must flush cache on each context switch
• Physical cache stores data using main memory
physical addresses
• Almost all non-embedded processors, and
many embedded processors, support
virtual memory, In essence, virtual
memory is a facility that allows programs
to address memory from a logical point of
view, without regard to the amount of
main memory physically available. When
virtual memory is used, the address fields
of machine instructions contain virtual
addresses.
• For read to and writes from main memory,
a hardware memory management unit
(MMU) translates each virtual address into
a physical address in main memory.
• When virtual addresses are used, the
system designer may choose to place the
cache between the processor and the
MMU or between the MMU and main
memory as shown below.
Logical and Physical Caches
• A logical cache, also known as a virtual
cache, stores data using virtual addresses.
The processor accesses the cache directly,
without going through the MMU. A
physical cache stores data using main
memory physical addresses
• Question ()
What are the advantages and disadvantages
of logical cache over physical cache?
Cache Size
• Cost
— More cache is expensive
• Speed
— More cache is faster (up to a point)
— Checking cache for data takes time
• The size of the cache should to be small enough
so that the overall average cost per bit is close to
that of main memory alone and large enough so
that the overall average access time is close to
that of the cache alone.
• 24 bit address
• 2 bit word identifier (4 byte block)
• 22 bit block identifier
— 8 bit tag (=22-14)
— 14 bit slot or line
• No two blocks in the same line have the same Tag field
• Check contents of cache by finding line and checking Tag
Direct Mapping
Cache Line Table
1 1,m+1, 2m+1…2s-m+1
…
m-1 m-1, 2m-1,3m-1…2s-1
Direct Mapping Cache Organization
Direct Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w
words or bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+
w/2w = 2s
• Number of lines in cache = m = 2r
• Size of tag = (s – r) bits
Direct Mapping pros & cons
• Simple
• Inexpensive
• Fixed location for given block
— If a program accesses 2 blocks that map to the
same line repeatedly, cache misses are very
high
— Its main disadvantage is that there is a fixed
cache location for any given block. Thus, if a
program happens to reference words
repeatedly from two different blocks that map
into the same line, then the blocks will be
continually swapped in the cache, and the hit
ratio will be low (a phenomenon known as
thrashing).
Associative Mapping
• Associative mapping overcomes the
disadvantage of direct mapping by
permitting each main memory block to be
loaded into any line of the cache.
• A main memory block can load into any
line of cache
• Memory address is interpreted as tag and
word
• Tag uniquely identifies block of memory
• Every line’s tag is examined for a match
• Cache searching gets expensive
• the cache control logic interprets a
memory address simply as a Tag and a
Word field.
• The Tag field uniquely identifies a block of
main memory. To determine whether a
block is in the cache, the cache control
logic must simultaneously examine every
line’s tag for a match
Associative Mapping from
Cache to Main Memory
Fully Associative Cache Organization
Associative Mapping
Address Structure
Word
Tag 22 bit 2 bit
• 22 bit tag stored with each 32 bit block of data
• Compare tag field with tag entry in cache to
check for hit
• Least significant 2 bits of address identify which
16 bit word is required from 32 bit data block
• e.g.
— Address Tag Data Cache line
— FFFFFC FFFFFC24682468 3FFF
Associative Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w
words or bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+
w/2w = 2s
• Number of lines in cache = undetermined
• Size of tag = s bits
• With associative mapping, there is
flexibility as to which block to replace
when a new block is read into the cache.
• The principal disadvantage of associative
mapping is the complex circuitry required
to examine the tags of all cache lines in
parallel.
Set Associative Mapping
• Set- associative mapping is a compromise
that exhibits the strengths of both the
direct and associative approaches while
reducing their disadvantages.
• Cache is divided into a number of sets
• Each set contains a number of lines
• A given block maps to any line in a given
set
— e.g. Block B can be in any line of set i
• e.g. 2 lines per set
— 2 way associative mapping
— A given block can be in one of 2 lines in only
one set
• In this case, the cache consists of
number sets, each of which consists of a
number of lines.
• This is referred to as k-way
set-associative mapping.
Set Associative Mapping
Example
• 13 bit set number
• Block number in main memory is modulo
213
Mapping From Main Memory to Cache:
v Associative
Mapping From Main Memory to Cache:
k-way Associative
K-Way Set Associative Cache
Organization
Set Associative Mapping
Address Structure
Word
Tag 9 bit Set 13 bit 2 bit
• Disadvantages of RAID 0
• RAID 0 cannot be used in cri?cal systems because it is unable to
tolerate the fault.
• If one disk fails in RAID 0, then all the data of other disks are also lost.
• RAID 1
• RAID 1 can also be called Mirroring. It will take all data from a disk
and then write it into a second disk, which is parallel to the Hrst disk.
In RAID 1, there is very high redundancy because each disk contains
the exact copy of data on another disk.
• It needs minimum two disks to work. The setup of RAID 1 provides
protec?on against data loss, or we can say that it has the fault
tolerance capacity.
• If one disk fails, then the copy of that disk provides the required data.
• Here, the systems can read the data from both disks simultaneously.
Because of this feature, it will also be able to speed up the
performance and availability.
• S?ll, the performance of write opera?on is una>ected. It takes more
?me as compared to the read opera?on because RAID 1 contains two
disks wri?ng in parallel, and the wri?ng opera?on uses the capacity of
one disk, and they have to write the same data twice.
• In RAID 1, the downside of the disks contains the high costs because
one disk must build twice the capacity that's actually needed at this
level.
• Advantages of RAID 1
• As compare to the single disk, the RAID 1 provides an excellent read and
write speed.
• It has the fault tolerance ability. If one disk fails, we don't need to build
the data again, and we will just simply copy the data into the
replacement disk.
• It is a very simple technology, and the implementa?on of RAID 1 is also
very simple.
• Disadvantages of RAID 1
• n RAID 1, the data has to be wriFen twice. That's why e>ec?ve storage
capacity is only half of the total disk capacity, and it is the main
disadvantage of RAID 1.
• RAID 1 is more expensive as compared to RAID 0 because it needs twice
disks to mirror the data.
• The hot-swapping of failed disks are not always allowed by the
so\ware RAID 1. When we power down the computer through which
failed disk is aFacked, the failed disk can only be replaced.
• A lot of people simultaneously use the servers, and this power-down
process may not be accepted by them. That's why these types of
systems typically use hardware controllers because they support hot-
swapping.
• When we write the data on disks, it will evaluate the ECC code (error
correc?on code) for the data on the My. A\er that, it will strip the data
bits to the data disks, and Hnally, it will write the ECC code to the
redundancy disks. When we read the data from the disks, it uses the
redundancy disks to read the corresponding ECC code. Now it will
verify whether the data is consistent. If needed, it will perform
appropriate correc?ons on the My.
• This process uses many disks. It will be conHgured in various disk
conHgura?ons. Now this ?me, RAID 2 is not useful anymore because it
is costly, and the implementa?on of RAID 2 in RAID controller is
dibcult. Now this ?me, ECC is also redundant because the hard disks
are capable of doing the work of ECC themselves.
• Advantages of RAID 2
• RAID 2 uses hamming code while error correc?on.
• It can store the parity with the help of one designated drive.
• Disadvantages of RAID 2
• RAID 2 needs an extra drive for error detec?on.
• It contains an extra drive. That's why it is expensive and contains a
complex structure.
• RAID 3
• RAID 3 can also be called Byte level stripping. The working of RAID 3 is the
same as RAID 0 as it uses byte-level stripping, but it also needs an extra disk
in the array. RAID 3 is used to support a special type of processor in the
parity code calcula?ons, which can be called 'parity disk'. In RAID 3, we strip
the bytes across the disks in place of striping the blocks across the disks. At
this level, we require mul?ple data disks and a dedicated disk so that we can
store the parity. In the conHgura?on process of RAID 3, the data will be
divided into individual bytes, and then it will be saved on a disk. For each
row of the data, the parity disk will be determined, and a\er that, it will be
saved in the men?oned parity disk. If there is any failure, it has the ability to
recover the data with the help of parity bytes that correspond with them
and the appropriate calcula?on of remaining bytes.
• Although this level is rarely used in prac?ce, but it has a lot of beneQts that are if there is
any damage of disk in the arrangement, it can resist. Secondly, it has a very high reading
speed. Unfortunately, RAID 3 also has a lot of drawbacks. First, as compared to the read
speed, the write speed is very slow because of its necessity for checksum calcula?ng.
(RAID hardware controllers are also unable to solve this problem). The second problem is
that if there is any disk failure, then the whole system will work very slowly. RAID 3 has
the ability to resist breakdown that means if any disk in the array fails, it will replace the
damaged disk, but replacing process is very costly. The third problem is that we use the
disk for calcula?ng checksums, which is the boFleneck in the en?re array performance.
• Though the above descrip?on, RAID 3 is unable to show a good, reliable and cheap
solu?on. That's why RAID 3 is used rarely in prac?ce. The systems which are based on
RAID 3 are mostly used for the implementa?on purpose where very large Hles are referred
by the small number of users.
• Advantages of RAID 3
• RAID 3 provides high throughput to transfer the huge data.
• It solves RAID 2's main disadvantage that means it can be resistant to disk
failure and breakdown.
• Disadvantages of RAID 3
• If we only need to transfer a small Hle, the conHgura?on may be too much.
• If there is any disk failure, then it will signiHcantly decrease the throughput.
• RAID 4
• RAID 4 can be known as Block-level striping. The working of RAID 4 is the same as RAID 3. The
main di>erence between them is the process of sharing the data. They are divided into blocks
such as 16, 32, 64, or 128 GB. Same as RAID 0, it will be wriFen on the disk. For each row of
wriFen data, a parity disk is used to write any recorded block. That means this level uses block-
level for striping data in place of byte-level striping. RAID 5 and RAID 4 has a lot of similari?es,
but RAID 4 conHnes all parity data to a single disk. So we can say that it does not use
distribu?ve parity.
• In RAID 4, we can complete implementa?on and conHgura?on with the help of minimum three
disks. RAID 4 also requires hardware support to perform the parity calcula?ons. Due to this, we
are able to recover data with the help of appropriate mathema?cal opera?ons.
• Advantages of RAID 4
• RAID 4 allows the block-level striping, which provides the facility to simultaneously send
I/O requests.
• It provides a low storage overhead. If we add various disks, it will become more lowers.
• This level does not need a synchronized controller or spindles.
• Disadvantages of RAID 4
• It contained the parity drives, which may lead to a boFleneck.
• If we try to perform a write opera?on simultaneously, the opera?on will be slower
because the informa?on of parity is wriFen to one disk.
• RAID 5
• RAID 5 can be called Stripping with parity. It uses the block level for data striping and
also uses distribu?ve parity. RAID 5 needs minimum three disks but can work up to 16
disks. It is the most secured RAID level. Parity is a type of raw binary data. RAID
system calculates the values of parity and using these values, create a parity block. If
any disk fails in the RAID system, it will use the parity block to recover striped data.
Mostly RAID system with parity func?on uses the array to store the parity blocks in
the disks. At this level, data blocks are striped across the drives. The parity checksum
of all data blocks is wriFen only on one drive. The parity checksum does not use a
Hxed drive, but they are spreading across all the drives. If the data of any data block
has no longer available, with the help of parity data, the computer can recalculate the
data. That means if there is any single drive failure, RAID 5 has the ability to resist
against the failure of any disk in the array without access to data or losing the data.
• Advantages of RAID 5
• In RAID 5, the write data transac?ons are slow because of the calcula?on of parity, while the read
data transac?ons are very fast.
• If there is any disk failure in the RAID 5, we s?ll have the power to access all the data no maFer that
the failed drive is being replaced and the data is rebuilt by the storage controller on a new disk.
• Disadvantages of RAID 5
• If there is any disk failure, it will a>ect the throughput, but this process is s?ll acceptable.
• RAID 5 is complex technology. Suppose there is a disk of 4TB in the array of various disks, and it
fails. In this case, replacing and restoring the data of failed disk may take a day or more than that on
the basis of the speed of the controller and the load on an array. At this ?me, if any disk goes bad,
data will be lost forever.
• RAID 6
• RAID 6 can also be called Striping with double parity. The working of RAID 5 is the
same as RAID 6, and the di>erence between them is that the system stores an
addi?onal parity block on each desk in RAID 6. Due to this, a conHgura?on will be
enabled where before the array is unavailable, the two disks may be failed. It needs
two di>erent sets for parity calcula?ons, and it has the ability to rebuild an array even
if two drives simultaneously fail. RAID 6 needs minimum four disks, and it can
withstand two disks that are dying simultaneously. The two disks will be used for the
data, and the remaining two disks will be used for parity informa?on. If there is a rise
in the number of disks, it will increase the chances of mul?ple failures and also
increase the complexity of rebuilding the disk set.
• As compared to the RAID 5, it o>ers higher redundancy and also increases the
performance of read. If there is an intensive write opera?on, this level will also
su>er from the same server performance overhead. This performance is depended
on the architecture of RAID system, i.e., so\ware or hardware. If the system
performs high-performance parity calcula?on with the help of including processing
so\ware, and if it is located in the Hrmware, it will a>ect the performance.
• In RAID 6, the chances of two disks failure at the same ?me are very less. In the
RAID 5 system, if any disk fails, it will take hours, days or more ?me to replace it
with the new disk. At that ?me, if another disk fails, we will lose all of our data
forever. But in RAID 6, the RAID array will even survive from the second failure.
• Advantages of RAID 6
• In RAID 6, the read data transac?ons are very fast, just like the RAID 5.
• It is more secure than RAID 5 because if two disks fail, we are able to s?ll access all our data even
while the system is replaced with our failed disks.
• Disadvantages of RAID 6
• In the RAID 6, we have to calculate the addi?onal parity. That's why write data transac?ons in
RAID 6 are slower than the RAID 5. It can be slower by 20% as compared to RAID 5.
• If there is any disk failure, then it will a>ect the throughput, but this process is s?ll acceptable.
• RAID 6 is complex technology. If there is a disk failure in any RAID array, rebuilding an array can
take a long ?me.
• Op?cal Memory
• The op?cal memory was released in 1982, and Sony and Philips
developed it. These memories perform their opera?ons with the help of
light beams, and it also needs op?on drive for the opera?ons. We can use
op?cal memory to store backup, audio, video, and also for caring data.
The speed of a Mash drive and the hard drive is faster as compared to the
read/write speed. There are various examples of op?cal memory that are
Compact disk (CD), Bluray Disk (BD), and Digital Versa?le Disk (DVD).
• Compact Disk (CD)
• It is a type of digital audio system, which is used to store data. It is composed of
circular plas?c, in which aluminium alloy is used to coat the single side of plas?c,
which is used to store the data. It also contains an addi?onal thin plas?c covering,
which is used to protect the data. CD will perform its opera?ons with the help of a CD
drive. The compact disk can be called the non-erasable disk. Here we use the laser
beam to imprint the data on the disk. In the star?ng, the CDs are used to hold the 60
to 75 minutes of audio informa?on that has the ability to store about 700MB of data.
But now, it can store 60 minutes of audio informa?on on a single side. Now many
devices have been developed which contains low cost and high capacity as compared
to the CD.
• Types of Compact Disk
• CD-ROM:
• CD-ROM is also known as CD read-only memory. It is mainly used to store computer data. As we
know earlier, the compact disks were used to store the video and audio data, but it uses the digital
form to store the data, so we can also be used the compact disks to store the computer data.
• If there is some error in the audio and video appliance, it will ignore that error, and that error does
not reMect in the produced video and audio. But if the computer data contains any error, then CD-
ROM will not tolerate it, and that error will reMect in the produced data. At the ?me of inden?ng
pills on the compact disks, it is impossible to prevent physical imperfec?on. So in order to detect
and correct the error, we have to add some extra bits.
• The compact disk (CD) and compact disk read-only memory (CD-ROM) contain one spiral track,
beginning from the track's centre and spiral out towards the outer edge. CD-ROM uses the blocks
or sectors to store the data
• CD-R:
• CD-R is also known as CD-Recordable. It is a type of write once read many, or we
can say that it allows single ?me recording on a disk. It is used in these types of
applica?ons that require one or a small number of copies of a set of data. CD
recordable composed of polycarbonate plas?c substrate, coa?ng of thin reMec?ve
metal, and a protec?ve outer coa?ng.
• CD-RW:
• CD-RW is also known as CD-Rewritable. It is a type of compact disk format which
allow us to repeatedly recording on a disk. CD rewritable and CD recordable both
are composed of the same material. So it is also composed of polycarbonate
plas?c substrate, coa?ng of thin reMec?ve metal, and a protec?ve outer coa?ng.
The dye will be replaced by an alloy in the CD-RW
• Digital Versa?le Disk (DVD)
• The DVD (digital versa?le disk) technology was Hrst launched in 1996.
The appearance of the CD (compact disk) and the DVD (digital
versa?le disk) has the same. The storage size is the main di>erence
between CD and DVD. So the storage size of a DVD is much larger
than the CD. While designing DVDs, there are several changes that are
done in their design to make the storage larger.
• Blu-Ray DVD
• A Blu-ray disk is a type of high capacity op?cal disk medium, which is used to store a huge amount of data and
to record and playback high deHni?on video. Blu-ray was designed to supersede the DVD. While a CD is able to
store 700 MB of data and a DVD is able to store 4.7 GB of data, a single Blu-ray disk is able to store up to 25 GB
of data. The dual-layer Blu-ray disks can hold 50 GB of data. That amount of storage is equivalent to 4 hours of
HDTV. There is also a double-sided dual-layer DVD, which is commonly used and able to store 17 GB of data.
• Blu-ray disk uses the blue lasers, which help them to hold more informa?on as compared to other op?cal
media. The laser is actually known as 'blue-violet', but the developer rolls o> the tongue to make 'Blue-violet-
ray' a liFle earlier as 'Blu-ray'. The CDs and DVDs use the red laser, and their wavelength (650 nm) is greater
than the blue-violet laser (405nm). With the help of a small wavelength, the laser can focus on a small area. In
Blu-ray disks, we can use the same size, which is used by CD or DVD and store a large amount of data on a disk.
Blu-ray is able to provide very high resolu?on as compared to the DVD. On the basis of standard deHni?on, a
DVD can provide a deHni?on of 720x480 pixels. In contrast, the Blu-ray high deHni?on contains 1920X1080
pixel resolu?on.
• Magne?c Tape
• Reading and wri?ng techniques in the tape system is the same as the disk system.
In this, the medium is Eexible polyester tape coated with a magne3zable
material. The tape's data can be structured as a number of parallel tracks that will
be run lengthwise. In this form, the recording of data can be called a parallel
recording. Instead of the parallel recording, most of the modern system uses
serial recording. The serial recording uses the sequence of bits along with each
track to lay of the data. It is done with the help of a magne?c disk. In the serial
recoding, the disk contains the physical record on the tape, which can be
described as the data which are read and write in the con?guous blocks.
• A tape drive can be accessed as a sequen?al access device. If the current
posi?on of the head is beyond the desired result, we have to rewind the tape at
a certain distance and star?ng reading forward. During the opera?on of reading
and wri?ng only, the tape is in mo?on. The di>erence between tape and disk
drive is that the disk drive can be referred to as a direct access device. A disk
drive is able to get the desired result without sequen?ally reading all sectors on
a disk. It has to only wait un?l the intervening sectors have arrived within one
track. A\er that, it is able to successive access to any track.
•
The magne?c tape can also be known as a type of second memory. It can also be
used as the slowest speed and lowest cost member of the memory hierarchy.
Virtual Memory is a storage allocation scheme in which secondary memory can be addressed
as part of the main memory.
Virtual memory uses both hardware and software to enable a computer to compensate for
physical memory shortages, temporarily transferring data from random access memory (RAM)
to disk storage. Mapping chunks of memory to disk files enables a computer to treat secondary
memory as though it were main memory.
The size of virtual storage is limited by the addressing scheme of the computer system and the
amount of secondary memory available not by the actual number of main storage locations.
It is a technique that is implemented using both hardware and software. It maps memory
addresses used by a program, called virtual addresses, into physical addresses in computer
memory.
It frees applications from managing shared memory and saves users from having to add
memory modules when RAM space runs out.
It has increased speed when only a segment of a program is needed for execution.
Pages in the original process can be shared during a fork system call operation that creates a
copy of itself.
Data must be mapped between virtual and physical memory, which requires extra hardware
support for address translations, slowing down a computer further.
The size of virtual storage is limited by the amount of secondary storage, as well as the
addressing scheme with the computer system.
Thrashing can occur if there is not enough RAM, which will make the computer perform
slower.
When talking about the differences between virtual and physical memory, the biggest distinction
commonly made is to speed. RAM is considerably faster than virtual memory. RAM, however,
tends to be more expensive.
When a computer requires storage, RAM is the first used. Virtual memory, which is slower, is
used only when the RAM is filled.
An abstraction that extends the The actual hardware (RAM) that stores
Definition available memory by using disk data and instructions currently being used
storage by the CPU
Topic: Input/Output
Device
Interface External
I/O sensors
Processor Bus Module Device and
Interface controls
Device
Interface
External Devices
• External devices are needed as a means of communicaKon
to the outside world (both input and output – I/O)
• Types
• Human readable – communicaKon with user (monitor, printer,
keyboard, mouse)
• Machine readable – communicaKon with equipment (hard drive,
CDROM, sensors, and actuators)
• CommunicaKon – communicaKon with remote
computers/devices (Can be any of the Qrst two or a network
interface card or modem)
Generic Device Interface
Con>guration
Device Interface
Components
• The control logic is the I/O module's interface to the device
• The data channel passes the collected data from or the data to be
output to the device. On the opposite end is the I/O module, but
eventually it is the processor.
• The transducer acts as a converter between the digital data of the
I/O module and the signals of the outside world.
• Keyboard converts moKon of key into data represenKng key
pressed or released
• Temperature sensor converts amount of heat into a digital value
• Disk drive converts data to electronic signals for controlling the
read/write head
I/O Module Functions
• Control & Timing
• Processor CommunicaKon
• Device CommunicaKon
• Data Bu@ering
• Error DetecKon
I/O Module: Control and Timing
• Required because of mulKple devices all communicaKng on the same
channel
• Example
• CPU checks I/O module device status
• I/O module returns status
• If ready, CPU requests data transfer
• I/O module gets data from device
• I/O module transfers data to CPU
• VariaKons for output, DMA, etc.
I/O Module: Processor Communication
• Commands from processor – Examples: READ SECTOR, WRITE
SECTOR, SEEK track number, and SCAN record ID.
• Data – passed back and forth over the data bus
• Status reporKng – Request from the processor for the I/O Module's
status. May be as simple as BUSY and READY
• Address recogniKon – I/O device is setup as a block of one or more
addresses unique to itself
Other I/O Module Functions
• Device CommunicaKon – speciQc to each device
• Data Bu@ering – Due to the di@erences in speed (device
is usually orders of magnitude slower) the I/O module
needs to bu#er data to keep from tying up the CPU's bus
with slow reads or writes
• Error DetecKon – simply distribuKng the need for
watching for errors to the module. They may include:
• MalfuncKons by device (paper jam)
• Data errors (parity checking at the device level)
• Internal errors to the I/O module such as bu@er overruns
I/O Module Structure
I/O Module Level of Operation
• How much control will the CPU be required to handle?
• How much will the CPU be allowed to handle?
• What will the interface look like, e.g., Unix treats everything like a Qle
• Support mulKple or single device
• Will addiKonal control be needed for mulKple devices on a single port
(e.g., serial port versus USB)
Input/Output Techniques
• Programmed I/O – poll and response
• Interrupt driven – module calls for CPU when needed
• Direct Memory Access (DMA) – module has direct access to speciQed
block of memory
Programmed I/O –
CPU has direct control over I/O
• Processor requests operaKon with commands sent to I/O
module
• Control – telling a peripheral what to do
• Test – used to check condiKon of I/O module or device
• Read – obtains data from peripheral so processor can read it
from the data bus
• Write – sends data using the data bus to the peripheral
• I/O module performs operaKon
• When completed, I/O module updates its status registers
• Sensing status – involves polling the I/O module's status
registers
Programmed I/O (continued)
• I/O module does not inform CPU directly
• CPU may wait or do something and come back later
• Wastes CPU Kme because typically processor is much faster than I/O
• CPU acts as a bridge for moving data between I/O module and main memory,
i.e., every piece of data goes through CPU
• CPU waits for I/O module to complete operaKon
Interrupt Driven I/O
• Overcomes CPU waiKng
• Requires setup code and interrupt service rouKne
• No repeated CPU checking of device
• I/O module interrupts when ready
• SKll requires CPU to be go between for moving data between I/O
module and main memory
Basic Interrupt I/O Operation
• CPU iniKalizes the process
• I/O module gets data from peripheral while CPU does other work
• I/O module interrupts CPU
• CPU requests data
I/O module transfers data
Design Issues
• ResoluKon of mulKple interrupts – How do you idenKfy the module
issuing the interrupt?
• Priority – How do you deal with mulKple interrupts at the same Kme
or interrupKng in the middle of an interrupt?
Identifying Interrupting Module
• Di@erent interrupt line for each module
• Limits number of devices
• Even with this method, there are o^en mulKple interrupts sKll on a
single interrupt lined
• Priority is set by hardware
Software poll
• Single interrupt line – when interrupt occurs, CPU then goes out to
check who needs a_enKon
• Slow
• Priority is set by order in which CPU polls devices
Daisy Chain or Hardware
poll
• Interrupt Acknowledge sent down a chain
• Module responsible places unique vector on bus
• CPU uses vector to idenKfy handler rouKne
• Priority is set by order in which interrupt
acknowledge gets to I/O modules, i.e., order of
devices on the chain
Bus Arbitration
• Allow mulKple modules to control bus
• I/O Module must claim the bus before it can raise
interrupt
• Can do this with:
• Bus controller/arbiter
• Distribute control to devices
• Must be one master, either processor or other device
• Device that "wins" places vector on bus uniquely
idenKfying interrupt
• Priority is set by priority in arbitraAon, i.e., whoever is
currently in control of the bus
Direct Memory Access (DMA)
• Impetus behind DMA – Interrupt driven and programmed I/O require
acKve CPU intervenKon (All data must pass through CPU)
• Transfer rate is limited by processor's ability to service the device
• CPU is Ked up managing I/O transfer
DMA (continued)
• AddiKonal Module (hardware) on bus
• DMA controller takes over bus from CPU for I/O
• WaiKng for a Kme when the processor doesn't need bus
• Cycle stealing – seizing bus from CPU (more common)
DMA Operation
• CPU tells DMA controller:
• whether it will be a read or write operaKon
• the address of device to transfer data from
• the starKng address of memory block for the data transfer
• the amount of data to be transferred
• DMA performs transfer while CPU does other processes
• DMA sends interrupt when completed
Evolutions of I/O Methods
Growth of more sophisKcated I/O devices
1.Processor directly controls device
2.Processor uses Programmed I/O
3.Processor uses Interrupts
4.Processor uses DMA
5.Some processing moved to processors in I/O module
that access programs in memory and execute them on
their own without CPU intervenKon (I/O Module
referred to as an I/O Channel)
6.Distributed processing where I/O module is a computer
in its own right(I/O Module referred to as an I/O
Processor)
Computer Organization
Asynchronous input output
synchronization
• Asynchronous input/output (I/O) synchroniza5on is a technique used
in computer organiza5on to manage the transfer of data between the
central processing unit (CPU) and external devices. In asynchronous
I/O synchroniza5on, data transfer occurs at an unpredictable rate,
with no Fxed 5ming or synchroniza5on between the CPU and external
devices.
• This approach diHers from synchronous I/O synchroniza5on, which
uses a clock signal to synchronize the transfer of data between the
CPU and external devices, and ensures that data is transferred at a
Fxed rate.
• Asynchronous I/O synchroniza5on is typically used in situa5ons where
data transfer rates are variable or unpredictable, such as in serial
communica5on or with slow devices. In these cases, the use of a clock
signal to synchronize data transfer can result in a waste of resources
or slow down the transfer of data.
• To manage the transfer of data in asynchronous I/O synchroniza5on,
the CPU typically uses interrupt-driven I/O, where it waits for an
interrupt signal from the device to indicate that data is ready for
transfer. The CPU can then ini5ate the transfer of data, and the device
will send data back in an asynchronous manner.
• Asynchronous input output is a form of input output processing that
allows others devices to do processing before the transmission or
data transfer is done. Problem faced in asynchronous input output
synchroniza6on – It is not sure that the data on the data bus is fresh
or not as their no 5me slot for sending or receiving data. This problem
is solved by following mechanism:
• Strobe
• Handshaking
• Data is transferred from source to des5na5on through data bus in
between. 1. Strobe Mechanism:
• Source ini6ated Strobe – When source ini5ates the process of data
transfer. Strobe is just a signal.
• (i) First, source puts data on the data bus and ON the strobe signal. (ii)
Des5na5on on seeing the ON signal of strobe, read data from the data
bus. (iii) ASer reading data from the data bus by des5na5on, strobe
gets OFF. Signals can be seen as:
• It shows that Frst data is put on the data bus and then strobe signal
gets ac5ve.
• Des6na6on ini6ated signal – When des5na5on ini5ates the process of data
transfer.
• (i) First, the des5na5on ON the strobe signal to ensure the source to put the
fresh data on the data bus. (ii) Source on seeing the ON signal puts fresh
data on the data bus. (iii) Des5na5on reads the data from the data bus and
strobe gets OFF signal. Signals can be seen as:
• It shows that Frst strobe signal gets ac5ve then data is put on the data bus.
• Problems faced in Strobe based asynchronous input output –
• In Source ini5ated Strobe, it is assumed that des5na5on has read the
data from the data bus but there is no surety.
• In Des5na5on ini5ated Strobe, it is assumed that source has put the
data on the data bus but there is no surety.
• This problem is overcome by Handshaking. 2. Handshaking Mechanism:
• Source ini6ated Handshaking – When source ini5ates the data transfer
process. It consists of signals: DATA VALID: if ON tells data on the data
bus is valid otherwise invalid. DATA ACCEPTED: if ON tells data is
accepted otherwise not accepted.
• (i) Source places data on the data bus and enable Data valid signal. (ii)
Des5na5on accepts data from the data bus and enable Data accepted
signal. (iii) ASer this, disable Data valid signal means data on data bus
is invalid now. (iv) Disable Data accepted signal and the process ends.
Now there is surety that des5na5on has read the data from the data
bus through data accepted signal. Signals can be seen as:
• It shows that Frst data is put on the data bus then data valid signal
gets ac5ve and then data accepted signal gets ac5ve. ASer accep5ng
the data, Frst data valid signal gets oH then data accepted signal gets
oH.
• Des6na6on ini6ated Handshaking – When des5na5on ini5ates the
process of data transfer. REQUEST FOR DATA: if ON requests for
puTng data on the data bus. DATA VALID: if ON tells data is valid on
the data bus otherwise invalid data.
• (i) When des5na5on is ready to receive data, Request for Data signal
gets ac5vated. (ii) source in response puts data on the data bus and
enabled Data valid signal. (iii) Des5na5on then accepts data from the
data bus and aSer accep5ng data, disabled Request for Data signal.
(iv) At last, Data valid signal gets disabled means data on the data bus
is no more valid data. Now there is surety that source has put the
data on the data bus through data valid signal. Signals can be seen as:
• It shows that Frst Request for Data signal gets ac5ve then data is put on data bus then Data valid
signal gets ac5ve. ASer reading data, Frst Request for Data signal gets oH then Data valid signal.
• Features :
• Callback func6ons: A callback func5on is a func5on that is called by the opera5ng system or
device driver when a data transfer opera5on is completed. The CPU can con5nue with other tasks
while the device is performing the data transfer opera5on. Once the opera5on is complete, the
device driver calls the callback func5on, which can be used to no5fy the CPU that the data transfer
has Fnished.
• Interrupts: Interrupts are signals sent by devices to the CPU to indicate that an event has
occurred. In the case of asynchronous I/O, interrupts can be used to signal the CPU that a data
transfer opera5on has completed. When an interrupt occurs, the CPU stops execu5ng its current
task and transfers control to an interrupt service rou5ne (ISR) that is responsible for handling the
interrupt.
• Polling: Polling is a technique used to check the status of a device
periodically to determine if it has completed a data transfer opera5on.
With asynchronous I/O, the CPU can poll the device periodically to check
if the data transfer has Fnished. If the transfer is complete, the CPU can
then retrieve the data from the device.
• Select func6on: The select func5on is a system call used to monitor
mul5ple Fle descriptors for input or output readiness. With asynchronous
I/O, the select func5on can be used to monitor the status of a device and
no5fy the CPU when a data transfer opera5on has completed.
• Advantages of Asynchronous input output synchroniza6on :
• Some advantages of asynchronous input/output (I/O) synchroniza5on include:
• Flexibility: Asynchronous I/O synchroniza5on allows for Vexible data transfer rates
and can adapt to varying transfer speeds without the need for synchroniza5on.
This is par5cularly useful when dealing with slow or intermiWent devices.
• Resource eXciency: Because data transfer is not synchronized to a clock signal,
asynchronous I/O synchroniza5on can be more resource-eXcient than
synchronous I/O synchroniza5on. It can reduce the overhead of synchroniza5on
and improve the u5liza5on of system resources.
• Reduced latency: Asynchronous I/O synchroniza5on can help reduce latency, or
the delay between the ini5a5on of a data transfer and its comple5on. This can
improve the responsiveness and overall performance of the system.
• BeWer error handling: Asynchronous I/O synchroniza5on can provide beWer error
handling, as it allows for the detec5on and handling of errors during data
transfer. This can help ensure that data is transferred accurately and reliably.
• Compa5bility: Asynchronous I/O synchroniza5on is compa5ble with a wide range
of devices and systems, making it a Vexible and widely used technique for
managing data transfer.
• Dis-advantages of Asynchronous input output synchroniza6on :
• Some disadvantages of asynchronous input/output (I/O) synchroniza5on include:
• Complexity: Asynchronous I/O synchroniza5on can be more complex to
implement than synchronous I/O synchroniza5on, as it requires interrupt-driven
I/O and other techniques to manage data transfer.
• Overhead: Asynchronous I/O synchroniza5on can result in higher overhead
than synchronous I/O synchroniza5on, as the CPU must constantly monitor
for interrupt signals and ini5ate data transfer when necessary.
• Latency: Although asynchronous I/O synchroniza5on can help reduce
latency in some cases, it can also introduce addi5onal latency when wai5ng
for interrupt signals from devices.
• Synchroniza5on issues: Asynchronous I/O synchroniza5on can introduce
synchroniza5on issues, par5cularly when dealing with mul5ple devices or
large data transfers. It can be diXcult to ensure that data is transferred in
the correct order and that all devices are properly synchronized.
• Compa5bility issues: Asynchronous I/O synchroniza5on may not be
compa5ble with all devices and systems, par5cularly those that
require Fxed data transfer rates or speciFc synchroniza5on protocols.