Conspect of Lecture 7
Conspect of Lecture 7
CACHE MEMORY
The goal of the lecture: analyze and study principles of Cache work, elements of
Cache Design, Mapping Function, direct, associative and set associative
techniques, Cache organization in PENTIUM & PowerPC processors.
Contents
Literature.
1.http://home.ustc.edu.cn/~leedsong/reference_books_tools/Computer
%20Organization%20and%20Architecture%2010th%20-%20William
%20Stallings.pdf.
2. Pustovarov V. I. Assembler. Programming and analysis of machinery
programs correctness, - Kiev: “Irina”, 2010. - 476
3. Э. Таненбаум. Т. Остин. Архитектура компьютера. 6-е изд.
Издательство: Питер, 2016. — 816 стр.
Keywords.
Reference Locality principal, Cache Design, Mapping Function, direct, associative
and set associative techniques, Replacement algorithm, data integrity, out-of-order
execution.
called miss
Then deliver from cache to CPU.Cache includes tags to identify
which block of main memory is in each cache slot
Cache Design.
Cache Size
Mapping Function
Direct
Associative
Set associative
Replacement Algorithm
Least recently used (LRU)
First in first out (FIFO)
Least frequently used (LFU)
Random
Write Policy
Write through
Write back
Line Size
Number of Caches
Single or two level
Unified or split
Analysis of Cache Design Elements.
Mapping Function
Cache of 64kByte
Cache slot of 4 bytes
i.e. cache is 16k (214) lines(slots) of 4 bytes
16MBytes main memory
24-bit address
(224=16M)
Direct Mapping
Each block of main memory maps to only one cache line
i.e., if a block is in cache, it must be in one specific place
Address is in two parts
Least Significant w bits identify unique word
Most Significant sbits specify one memory block
The MSBs are split into a cache line field r and a tag of s-r
(most significant)
24-bit address
the low-order 2 bits select one of 4 words in 4-byte block
22-bit block identifier
8-bit tag (=22-14) (the high-order 8 bits of the memory address of the
block are stored in 8 tag bits associated with its location in the cache)
14-bit slot or line (determines the cache position in this block)
No two blocks in the same line have the same Tag field
Check contents of cache by finding line and checking Tag
Simple
Inexpensive
Fixed location for given block
o If a program accesses 2 blocks that map to the same line repeatedly, cache
misses are very high
Associative Mapping
A main memory block can load into any line of cache
Memory address is interpreted as tag and word
Tag uniquely identifies block of memory
Every line’s tag is examined for a match
Cache searching gets expensive
Associative Mapping Address Structure
Word
Tag 22 bit 2 bit
Word
Tag 9 Set 13 2 bits
bit bit
Replacement Algorithms
Replacement algorithms are only needed for associative and set associative
techniques.
1. Least Recently Used (LRU) – replace the cache line that has been in the cache
thelongest with no references to it.
2. First-in First-out (FIFO) – replace the cache line that has been in the cache
thelongest.
3. Least Frequently Used (LFU) – replace the cache line that has experienced the
fewestreferences.
4. Random – pick a line at random from the candidate lines.
Write Policy
If a cache line has not been modified, then it can be overwritten immediately;
however, if one or more words have been written to a cache line, then main
memory must be updated before replacing the cache line.
There are two main potential write problems:
• If an I/O module is able to read/write to memory directly, then if the cache has
been modified a memory read cannot happen right away. If memory is written to,
then the cache line becomes invalid.
• If multiple processors each have their own cache, if one processor modifies its
cache, then the cache lines of the other processors could be invalid.
1. write through – this is the simplest technique where all write operations are
made to memory as well as cache ensuring main memory is always valid. This
generates a lot of main memory traffic and creates a potential bottleneck;
2. write back – updates are made only to the cache and not to main memory until
the line is replaced.
Cache coherency – keeps the same word in other caches up to date using some
technique. This is an active field of research.
Write through
All writes go to main memory as well as cache
Multiple CPUs can monitor main memory traffic to keep local (to CPU) cache
up to date
Lots of traffic
Slows down writes
Write back
Updates initially made in cache only
Update bit for cache slot is set when update occurs
If block is to be replaced, write to main memory only if update bit is set
Other caches get out of sync
I/O must access main memory through cache
15% of memory references are writes
Line Size
Cache lines sizes between 8 to 64 bytes seem to produce optimum results.
Number of Caches
An on-chip cache reduces the processor's external bus activity. Further, an off-chip cache
is usually desirable. This is the typical level 1 (L1) and level 2 (L2) cache design where
the L2 cache is composed of static RAM. As chip densities have increased, the L2 cache
has been moved onto the on-chip area and an additional L3 cache has been added.
Problems.
1. What is the main purpose of Cache Memory implementation?
2. Describe principles of Cache Memory work.
3. Enumerate elements of Cache Design.
4. Analyze the block-diagram of Pentium 4 processor.
5. How is ensured the Data Cache Consistency?