0% found this document useful (0 votes)
278 views10 pages

EIE3343 Lab: Cache System Principles

The document describes a laboratory exercise on cache systems. The objectives are to study how cache systems work and their organization. It introduces cache structure, operation, and performance metrics like hit ratio and effective access time. The lab uses a CacheSim program to simulate different cache configurations (e.g. 4-way set associative, direct mapped) and analyze their performance on sample data files. Students are asked to trace sample runs, calculate metrics, and summarize the results in tables to compare the configurations.

Uploaded by

v2xv4p6pv8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
278 views10 pages

EIE3343 Lab: Cache System Principles

The document describes a laboratory exercise on cache systems. The objectives are to study how cache systems work and their organization. It introduces cache structure, operation, and performance metrics like hit ratio and effective access time. The lab uses a CacheSim program to simulate different cache configurations (e.g. 4-way set associative, direct mapped) and analyze their performance on sample data files. Students are asked to trace sample runs, calculate metrics, and summarize the results in tables to compare the configurations.

Uploaded by

v2xv4p6pv8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

The Hong Kong Polytechnic University

Department of Electrical and Electronic Engineering

EIE3343 Computer Systems Principles

Laboratory Exercise 2: Cache system

Objectives: To study how a cache system works.

After completing this experiment, you should know the organization of different cache systems and how a
cache system operates with its cache management system.

Software: The CacheSim simulation program

Introduction:

A cache is a high-speed memory system between the microprocessor and the DRAM memory system. A
buffer allows a computer system to function more efficiently with DRAM of lower accessing speeds. It can
improve the overall performance of a memory system when data are accessed more than once.

The structure of a cache system is as follows. The system memory space is partitioned into several blocks.
Blocks are then grouped into several sets, and the blocks in the same set compete to occupy a fixed number
of block buffers (= the number of ways) in the cache. The data movement between the cache and system
memory is block-oriented.

As an example, Figure 1 shows a 4-way set-associate cache system. A cache entry consists of a cache
directory entry and the corresponding cache memory entry. The cache directory contains information such as
what is stored in the cache. The data is stored as a cache memory entry in a buffer. The 3-bit LRU entry
determines which buffer in the selected set should be replaced after a cache miss. In the CacheSim program,
a counter is used for each block (slot) of the cache memory to record the period for which the block has not
been accessed.

When a datum is accessed, a cache system operates in a way as follows:

1. Determine the set address and the tag address of the datum.
2. Check the tag fields of all cache directory entries of the corresponding set against the tag address
simultaneously.
3. If there is a match in any one of the tag fields and the corresponding valid bit is active, there is a
cache hit, or else a cache miss.

1
Figure 1: A 4-way set-associative cache memory.

Figure 2 summarizes the policy adopted in a cache system to operate according to different accessing results.

Figure 2: Cache read and write policy

2
The performance of a cache memory is measured by Hit Ratio (HR) and Effective Access Time (EAT),

No. of hits
HR=
Total No. of memory accesses

( No. of hits)×T h +( No. of misses )×T m


EAT =
Total No. of memory accesses
where Th and Tm are, respectively, the access time for a hit and a miss.

If the access times for read and write operations are different, EAT can be defined as

(No . of read hits )×T h−r +( No. of write hits )×T h−w +(No . of read misses)×T m−r +(No . of write misses )×T m−w
EAT =
Total No . of memory accesses
where
T h−r : the access time for a read hit
T h−w : the access time for a write hit
T m−r : the access time for a read miss
T m−w : the access time for a write miss

3
Method and details:

In this lab, you will study the behavior of cache memory systems and cache management systems with a
simulation software called CacheSim. The default cache system simulated in CacheSim is a 4-way, 16-set
associate cache system. It consists of 64 slots, each containing four 16-bit data words and some bits for a tag
and replacement algorithm statistics. The cache uses a write-through write policy and a least-recently-used
replacement algorithm. Memory is addressable at a 16-bit word boundary (word-addressable, i.e., each main
memory address references one 16-bit word). Main memory consists of a maximum of 128K bytes (or 64K
words). In other words, a memory address requires 16 bits.

The simulator reads memory references from a default data file called '[Link]'. Each line in the
file starts with a “W” to indicate a write or an “R” to indicate a read. Write lines are of the form:

W address word

where “address” is a 16-bit address in decimal and “word” is the 16-bit word in decimal to be written. Read
lines are of the form

R address

where “address” is the main memory address where the read needs to come from. The access time in the
cache system is given as follows.
1. If a referenced word is in the cache, it takes one cycle to read (
T h−r ) and four cycles to write (T h−w ).
T
2. If a referenced word is not in the cache, it takes 81 cycles to read ( m−r ) and 84 cycles to write ( m−r
T
).
Figure 3 shows the information that can be provided in the simulator. Most of them are self-explanatory. The
“Tag”, “Set”, and “Offset” are computed based on the instruction provided in the data file respectively.
“ICtr” stands for instruction counter. All cache memory entries are hidden as they are not the concern in the
simulation. The numbers in blue tell the LRU information of the corresponding cache entries. The color of
the tag field in a cache entry tells whether the data in the associated cache memory entry is valid (red) or not
(black).

4
Figure 3: A snapshot of what the simulator provides.

5
The Hong Kong Polytechnic University

Department of Electrical and Electronic Engineering

EIE3343 Computer Systems Principles

Laboratory Exercise 2: Cache system

Student Name: ___________________

Student No.: ___________________

Date: ___________________

1. With the provided small data file ‘[Link]’ (50 read/write operations). Modify the filename in
the MATLAB program CacheSim.m if necessary (Line 145: fid = fopen('[Link]','r');). Trace
the simulation program to understand how the set-associative cache system operates with the default
parameter setting: a 4-way 16-set cache system. Complete Table 1.

Table 1: Tracing read/write operations (a 4-way 16-set cache system)


ICtr Address Tag Set Offset Accumulated Access
(HEX) Field Field (Word Field, Number of Time
(DEC) (DEC) DEC) Hits (cycles)
0 0 0 0 0 0 0
5 6H
10
15
20
25
30
35
40
45
49
50

Based on the results shown in Table 1, calculate HR and EAT:

Hit Ratio (HR) =


Effective Access Time (EAT) =

6
2. Run the simulation program with the small data file ‘[Link]’ and complete Tables 2-1 and 2-
2.
Table 2-1
Cache Memory ICtr Address Tag Field Set Field Offset (Word
Organization (HEX) (Binary) (Binary) Field, Binary)
4-way 8-set 14 20 00000000001 000 00

2-way 16-set 24

2-way 8-set 34
Direct mapped 16 39
sets (one way)
Direct mapped 8
44
sets (one way)

Table 2-2
Direct Direct
Cache system 4-way 4-way 2-way 2-way mapped mapped
organization 16-set 8-set 16-set 8-set 16 sets 8 sets
(one way) (one way)
Number of tag bits

Number of set bits


Number of offset
2 2 2 2 2 2
(word) bits
Size of the cache
memory (in bytes) for
storing the data from
the main memory
Hit Ratio (HR)
Effective Access Time
(EAT) (unit: clocks)

7
3. Run the simulation program with the large data file ‘[Link]’ (Line 415 of CacheSim.m) and
complete Table 3.

Table 3: Simulation results with the large data file ‘[Link]’ (about 50,000 read/write operations)

Case 1 2 3 4 5 6
Direct Direct
Cache system 4-way 4-way 2-way 2-way mapped mapped
organization 16-set 8-set 16-set 8-set 16 sets 8 sets
(one way) (one way)
Hit Ratio (HR)
Effective Access Time
(EAT) (unit: clocks)

Questions:

(1) Based on Table 3, comment on the effect of the size of cache memory on the HR when running a
relatively large program.

(2) Based on Table 3, comment on how the organization of cache memory affects the cache performance
(HR and EAT).

(3) What is the relationship between HR and EAT?

8
(4) Comment on how the size of a program affects the cache performance (HR and EAT) of a cache
system by comparing the results of the data file ‘[Link]’ with those of the data file
‘[Link]’. Why?

9
4. Assume that a w-way (w=¿ 2, 4, or 8) set associative cache memory (take Figure 1 as an example) is
used. Each set has three LRU bits, one valid bit, and one write-protect bit for each cache way (block).
Derive the formulas for the size of the cache directory in bits (
Sd ), the size of the cache memory in bits (
Sm ), and the total number of bits in the cache system ( Sc =S d + Sm ) in terms of the number of bits in
the set field (
N s ), the offset (byte) field ( N o ), and the address (N). Assume that it is a byte-addressable
computer.

Hint:

The number of bits in the tag field = N – Ns – No.

The total number of bits for a set in the cache directory


= (The number of bits in the tag field + 1 valid bit + 1 write-protect bit) × The number of ways

There are three LRU bits for each set, but not each way. The LRU bits of a set are to identify which of
these blocks (ways) in the set is the LRU block. Thus, three LRU bits are enough to identify the LRU
block if w is not more than eight.

The size of the cache directory, Sd


= (The total number of bits for a set in the cache directory + 3 LRU bits) × The total number of sets

The size of the cache memory, Sm


= The total number of ways × The total number of sets in the cache × The total number of bits in a block

- End -
Lawrence Cheung
January 2024
10

Common questions

Powered by AI

Understanding both hit ratio and effective access time is crucial because they provide complementary insights into cache performance. Hit Ratio (HR) indicates the frequency of data being served from cache, reflecting the cache's success in storing frequently accessed data. Effective Access Time (EAT) measures the average time to access memory, incorporating both cache hits and misses. Cache systems with high HR generally have low EAT, indicating efficient performance. These metrics together help in evaluating how different cache configurations or data sizes impact overall memory system efficiency .

Effective Access Time (EAT) in a cache memory system is calculated using the formula: EAT = ((No. of read hits) × Th-r + (No. of write hits) × Th-w + (No. of read misses) × Tm-r + (No. of write misses) × Tm-w) / Total No. of memory accesses. Different access times are considered because the time to read from or write to cache after a hit differs from accessing DRAM, particularly due to the write-back or write-through policies affecting write operations; thus, making write access typically slower compared to read operations .

In the CacheSim environment, the LRU algorithm is used as a cache replacement policy to determine which cache entry to evict upon a cache miss. When data is accessed, the system updates the LRU information to reflect that the data was recently accessed. The 3-bit LRU entry helps identify the least recently used block within a set in the cache. This approach attempts to discard the least useful data assuming that data accessed recently is more likely to be accessed again soon, thereby improving cache performance .

Modifying the file used in CacheSim can influence the understanding of cache operations by allowing users to simulate different sequences of memory access patterns. This experimentation helps in comprehending how various workloads interact with the cache, affecting hit ratio and access times. By modifying data access patterns or sizes, users can observe cache behavior changes, validate theoretical assumptions and understand the impact of access sequences on cache efficiency, thus gaining deeper insights into cache dynamics in practical scenarios .

Cache memory organization has a significant impact on both the hit ratio (HR) and effective access time (EAT) when executing large programs. A 4-way set-associative cache, for example, generally provides a higher HR by allowing more flexible placement of data blocks across sets. This flexibility reduces conflict misses, thus improving HR and reducing EAT. Conversely, direct-mapped caches have fewer slots per set, leading to higher conflict misses, lower HR, and increased EAT. Therefore, set-associativity improves cache performance by enhancing HR and decreasing EAT, especially under heavy loads of large programs .

The structure of a cache system enhances performance by providing a high-speed memory system between the microprocessor and DRAM, which bridges the speed gap between the CPU and main memory. A cache system partitions system memory into blocks, which are grouped into sets where blocks compete for buffer space. This locality-based organization allows frequently accessed data to be stored in the cache, thus reducing the average time to access data by improving the hit ratio, thereby minimizing access to slower DRAM. The speeding up of read and write access is evaluated using metrics such as Hit Ratio and Effective Access Time (EAT).

A write-through policy ensures data consistency by writing data to both the cache and the main memory simultaneously upon a write operation. This impacts cache performance by typically increasing the write access time, as writes incur additional time compared to a write-back policy, where data is written to memory only when replaced in cache. Write-through policy simplifies coherence in systems with multiple processors as all caches can maintain consistent views of memory but at the cost of increased write access cycles, affecting overall EAT when writes are frequent .

Running CacheSim with larger and more complex data files poses challenges such as increased contention for cache blocks, leading to higher rates of cache misses. Larger data volumes can exceed cache capacities, stressing the replacement policies (e.g., LRU) and revealing the limitations of the 4-way set-associative architecture. These scenarios also likely decrease the Hit Ratio and increase the Effective Access Time as the cache system handles more frequent data eviction and rerouting, thereby challenging the cache's efficiency and robustness under realistic high-load conditions .

The number of cache ways (w) and sets affect the size of the cache directory and memory significantly. The size of the cache directory (Sd) is calculated by considering the number of bits for tag fields, valid and write-protect bits per way, and additional LRU bits per set. Specifically, Sd = ((N - Ns - No + 1 + 1) × w + 3) × number of sets, where N is the address size, Ns is the number of set bits, and No is the number of offset bits. Meanwhile, the size of the cache memory (Sm) is determined by multiplying the number of ways, total number of sets, and the number of bits in a block. Thus, more cache ways and sets require more bits for both the directory and memory .

CacheSim provides a platform to visualize and analyze the effects of varying cache architectures and replacement policies on system performance. By simulating different cache configurations (e.g., direct-mapped vs. set-associative) and replacement policies (e.g., LRU), users can evaluate changes in hit ratio and access times based on realistic memory access patterns. This hands-on experience aids in understanding conceptual and practical differences between caching strategies, facilitating deeper insights into cache design trade-offs and potential optimizations .

You might also like