0% found this document useful (0 votes)

291 views

Memory Hierarchy - Introduction: Cost Performance of Memory Reference

The document discusses computer memory hierarchies and caching. It begins by explaining that computer programmers want fast unlimited memory, but it is more economical to use a memory hierarchy that takes advantage of locality of reference. A memory hierarchy organizes memory into multiple levels, with each level being smaller, faster, and more expensive than the next. Caches store frequently accessed data from slower memory closer to the CPU for rapid access. When data is found in cache it is a cache hit, and when not, a cache miss. The document then discusses various concepts related to memory hierarchies including blocks, virtual memory, pages, and cache performance metrics like miss rates.

Uploaded by

ravi_jolly223987

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

291 views

Memory Hierarchy - Introduction: Cost Performance of Memory Reference

Uploaded by

ravi_jolly223987

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 52

Memory Hierarchy - Introduction

Computer programmers want unlimited amount of

fast memory
An economical soln for that is a memory hierarchy

which takes the advantage of principle of locality and cost performance of memory reference
Principle of Locality : says that most programs dont

access all code or data uniformly.

Multilevel Memory hierarchy

Since fast memory is expensive, a memory hierarchy is

organized into several levels each smaller, faster, and more expensive per byte than the next lower level.
The goal is to provide a memory system with cost per

byte is as low as cheapest level of memory and speed is as high as the fastest level.
Each level maps addresses from a slower larger

memory to a smaller but faster memory higher in the hierarchy.

Terminologies
Cache : is the name given to the highest or first level

of memory hierarchy once the address leaves the processor. cache is a temporary storage area where frequently accessed data can be stored for rapid access.
Cache hit : When the processor finds the requested

data in the cache it is known as cache it.

Cache miss occurs when the processor does not find

the needed data item in cache.

Three categories of cache misses

Compulsory the very first access to block can not be in cache, so the block must be brought into the cache. Compulsory misses are those that occur even if you had infinite cache. Capacity if the cache cannot contain all the blocks needed during execution of a program, capacity misses occur because of blocks being discarded and later retrieved. Conflict if the block placement strategy is not fully associative, conflict misses will occur because a block may be discarded and later retrieved if conflicting blocks map to its set.

The time required for cache miss depends on both the

latency and the bandwidth of the memory.

Latency determines the time to retrieve first word of the

block.
Bandwidth determines the time to retrieve the rest of

this block.
The cache misses are handled by hardware and causes

processors using in-order execution to stall.

With out of order execution, an instruction using the

result must wait, but the other instructions may proceed during miss.

Conti..
Block : A fixed size collection of data containing the requested word ,

also known as line run.

Virtual memory : Not all objects referenced by a program need to

reside in main memory. Some objects may reside in disk.

Pages : Address space is usually broken into fixed size blocks. Page fault: At any time a page resides either in memory or in disk.

When processor references an item within a page that is not present in memory then a page fault occurs, and the entire page is moved from disk to memory. processor is not stalled. The processor usually switches to some other task.

Since page faults take so long they are handled in software and the

Cache performance
Memory hierarchy can substantially improve the performance because

of locality of reference and the higher speed of smaller memories.

The processor execution time equation when we consider the number

of cycles during which processor is stalled waiting for memory access, which is called memory stall cycles is as given bellow. CPU Execution Time= (CPU clock cycles + Memory stall clock cycles) * clock cycle time.
This equation assumes that the CPU clock cycles include the time to

handle a cache hit, and that the processor stalled during a cache miss.

Contin.
The number of memory stall cycles depends on both the number

of misses and the cost per miss, which is called miss penalty.
Memory stall cycles = Number of misses * miss penalty

= IC * Misses * miss penalty Instruction = IC * Memory accesses * miss rate*miss penalty Instruction The component miss rate is simply the fraction of cache accesses that results in a miss( i.e the number of accesses that miss divided by the number of accesses)

The formula was an approximation since the miss rates and miss

penalties are different for reads and writes.

Memory stall cycles could then be defined in terms of the number of

memory accesses per instruction, miss penalty(in clock cycles) for reads and writes, and miss rate for reads and writes;
Memory stall cycles =

IC * reads per instruction *read miss rate* read miss penalty + IC * writes per instruction*write miss rate * write miss penalty.

Some designers measure miss rate as misses per

instruction rather than misses per memory reference. These two are related. Misses = Instrn miss rate * mem accesses = miss rate* mem accesses Instrn count Instrn
Misses per instruction is often reported as misses per

1000 instrns to show integers instead of fractions.

4 memory hierarchy questions

Cache first level of memory hierarchy Answering following questions help us understand the

different trade-offs of memories help us at different levels of hierarchy. 1 . Where can a block be placed in the upper level? (block placement) 2. How is a block found if it is in the upper level ? (block identification) 3. Which block should be replaced on miss ? ( block replacement) 4. What happens on a write ? ( write strategy).

Where can a block be placed in a cache ???

Three categories of cache organization

1. If each block has only one place it can appear in cache it is said to be direct mapped. The mapping is done usually (block addr ) MOD (no of blocks in cache) 2. If a cache block can be placed anywhere in the cache, the cache is said to be fully associative. 3. If a block can be placed in restricted set of places in the cache, the cache is set associative. A set is a group of blocks in the cache. A block is mapped into a set, and then the block can be placed anywhere within the set. The set is usually chosen by bit selection; that is (block address) MOD (number of sets in cache)

Conti..
If there are n blocks in a set the cache placement is

called n-way set associative. Direct mapped is simply one way set associative and a fully associative cache with m blocks could be called m-way set associative. Direct associative can be thought of having m sets and fully associative as having one set. The vast majority of processor caches today are direct mapped, two way set associative, or four way set associative.

How is a block found if it is not in cache ???

Caches have an address tag on each block frame that

gives the block address. The tag of every cache block is checked to see if it matches the block address from the processor. All possible tags are searched in parallel because speed is critical. A valid bit is added to the tag to say whether or not this entry contains a valid address. If this bit is not set, there can not be a match on this address.

Relationship of a processor address to cache

The first division is between is between the block

address and the block offset. The block frame address is further divided into the tag field and the index field. The block offset field selects the desired data from the block. The index field selects the set. The tag field is compared against it for a hit.

Memory Hierarchy-Review
By, Chandru 1RV08SCS05

Objectives > Basic Information

> Four Memory Hierarchy Questions Q1: Where can a block be placed in the upper level? (block placement) Q2: How is a block found if it is in the upper level? (block identification) Q3: Which block should be replaced on a miss? (block replacement) Q4: What happens on a write? (write strategy) > An Example: The Opteron Data Cache

Memory arrangement of storage in current computer The hierarchical Hierarchy

architectures is called the memory hierarchy. It is designed to take advantage of memory locality in computer programs.

Most modern CPUs are so fast that for most program workloads, the

locality of reference of memory accesses and the efficiency of the caching and memory transfer between different levels of the hierarchy are the practical limitation on processing speed.

Smaller, faster, and costlier (per byte) storage devices

An Example Memory Hierarchy

L0: registers
CPU registers hold words retrieved from L1 cache.

L1: on-chip L1 cache (SRAM) L2: off-chip L2 cache (SRAM)

L1 cache holds cache lines retrieved from the L2 cache memory. L2 cache holds cache lines retrieved from main memory.

L3:

Larger, slower, and cheaper (per byte) storage devices

L5:

main memory (DRAM)

Main memory holds disk blocks retrieved from local disks.

L4:

local secondary storage (local disks)

Local disks hold files retrieved from disks on remote network servers.

remote secondary storage (distributed file systems, Web servers)

The Principle of Locality The Principle of Locality:

Program access a relatively small portion of the address

space at any instant of time.

Two Different Types of Locality: Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced again soon (e.g., loops, reuse) Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are close by tend to be referenced soon (e.g., straightline code, array access)

Cache Algorithm (Read) tags to Look at Processor Address, search cache

find match. Then either
HIT - Found in Cache Return copy of data from cache
Hit Rate = fraction of accesses found in cache Miss Rate = 1 Hit rate Hit Time = RAM access time +

MISS - Not in cache

Read block of data from Main Memory Wait Return data to processor and update cache

time to determine HIT/MISS

Miss Time = time to replace block in cache + time to deliver block to processor

Caching in a Memory Hierarchy

Level k: 8 4 9 14 10 3 Smaller, faster, more expensive device at level k caches a subset of the blocks from level k+1 10 4 Data is copied between levels in block-sized transfer units

0 Level k+1: 4 8 12

1 5 9 13

2 6 10 14

3 7 11 15 Larger, slower, cheaper storage device at level k+1 is partitioned into blocks.

Types of cache misses: Cold (compulsory) miss

Cold misses occur because the cache is empty. Conflict miss If the block placement strategy is not fully associative, conflict misses will occur because a block may be discarded and later retrieved if conflicting blocks map to itself. Conflict misses occur when the level k cache is large enough, but multiple data objects all map to the same level k block. E.g. Referencing blocks 0, 8, 0, 8, 0, 8, ... would miss every time. Capacity miss If the cache cannot contain all the blocks needed during execution of a program, capacity misses will occur

Q3: Which block should be replaced on a miss?

Easy for Direct Mapped Set Associative or Fully Associative:
Random

Candidate blocks are randomly selected. Some systems generate pseudorandom block number Least Recently Used (LRU) LRU relies on a corollary of locality: If recently used blocks are likely to be used again, then LRU block is good candidate. First In, First Out (FIFO) used in highly associative caches

General Caching Concepts

14 12
1

Program needs object d, which is stored in some

Request 12 14
2

block b. Cache hit

Level k:

Program finds b in the cache at level k.

4* 12

E.g., block 14.

Cache miss
b is not at level k, so level k cache must

12 4*

Request 12

0 Level k+1:

fetch it from level k+1. E.g., block 12. If level k cache is full, then some current block must be replaced (evicted). Which one is the victim?

4 4*
8 12

5
9 13

6
10 14

7
11 15

Placement policy: where can the new block go? E.g., b mod 4 Replacement policy: which block should be evicted? E.g., LRU

Q4: What happens on a write?

Cache hit:
write through: write both cache & memory

generally higher traffic but simplifies cache coherence write back: write cache only (memory is written only when the entry is evicted) a dirty bit per block can further reduce the traffic

Cache miss: Q-4 continued

no write allocate: only write to main memory (lower-level memory)

write allocate : The block is allocated on a write miss. Write misses

act like read misses

Assume a fully associative cache with cache Example empty.write-back sequence ofmanymemory entries that starts Below is a five operations (the address is in square brackets): WriteMem[100]; WriteMem[100]; Read Mem[200]; WriteMem[200]; WriteMem[100]. What are the number of hits and misses when using no-write allocate versus writeallocate?

Answer For no-write allocate, the address 100 is not in the cache, and

there is no allocation on write, so the first two writes will result in misses. Address 200 is also not in the cache, so the read is also a miss. The subsequent write to address 200 is a hit. The last write to 100 is still a miss. The result for no-write allocate is four misses and one hit. For write allocate, the first accesses to 100 and 200 are misses, and the rest are hits since 100 and 200 are both found in the cache. Thus, the result for write allocate is two misses and three hits.

Six basic Cache Optimization

Sandeep Singh M.Tech (CSE) 2nd Sem

Average memory access time = Hit time + Miss rate * Miss penalty Three Categories of Cache Optimizations Reducing the miss rate : large block size ,large cache size and higher associativity Reducing the miss penalty : multilevel caches and giving reads priority over writes Reducing the time to hit in the cache : avoiding address translation when indexing the cache.

Three categories of misses

Compulsory : the very first access to a block cannot be in the cache ,so

the block must be brought into the cache. These are also called coldstart misses or first-reference misses. Compulsory misses are those that occur in an infinite cache. Capacity : if the cache cannot contain all the blocks needed during execution of a program ,capacity misses will occur because of blocks discarded and later retrieved. Capacity misses are those that occur in a fully associative cache. Conflict : if the block placement strategy is set associative or direct mapped ,conflict misses will occur because a block may be discarded and later retrieved if too many blocks map to its set. These misses are also called collision misses. Conflict misses are those that occur going from fully associative to eight-way associative ,four-way associative and so on.

Four Division of Conflict Misses Eight-way :conflict misses due to going from fully associative to

eight-way associative. Four-way :conflict misses due to going from eight-way associative to four-way associative. Two-way :conflict misses due to going from four-way associative to two-way associative. One-way :conflict misses due to going from two-way associative to one-way associative.

First Optimization : Larger Block Size to Reduce

Miss Rate
The simplest way to reduce miss rate is to increase the block size

,larger block size will reduce also compulsory misses. This reduction occur because the principle of locality has two components : temporal locality and spatial locality. Larger block take advantage of spatial locality. But larger blocks also increase the miss penalty. Since they reduce the number of blocks in the cache ,larger blocks may increase conflict misses and even capacity misses if the cache is small.

Cont
The selection of block size depends on both the latency and

bandwidth of the lower-level memory. High latency and high bandwidth encourage large block size since the cache gets many more bytes per miss for a small increase in miss penalty. Low latency and low bandwidth encourage smaller block size since there is little time saved from a larger block.

Second Optimization : Larger Caches to Reduce

Miss Rate The obvious way to reduce capacity misses is to increase capacity of the cache. The drawback is potentially longer hit time and higher cost and power. This technique has been especially popular in off-chip caches.

Third Optimization : Higher Associativity to

Reduce Miss Rate

There are two rules The first is that eight-way set associative is for practical purpose

as effective in reducing misses for these sized caches as fully associative. The second rule ,called the 2:1 cache rule of thumb , is that a direct-mapped cache of size N has about the same miss rate as a two-way set-associative cache of size N/2. The higher associativity increases average memory access time.

Fourth Optimization : Multilevel Caches to

Reduce Miss Penalty

Due to performance gap between processor and memory

designer added another level of cache between the original cache and memory. The first-level cache can be small enough to match the clock cycle time of the fast processor. The second-level cache can be large enough to capture many accesses that would go to main memory, thereby lessening the effective miss penalty.

Average memory access time for a two-level cache : Average memory access time = Hit time(L1) + Miss rate(L1)*Miss

penalty(L1) Miss penalty(L1) = Hit time(L2) + Miss rate(L2)*Miss penalty(L2) So Average memory access time = Hit time(L1) + Miss rate(L1)*(hit time(L2) + Miss rate(L2)*Miss penalty(L2))

Term adopted for a two-level cache system :

Local miss rate : this rate is simply the number of misses in a

cache divided by the total number of memory access to this cache. for the first-level cache it is equal to Miss rate(L1) and for the second-level cache it is Miss rate(L2). Global miss rate : the number of misses in the cache divided by the total number of memory accesses generated by the processor. The global miss rate for the first-level cache is still just Miss rate(L1) ,but for second-level cache it is Miss rate(L1)*Miss rate(L2).

Fifth Optimization : Giving Priority to read Misses

over Writes to Reduce Miss Penalty

This optimization serves reads before writes have been completed. With a write-through cache most important improvement is a write

buffer of the proper size. Write buffers ,however do complicate memory accesses because they might hold the updated value of a location needed on a read miss. The simplest way out of this is for the read miss to wait until the write buffer is empty. The alternative is to check the content of the write buffer on a read miss ,and if there are no conflict and the memory system is available , let the read miss continue.

Sixth Optimization : Avoiding Address Translation during indexing of the Cache to Reduce Hit rate
We use virtual addresses for the cache ,since hits are much more

common than misses. Such caches are termed virtual caches , with physical cache used to identify the traditional caches that uses physical addresses. Two important tasks are : indexing the cache and comparing addresses. Full virtual addressing for both indices and tags eliminates address translation time from a cache hit.

Some Reasons for not building virtually addressed caches: Protection : page-level protection is checked as part of the virtual to

physical address translation and it must be enforced. One solution is to copy the protection information from the TLB on a miss , and a field to hold it and check it on every access to the virtually addressed cache. Another reason is that every time a process is switched, the virtual addresses refer to different physical addresses, requiring the cache to be flushed. One solution is to increase the width of the cache address tag with a process-identifier tag (PID). If operating system assign these tags to processes, it only need flush the cache when a PID is recycled, that is PID distinguishes whether or not the data in the cache are for this program

A third reason is operating systems and users program may use

different virtual addresses for the same physical address. These duplicate addresses, called synonyms or aliases, could result in two copies of the same data in virtual cache, if one is modified the other will have the wrong value. With a physical cache this would not happen, since the accesses would first be translated to the same physical cache block. Hardware solution to the synonym problem, called antialiasing, guarantee every cache block a unique physical address. Software can make this problem much easier by forcing to share some address bits, this restriction is called page coloring.

I/O typically uses physical addresses and thus would require

mapping to virtual addresses to interact with a virtual cache. One alternative is to use part of the page offset-the part that is identical in both virtual and physical addresses-to index the cache. At the same time as the cache is being read using that index, the virtual part of the address is translated, and the tag match uses physical addresses. This alternative allows the cache read to begin immediately, and yet the tag comparison is still with physical addresses. The limitation of this virtually indexed, physically tagged alternative is that a direct-mapped cache can be no bigger than the page size.

Access Manager User's Guide: Foxboro Evo Process Automation System
No ratings yet
Access Manager User's Guide: Foxboro Evo Process Automation System
164 pages
Mark Vincent B. Cornel BT501A
No ratings yet
Mark Vincent B. Cornel BT501A
2 pages
DATA COMMUNICATIONS Reviewer
No ratings yet
DATA COMMUNICATIONS Reviewer
36 pages
Activity 2. Introduction To Ethics
No ratings yet
Activity 2. Introduction To Ethics
5 pages
End of Chapter 01 Solution
50% (4)
End of Chapter 01 Solution
21 pages
Ba 236 Activities
No ratings yet
Ba 236 Activities
5 pages
CHAPTER 4 - Network Models
No ratings yet
CHAPTER 4 - Network Models
11 pages
7 Cost of Capital CTDI October 2013
No ratings yet
7 Cost of Capital CTDI October 2013
85 pages
Carlos Hilado Memorial State College: College of Business Management and Accountancy
No ratings yet
Carlos Hilado Memorial State College: College of Business Management and Accountancy
14 pages
Database Caching Strategies Using Redis
No ratings yet
Database Caching Strategies Using Redis
22 pages
CHAPTER II-Evolution of Computer
No ratings yet
CHAPTER II-Evolution of Computer
11 pages
Database f5-f6 Codigo
No ratings yet
Database f5-f6 Codigo
5 pages
CpE Laws - Midterm Quiz 2
No ratings yet
CpE Laws - Midterm Quiz 2
3 pages
Site Selection
No ratings yet
Site Selection
3 pages
Chapter 8. Preparation of Research Report
No ratings yet
Chapter 8. Preparation of Research Report
26 pages
PQA Summary Final
No ratings yet
PQA Summary Final
25 pages
ENTREP
No ratings yet
ENTREP
7 pages
Green Computing - A Survey
No ratings yet
Green Computing - A Survey
7 pages
Lecture 1 - Introduction To Event Driven Programming
100% (2)
Lecture 1 - Introduction To Event Driven Programming
27 pages
Machine Problems
No ratings yet
Machine Problems
97 pages
Lecture 3: Role of Academic Librarian: Prof. Dana P. Tugade
100% (1)
Lecture 3: Role of Academic Librarian: Prof. Dana P. Tugade
34 pages
C++ FQ Prelims
No ratings yet
C++ FQ Prelims
3 pages
Understanding The Marketplace and Customer Needs
No ratings yet
Understanding The Marketplace and Customer Needs
6 pages
Reviewer DSA
No ratings yet
Reviewer DSA
47 pages
SUMOBOT Docu PDF
No ratings yet
SUMOBOT Docu PDF
4 pages
Science, Technology and Society
No ratings yet
Science, Technology and Society
18 pages
Technopreneurship: Asingan Campus
No ratings yet
Technopreneurship: Asingan Campus
18 pages
Decimal Binary Octal Hexadecimal Binary Coded Decimal (BCD)
No ratings yet
Decimal Binary Octal Hexadecimal Binary Coded Decimal (BCD)
20 pages
1 - Introduction To Proteus VSM (Part I) : LAB Objectives
100% (1)
1 - Introduction To Proteus VSM (Part I) : LAB Objectives
16 pages
College of Engineering, Architecture and Technology Dasmariñas, Cavite Philippines
No ratings yet
College of Engineering, Architecture and Technology Dasmariñas, Cavite Philippines
2 pages
Mock Exam 1
No ratings yet
Mock Exam 1
58 pages
Chapter 1 ReviewQuestions
No ratings yet
Chapter 1 ReviewQuestions
2 pages
CH 07
No ratings yet
CH 07
44 pages
Module 11 Fetch Decode Execute Cycle V1
No ratings yet
Module 11 Fetch Decode Execute Cycle V1
16 pages
The Evolution and Impact of Programming Language Design Criteria On Modern Software Development
No ratings yet
The Evolution and Impact of Programming Language Design Criteria On Modern Software Development
18 pages
Sharity Draft Manu Defense Revise3
No ratings yet
Sharity Draft Manu Defense Revise3
25 pages
Introduction To Technopreneurship
No ratings yet
Introduction To Technopreneurship
33 pages
Hw1 Solution
No ratings yet
Hw1 Solution
4 pages
Act. 9 Intacc
No ratings yet
Act. 9 Intacc
6 pages
Module 2 - Comp 312 - Computer Fundamentals and Programming
No ratings yet
Module 2 - Comp 312 - Computer Fundamentals and Programming
41 pages
Basic Program Structure in C++: Study Guide For Module No. 2
No ratings yet
Basic Program Structure in C++: Study Guide For Module No. 2
9 pages
Data Structures and Algorithms.
100% (1)
Data Structures and Algorithms.
2 pages
Platform Technologies Reviewer
No ratings yet
Platform Technologies Reviewer
46 pages
Information and Communications Technology: An Overview
No ratings yet
Information and Communications Technology: An Overview
31 pages
Module Instantiation and Test Benches: Hardware Description Language
No ratings yet
Module Instantiation and Test Benches: Hardware Description Language
15 pages
Program Logic Formulation
No ratings yet
Program Logic Formulation
69 pages
Information Age
No ratings yet
Information Age
8 pages
Computer:: The Basic Parts Without Which A Computer Cannot Work Are As Follows
No ratings yet
Computer:: The Basic Parts Without Which A Computer Cannot Work Are As Follows
30 pages
DS101
No ratings yet
DS101
8 pages
Action Plan To Improve Work Immersion PR
No ratings yet
Action Plan To Improve Work Immersion PR
3 pages
Minimal Spanning Tree PDF
No ratings yet
Minimal Spanning Tree PDF
4 pages
Interaction Design Lab - Week 03
67% (3)
Interaction Design Lab - Week 03
4 pages
Q-1 Engineering Economics
No ratings yet
Q-1 Engineering Economics
2 pages
CIS - Fundamentals of Enterprise Data Management
No ratings yet
CIS - Fundamentals of Enterprise Data Management
6 pages
Introduction To Computing - Lec 1
No ratings yet
Introduction To Computing - Lec 1
67 pages
A CASE STUDY lajong final talaga yehey
No ratings yet
A CASE STUDY lajong final talaga yehey
16 pages
PCOM 102 Module 1
No ratings yet
PCOM 102 Module 1
67 pages
Module 1 - The Systems Development Environment
No ratings yet
Module 1 - The Systems Development Environment
37 pages
Technopreneurship Syllabus
No ratings yet
Technopreneurship Syllabus
11 pages
Cau 6 Cache
No ratings yet
Cau 6 Cache
25 pages
05) Cache Memory Introduction
No ratings yet
05) Cache Memory Introduction
20 pages
UNIT-IV Memory and I/O
No ratings yet
UNIT-IV Memory and I/O
36 pages
AWS Intro
No ratings yet
AWS Intro
27 pages
Part 3
No ratings yet
Part 3
232 pages
Introduction to Internet of Things - - Unit 11 - Week 9
No ratings yet
Introduction to Internet of Things - - Unit 11 - Week 9
4 pages
Linux Paging Improvements
No ratings yet
Linux Paging Improvements
5 pages
Computer Systems: CS553 Homework #2
No ratings yet
Computer Systems: CS553 Homework #2
2 pages
Windows Azure Storage at 23rd ACM Symposium On Operating Systems Principles (SOSP) - Paper
No ratings yet
Windows Azure Storage at 23rd ACM Symposium On Operating Systems Principles (SOSP) - Paper
15 pages
Instant Access to Databases Theory and Applications Junhu Wang ebook Full Chapters
100% (1)
Instant Access to Databases Theory and Applications Junhu Wang ebook Full Chapters
55 pages
SQL Server - Pages & Extents - Architecture
No ratings yet
SQL Server - Pages & Extents - Architecture
7 pages
Snowpro™ Core: Exam Study Guide
No ratings yet
Snowpro™ Core: Exam Study Guide
17 pages
ICS 2174 Introduction To Computer Science Notes
No ratings yet
ICS 2174 Introduction To Computer Science Notes
44 pages
Module 5 - Block, File, and Object-based Storage Systems - Participant Guide
No ratings yet
Module 5 - Block, File, and Object-based Storage Systems - Participant Guide
68 pages
Dynamodb DG
No ratings yet
Dynamodb DG
915 pages
Log
No ratings yet
Log
38 pages
Current Log
No ratings yet
Current Log
13 pages
Unit 1 Fundamentals of Digital Computers: Short Answer Questions
No ratings yet
Unit 1 Fundamentals of Digital Computers: Short Answer Questions
17 pages
Informatica Interview Questions and Answers
No ratings yet
Informatica Interview Questions and Answers
58 pages
SQL Performance Counter - User Sessions
No ratings yet
SQL Performance Counter - User Sessions
2 pages
Graphic Era Hill University, Dehradun: TCS101 Assignment No. 1 Based On The Course Objective-1
No ratings yet
Graphic Era Hill University, Dehradun: TCS101 Assignment No. 1 Based On The Course Objective-1
1 page
Cef Log
No ratings yet
Cef Log
7 pages
Memory Management in LINUX
No ratings yet
Memory Management in LINUX
16 pages
Module 6 Device Driver
80% (10)
Module 6 Device Driver
12 pages
Memory Organization
No ratings yet
Memory Organization
99 pages
Lagger: Construction of Information Retrieval Systems: Bob Scheble
No ratings yet
Lagger: Construction of Information Retrieval Systems: Bob Scheble
7 pages
SaaS New
No ratings yet
SaaS New
24 pages
Starters Guide To Db2 For ZOS Data Sharing Monitoring and Tuning
No ratings yet
Starters Guide To Db2 For ZOS Data Sharing Monitoring and Tuning
28 pages
Python Unit 1-4 Notes
No ratings yet
Python Unit 1-4 Notes
87 pages
File Systems: Files Directories File System Implementation Example File Systems
No ratings yet
File Systems: Files Directories File System Implementation Example File Systems
46 pages
Computer Organization and Architecture C PDF
No ratings yet
Computer Organization and Architecture C PDF
24 pages

Memory Hierarchy - Introduction: Cost Performance of Memory Reference

Uploaded by

Memory Hierarchy - Introduction: Cost Performance of Memory Reference

Uploaded by

Memory Hierarchy - Introduction

Computer programmers want unlimited amount of

access all code or data uniformly.

Multilevel Memory hierarchy

Since fast memory is expensive, a memory hierarchy is

memory to a smaller but faster memory higher in the hierarchy.

data in the cache it is known as cache it.

the needed data item in cache.

Three categories of cache misses

The time required for cache miss depends on both the

latency and the bandwidth of the memory.

processors using in-order execution to stall.

also known as line run.

Virtual memory : Not all objects referenced by a program need to

reside in main memory. Some objects may reside in disk.

of locality of reference and the higher speed of smaller memories.

penalties are different for reads and writes.

Some designers measure miss rate as misses per

1000 instrns to show integers instead of fractions.

4 memory hierarchy questions

Where can a block be placed in a cache ???

Where can a block be placed in a cache ???

How is a block found if it is not in cache ???

How is a block found if it is not in cache ???

Relationship of a processor address to cache

Objectives > Basic Information

Memory arrangement of storage in current computer The hierarchical Hierarchy

Smaller, faster, and costlier (per byte) storage devices

An Example Memory Hierarchy

L1: on-chip L1 cache (SRAM) L2: off-chip L2 cache (SRAM)

Larger, slower, and cheaper (per byte) storage devices

main memory (DRAM)

Main memory holds disk blocks retrieved from local disks.

local secondary storage (local disks)

remote secondary storage (distributed file systems, Web servers)

The Principle of Locality The Principle of Locality:

space at any instant of time.

Cache Algorithm (Read) tags to Look at Processor Address, search cache

MISS - Not in cache

time to determine HIT/MISS

Caching in a Memory Hierarchy

Types of cache misses: Cold (compulsory) miss

Q3: Which block should be replaced on a miss?

General Caching Concepts

Program needs object d, which is stored in some

block b. Cache hit

Program finds b in the cache at level k.

E.g., block 14.

Q4: What happens on a write?

Cache miss: Q-4 continued

no write allocate: only write to main memory (lower-level memory)

act like read misses

Six basic Cache Optimization

Sandeep Singh M.Tech (CSE) 2nd Sem

Three categories of misses

First Optimization : Larger Block Size to Reduce

Second Optimization : Larger Caches to Reduce

Third Optimization : Higher Associativity to

Reduce Miss Rate

Fourth Optimization : Multilevel Caches to

Reduce Miss Penalty

Term adopted for a two-level cache system :

Fifth Optimization : Giving Priority to read Misses

over Writes to Reduce Miss Penalty

A third reason is operating systems and users program may use

I/O typically uses physical addresses and thus would require

You might also like