0% found this document useful (0 votes)
10 views26 pages

ELT3047 Computer Architecture: Lecture 11: Virtual Memory

The document discusses virtual memory in computer architecture, explaining how it allows multiple programs to run simultaneously by providing each process with the illusion of a full memory address space. It details the benefits of virtual memory, such as running larger programs than physical memory, simplifying memory management, and enhancing memory protection. The document also covers address mapping, page tables, and page replacement strategies to manage memory efficiently.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views26 pages

ELT3047 Computer Architecture: Lecture 11: Virtual Memory

The document discusses virtual memory in computer architecture, explaining how it allows multiple programs to run simultaneously by providing each process with the illusion of a full memory address space. It details the benefits of virtual memory, such as running larger programs than physical memory, simplifying memory management, and enhancing memory protection. The document also covers address mapping, page tables, and page replacement strategies to manage memory efficiently.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

ELT3047 Computer Architecture

Lecture 11: Virtual Memory

Hoang Gia Hung

Faculty of Electronics and Telecommunications

University of Engineering and Technology, VNU Hanoi


Introduction

 Modern computers can run multiple programs simultaneously

 How can we run programs exceeding physical memory?

 How can we prevent each program from interfering with other’s memory?

Program Memory

Program 1
Physical Memory

4GB
1GB XYZ$@#!

Program 2

⚠️Crash if we try to access an address > 0x3FFF ⚠️Corrupt if each process can access any
FFFF! memory address
Virtual Memory

 Virtual memory is the next level in the memory hierarchy

 Give each process the illusion of a full memory address space at a time.

 Parts of the program (working set) reside in RAM, others are in disk.

 Physical memory layout are hidden from processes.

Virtual Memory

DRAM

B
3 A
C
Process 1
2 B D
1 C

Hidden from Process


E
0 D

Disk

G
3 E
Process 2 A
2 F
F
1 G H

0 H
Virtual Address Space Illusion

Processor
Processor ffff ffff Memory
Memory
StackStack hex
Control
Control

Unused
Unused Processes use virtual
Different processes run simultaneously by context switching

Datapath
Datapath Memory
Memory

Physical Addresses
Physical Addresses
Instruction
CacheCache

Virtual Addresses
Instruction

Virtual Addresses
Program
Program Counter
Counter (PC) (PC) addresses
Bytes Bytes
Many processes, all using
Registers
Registers ? ?
HeapHeap same (conflicting)
Data Cache
Data Cache addresses.
StaticStatic
DataData

Arithmetic-Logic
Arithmetic-Logic
Unit (ALU)
Unit (ALU)
CodeCode 0000 0000
hex

Processor
Processor ffff ffff Memory
Memory
Stack
Stack hex
Control
Control

Unused
Unused
Datapath
Datapath Memory
Memory

Physical Addresses
Physical Addresses
Instruction
Instruction
CacheCache

Virtual Addresses
Virtual Addresses
Program
Program
Counter
Counter
(PC) (PC)
BytesBytes

Registers
Registers ? ?
HeapHeap
Data Data
CacheCache
Static
Static
DataData

Arithmetic-Logic
Arithmetic-Logic
Unit (ALU)
Unit (ALU)
CodeCode 0000 0000
hex
Benefits of Virtual Memory

 Ability to run programs larger than the physical memory

 By moving old data to the disk to free up RAM for active processes.

 Simplifies memory management for programmers

 Each process sees a contiguous block of memory in its virtual address space, making development easier.

 Memory management unit (MMU) takes care of the mapping between virtual and physical addresses.

 Enhances memory protection

 Each process has its own virtual address space → one program cannot directly access the memory of another program.

 What makes it work?

 Principle of Locality: a program often accesses a small portion of its address space during a period of time.
Physical Adresses

 Physical memory partitioned into equal sized page frames

 Avoids external fragmentation


(f -1,o -1)
MAX MAX

 A memory address is a pair (f, o)

 f: frame number (f
max
frames)

 o: frame offset (o
max
bytes/frames) (f,o)

o
 Physical address = o
max
f+o

 Example: o
max
= 512 bytes/frames Physical

 16-bit address Memory

 addressing location (3, 6) = 1,542


f
 The address bit pattern isn’t changing,

 just the interpretation of the bit pattern.

PA: 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0

16 10 9 1

(0,0)
Virtual Adresses

 A virtual address space is partitioned into equal sized pages

(p -1,o -1)
MAX MAX

 A virtual address is a pair (p, o)

 p: page number (p
max
pages)

 o: page offset (o
max
bytes/pages) (p,o)

o
 Virtual address = o
max
p+o

 Observations Virtual

 o
max
is the same for both pages and frames. Memory

 p
max
and f
max
in principle have no relation
p
 Normally, we have more pages than frames

(0,0)
Address mapping

 Each memory request requires a mapping from virtual space to physical space.

 It is possible for a virtual page to be absent from the main memory (DRAM).

 Physical pages can be shared, enabling data sharing between 2 programs.

Program 1

virtual address space

main memory

Program 2

virtual address space

Page sharing: two virtual addresses point to the same physical

address.
Address translation: introduction

 Translating a Virtual Address (VA) to a Physical Address (PA) is done by a combination of hardware and

software.

 Maps a Virtual Page Number (VPN) to a Physical Page Number (PPN)

 Result of a translation can point to a page frame in DRAM or a block on disk.

 Accessing a disk block is very costly, often ~millions of clock cycles.


Address translation: implementation

Virtual page # Offset

Physical page # Offset

Mapping Main memory


Physical page
V base addr
Page table register

1
index
into 1

page 1
table 0

If the valid bit is off, the page is 0

not in DRAM but is in the disk: a 1


page fault. 0

Page Table

(in main memory)


32 bits wide = V + 18 bits PPN + extra
bits
Disk storage
Page table details

 Each process has a dedicated page table.


Page Table

 232 virtual addresses / (212 B/page) = 220 virtual page numbers (220 entries)

(1 Mi pages)
0x00000
0
 Page table contents:

 One entry per virtual page number. 0x60000 2


 Entry has physical page number (or disk address) as well as …
status bits. disk

 Page table is NOT a cache!! 0xFFFFF
1
 Page tables don’t hold the program’s data: it’s just a lookup table.

 Each VPN has a valid entry: no tags, the VPN is used as an Status bits Memory page/

(more later) disk address


index.
The role of the OS

 The OS takes over when page faults occur.

 Maintains a data structure (called a swap table or backing store map).

 It maps virtual page numbers to locations in the swap space (disk blocks).

 Use both memory and disk.

 Give illusion of larger memory by storing some content on disk.

 Disk is usually much larger and slower than DRAM

 Protection:

 Isolate memory between processes by assigning different pages in DRAM

 Prevents user programs from messing with OS’s memory.

 Errors in one program won’t corrupt memory of other programs.

 Keeps track of which process is active

 Page table entry also includes a write protection bit; if on, then page is “protected”.
Address translation: Page hit example

Program executes a load specifying a virtual address (VA).


1.
Computer translates VA to the physical address (PA) in memory
2.
 Extract virtual page number (VPN) from VA, e.g. top 20 bits if page size 4KiB

 Look up the physical page number (PPN) in the page table.

Page hit: PA = PPN (from page table) + offset (from VA)


3.
 OS reads memory & returns the data to the program.

Program DRAM
Page Table
(32-b virtual address space) (physical address space)
4️⃣

VPN PPN
0 …
lb t0, 0xFFFFF004(x0)

1️⃣ … … 1
……………data for
2
0x60000 disk t0…
… … 3

0xFFFFF 3️⃣
2️⃣ 1

(assume 4 x 4KiB pages)
CPU
Address translation: Page fault example

Program executes a load specifying a virtual address (VA).


1.
Computer translates VA to the physical address (PA) in memory
2.
 Extract virtual page number (VPN) from VA, e.g. top 20 bits if page size 4KiB

 Look up the physical page number (PPN) in the page table.

Page fault: the OS uses the swap table to find the disk block
3.
 OS reads the disk, loads the data to RAM & returns it to the program.

Program DRAM
Page Table
(32-b virtual address space) (physical address space)

0 …
lb t0, 0xFFFFF004(x0) VPN PPN
1
lb t1, 0x60000030(x0) … … ……………data
1️⃣ ⚠️ 2
for t0…
0x60000 disk
2️⃣ … … 3️⃣ 3

0xFFFFF 1 Go to

(assume 4 x 4KiB pages)
CPU disk!
Dealing with Large Page Tables

 When OS starts a new process, it creates space on disk for all the pages of the process (all valid bits in page

table = zero)

 called Demand Paging - pages of the process are loaded from disk only when needed.

 With demand paging, physical memory fills quickly → the overall page table size becomes too big.

 4GB Virtual Address Space ÷ 4 KB page

 1 million Page Table Entries ≈ 4 MB just for a single process!

 Variety of solutions to tradeoff the page table size for slower performance

 E.g., Multi-level page table, Paging page tables, etc.


Two-level Page Table Example

10 bits 10 bits 12 bits


vaddr

31 22 21 12 11 0

g e?
h y s ical pa
y p
e is m
Wher
?
an s l ation
e is my tr
Wher

PTEntry
PPN

PDEntry
Page Table

PTBR

Page Directory Also referred to as Level 1 and

Level 2 Page Tables


Multi-level Page Table

 Doesn’t this take up more memory than before?

 Yes, e.g. two-level page table:

 Page directory size: 210 (#entries) * 4 bytes = 4KB

 Total page table size: 210 (#tables) * 210 (#entries/table) * 4 bytes = 4MB

 Benefits

 Don’t need 4MB contiguous physical memory

 Don’t need to allocate every PageTable, only those containing valid entries

 Drawbacks

 Longer lookups.
Disk Write and Load control

 Disk writes take millions of cycles

 Accessing block at once, not individual locations

 Write through is impractical

 Use write-back

 Dirty bit in the page table entry set when page is written

 To improve CPU utilization, the CPU muts switch to service other processes during a disk write

 System is thrashing when a chain of page faults occur → CPU utilization falls : spending all of its time paging.

 Load control: determining how many jobs can be in memory at one time

 Load control in the small: how many pages of a process to be loaded?

 Load control in the large: how many processes to be loaded?


Page Replacement

 Now we have lots of programs in memory

 When a process faults & memory is full, some page must be swapped out

 Which page should be replaced?

 Local replacement: replace a page of the faulting process.

 Global replacement: possibly replace the page of another process.

 Replacement algorithms

 FIFO: simple to implement, worst performance.

 Clairvoyant: replace the page that won’t be needed for the longest time in the future – optimal but impractical (can’t look forward in

time).

 LRU: replace the page that hasn’t been referenced for the longest time

 Look backwards and use the recent past to predict the near future.

 Approximation of clairvoyant replacement: “back to the future” heuristic.


LRU Replacement: Example

1
Time 20 3 4 5 6 7 8 9

c
Requests a d b e b a b c

a 0 aa a a a a a a a

b 1 bb b b b b b b b
Frames
Page

c 2 cc c c e e e e e

d 3 dd d d d d d d c

Faults • • •
a=2 a=7 a=7
Time page b=4 b=8 b=8
last used c=1 e=5 e=5
d=3 d=3 c=9

 Note: did a bad job at time 9

 Replaced a page that we’re just going to reference 1 (virtual) time unit later.
LRU Replacement: Implementation

 Maintain a “stack” of recently used pages

 Impractical: stack size depends on number of frames allocated to the process

1
Time 20 3 4 5 6 7 8 9

c
Requests a d b e b a b c

a 0 aa a a a a a a a

b 1 bb b b b b b b b
Frames
Page

c 2 cc c c e e e e e

d 3 dd d d d d d d c

Faults • •

c a d b e b a b c d
LRU
c a d b e b a b c
page stack
c a d d e e a b

c a a d d e a

Page to replace c d e
LRU Replacement: Approximation

 Maintains a circular list of pages resident in memory

 Use a clock (used/referenced) bit to track how often a page is accessed

 The bit is set whenever a page is referenced

 Clock hand sweeps over pages looking for one with used bit = 0.

 The search starts with the 1st valid entry in the page table and there-after continue where it left off last time.

Page 7: 1 1 0 func Clock_Replacement

begin

while (victim page not found) do

if(used bit for current page = 0) then


Page 1: 1 0 5 Page 4: 1 0 3
replace current page

else

reset used bit


Page 3: 1 1 1
end if
Page 0: 1 1 4
advance clock pointer
resident bit
used bit end while
frame number
end Clock_Replacement
TLB: Making Address Translation Fast

 A virtual memory request requires extra memory references

 one to translate Virtual Address into Physical Address (page table lookup)

 one to transfer the actual data (hopefully cache hit).

 But access to page tables has good locality

 So use a fast cache of page tables within the CPU

 Called a Translation Look-aside Buffer (TLB)

V Dirty Ref Tag (VPN) PPN

 Typical design: fully associative, 16–512 entries, 0.5–1 cycle for hit, 10–100 cycles for miss, 0.01%–1% miss rate, random/FIFO

replacement policy.

 Misses could be handled by hardware or software.


TLB: Example
Full Memory Hierarchy

¼ t hit ¾t
VA PA miss
TLB Main
CPU Cache
Lookup Memory

miss hit

Page table

translation

data

 A TLB miss – is it a page fault or merely a TLB miss?

 No, if the page is already loaded into main memory → finds & loads the information from the page table into the TLB (takes ~10’s of

cycles).

 Yes, if the page is not in main memory → it’s a true page fault (takes ~millions of cycles to service a page fault).

 TLB misses are much more frequent than true page faults
Summary: steps in memory access

You might also like