13-vm-notes
13-vm-notes
Vishal Shrivastav
CS 3410
Computer System Organization & Programming
These slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, and Sirer.
Where are we now and where are we going?
• a) 1
• b) 2
• c) 3-5
• d) 6-10
• e) 11+
Big Picture: Multiple Processes
How to run multiple processes?
Heap
Data
Text
0x000…0
Memory
Multiple Processes
• Q: What happens when another program is
executed concurrently on another processor?
0xfff…f
CPU
0x7ff…f
Stack
$$ Stack
$$
Heap
Heap
CPU
Data
Data
• A: The addresses will conflict Text
Text
Even though, CPUs may take 0x000…0
turns using memory bus Memory
Multiple Processes
• Q: Can we relocate second program?
0xfff…f
CPU
0x7ff…f
Stack
Stack
Heap
Heap
CPU
Data
Data
Text
Text
0x000…0
Memory
Solution? Multiple processes/processors
3 A
Process 1 2 B
1 C
0 D Give each process an illusion
that it has exclusive access to
entire main memory
3 E
Process 2 2 F
1 G
0 H
But In Reality…
14
D 13
12
Process 1 E 11
10
C 9
8
B 7
G 6
H 5
Process 2 4
A 3
2
F 1
0
Physical Memory
How do we create the illusion?
14
D 13
12
3 A E
Process 1 2 B
11
10
1 C
0 D C 9
8
B 7
G 6
H 5
3 E
Process 2 2 F
4
A 3
1 G
2
0 H
F 1
0
Physical Memory
How do we create the illusion?
14
D 13
12
3 A E
Process 1 2 B
11
10
1 C
science can be solved by Canother
9
All problems D
0 in computer
8
level of indirection. B 7
– David G
Wheeler
6
5
H
3 E
Process 2 2 F
4
A 3
1 G
2
0 H
F 1
0
Physical Memory
How do we create the illusion?
14
D 13
12
3 A E
Process 1 2 B
11
10
1 C Map virtual
C 9
Physical address
0 D address to
physical address 8
Virtual address
Memory
B 7
management unit G 6
(MMU) takes care H 5
3 E of the mapping
Process 2 2 F
4
A 3
1 G
2
0 H
F 1
Virtual Memory
0
(just a concept; does not exist physically)
Physical Memory
How do we create the illusion?
14
Process 1 wants to
D 13
access data C
12
Process 1 thinks it
3 A E
Process 1 2 B
is stored at addr 1 11
So CPU generates 10
1 C addr 1
C 9
Physical address
0 D This addr is 8
intercepted by
Virtual address
MMU
B 7
Memory
B 7
management unit G 6
(MMU) takes care H 5
3 E of the mapping
Process 2 2 F
4
3
1 G
2
0 H
1
Virtual Memory
0
A F
Disk Physical Memory
Big Picture: (Virtual) Memory
• From a process’s perspective –
20
Picture Memory as… ?
Byte Array: Segments: Page Array:
addr data 0xfffffffc system page n
0xffffffff xaa 0xfffff000
… 0x80000000 reserved
… 0x7ffffffc 0xffffe000
stack
x00 0xffffd000
...
heap
0x00004000
x00
0x10000000 data ...
0x00003000
xef
text page 2
xcd 0x00002000
xab 0x00400000 page 1
system 0x00001000
xff
0x00000000 x00 0x00000000 reserved 0x00000000
page 0
21
A Little More About Pages
Page Array: Memory size = depends on system
4KB say 4GB
0xfffff000
0xffffe000
Page size = 4KB (by default)
0xffffd000 Then, # of pages = 2^20
…
Any data in a page # 2 has address of the
form: 0x00002xxx
... paddr
0x9000000c 0xC20A3 0x4123B 0xABC
0x90000008 0x4123B 0x90000000
0x90000004 0x10044
0x90000000 0x00000 0x4123BABC
0x4123B000
31 12 11 0
vaddr 0x00002 0xABC
0x10045000
index into the page table page offset
0x10044000
Clicker Question:
Page size is 16KB how many bits is page offset?
(a) 12 (b) 13 (c) 14 (d) 15 (e) 16
• What if Main Memory is not 4GB?
Physical page number is no longer 20 bits
Clicker Question:
Page size 4KB, Main Memory 512 MB
how many bits is PPN?
(a) 15 (b) 16 (c) 17 (d) 18 (e) 19
26
Virtual Memory: Summary
Virtual Memory: a Solution for All Problems
32
Page Table Overhead
• How large is PageTable?
• Virtual address space (for each process):
Given: total virtual memory: 232 bytes = 4GB
Given: page size: 212 bytes = 4KB
# entries in PageTable? 220 = 1 million entries
size of PageTable? PTE size = 4 bytes
• Physical address space: PageTable size = 4 x 2 20 = 4MB
10 x 4MB = 40 MB of overhead!
• 40 MB /512 MB = 7.8% overhead,
space due to PageTable 33
But Wait... There’s more!
• Page Table Entry won’t be just an integer
• Meta-Data
Valid Bits
• What PPN means “not mapped”? No such number…
• At first: not all virtual pages will be in physical memory
• Later: might not have enough physical memory to map
all virtual pages
Page Permissions
• R/W/X permission bits for each PTE
• Code: read-only, executable
• Data: writeable, not executable
34
Less Simple Page Table
Physical Page
V R W X Number 0xC20A3000
0
1 1 1 0 0xC20A3
0
0 0x90000000
1 1 0 0 0xC20A3
1 0x4123B
1 0x10044
0 0x4123B000
Process tries to access a page without
proper permissions
0x10045000
Segmentation Fault
Example: 0x10044000
Write to read-only? process killed 0x00000000 35
Now how big is this Page Table?
struct pte_t page_table[220]
Each PTE = 8 bytes
How many pages in memory will the page table
take up?
PTBR
Page Table
42
Multi-Level Page Table
10 bits 10 bits 12 bits vaddr
31 22 21 12 11 0
PTEntry
PPN
PDEntry
Page Table
Also referred to as
PTBR
Page Directory Level 1 and Level 2
* Indirection to the Rescue, AGAIN! Page Tables43
Multi-Level Page Table
10 bits 10 bits 12 bits vaddr
31 22 21 12 11 0
Assuming each entry is
4bytes,What is the size of
Page Directory?
A: 2KB B: 2MB
C: 4KB D: 4MB
PTEntry
PPN
PDEntry
Page Table
Also referred to as
PTBR
Page Directory Level 1 and Level 2
* Indirection to the Rescue, AGAIN! Page Tables44
Multi-Level Page Table
10 bits 10 bits 12 bits vaddr
31 22 21 12 11 0
Assuming each entry is
4bytes,What is the total
size of ALL Page tables?
A: 2KB B: 2MB
C: 4KB D: 4MB
PTEntry
PPN
PDEntry
Page Table
Also referred to as
PTBR
Page Directory Level 1 and Level 2
* Indirection to the Rescue, AGAIN! Page Tables45
Multi-Level Page Table
10 bits 10 bits 12 bits vaddr
31 22 21 12 11 0
PTEntry
PPN
PDEntry
Page Table
PTBR Size = 210 * 210 *4 bytes = 4MB
Page Directory # entries per page table
# page tables
Size = 210 * 4 bytes = 4KB 46
Multi-Level Page Table
Doesn’t this take up more memory than before?
- YES, but..
Benefits
• Don’t need 4MB contiguous physical memory
• Don’t need to allocate every PageTable, only
those containing valid PTEs
Drawbacks
• Performance: Longer lookups
47
Virtual Memory Agenda
What is Virtual Memory?
How does Virtual memory Work?
• Address Translation
• Overhead
• Paging
• Performance
48
Paging
What if process requirements > physical memory?
Virtual starts earning its name
More Meta-Data:
• Dirty Bit, Recently Used, etc.
• OS may access this meta-data to choose a victim
49
Paging
Physical Page
0xC20A3000
V RWX D Number
0 --
1 1 0 1 0 0x10045
0x90000000
0 --
0 --
0x4123B000
0 0 disk sector 200
0 0 disk sector 25
1 1 1 0 1 0x00000 0x10045000
0 --
0x00000000
Example: accessing address
beginning with 0x00003
(PageTable[3]) results in a Page
200
Fault which will page the data in 25
from disk sector 200 50
Page Fault
Valid bit in Page Table = 0
means page is not in memory
OS takes over:
• Choose a physical page to replace
“Working set”: refined LRU, tracks page usage
• If dirty, write to disk
• Read missing page from disk
Takes so long (~10ms), OS schedules another task
52
Watch Your Performance Tank!
For every instruction:
• MMU translates address (virtual physical)
Uses PTBR to find Page Table in memory
Looks up entry for that virtual page
• Fetch the instruction using physical address
Access Memory Hierarchy (I$ L2 Memory)
53
Performance
• Virtual Memory Summary
• PageTable for each process:
Page
• Single-level (e.g. 4MB contiguous in physical memory)
• or multi-level (e.g. less mem overhead due to page table),
•…
every load/store translated to physical addresses
page table miss: load a swapped-out page and retry
instruction, or kill program
• Performance?
terrible: memory is already slow
translation makes it slower
• Solution?
A cache, of course
Next Goal
• How do we speedup address translation?
Translation Lookaside Buffer (TLB)
• Small, fast cache
• Holds VPNPPN translations
• Exploits temporal locality in pagetable
• TLB Hit: huge performance savings
• TLB Miss: invoke TLB miss handler
• Put translation in TLB for later
CPU VA
“tag” “data”
VPN PPN
VA VPN PPN
VA VPN PPN
MMU TLB
PA PA
56
TLB Parameters
Typical
• very small (64 – 256 entries) very fast
• fully associative, or at least set associative
57
TLB to the Rescue!
For every instruction:
• Translate the address (virtual physical)
CPU checks TLB
That failing, walk the Page Table
• Use PTBR to find Page Table in memory
• Look up entry for that virtual page
• Cache the result in the TLB
• Fetch the instruction using physical address
Access Memory Hierarchy (I$ L2 Memory)
no TLB yes
$ Access
Hit?
Physical
Address
$ yes
Hit?
no
DRAM
Access