0% found this document useful (0 votes)

38 views

7 Memory

The document discusses different types of memory technologies including RAM, SRAM, DRAM, and SDRAM. It describes the basic components and workings of SRAM and DRAM bit cells. It also covers topics like fast page mode, refresh overhead, and efficient block transfers in memory systems.

Uploaded by

abccdes

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views

7 Memory

Uploaded by

abccdes

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 89

ECSE324 : Computer Organization

Memory
Chapter 8

Brett H. Meyer
Winter 2024

Revision history:
Warren Gross – 2017
Christophe Dubach – W2020, F2020, F2021, F2022, F2023
Brett H. Meyer – W2021, W2022, W2023, W2024
Some material from Hamacher, Vranesic, Zaky, and Manjikian, Computer Organization and Embedded Systems, 6 th ed, 2012, McGraw Hill,
and “Introduction to the ARM Processor using Altera Toolchain.”

Timestamp: 2024/03/13 10:50:00

1
Disclaimer

It is possible (and even likely) that I will (sometimes) make mistakes

and give incorrect information during the live lectures. If you have
any doubts, please check the textbook, or ask for clarification online.

2
Introduction
What is Memory? What is Storage?

3
What is Memory? What is Storage?

4
What is Memory? What is Storage?

5
What is Memory? What is Storage?

What is the difference between memory and storage? How do they

interact?

source: www.ifixit.com

iPad Air LTE board

Elpida 1 GB LPDDR3 SDRAM memory: volatile (temporary) space for

data that loses its contents when powered off; addressable
Toshiba 16 GB NAND Flash storage: non-volatile (permanent) space
for data; only accessible through the OS
What are the different technologies used to implement memory? 6
Memory Technology
To the Processor, the World is Memory

7
Random Access Memory (RAM)

There are two key metrics (amongst others) used to describe

memory:

• Memory access time: the time from initiation to completion of a

word or byte transfer
• Memory cycle time: the minimum time between initiation of
successive transfers

Random access memory (RAM) means that access time is

independent of the accessed location.

8
Memory Technology

Semiconductor RAM Memories

Textbook§8.1, 8.2
16x8 RAM

9
1024x1 RAM

10
Static RAM

Static RAM (SRAM) is made out of CMOS transistors.

Gate Gate
Source Drain Source Drain

Gate Oxide Gate Oxide

n n p p

p n

Body Body

source: VectorVoyagerPNG version: user:rogerb, CC BY-SA 3.0, via Wikimedia Commons source: VectorVoyagerPNG version: user:rogerb, CC BY-SA 3.0, via Wikimedia Commons

n-type p-type

• SRAM is volatile: it retains state as long as power is applied

• SRAM is fast, but expensive: access time is typically a few ns, but
each bit requires six transistors
• SRAMs are typically no larger than a few a Mbit
• Today, SRAMs often∗ implement on-chip “cache” or
“scratch-pad” memories, but not main memory

11
Static RAM

Static RAM (SRAM) is made out of CMOS transistors.

Gate Gate
Source Drain Source Drain

Gate Oxide Gate Oxide

n n p p

p n

Body Body

source: VectorVoyagerPNG version: user:rogerb, CC BY-SA 3.0, via Wikimedia Commons source: VectorVoyagerPNG version: user:rogerb, CC BY-SA 3.0, via Wikimedia Commons

n-type p-type

• SRAM is volatile: it retains state as long as power is applied

An SRAM cell is storing a ‘1’ when X is VDD.

12
6T SRAM Bit Cell

Transistors must be carefully sized for read stability and writeability.

13
Dynamic RAM

Dynamic RAM (DRAM) is also volatile, but: without refreshing, will

lose state even when powered.
SRAM is self-reinforcing: cross-coupled inverters hold state when
powered. DRAM is not: a single capacitor leaks its contents within
10s of ms; DRAM must be read periodically (refreshed) to preserve its
state.

• DRAM is slower than SRAM, but cheaper: the DRAM cell is

simpler, and DRAM is much denser than SRAM
• DRAM arrays can be quite large, up to Gbits
• DRAM is often used to implement off-chip main memory

14
DRAM Bit Cell

A DRAM cell is storing a ‘1’ when the voltage across C is VDD∗ . The
charge in C leaks through T even when T is off.

15
DRAM Bit Cell

A DRAM cell is storing a ‘1’ when the voltage across C is VDD∗ . The
charge in C leaks through T even when T is off.

∗
In practice, voltages less than VDD are recognized as ‘1’, too.
15
Reading DRAM

Because DRAM is built for density, out of the smallest circuit

elements possible, accesses are optimized for speed where possible.

• When read, a sense amplifier (sense amp) connected to the bit

line detects if the charge in the capacitor is above a threshold
• If above threshold, the sense amp drives the bit line to VDD (‘1’),
recharging the capacitor.
• If below threshold, the sense amp pulls the bit line to GND (‘0’),
discharging the capacitor.

Reading a DRAM cell refreshes its contents. Note that an entire row
is read and refreshed at the same time.
To refresh the entire DRAM, each row must be periodically read.

16
Refresh Overhead

Assume that each row needs to be refreshed every 64 ms, that the
minimum time between two row accesses is 50 ns, and that all rows
are refreshed in 8192 cycles.
Read/write operations have to be delayed until refresh is finished.
What is the refresh overhead?

17
256 Mb Asynchronous DRAM (32M x 8)

The 25-bit address is broken into 14 bits for row select, 11 for column.
• First, A24−11 is driven, and RAS asserted, reading a row.
• Then, A10−0 is driven, and CAS asserted, selecting a byte.

RAS, CAS, and refresh are managed by an external memory controller. 18

Fast Page Mode

• In the preceding example, each read accesses (and refreshes) all

16,384 cells in the addressed row.
• Only 8 bits are transferred, however.
• For more efficient access to data in the same row (page), latches
in sense amps buffer cell contents.
• Subsequent reads to the same row only require a new column
address and CAS strobe.
• This is called fast page mode and it speeds up block transfers.

19
Synchronous DRAM

• Synchronous DRAM
(SDRAM) integrates an
on-chip memory
controller
• A clock helps generate
internal timing signals
(i.e., RAS and CAS)
• Refresh is also built-in
• The “dynamic” nature
of the chip is invisible
to the user

20
Efficient Block Transfers

SDRAM can operate in different modes, which determines how

signals are generated internally.
• E.g., burst mode
automatically accesses
consecutive locations
in memory.
• Column address and
CAS are asserted for
one cycle.
• SDRAM circuitry
increments the column A burst of length 4; RAS delay of 2 cycles;
counter internally for CAS delay of 1 cycle.
each additional access.

21
Memory Latency and Bandwidth

• Memory latency (ns) is

time from initiation to
when the first word of
a block transfer is on
Data.
• The time between
subsequent accesses
to consecutive words
is much shorter.
Latency is 5 cycles. If the clock is 500
• Memory bandwidth
MHz, the latency is 5 x 1/500e6 = 10 ns.
(bps) measures the
The remaining three words in the
maximum rate at
transfer are read at one word every 2 ns.
which data may be
transferred.

22
Double-Data-Rate (DDR) SDRAM

Modern SDRAM uses both rising and falling edges of the clock
(“double data rate”).
E.g., DDR4 has a clock of 2133 MHz, and can support up to 2400 M
transfers per second.

8 GByte DDR4-2133 ECC 1.2 V RDIMM

(Registered dual-inline memory module)

23
Multi-chip Memories

Multiple smaller memories can be integrated to create a larger

memory. E.g., a 2M x 32 SRAM:

24
Memory Technology

Read-only Memories
Textbook§8.3
Non-volatile Memories

Non-volatile memories (NVM) are essential for today’s embedded

systems. NVM

• retain their contents even when unpowered; and,

• are slower than volatile memories, and require special
procedures for write accesses.
• NVM are usually used for long-term storage: e.g.,
• code and related data in embedded systems (addressable
memory), and
• solid state drives (SSD) when more storage is needed (managed by
the OS)

25
Read-only Memory (ROM)

26
PROM, EPROM, and EEPROM

A programmable ROM (PROM) is written once, at manufacturing time,

and cannot be later modified: e.g., a fuse is burned out with a large
current.
Other types of ROM can be erased and re-written in the field.

• An erasable programable ROM (EPROM) uses a special transistor

instead of a fuse.
• Injecting charge allows the transistor to turn on.
• Erasure requires UV light exposure to remove all charge.
• An electrically erasable programmable ROM (EEPROM) supports
the selective erasure of cells.

27
Flash Memory

Flash is a high-density, low-power, low-cost, and very widely adopted

NVM.

• Flash cells are designed to be erased in larger blocks, increasing

density
• Writing individual cells requires reading a block, erasing it, and
writing it back with changes
• Flash cells wear out: wear leveling distributes writes to avoid
wearing out some cells before others

28
Direct Memory Access (DMA)
Textbook§8.4
Direct Memory Access

CPU overhead for block transfers is high: an address calculation, and

load/store instruction, per byte or word.

• A DMA controller is an I/O

device that manages block
transfers between memory
and other devices.
• The CPU initiates the transfer,
which completes without
further CPU involvement.

29
DMA Controller

DMA controllers may be shared; individual I/O devices may also have
DMA controllers.
• CPU writes control registers
(starting address, count,
R/W), and initiates transfer.
• The controller keeps track of
progress with a counter.
• An interrupt can be used to
signal transfer completion.
• DMA can also be invoked to
make repeated transfers
triggered by a timer.

30
Caches
Textbook§8.5, 8.6
The Memory Problem

Problem: we want a very large, very fast memory.

• DRAM can be large, but is slow.

• SRAM can be fast, but not large.

Solution: use both DRAM and SRAM such that the memory appears
to the CPU to be large, and fast.
The solution should be transparent to the programmer.

31
The Memory Problem

Library:
Library:
Library: large,
large, slow
large, slow
slow access
access
access Desk:
Desk: small,
small,
Desk: fast
fast
small, access
access
fast access

32
Unlimited amounts of fast memory?

The memory problem is as as old modern computing.

“Ideally one would desire an indefinitely large memory ca-

pacity such that any particular...word would be immediately
available...We are...forced to recognize the possibility of con-
structing a hierarchy of memories, each of which has greater
capacity than the preceding but which is less quickly acces-
sible.”

A. W. Burks, H. H. Goldstine, and J. von Neumann, “Preliminary

Discussion of the Logical Design of an electronic computing
instrument”, 1946.

33
Memory Hierarchy

We create the illusion of large, fast memory by keeping a copy of

(caching) frequently used data in a small memory (cache); accesses
to cached data (fast) do not require accesses to memory (slow).

• Programmers use load and store instructions as usual

• Specialized hardware manages the movement of data between
memory and the cache

34
Memory Hierarchy

Memory hierarchy design matters to hardware engineers:

• Different systems call for different amounts of memory

• Different manufacturing technologies mean different amounts of
delay for memory accesses
• Memory hierarchy must be carefully tuned for each system

Memory hierarchy design matters to software engineers:

• Memory hierarchy cannot hide all memory access delay

• Delay is hidden better for some access patterns than others
• Understanding memory hierarchy makes it easier to write fast
software

35
Memory Hierarchy

Modern memory hierarchies incorporate multiple levels of cache,

which may be split into instruction and data caches, or unified,
shared by multiple processors, or private.

36
Memory Hierarchy

Even when two different systems have the same number of levels or
hierarchy, different use cases may mean different sizes for each
memory.

37
Locality, Locality, Locality

Why does caching work?

Temporal Locality
• Programs and data are naturally Recently accessed items
structured in convenient ways are likely to be accessed
• Software engineering, compilation, again soon: loops, data
and computer hardware, have reuse.
evolved together
Spatial Locality
• Consequently, memory accesses
tend to follow predictable patterns Items near an accessed
• This is called the principle of item are likely to be
locality, which caches exploit accessed soon: code
without branches, arrays.

38
Cache Basics

Caches are too small to store copies of the entire address space; at
any given time, some recently accessed things will be in the cache,
other things will not.
Each time the CPU (a) fetches an instruction, or (b) accesses data:

• Look for it in the cache.

• Is it there? That’s a cache hit
• Deliver the desired item to the processor
• Is it not there? That’s a cache miss
• Copy the item from main memory into the cache
• Deliver the desired item to the processor

39
Hit and Miss Rate

Caches improve performance as long as enough accesses hit in

cache.

• It is not uncommon for caches to have a hit rate of >95%.

• Instruction accesses are especially predictable; data accesses
less so.

hit rate = cache hits/memory accesses

miss rate = 1 − hit rate

40
Where are items put in the cache?

The cache is a RAM; where should a particular item be stored in it?

Where do we look for the item that we want?

• Caching divides main memory into blocks (a.k.a. cache lines),

each consisting of several consecutive data elements.
• E.g., a typical cache line size is 64B.
• When a miss occurs, the block containing the desired item is
transferred from main memory.
• Where the block goes is determined by the mapping function.

Some mapping functions are simple; others are more complex, but
result in a higher hit rate.

41
Direct-mapped Cache

Assume a cache size of n blocks, and m words per block.

The simplest mapping function is direct mapping:

• Every block in memory maps to a single block in cache

• Memory block j goes in cache block (j mod n)

42
Direct-mapped Cache

Block size: m = 16B (16 words)

Cache size: n = 128 blocks
Main memory: 64KB → 4K blocks
Address size: 16 bits
Memory block j goes in cache
block (j mod 128)
The 16-bit address is divided into
three parts: word, block, and tag:
• Word selects the appropriate
cache column (1 of 16)
• Block selects the appropriate
cache row (1 of 128)

43
Direct-mapped Cache

What is the cache location and tag corresponding to address 2065?

44
Valid Bit

What happens when a cache block is accessed for the first time? The
tag could match, but the data would be invalid.
Each cache block also has a valid bit, initialized to ‘0,’ and set to ‘1’
whenever a block is copied into the cache.

• For a hit to occur, valid must be 1.

• Valid bits are reset under different circumstances, e.g., whenever
a new program begins executing.

45
Direct-mapped Cache Harware Design

46
Direct-mapped Cache

The advantage of direct-mapped caches: simple (and fast) hardware

maps memory addresses to cache blocks.
The drawback of direct-mapped caches: multiple blocks may
contend for the same location.

• Newly requested blocks always overwrite blocks previously

stored at a given location
• If multiple frequently accessed memory blocks map to the same
cache block, they will replace each other, resulting in more
misses, and costly accesses to main memory

47
Fully-associative Cache

With fully-associative mapping, a

memory block can be placed in
any cache block.
• Blocks are only replaced
when the cache is full
• There is no block field in the
address: only tag and word
• Every cache block is searched
simultaneously for a
matching tag

This is slower and more expensive, but achieves the highest hit rate.

48
Fully-associative Cache

Fully associative

Address Tag Word idx

V Tag Data V Tag Data V Tag Data

...

1
= 1
= 1
=
1 1 1

1 1 1

...
..

Mux
.

cache line
...
Mux

hit word

49
Set-associative Cache

With set-associative mapping, a

memory block can be placed in
limited number of cache blocks.
k-way set-associativity:
• Blocks are grouped into sets
of k blocks
• Memory blocks are directly
mapped to a set
• The tags of the k blocks are
searched in parallel

This strikes a trade-off between direct-mapped and fully associative

caches, improving hit rate without the high cost of full associativity.

50
Set-associative Cache
Set-associative

Address Tag Set idx Word idx

V Tag Data V Tag Data

1
Decoder

...

1
= 1
=
1 1

1 1

... cache set / row

...

Mux

cache line / block

Mux
1

hit word

51
Every Cache is Set-associative

Associativity determines the number of cache blocks in which a

memory block may be placed.
Assuming a cache with n blocks:

• 1-way set-associative caches are direct mapped

(there are n sets of one block)
• k-way set-associative (e.g., k ∈ {2, 4, 8, 16, . . .}, k < n)
(there are n/k sets of k blocks)
• n-way set-associative caches are fully associative
(there is one set with n blocks)

52
Block Replacement Policies

Block replacement (determining which block in the set to replace on

a cache miss) is trivial for direct-mapped caches; a strategy for
associative caches is needed, however.

• Least-recently-used (LRU): hardware tracks the relative timing of

accesses to each block in the set
• First-in-first-out (FIFO): replacement rotates through blocks in
the set
• Random: a block in the set is chosen at random for replacement

Each policy choice has pros and cons related to hardware complexity
and resulting miss rate. See more possibilities here.

53
Writes to Cache

Depending on the organization of memory, writes to cache are

handled in a variety of different ways.
There are two commonly used policies:

• Write-through: update the accessed block in the cache, if

present, and main memory;
• Write-back: only write to the cache.

54
Write-through

Write-through simplifies memory system design at the cost of using

more memory bandwidth and energy: each write to cache results in
a write to main memory.

• Hit: write to both cache and main memory

• Miss: write only to main memory

55
Write-back

A write-back policy reduces memory bandwidth on most writes, but

increases the complexity of block replacment, and complicates
memory system design in general, especially for multiprocessors.

• Hit: write to the cache. Update main memory only when that
cache block is removed from the cache. A dirty bit (or modified
bit) is set to indicate cache block has been modified and is no
longer identical to the block in main memory.
• Miss: first copy the block containing the addressed word from
main memory into the cache, and then write the new word in
the cache block.

56
Caching Example

Assume a 4x10 array of

16-bit numbers is stored in
an array A in column-major
order.
Let’s look at cache behavior
when we normalize the
elements of the first row of
A with respect to the
average value of elements
in that row.

57
Caching Example

Consider a cache with the

following characteristics:
• Memory word: 16 bits
• Word-addressable with
16-bit addresses
• Block size: one word
• Cache size: 8 blocks
• LRU replacement
Let’s look at what happens
for three different caches:
• direct-mapped
• fully associative
• 4-way set associative

58
Caching Example: Direct-mapped Cache Results

Everything maps to just two sets!

Only two hits: when i = 9 and i = 8!

59
Caching Example: Fully-associative Cache Results

The cache lacks the capacity to store the working set.

Only two misses in the second loop: when i = 1 and i = 0!

60
Caching Example: Set-associative Cache Results

Everything maps to a single set, but we have four ways.

Six misses in the second loop in this case: when i ∈ {0, 1, . . . , 5}

61
Split L1 Cache
Instruction & Data cache
L1 is usually split into instruction and data caches; later levels are
unified.
Ins
• Harvard architecture: sa
unified L1 would slow
•
things down
• Instruction and data
access patterns are
quite different
• Instruction accesses •
are predictable:
loops; basic blocks •
• Instruction accesses
are read-only
• Splitting L1 cache
results in higher hit
rates
62
Secondary Storage
Textbook§8.10
Secondary Storage

In addition to memory, computer systems often have additional

storage.
• Non-volatile, long-term storage
• Managed by the OS (not directly
addressable by the CPU)
• Two main technologies today:
• Flash-based solid-state drives
(SSD): e.g., in mobile devices and
some laptops
• Magnetic hard-disk drives (HDD):
e.g., in workstations, servers
• HDD are lower cost / bit at the
moment, but this may change as
technology evolves

63
Magnetic Hard Disk Drives

Hard drive drives consist of one or more magnetic platters on a

common spindle.
• Platters are covered with a thin
magnetic film
• Platters rotate on spindle at a
constant rate (1000s of RPM)
• Read/write heads, close to the
surface, detect bits stored as a
magnetic field in concentric tracks
• The magnetic yoke and magnetizing
coil in the head sense or change the
polarity of the field on the surface of
the platter

64
Magnetic Hard Disk Drives

65
Magnetic Hard Disk Drives

Each disk is divided into concentric tracks, and each track into
sectors. A cylinder is a set of tracks on a stack of disks; such tracks
can accessed simultaneously without moving the read/write heads.
• Data is written
sector-by-sector (e.g., 512 B)
• Formatting information
(including track/sector
markers) and error-correcting
code (ECC) information is
stored on disk
• The file system is on disk, too:
data structures that the OS
uses to keep track of files

66
HDD Access Time

• Seek time: time required to move the read/write head to the

proper track
• Depends on the initial position of the head
• Typically 5 to 8 ms
• Latency: time to read addressed sector after the head is
positioned over the correct track
• On average, half the time for a full disk rotation

Access time = seek time + latency

• Flash access time is typically 35 to 100 µs (100x faster)

67
Virtual Memory
Textbook§8.8, 8.9
Virtual Memory

Physical memory capacity is almost always smaller than the address

space size.

• A large program, or many smaller concurrent programs, may

require more physical memory than is available
• Virtual memory uses secondary storage to hold data in excess of
memory capacity (in “swap file” or “page file”)
• Virtual memory is the lowest tier of the memory hierarchy
• Magnetic disk (5 ms) is five orders of magnitude (105 ) times slower
than SDRAM (15 ns)
• Virtual memory must be carefully managed (by the OS) to limit
disk accesses

68
Virtual Memory

• Programs are written assuming exclusive access to the whole

address space
• Processors access virtual addresses (logical addresses)
• Virtual addresses must be translated into physical addresses
• This works sort of like caching, but software (OS) managed:
• When the requested data is in physical memory (hit! a valid
translation exists), proceed like usual
• When it is not (miss! there is no valid translation), the requested
data must be moved from secondary storage to physical memory
(replacing something else)
• Fully associative mapping is used, e.g., with LRU replacement

69
Memory Management Unit

A hardware memory management unit (MMU) performs translation

from virtual addresses to physical addresses.

• The MMU maintains a table of translations from virtual to

physical addresses
• When no physical address exists for a given application and
virtual address, an interrupt occurs (page fault), and the OS
intervenes
• The OS transfers the desired data from disk to memory using
DMA (first copying some memory to disk, if physical memory is
full)
• The MMU is then updated to include the new translation

70
Virtual Memory Organization

71
Address Translation

Virtual memory is organized into pages.

• Pages are fixed∗ size, often 2-16 KB

• Pages are much larger than cache blocks
• Disks have high access times, but bandwidth in MB/s
• For translation, addresses are divided into two fields
• Upper bits give the virtual page number (VPN)
• Lower bits give the offset of a word within a page
• Translation preserves offset bits, but replaces VPN with the
appropriate page frame number (i.e., physical page number)
• The page table (stored in main memory) keeps track of the
mapping between virtual and physical page numbers

72
Address Translation

Virtual memory is organized into pages.

• Pages are fixed∗ size, often 2-16 KB

72
Page Table

73
Page Table

The page table stores all translations from virtual to physical pages.
• The MMU stores the start address of
the page table: page table base
register (PTBR)
• PTBR + VPN = address of the page
table entry (PTE) for the given VPN
• Each PTE maintains control bits
(valid? modified?)
• Each PTE also stores the page frame
number if the page is in memory
• Otherwise, it may indicate where on
disk the page can be found
• PTEs also track process information,
read/write permission, etc
74
Translation Lookaside Buffer (TLB)

The MMU must perform translation for each memory access (i.e.,
every fetch, every load or store). If each translation requires a
references to the page table, this is slow!

• When physical memory is large, the page table has many entries
• It isn’t practical to store the page table in the MMU
• The translation lookaside buffer (TLB) in the MMU caches
recently accessed PTEs
• The TLB is fully associative; on a miss, the full table is accessed,
and TLB updated (e.g., using LRU replacement)
• Split L1 caches? Two TLBs: one for instructions accesses, another
for data accesses

75
Translation Lookaside Buffer (TLB)

76
Page Faults

• A page fault occurs when a virtual address has no corresponding

physical address
• The MMU raises an interrupt so the OS can place the appropriate
page in the memory, and create the corresponding translation
• The OS uses LRU to select a page frame to replace, writing the
old frame to memory if necessary
• Handling page faults takes a long time, requiring disk accesses!
• Usually the OS selects another program to execute while waiting
• The suspended program restarts later when the page is ready

77
Conclusions

This set of lectures introduced how computer system memory is

organized. We’ve looked at:

• Memory technology: SRAM, DRAM, ROM, etc

• Direct memory access (DMA) hardware that assists with large
memory transfers
• Caches for reducing the latency of memory accesses
• Secondary storage: hard-disk drives, and solid-state drives
• Virtual memory, which expands physical memory size using
secondary storage

Next we’ll look at how processors themselves are implemented, and

how they execute instructions.

ECE 554 Computer Architecture Main Memory Spring 2013
No ratings yet
ECE 554 Computer Architecture Main Memory Spring 2013
35 pages
2.2 CU Memory System Design
No ratings yet
2.2 CU Memory System Design
75 pages
Memory System
No ratings yet
Memory System
70 pages
Unit 3 OF ESD
No ratings yet
Unit 3 OF ESD
22 pages
L05 Memory
No ratings yet
L05 Memory
45 pages
Unit 5 Memory
No ratings yet
Unit 5 Memory
116 pages
Evoltion&Future - Memory Technology
No ratings yet
Evoltion&Future - Memory Technology
37 pages
05 Internal Memory
No ratings yet
05 Internal Memory
73 pages
Embedded Systems - Virtual Memory Notes
No ratings yet
Embedded Systems - Virtual Memory Notes
17 pages
Memory Organization
No ratings yet
Memory Organization
117 pages
Chapter05 Slides
No ratings yet
Chapter05 Slides
34 pages
ca_ut5
No ratings yet
ca_ut5
54 pages
comporg6_ch8
No ratings yet
comporg6_ch8
75 pages
EECS 150 - Components and Design Techniques For Digital Systems Lec 16 - Storage: DRAM, SDRAM
No ratings yet
EECS 150 - Components and Design Techniques For Digital Systems Lec 16 - Storage: DRAM, SDRAM
26 pages
Memory Selection of ES
No ratings yet
Memory Selection of ES
37 pages
Memories - I: COMP541
No ratings yet
Memories - I: COMP541
45 pages
Coa Unit 4
No ratings yet
Coa Unit 4
90 pages
Chapter 5-The Memory System
No ratings yet
Chapter 5-The Memory System
84 pages
Introduction TO MEMORY SYSYTEM
No ratings yet
Introduction TO MEMORY SYSYTEM
24 pages
10 Memory Devices
No ratings yet
10 Memory Devices
39 pages
Lecture 10
No ratings yet
Lecture 10
44 pages
Lecture 12 - Memory Technologies
No ratings yet
Lecture 12 - Memory Technologies
47 pages
Module 4
No ratings yet
Module 4
28 pages
Internal Memory (RAM + ROM) 2
No ratings yet
Internal Memory (RAM + ROM) 2
32 pages
Microcontroller and Embeddes System - ARM Program Optimization 3
No ratings yet
Microcontroller and Embeddes System - ARM Program Optimization 3
42 pages
Chapter 5 Internal Memory
No ratings yet
Chapter 5 Internal Memory
13 pages
Fundamental Concepts
No ratings yet
Fundamental Concepts
64 pages
Chapter 5 - Memory - Systems
No ratings yet
Chapter 5 - Memory - Systems
80 pages
Module 3:the Memory System: Courtesy: Text Book: Carl Hamacher 5 Edition
No ratings yet
Module 3:the Memory System: Courtesy: Text Book: Carl Hamacher 5 Edition
73 pages
CO_unit_4 ppt
No ratings yet
CO_unit_4 ppt
113 pages
Unit-4 Memory Ppts
No ratings yet
Unit-4 Memory Ppts
31 pages
Lecture 10
No ratings yet
Lecture 10
44 pages
Embedded System Memory
No ratings yet
Embedded System Memory
22 pages
Lecture 10
No ratings yet
Lecture 10
44 pages
Unit 5 (Memory)
No ratings yet
Unit 5 (Memory)
108 pages
05 - Internal Memory
No ratings yet
05 - Internal Memory
33 pages
2019 S4 COA NOTES (5)
No ratings yet
2019 S4 COA NOTES (5)
154 pages
Chapter 5 - Internal Memory
100% (1)
Chapter 5 - Internal Memory
33 pages
Unit 4
No ratings yet
Unit 4
61 pages
CSE243: Introduction To Computer Architecture and Hardware/Software Interface
No ratings yet
CSE243: Introduction To Computer Architecture and Hardware/Software Interface
25 pages
05 Internal Memory
No ratings yet
05 Internal Memory
33 pages
The Memory System: Chapter Objectives
No ratings yet
The Memory System: Chapter Objectives
70 pages
DDCO Module 4
No ratings yet
DDCO Module 4
32 pages
Ch-4-Memory System Org and Arch
No ratings yet
Ch-4-Memory System Org and Arch
58 pages
SJB Institute of Technology: CO & ARM Microcontrollers (21EC52)
No ratings yet
SJB Institute of Technology: CO & ARM Microcontrollers (21EC52)
75 pages
Module 4-The Memory System
No ratings yet
Module 4-The Memory System
55 pages
Mc9211unit 5 PDF
No ratings yet
Mc9211unit 5 PDF
89 pages
COA_4 (1)
No ratings yet
COA_4 (1)
34 pages
Abdelwahab Alsammak Lecture 9 Memory
No ratings yet
Abdelwahab Alsammak Lecture 9 Memory
25 pages
Assignment: Embedded Systems
No ratings yet
Assignment: Embedded Systems
6 pages
Lecture 2 - Memory Devices
No ratings yet
Lecture 2 - Memory Devices
53 pages
External ROM and RAM 8051
No ratings yet
External ROM and RAM 8051
59 pages
8086 Interface
No ratings yet
8086 Interface
26 pages
COA unit 4
No ratings yet
COA unit 4
32 pages
EE6304 Lecture8 Mem Hierarchy
No ratings yet
EE6304 Lecture8 Mem Hierarchy
54 pages
05_Internal Memory
No ratings yet
05_Internal Memory
33 pages
William Stallings Computer Organization and Architecture 8th Edition Internal Memory
No ratings yet
William Stallings Computer Organization and Architecture 8th Edition Internal Memory
22 pages
Memory Basics Explained
From Everand
Memory Basics Explained
Alisa Turing
No ratings yet
Storage Area Networks For Dummies
From Everand
Storage Area Networks For Dummies
Christopher Poelker
3.5/5 (2)
Memory Makers
From Everand
Memory Makers
Mei Gates
No ratings yet
Lecture 1
No ratings yet
Lecture 1
24 pages
Modsd
No ratings yet
Modsd
7 pages
SDM Module I
No ratings yet
SDM Module I
28 pages
Implementation of AES Algorithm in UART Module For Secured Data Transfer
No ratings yet
Implementation of AES Algorithm in UART Module For Secured Data Transfer
1 page
Chap. 15: Reduction of State Tables/ State Assignment
No ratings yet
Chap. 15: Reduction of State Tables/ State Assignment
64 pages
Assignment of Computer: Submitted To: Mam Ramash Zahra Submitted By: Sameen Roll No: 20 Class: Bs Physics 1 Semester
No ratings yet
Assignment of Computer: Submitted To: Mam Ramash Zahra Submitted By: Sameen Roll No: 20 Class: Bs Physics 1 Semester
4 pages
Esp32-S3 Datasheet en
No ratings yet
Esp32-S3 Datasheet en
75 pages
(Ebook) Embedded System Design by Frank Vahid; Tony Givargis ISBN 9789971514051, 9971514052 - The latest ebook is available, download it today
100% (1)
(Ebook) Embedded System Design by Frank Vahid; Tony Givargis ISBN 9789971514051, 9971514052 - The latest ebook is available, download it today
55 pages
Unit 2
No ratings yet
Unit 2
45 pages
HP Printer - Boot Progress Indicator General-Specific
No ratings yet
HP Printer - Boot Progress Indicator General-Specific
10 pages
PLC final exam
No ratings yet
PLC final exam
6 pages
Automatic Multilevel Car Parking by Use of Sensor System
No ratings yet
Automatic Multilevel Car Parking by Use of Sensor System
40 pages
Part 2
No ratings yet
Part 2
39 pages
RISC Processor Fundamentals
No ratings yet
RISC Processor Fundamentals
18 pages
Hitachi Microcontrollers
100% (1)
Hitachi Microcontrollers
5 pages
STM32F407VGT General-Purpose I/Os (GPIO)
100% (1)
STM32F407VGT General-Purpose I/Os (GPIO)
5 pages
Latch Up Prevention
No ratings yet
Latch Up Prevention
24 pages
B UCSM GUI Storage Management Guide 3 1 Chapter 010101
No ratings yet
B UCSM GUI Storage Management Guide 3 1 Chapter 010101
16 pages
MPMC Question Bank
No ratings yet
MPMC Question Bank
8 pages
Advanced Digital Design: HW Datapath Components
No ratings yet
Advanced Digital Design: HW Datapath Components
20 pages
Cadence Virtuso Layout Editor
No ratings yet
Cadence Virtuso Layout Editor
18 pages
The Control Unit: The Control Unit Manages Four Basic Operations (Fetch, Decode, Execute, and Write-Back)
No ratings yet
The Control Unit: The Control Unit Manages Four Basic Operations (Fetch, Decode, Execute, and Write-Back)
7 pages
Polyn Ai Overview
No ratings yet
Polyn Ai Overview
23 pages
Power Aware Design of Nanometer MCML Tap
No ratings yet
Power Aware Design of Nanometer MCML Tap
20 pages
8255a
No ratings yet
8255a
67 pages
Arduino 101 (USA ONLY) & Genuino 101 (OUTSIDE USA) : Intel® Curie™
No ratings yet
Arduino 101 (USA ONLY) & Genuino 101 (OUTSIDE USA) : Intel® Curie™
5 pages
PDC - Multivibrators
No ratings yet
PDC - Multivibrators
18 pages
Joshi Thesis 2016
No ratings yet
Joshi Thesis 2016
94 pages
Design A Simple Fpga Risc Cpu and System On A Chip - Slides
No ratings yet
Design A Simple Fpga Risc Cpu and System On A Chip - Slides
65 pages
Cs1304-Computer Architecture Department of Cse & It
No ratings yet
Cs1304-Computer Architecture Department of Cse & It
105 pages

7 Memory

Uploaded by

7 Memory

Uploaded by

ECSE324 : Computer Organization

Timestamp: 2024/03/13 10:50:00

It is possible (and even likely) that I will (sometimes) make mistakes

What is the difference between memory and storage? How do they

iPad Air LTE board

Elpida 1 GB LPDDR3 SDRAM memory: volatile (temporary) space for

There are two key metrics (amongst others) used to describe

• Memory access time: the time from initiation to completion of a

Random access memory (RAM) means that access time is

Semiconductor RAM Memories

Static RAM (SRAM) is made out of CMOS transistors.

Gate Oxide Gate Oxide

• SRAM is volatile: it retains state as long as power is applied

Static RAM (SRAM) is made out of CMOS transistors.

Gate Oxide Gate Oxide

• SRAM is volatile: it retains state as long as power is applied

An SRAM cell is storing a ‘1’ when X is VDD.

Transistors must be carefully sized for read stability and writeability.

Dynamic RAM (DRAM) is also volatile, but: without refreshing, will

• DRAM is slower than SRAM, but cheaper: the DRAM cell is

Because DRAM is built for density, out of the smallest circuit

• When read, a sense amplifier (sense amp) connected to the bit

RAS, CAS, and refresh are managed by an external memory controller. 18

• In the preceding example, each read accesses (and refreshes) all

SDRAM can operate in different modes, which determines how

• Memory latency (ns) is

8 GByte DDR4-2133 ECC 1.2 V RDIMM

Multiple smaller memories can be integrated to create a larger

Non-volatile memories (NVM) are essential for today’s embedded

• retain their contents even when unpowered; and,

A programmable ROM (PROM) is written once, at manufacturing time,

• An erasable programable ROM (EPROM) uses a special transistor

Flash is a high-density, low-power, low-cost, and very widely adopted

• Flash cells are designed to be erased in larger blocks, increasing

CPU overhead for block transfers is high: an address calculation, and

• A DMA controller is an I/O

Problem: we want a very large, very fast memory.

• DRAM can be large, but is slow.

The memory problem is as as old modern computing.

“Ideally one would desire an indefinitely large memory ca-

A. W. Burks, H. H. Goldstine, and J. von Neumann, “Preliminary

We create the illusion of large, fast memory by keeping a copy of

• Programmers use load and store instructions as usual

Memory hierarchy design matters to hardware engineers:

• Different systems call for different amounts of memory

Memory hierarchy design matters to software engineers:

• Memory hierarchy cannot hide all memory access delay

Modern memory hierarchies incorporate multiple levels of cache,

Why does caching work?

• Look for it in the cache.

Caches improve performance as long as enough accesses hit in

• It is not uncommon for caches to have a hit rate of >95%.

hit rate = cache hits/memory accesses

The cache is a RAM; where should a particular item be stored in it?

• Caching divides main memory into blocks (a.k.a. cache lines),

Assume a cache size of n blocks, and m words per block.

• Every block in memory maps to a single block in cache

Block size: m = 16B (16 words)

• Tag disambiguates different

What is the cache location and tag corresponding to address 2065?

• For a hit to occur, valid must be 1.

The advantage of direct-mapped caches: simple (and fast) hardware

• Newly requested blocks always overwrite blocks previously

With fully-associative mapping, a

Address Tag Word idx

V Tag Data V Tag Data V Tag Data

With set-associative mapping, a

This strikes a trade-off between direct-mapped and fully associative

Address Tag Set idx Word idx

V Tag Data V Tag Data

... cache set / row

cache line / block

Associativity determines the number of cache blocks in which a

• 1-way set-associative caches are direct mapped

Block replacement (determining which block in the set to replace on

• Least-recently-used (LRU): hardware tracks the relative timing of