Parallel & Distributed
Computing
lecture#07
What is a GPU?
• A GPU (Graphics Processing Unit) is a special type of
processor designed to perform many operations at the
same time — this is called parallel processing.
• Example: If a CPU is like a smart person solving problems one
by one, a GPU is like a large group of people solving smaller
problems simultaneously.
•
GPU Architecture
• GPUs are made to handle thousands of threads (small
programs) running at once. Here's how it works:
CUDA Cores (or ALUs)
• These are simple processors inside the GPU.
• There are hundreds or thousands of these in one GPU.
• They execute arithmetic and logic operations.
Streaming Multiprocessor (SM)
•A group of CUDA cores is called an SM.
•Each SM can run many threads in parallel.
•Think of it like a mini-CPU with multiple cores.
Warp and Warp Scheduler
•A warp is a group of 32 threads.
•All threads in a warp run the same instruction at the
same time (SIMT – Single Instruction, Multiple Threads).
•The warp scheduler controls which threads get
executed.
What is Memory Hierarchy
• The Computer memory hierarchy looks like a pyramid
structure which is used to describe the differences among
memory types. It separates the computer storage based on
hierarchy.
•
Registers
• Registers are small, high-speed memory units located in the
CPU. They are used to store the most frequently used data
and instructions.
• Registers have the fastest access time and the smallest
storage capacity, typically ranging from 16 to 64 bits.
Cache Memory
• Cache memory is a small, fast memory unit located close to
the CPU. It stores frequently used data and instructions that
have been recently accessed from the main memory.
• Cache memory is designed to minimize the time it takes to
access data by providing the CPU with quick access to
frequently used data.
Main Memory
• Main memory, also known as RAM (Random Access Memory),
is the primary memory of a computer system.
• It has a larger storage capacity than cache memory, but it is
slower.
• Main memory is used to store data and instructions that are
currently in use by the CPU.
Secondary Storage
• Secondary storage, such as
hard disk drives (HDD) and solid-state drives (SSD) , is a non-
volatile memory unit that has a larger storage capacity than
main memory.
• It is used to store data and instructions that are not currently
in use by the CPU. Secondary storage has the slowest access
time and is typically the least expensive type of memory in
the memory hierarchy.
Magnetic Disk
• Magnetic Disks are simply circular plates that are fabricated
with either a metal or a plastic or a magnetized material.
• The Magnetic disks work at a high speed inside the computer
and these are frequently used.
Magnetic Tape
• Magnetic Tape is simply a magnetic recording device that is
covered with a plastic film.
• Magnetic Tape is generally used for the backup of data. In the
case of a magnetic tape, the access time for a computer is a
little slower and therefore, it requires some amount of time
for accessing the strip.
Characteristics of Memory
Hierarchy
• Capacity: It is the global volume of information the memory
can store. As we move from top to bottom in the Hierarchy,
the capacity increases.
• Access Time: It is the time interval between the read/write
request and the availability of the data. As we move from top
to bottom in the Hierarchy, the access time increases.
Characteristics of Memory
Hierarchy
• Performance: The Memory Hierarchy design ensures that
frequently accessed data is stored in faster memory to
improve system performance.
• Cost Per Bit: As we move from bottom to top in the
Hierarchy, the cost per bit increases i.e. Internal Memory is
costlier than External Memory.
Level Register Cache Primary memory Secondary memory
Bandwidth 4k to 32k MB/sec 800 to 5k MB/sec 400 to 2k MB/sec 4 to 32 MB/sec
Size Less than 1KB Less than 4MB Less than 2 GB Greater than 2 GB
Access time 2 to 5nsec 3 to 10 nsec 80 to 400 nsec 5ms
Managed by Compiler Hardware Operating system
OS or user
Advantages of Memory Hierarchy
• Performance: Frequently used data is stored in faster
memory (like cache), reducing access time and improving
overall system performance.
• Cost Efficiency: By combining small, fast memory (like
registers and cache) with larger, slower memory (like RAM
and HDD), the system achieves a balance between cost and
performance. It saves the consumer's price and time.
Advantages of Memory Hierarchy
• Optimized Resource Utilization: Combines the benefits of
small, fast memory and large, cost-effective storage to
maximize system performance.
• Efficient Data Management: Frequently accessed data is
kept closer to the CPU, while less frequently used data is
stored in larger, slower memory, ensuring efficient data
handling.
Disadvantages of Memory
Hierarchy
• Complex Design: Managing and coordinating data across
different levels of the hierarchy adds complexity to the
system's design and operation.
• Cost: Faster memory components like registers and cache
are expensive, limiting their size and increasing the overall
cost of the system.
Disadvantages of Memory
Hierarchy
• Latency: Accessing data stored in slower memory (like
secondary or tertiary storage) increases the latency and
reduces system performance.
• Maintenance Overhead: Managing and maintaining
different types of memory adds overhead in terms of
hardware and software.