0% found this document useful (0 votes)

214 views7 pages

Parallel and Distributed Computing

Moore's law of speedup states that speedup is limited by non-parallelizable code fraction. For 60% parallel code on 4 cores, speedup is 1.55x. Flynn's taxonomy classifies computer architectures based on instruction and data streams as SISD, SIMD, MISD, MIMD. Cache coherence ensures consistent data across caches using protocols like directory-based or MSI/MESI.

Uploaded by

Rana Noman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

214 views7 pages

Parallel and Distributed Computing

Uploaded by

Rana Noman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Question #2

A) Define Moore's law of speedup and also find out the speedup if a
code contains 60% parallel code run on 4 cores.
Moore's Law of Speedup, also known as Gustafson's Law, states that the speedup of a program
using parallel processing is limited by the fraction of the program that cannot be parallelized.
The law is defined as:

S = (1 - f) + f/n

Where:
S = theoretical speedup
f = fraction of parallel code
n = number of processing cores

Given a code with 60% parallel code (f = 0.6) running on 4 cores (n = 4), we can calculate the
speedup as:

S = (1 - 0.6) + (0.6/4)
S = 0.4 + 0.15
S = 1.55

So, the theoretical speedup would be approximately 1.55 times faster than the original program.

B) Define Flynn's Taxonomy

Flynn's taxonomy is a classification of computer architectures, proposed by Michael J. Flynn in
1966 and extended in 1972 ¹. Flynn's taxonomy is based on the number of control units and the
multiple processors available in a computer, and it has been used as a tool in the design of
modern processors and their functionalities ². The classification system has stuck, and it has been
used as a tool in the design of modern processors and their functionalities ¹. The four
classifications are as follows ³ ² ¹:
- SISD (Single Instruction Stream, Single Data Stream): A sequential computer which exploits
no parallelism in either the instruction or data streams.

- SIMD (Single Instruction Stream, Multiple Data Streams): A single instruction is

simultaneously applied to multiple different data streams.

- MISD (Multiple Instruction Streams, Single Data Stream): Multiple instructions operate on one
data stream.

- MIMD (Multiple Instruction Streams, Multiple Data Streams): Multiple autonomous processors
simultaneously executing different instructions on different data.

Question #3
A) Define Cache Coherence. Snarfing and snooping. Explain any of
cache coherence protocol.
Cache Coherence :
In multiprocessor systems, each processor has its own cache. This means that different caches
could have different copies of the same memory block. Cache coherence ensures that all copies
of data stored in different caches are consistent and up to date. This is crucial for multiprocessor
systems to function correctly.

Snarfing :
Snarfing is a mechanism that ensures cache coherence. It involves a cache controller that
monitors the access to memory locations that have been cached and oversees the actual data that
is stored in the memory.

Snooping :
Snooping is another mechanism that ensures cache coherence. It is a process in which individual
caches monitor address lines for access to memory locations that they have cached.

Cache Coherence Protocols :

Here are some examples of cache coherence protocols:
- Directory-based system: This protocol uses a common directory that upholds the coherence
between different caches. When a data copy is changed, the directory can either update or
invalidate the other caches with that change.
- MSI (Modified, Shared, Invalid): This protocol has three states – Modified (M), Shared (S)
and Invalid (I).
- MESI (Modified, Exclusive, Shared, Invalid): This protocol is an extension of the MSI
protocol and adds an exclusive state.
- MOESI (Modified, Owned, Exclusive, Shared, Invalid): This protocol is another extension
of the MSI protocol and adds an owned state.

B) Explain False Sharing with an example

False sharing occurs when multiple processors in a multi-core system access different variables
that reside on the same cache line, causing coherence traffic and performance degradation. This
happens even though the processors are not sharing the same variable, hence the term "false
sharing".

Example:

Suppose we have two processors, P1 and P2, and two variables, A and B, that are stored in the
same cache line.

P1 is repeatedly updating variable A, while P2 is repeatedly updating variable B.

| Cache Line |
| --- | --- |
|A|B|

Although P1 and P2 are accessing different variables, they are accessing the same cache line.
This causes the cache coherence protocol to invalidate the cache line on P2's processor every
time P1 updates variable A, and vice versa.
As a result, P2's updates to variable B will always be slower because it has to wait for the cache
line to be reloaded from memory. This is an example of false sharing, where the coherence
traffic caused by P1's updates to A affects P2's updates to B, even though they are not sharing the
same variable.

To avoid false sharing, variables that are accessed by different processors should be padded to
ensure they are not stored in the same cache line

Question #4
A) Define Work Law and Span Law. Also tell speedup in form of
work law and span law.
Work Law and Span Law are two fundamental laws in parallel computing that help predict the
performance of parallel algorithms.

Work Law (Gustafson's Law):

The Work Law states that the total amount of work performed by a parallel algorithm is equal to
the sum of the work performed by each processor. Mathematically, it can be represented as:

W = w1 + w2 + ... + wn

Where:
W = total work
wi = work performed by processor i
n = number of processors

Span Law (Brent's Law):

The Span Law states that the total execution time of a parallel algorithm is equal to the
maximum time taken by any processor. Mathematically, it can be represented as:

T = max(t1, t2, ..., tn)

Where:
T = total execution time
ti = execution time of processor i
n = number of processors

Speedup in terms of Work Law and Span Law:

Speedup can be calculated using both laws as follows:

Work Law Speedup:

Speedup = W / w1 (assuming processor 1 is the slowest)

Span Law Speedup:

Speedup = T_serial / T_parallel
= (w1 + w2 + ... + wn) / max(t1, t2, ..., tn)

Where:
T_serial = total execution time on a single processor
T_parallel = total execution time on multiple processors

B) Define Briefly Pipelining, Superscalar

Pipelining: This is an implementation technique where multiple instructions are
overlapped in execution. This is done without additional hardware but by letting
different parts of the hardware work for different instructions at the same time.
This technique is responsible for large increases in program execution speed.
- Superscalar: A superscalar architecture is one in which several instructions can be
initiated simultaneously and executed independently. This is a CPU that
implements a form of parallelism called instruction-level parallelism within a
single processor. In contrast to a scalar processor, which can execute at most one
single instruction per clock cycle, a superscalar processor can execute more than
one instruction during a clock cycle by simultaneously dispatching multiple
instructions to different execution units on the processor.
Question# 5
Write in detail on any one of the following parallel approaches with an
example of code of your own choice.
a) MPI
b) Pthread,
c) OpenMP
d) CUDA

MPI
MPI (Message Passing Interface) is a standardized communication protocol for parallel computing. It
allows processes to communicate with each other by sending and receiving messages. MPI is widely
used in high-performance computing (HPC) applications, such as scientific simulations, data analytics,
and machine learning.

MPI provides a set of functions for:

1. Point-to-point communication: sending and receiving messages between two processes.

2. Collective communication: broadcasting, gathering, and scattering data among multiple processes.

3. Group management: creating and managing groups of processes.

4. Communication modes: synchronous, asynchronous, and buffered communication.

Pthread
PThread (POSIX Threads) is a standard for creating threads in a POSIX-compliant operating system. It
provides a set of APIs for creating, managing, and synchronizing threads.

PThread provides various functions for:

- Creating threads (pthread_create)

- Joining threads (pthread_join)

- Detaching threads (pthread_detach)

- Synchronizing threads (pthread_mutex_lock, pthread_mutex_unlock, etc.)

- Communicating between threads (pthread_cond_signal, pthread_cond_wait, etc.)

OpenMP
OpenMP (Open Multi-Processing) is a parallel programming model for multi-platform shared memory
multiprocessing programming. It consists of a set of compiler directives, library routines, and
environment variables that influence run-time behavior.

OpenMP is used for parallelizing loops, parallelizing sections of code, and parallelizing tasks. It provides a
simple and portable way to write parallel applications

CUDA
CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model
developed by NVIDIA that allows developers to harness the power of GPU acceleration for general-
purpose computing.

CUDA enables developers to write programs that execute on the GPU, using a programming model that
is similar to C++. The GPU is treated as a coprocessor, and the developer can offload computationally
intensive tasks to the GPU, while the CPU handles other tasks.

Key features of CUDA:

1. Parallel programming model: CUDA allows developers to write parallel code using threads, blocks, and
grids.

2. GPU acceleration: CUDA programs execute on the GPU, leveraging its massive parallel processing
capabilities.

3. Memory hierarchy: CUDA provides a hierarchical memory architecture, including registers, shared
memory, and global memory.

Introduction To Parallel Computing
No ratings yet
Introduction To Parallel Computing
13 pages
Big Data Analytics Unit - 1 Notes
No ratings yet
Big Data Analytics Unit - 1 Notes
24 pages
Operating System: Digital Notes by
No ratings yet
Operating System: Digital Notes by
86 pages
Software Testing LAB MANUAL
No ratings yet
Software Testing LAB MANUAL
126 pages
CS-3011 (Ai) - CS Mid Sept 2023
No ratings yet
CS-3011 (Ai) - CS Mid Sept 2023
13 pages
Android Project Directory Structure
No ratings yet
Android Project Directory Structure
5 pages
CS-701-CBGS: B.Tech., VII Semester
No ratings yet
CS-701-CBGS: B.Tech., VII Semester
2 pages
Case Studies of Common Csharp Problems
100% (2)
Case Studies of Common Csharp Problems
26 pages
ETI Unit I MCQs
No ratings yet
ETI Unit I MCQs
12 pages
Android Development Essentials
No ratings yet
Android Development Essentials
91 pages
Lab Manual Information Security (Iv-I Cse 2024-25)
No ratings yet
Lab Manual Information Security (Iv-I Cse 2024-25)
35 pages
The Importance of The User Interface: Lecture-1 The Essential Guide To UI Design
No ratings yet
The Importance of The User Interface: Lecture-1 The Essential Guide To UI Design
32 pages
Data Structure LAB Manual 2024-2025
No ratings yet
Data Structure LAB Manual 2024-2025
58 pages
High Performance Computing-Question Bank PDF
No ratings yet
High Performance Computing-Question Bank PDF
4 pages
Practical No.2 - EDE
100% (1)
Practical No.2 - EDE
3 pages
Mobile App Development Exam Paper
100% (1)
Mobile App Development Exam Paper
2 pages
IT24 - Advanced DBMS Model Answer Paper
No ratings yet
IT24 - Advanced DBMS Model Answer Paper
10 pages
Unit 2 Software Requirement Engineering
No ratings yet
Unit 2 Software Requirement Engineering
37 pages
ATAL 6 Days Online FDP Scheme Document 2025-26
No ratings yet
ATAL 6 Days Online FDP Scheme Document 2025-26
4 pages
Handling Exceptions
No ratings yet
Handling Exceptions
12 pages
Topic 2 - Problem Solving Concepts For The Computer
No ratings yet
Topic 2 - Problem Solving Concepts For The Computer
41 pages
Types of Operating Systems Explained
No ratings yet
Types of Operating Systems Explained
28 pages
Page Replacement Algorithms Guide
No ratings yet
Page Replacement Algorithms Guide
14 pages
Java Networking Essentials
No ratings yet
Java Networking Essentials
59 pages
2015 Winter Model Answer Paper PDF
100% (1)
2015 Winter Model Answer Paper PDF
40 pages
Data Structures and Algorithms: Module - 3 Linked List
No ratings yet
Data Structures and Algorithms: Module - 3 Linked List
44 pages
DSA Roadmap For Placements
No ratings yet
DSA Roadmap For Placements
3 pages
CPU Scheduling Algorithms
No ratings yet
CPU Scheduling Algorithms
2 pages
Segmented Paging: Unit Iv
100% (1)
Segmented Paging: Unit Iv
11 pages
Experment 5 AIM: To Draw The Structural View Diagram For The ATM: Class Diagram, Object Diagram
No ratings yet
Experment 5 AIM: To Draw The Structural View Diagram For The ATM: Class Diagram, Object Diagram
6 pages
Loading and Linking in Os
100% (1)
Loading and Linking in Os
2 pages
Chapter 5 - MultiThreading
No ratings yet
Chapter 5 - MultiThreading
36 pages
VTU Exam Question Paper With Solution of 18MCA51 Programming Using C#.NET Jan-2021-Ms Uma B
No ratings yet
VTU Exam Question Paper With Solution of 18MCA51 Programming Using C#.NET Jan-2021-Ms Uma B
37 pages
OSY 6th Chapter Notes
No ratings yet
OSY 6th Chapter Notes
36 pages
Mobile Application Development Question Paper
No ratings yet
Mobile Application Development Question Paper
12 pages
18CSMP68 Mobile Application Development Lab Manual in Kotlin
No ratings yet
18CSMP68 Mobile Application Development Lab Manual in Kotlin
73 pages
Spiral Model
No ratings yet
Spiral Model
15 pages
Design Model
No ratings yet
Design Model
3 pages
1.introducation To Dot Net Framework-1.docx - 20250103 - 134852 - 0000
100% (1)
1.introducation To Dot Net Framework-1.docx - 20250103 - 134852 - 0000
4 pages
Mobile Application Development
No ratings yet
Mobile Application Development
8 pages
Software Testing Important Questions
No ratings yet
Software Testing Important Questions
2 pages
Ede Micro Project For Tyco Students
No ratings yet
Ede Micro Project For Tyco Students
10 pages
Software Testing Overview
No ratings yet
Software Testing Overview
25 pages
Chapter-2 Process Management
100% (1)
Chapter-2 Process Management
56 pages
Balancing Function and Fashion
No ratings yet
Balancing Function and Fashion
41 pages
Java Inheritance and Interfaces
No ratings yet
Java Inheritance and Interfaces
8 pages
22316
No ratings yet
22316
7 pages
Parallel Sorting on Multi-Core CPUs
No ratings yet
Parallel Sorting on Multi-Core CPUs
22 pages
Ste - Unit4 - Presentation Updated
No ratings yet
Ste - Unit4 - Presentation Updated
25 pages
Assignment 1 Computer Architecture
No ratings yet
Assignment 1 Computer Architecture
3 pages
Msbte UT 1 QB Answers
No ratings yet
Msbte UT 1 QB Answers
13 pages
SE Unit 6
No ratings yet
SE Unit 6
50 pages
JDBC Architecture
No ratings yet
JDBC Architecture
8 pages
Chapter-4: Microprocessor: 8086 and Modern Microprocessors
No ratings yet
Chapter-4: Microprocessor: 8086 and Modern Microprocessors
50 pages
Mad Notes&mcq
No ratings yet
Mad Notes&mcq
378 pages
EDE Microproejct 1 by Campusify
No ratings yet
EDE Microproejct 1 by Campusify
23 pages
220245-MSBTE-22412-Java (Unit 1)
No ratings yet
220245-MSBTE-22412-Java (Unit 1)
40 pages
1 Introduction
No ratings yet
1 Introduction
30 pages
Seminar
No ratings yet
Seminar
85 pages
2022 Mid 1
No ratings yet
2022 Mid 1
4 pages
Database Mirroring S
No ratings yet
Database Mirroring S
20 pages
Introduction To Programming: Tutorial Task 2.2: Using Functions (Hello User With Functions)
No ratings yet
Introduction To Programming: Tutorial Task 2.2: Using Functions (Hello User With Functions)
4 pages
CS252 Graduate Computer Architecture Reorder Buffers and Explicit Register Renaming
No ratings yet
CS252 Graduate Computer Architecture Reorder Buffers and Explicit Register Renaming
55 pages
Memo Record - Edit
No ratings yet
Memo Record - Edit
10 pages
Go16 Ac Ch01 Grader 1g As Instructions
No ratings yet
Go16 Ac Ch01 Grader 1g As Instructions
2 pages
Gp150 Installation Manual D 11-6-08
No ratings yet
Gp150 Installation Manual D 11-6-08
44 pages
Basic Clerical Works Level 1
72% (32)
Basic Clerical Works Level 1
60 pages
Soc2040 SP Week 5 Lecture1 Slides On Data Representation Part4 Spring 2024
No ratings yet
Soc2040 SP Week 5 Lecture1 Slides On Data Representation Part4 Spring 2024
46 pages
Scmguide PDF
No ratings yet
Scmguide PDF
41 pages
Completing The Square: Key Points
No ratings yet
Completing The Square: Key Points
3 pages
Permutations & Combinations Guide
No ratings yet
Permutations & Combinations Guide
3 pages
Understanding Computer Viruses
No ratings yet
Understanding Computer Viruses
10 pages
Teacher Planner 2025 in Pastel Colours Lines Pattern Style - 20250407 - 110433 - 0000
No ratings yet
Teacher Planner 2025 in Pastel Colours Lines Pattern Style - 20250407 - 110433 - 0000
39 pages
LTE Formulas Warid End - V2.0
No ratings yet
LTE Formulas Warid End - V2.0
20 pages
Algorithm Course Overview
No ratings yet
Algorithm Course Overview
2 pages
TOP 250+ Finite Element Analysis (FEA) Interview Questions and Answers 30.05.2019 - Finite Element Analysis (FEA) Interview Questions - Wisdom Jobs India
No ratings yet
TOP 250+ Finite Element Analysis (FEA) Interview Questions and Answers 30.05.2019 - Finite Element Analysis (FEA) Interview Questions - Wisdom Jobs India
9 pages
Communication Types and Media Impact
No ratings yet
Communication Types and Media Impact
2 pages
Python Programming Basics Syllabus
No ratings yet
Python Programming Basics Syllabus
1 page
Clarion Pu-2294a Pu-2325a Pu-2294b PDF
0% (1)
Clarion Pu-2294a Pu-2325a Pu-2294b PDF
16 pages
The Evolution of DSP Processors
No ratings yet
The Evolution of DSP Processors
9 pages
EDI-990 Carrier Response Guide
No ratings yet
EDI-990 Carrier Response Guide
17 pages
Kottu Business Plan
100% (1)
Kottu Business Plan
4 pages
Unit 3 J2EE and Web Development-265689
No ratings yet
Unit 3 J2EE and Web Development-265689
32 pages
Chapter 10 Implementing Subprograms
No ratings yet
Chapter 10 Implementing Subprograms
15 pages
Verizon Jetpack Mifi 4620l Users Manual 391458 PDF
No ratings yet
Verizon Jetpack Mifi 4620l Users Manual 391458 PDF
100 pages
PP Dimp
No ratings yet
PP Dimp
50 pages
Project Lifecycle
No ratings yet
Project Lifecycle
15 pages
Cisco BRKRST-2069
No ratings yet
Cisco BRKRST-2069
98 pages
Continuous Predictors
No ratings yet
Continuous Predictors
5 pages
Delphi Users: TVideoGrabber Guide
No ratings yet
Delphi Users: TVideoGrabber Guide
3 pages

Parallel and Distributed Computing

Uploaded by

Parallel and Distributed Computing

Uploaded by

Question #2

B) Define Flynn's Taxonomy

- SIMD (Single Instruction Stream, Multiple Data Streams): A single instruction is

Cache Coherence Protocols :

B) Explain False Sharing with an example

P1 is repeatedly updating variable A, while P2 is repeatedly updating variable B.

Work Law (Gustafson's Law):

Span Law (Brent's Law):

T = max(t1, t2, ..., tn)

Speedup in terms of Work Law and Span Law:

Work Law Speedup:

Span Law Speedup:

B) Define Briefly Pipelining, Superscalar

MPI provides a set of functions for:

1. Point-to-point communication: sending and receiving messages between two processes.

3. Group management: creating and managing groups of processes.

4. Communication modes: synchronous, asynchronous, and buffered communication.

PThread provides various functions for:

- Creating threads (pthread_create)

- Joining threads (pthread_join)

- Detaching threads (pthread_detach)

- Synchronizing threads (pthread_mutex_lock, pthread_mutex_unlock, etc.)

Key features of CUDA:

You might also like