Parallel Merge Sort With MPI
Parallel Merge Sort With MPI
Submitted by
Irsa Waheed
BACHELOR
OF
YEAR
2020-2024
Table of Contents
1. Introduction..........................................................................................................2
1.1 Merge Sort .....................................................................................................3
1.2 Message Passing Interface (MPI) ..................................................................3
2. Overview of MPI .................................................................................................3
2.1 Goals of MPI ....................................................................................................3
2.2 MPI Offers ........................................................................................................4
3. MPI and Merge Sort Relation ..............................................................................4
4. Implementation of Parallel Merge Sort using MPI .............................................4
4.1 Code ..................................................................................................................4
4.2 Implementation Details ....................................................................................9
5. Performance Analysis ........................................................................................10
5.1 Execution Time: .............................................................................................10
5.2 Speedup: .........................................................................................................11
5.3 Efficiency: ......................................................................................................11
5.4 Scalability: ......................................................................................................11
5.5 Load Balancing: ..............................................................................................11
5.6 Communication Overhead: .............................................................................11
5.7 Optimization Potential: ...................................................................................11
5.8 System Characteristics: ..................................................................................12
6. Conclusion .........................................................................................................12
1. Introduction
Parallel merge sort is a parallelised version of the traditional merge sort algorithm designed to
distribute the workload across multiple processors using the Message Passing Interface (MPI).
MPI enables communication and coordination among processes running on different nodes,
allowing them to work together to sort a large dataset efficiently.
1.1 Merge Sort
Merge sort is a divide-and-conquer algorithm that efficiently sorts arrays. It divides the
array into smaller halves until each part is sorted, then merges them in a sorted manner. In
parallel processing.
2. Overview of MPI
MPI is intended as a standard implementation of the "message passing" model of parallel
computing. It enables multiple processes or nodes to exchange data and coordinate their tasks
in distributed computing environments. MPI provides a set of functions for point-to-point
communication and collective operations, allowing efficient coordination and synchronisation
among parallel processes.
▪ Provide source code portability. MPI programs should compile and run as-is on any
platform.
▪ Allow efficient implementations across a range of architectures.
4.1 Code
#include <iostream>
#include <mpi.h>
delete[] received_arr;
delete[] merged_arr;
}
} else {
int partner = rank - step;
// Send the local sorted subarray to the partner
MPI_Send(local_arr, chunk_size, MPI_INT, partner, 0,
MPI_COMM_WORLD);
}
}
// Gather the sorted chunks back to the root process
int *sorted_arr = nullptr;
if (rank == 0) {
sorted_arr = new int[size];
}
delete[] sorted_arr;
}
delete[] local_arr;
}
MPI_Finalize();
return 0; }
• Local Sorting:
Each process performs a local sorting algorithm, in this case, the merge sort
algorithm, on its allocated portion of the array. The mergeSort function is
applied locally to sort the subarray assigned to each process.
• Merge Phase:
The merge phase involves combining sorted subarrays from different processes
to create larger sorted arrays. This is accomplished using a recursive doubling
approach. Processes exchange their sorted subarrays with their partners using
MPI_Send and MPI_Recv. The merge function is then used to merge the
received subarray with the local sorted subarray.
• Final Merge:
After the recursive doubling phase, the sorted chunks are gathered back to the
root process using MPI_Gather. The root process (rank 0) performs the final
merge by applying the mergeSort function on the merged chunks to produce the
fully sorted array. The final sorted array is then displayed.
• MPI_Send:
Transmits data from one processor to another in MPI. In merge sort, it's used to
send portions of the array to other processors for sorting.
• MPI_Recv:
Receives data sent by MPI_Send. In merge sort, it's employed to gather sorted
subarrays from different processors.
• MPI_Gather:
Gathers data from multiple processors to a single processor. In merge sort, it
can collect sorted subarrays from all processors to perform the final merging.
• MPI_Scatter:
Divides the data among different processors. In merge sort, it distributes
portions of the array to different processors for parallel sorting.
• MPI_Barrier:
Synchronizes processors, ensuring they reach a particular point before moving
forward. In merge sort, it can ensure that all processors complete their sorting
before merging.
5. Performance Analysis
5.1 Execution Time:
Measure the total execution time of the parallel merge sort algorithm using MPI. This
is the time taken from the start of the program to its completion. Compare this with the
execution time of the sequential merge sort on the same dataset.
5.2 Speedup:
Speedup is a measure of how much faster the parallel algorithm is compared to the
sequential algorithm.
Formula: Speedup = Sequential Execution Time / Parallel Execution Time
A speedup greater than 1 indicates improvement, and the higher the speedup, the better.
5.3 Efficiency:
Efficiency provides a normalized measure of how well the parallel algorithm utilizes
the available resources.
Formula: Efficiency = Speedup / Number of Processes
Efficiency close to 1 indicates good utilization of resources; values less than 1 suggest
overhead.
5.4 Scalability:
Scalability assesses the performance of the parallel algorithm as the problem size or the
number of processors increases.
Ideally, as the problem size increases or more processors are added, the execution time
should decrease or remain relatively constant.
5.5 Load Balancing:
Assess the load balancing among MPI processes. Uneven distribution of workloads can
lead to idle processors, reducing overall efficiency.
Evaluate if the workload is evenly distributed among processes during the parallel
sorting and merging phases.
5.6 Communication Overhead:
MPI communication introduces overhead. Evaluate the impact of communication
patterns (e.g., scatter, gather, send, and receive) on overall performance.
Minimize unnecessary communication and explore optimization strategies.
5.7 Optimization Potential:
Identify potential areas for optimization. This could include tuning parameters such as
chunk sizes, and communication strategies, or exploring alternative parallelization
approaches.
5.8 System Characteristics:
Consider the specific characteristics of the computing environment, such as the
network topology, memory architecture, and processor speed.
Adapt the parallel algorithm to leverage these characteristics for improved
performance.
6. Conclusion
The MPI implementation of merge sort showcases the potential for efficient parallel
sorting by leveraging distributed computing. Despite communication overhead, it
demonstrates improved scalability and reduced sorting time with the effective
utilization of multiple processors. Proper load balancing and synchronization
mechanisms are crucial for maximizing the performance benefits of parallel merge sort
in MPI.