Scientific Writing Parallel Computing V2
Scientific Writing Parallel Computing V2
Georgiana NEAȚĂ1, Ionuț NICOLAE2, Horațiu ȚIBREA3, Raul VIDIS4, Georgiana UDREA5
1
Bucharest University of Economic Studies
2
Bucharest University of Economic Studies
3
Bucharest University of Economic Studies
4
Bucharest University of Economic Studies
5
Bucharest University of Economic Studies
In this paper we aim to parallelize the Quicksort, Merge Sort and Bubble Sort algorithms
using multithreading (OpenMP) platform. One of the representative sorting algorithms,
merge sort, is widely used in database systems that requires sorting in order to maintain its
stability. The proposed method examined on two standard datasets with different numbers of
threads. The elements of the input datasets are distributed into these temporary sub-arrays
depending on the number of characters in each word. We want to see the experimental results
of this study to reveal the performance of parallelization the proposed Quicksort, Merge Sort
and Bubble Sort algorithms has shown improvement when compared to the sequential
Quicksort, Merge Sort and Bubble Sort algorithms by delivering improved Execution Time,
Speedup and Efficiency. We implemented OpenMP using Intel Core i5-4210U, 1.7 GHz, 8,00
GB RAM, 4 CPUs. Finally, we get the data structure effects on the performance of the
algorithm for that we choice the second approach.
Keywords: Bubble Sort, Merge Sort, OpemMP, sorting algorithms, parallel computing
Facultatea de Cibernetică, Statistică și Informatică Economică 2
Introduction
1 Sorting is one of the most common operations perform with a computer. Basically,
it is a permutation function which operates on elements. In computer science sorting
algorithm is an algorithm that arranges the elements of a list in a certain order. Sorting
algorithms are taught in some fields such as Computer Science and Mathematics. There are
many sorting algorithms used in the field of computer science such as Bubble, Insertion,
Selection, Quick etc. They differ in their functionality, performance, applications, and
resource usage. In this paper we will focus on the bubble sort, merge sort and quick sort
algorithms.
Bubble sort is the oldest, the simplest and the slowest sorting algorithm in use having
a complexity level of O(n2). Bubble sort works by comparing each item in the list with the
item next to it and swapping them if required. The algorithm repeats this process until to make
passes all the way through the list without swapping any items. Such a situation means that all
the items are in the correct order. By this way the larger values move to the end of the list
while smaller values remain towards the beginning of the list. It is also used in order to sort
the array such like the larger values comes before the smaller values. In other words, all items
are in the correct order. The algorithm’s name, bubble sort, comes from a natural water
phenomenon where the larger items sink to the end of the list whereas smaller values
“bubble” up to the top of the data set. Bubble sort is simple to program, but it is worse than
selection sort for a jumbled array. It will require many more component exchanges, and is just
good for a pretty well ordered array. More importantly bubble sort is usually the easiest one to
write correctly.
For database system that processes large amount of information, there are many search
algorithms that are fast and accurate for usage. Similarity of all these algorithms is that data
are assembled in sort by search requirement . It means that fast and accurate search is fast and
accurate sorting. When huge amount of information is updated in a day, this situation
demands more frequent data sort. Also, there is trend in which amount of data for sort is
getting increased for one process. Therefore, in database system, time and effort are getting
increased for sorting. A method is needed to reduce sorting time to fix this problem. A
database system that processes huge amount of information needs increase in system
effectives but since information must be provided continuously, it requires a method that does
not alter system greatly while increasing effectiveness. However, a regular comparison-based
algorithm can’t go beyond effectiveness change. There are parallel method that takes care of
data simultaneously in hopes of raising effectiveness but those methods arranges road
balancing or makes communication between inter- processor faster, which makes construction
of parallel method difficult and creates great change to the system in the process. Trying to
solve these weaknesses, this thesis chose merge sort algorithm due to its characteristic of
being stable, which is a reason it is widely used in database system, and OpenMP was used
for parallel method. OpenMP would not alter system greatly by inserting directive for parallel
code into existing code, and it is supported by most of compilers so it can be used in most of
systems. Therefore, parallel merge sort using OpenMP can solve problems discussed earlier.
In comparison to other methods, it can be implemented easily so we can expect higher
effectiveness without much effort.
Hoare’s Quicksort algorithm, is one of the most intensively studied problems in
computer science. It utilizes the “divide and conquers” strategy by reducing the sorting
Facultatea de Cibernetică, Statistică și Informatică Economică 3
problem into several easier sorting problems and solve each of them. Due to the good
performance in practise the Quicksort algorithm considered one of the most popular sorting
algorithm. One value is selected form the input data which normally called the pivot value,
this pivot use to partitioning the input dataset into two subsets that one contains input data
smaller in size compared to the pivot value and the other contains input data higher than the
pivot value. In every single step these divided datasets are sub-divided selecting pivots from
each set. The recursive operation is no longer occur when there is no sub division is possible.
It employs the “divide and conquers” technique by minimizing the sorting problem into
several simpler sorting problems and solve each of them.
dsa
Fig. 1. The mechanism of algorithms
configuration properties for C/C++ language, after that we change the “OpenMP Support
field” to yes value.
2 Methods
We sort the datasets using the bubble sort algorithm in three phases. In the first phase,
we are removing / ignoring the special characters from the text file. In the second phase, we
convert the text file to array of list (vectors of string) based on the length of characters, all
shorter words come be for longer words. In the third phase, we sort each vector of string by
Facultatea de Cibernetică, Statistică și Informatică Economică 5
arranging in the alphabetic order using the bubble sort algorithm. Table 1 shows the time of
pre-processing phase.
parallel processing, in (2), first two data are sorted then next remaining two are sorted. Then it
goes to (3). Number of cores used in (2) is two, and it was used twice. (3) and (4) used two
and one respectively to reduce the time.
Fig 6 shows 4-way merge sort when data is divided into sixteen. Merge sort is
processed in numerical order, and dark box shows data pairing when four cores are used. Four
cores each process 1~4, 5~8, 9~12, and 13~16 then process 17~20. After that you get the
result in the last step with one core. If only two cores are used, then units should be
numerically paired in two and be processed.
3 Results
Table 1. Shows datasets have been tested 10 times to get an average based on the first
approach.
Facultatea de Cibernetică, Statistică și Informatică Economică 10
1 6.695 188.18
2 5.103 132.66
4 4.572 84.271
6 3.751 65.846
8 3.167 51.046
10 3.826 51.046
16 4.858 52.991
1 0.041054 2.401157
2 0.041054 2.401157
4 0.015838 1.40058
Table 4. Quick Sort - OpenMP parallel time for different data size and different
number of threads
Facultatea de Cibernetică, Statistică și Informatică Economică 12
1 0.3362 0.3376
2 0.3405 0.2076
4 0.3345 0.1749
2-way
8 0.3396 0.1753
1 0.3857 0.3849
4 0.3788 0.2042
4-way
16 0.3614 0.1971
1 0.3936 0.3915
8-way 8 0.3669 0.2035
4
Facultatea de Cibernetică, Statistică și Informatică Economică 13
Analysis
Bubble Sort
According to the pervious results the OpenMP shows the best speedup was when used
8 threads. That means, the best speedup occurs when using threads number as equal to the
actual cores number (see table 5). In other words, increasing the number of threads up to the
actual number of cores do not lead to any advantage really it will be affected on the speedup
value. Because the increasing of thread means more works to dividing the tasks, create
threads and destroy it etc. And what proves this conclusion the Increasing decadence in the
value of the efficiency as a result we will have huge idle time.
Quick Sort
The results of using (Dataset 1) shows higher speedup when running the parallel
algorithm compared to the sequential algorithm in different number of threads. The ratio of
the efficiency using 4 cores with 4 threads in the parallel method is close by the optimal result
of utilizing such number of cores. It could be viewed that the ratio of the efficiency increases
when the number of cores increases. Moreover, when using (Dataset 2) also shows higher
speedup when running the parallel program compared to the sequential program with different
number of threads.
Merge Sort
Most clearly shown result is when core is increased to take care of many data
simultaneously. If two cores are used for parallel processing, in case of 2-way with dividing
data into four showed above 1.9 improvement. If four cores were used it showed 2.8x
improvement. It used twice as much cores but showed 1.5x improvement from two. This is
due to parallel region is limited which all four cores can’t participate in processing at once. In
8-way, from single to dual showed 1.8x improvement and single to quad showed 2.9x
improvement similar to 2-way. In case of 4-way, if data is split into four and sixteen, single to
quad showed 2.8x and 2.67x respectively. This is due to process where data is increased for
single core, some of it were sorted early by operating system. So, area available for OpenMP
were smaller. Specific number is different but shows similar amount of improvement in
performance.
Facultatea de Cibernetică, Statistică și Informatică Economică 14
5 Conclusion
In this paper we implemented the bubble sort, quick sort and merge sort algorithm
using multithreading (OpenMP). The proposed work tested on two standard datasets (text file)
with different size taken from http://www.booksshouldbefree.com/. We implemented
OpenMP using Intel Core i5-4210U, 1.7 GHz, 8,00 GB RAM, 4 CPUs.
Finally, we get the data structure effects on the performance, where this is clear in
sequential code 1 and 2. In OpenMP, increasing the number of threads more than an actual
core number it will be affected on the speed up only. For the future work we will implement
bubble sort using massage Passing Interface (MPI) and compiler the result with OpenMP
approach.
The proposed method of parallelization quick sort algorithm examined on two
standard datasets with different number of threads. Experimental results of this study, which
have been explained carefully in the previous sections, reveal that the performance of
parallelization the proposed Quicksort algorithm has shown improvement when compared to
the sequential Quicksort algorithm by delivering improved Execution Time, Speedup and
Efficiency.
It is found that when using File1 and File2 dataset shows higher speedup when
running the parallel algorithm compared to the sequential algorithm in different number of
threads. The ratio of the efficiency using 4 threads with 4 cores in the parallel method is close
by the optimal result of utilizing such number of cores. This led to conclusion when the
number of cores increase the ratio of the efficiency increase too. We planned in the future to
implement the Quicksort algorithm using Message Passing Interface (MPI) and compare its
results with OpenMP method.
For improving performance of database system, performance of sorting algorithms
must increase. Among techniques that improve performance, there is parallel processing.
However, it is difficult to materialize, unexpected error can occur, and can alter system
greatly. Therefore, being effective and easy to use, we paralleled merge sort algorithm using
OpenMP. When looking at the results using the same number of cores, there is no clear
relationship with k-way and merge sort. Though changing k-way, it can reduce number of
combining but due to computers trait, k-way can’t be all handled at once and internally it uses
binary comparator so total number of comparisons is the same. It should be distributed to
parallel merge sort algorithm using OpenMP and parallel region should be at least be greater
than number of cores so that it can reduce time that core is not being used. If the number of
parallel regions is less than the number of cores, the performance is hardly improved not
being able to conjugate the resources. It can be shown by using single core with k-way merge
sort and using dual or quad core with 2-way merge sort. For greater effectiveness, area should
be in multiples of number of cores so that when earlier steps are done, number of core waiting
can be reduced, which will result in greater usage of entire core and great performance in
parallel processing. Through running merge sort algorithm, we implemented with
conclusions, in case of dual core, it showed 1.8x improvement, and in case of quad core, it
showed 2.8x improvement. Finally, without changing system greatly, data should be divided
Facultatea de Cibernetică, Statistică și Informatică Economică 15
so that area where OpenMP is applied should be set and parallel that part so great
improvement were acquired.
6 References
[1] Grama, Ananth. Introduction to parallel computing. Pearson Education, (2003).
[2] Neininger, Ralph. "Refined quicksort asymptotics." Random Structures & Algorithms
46.2, 346-361, (2015)
[3] Aumüller, Martin, and Martin Dietzfelbinger. "Optimal Partitioning for Dual-Pivot
Quicksort." ACM Transactions on Algorithms (TALG) 12.2, 18 (2015)
[4] Minsoo K, and Dongseung K. “Parallel Merge Sort with Load Balancing”; International
Journal of Parallel Programming., Vol. 31., No. 1., 2003.
[5] Rohit C, Leonaldo D, Dave K, Dror M, Jeff M, Ramesh M. “Parallel Programming in
OpenMP”, 2001.
[6] Bakara C, Gabriele J, and Ruud van der P. “Using OpenMP”, 2007.
[7] Altukhaim, S. (2003), Bubble Sort Algorithm, Florida Institute of Technology.
[8] Cătălin B. (2020), Parallel Processing - CPP and OMP lectures, Bucharest University
of Economic Studies
[9] Cătălin B. (2020), Parallel Processing - Algorithms for sorting, searching and from
other fields that can be parallized lectures, Bucharest University of Economic Studies