Lower bound for comparison based sorting algorithms
Last Updated :
19 Jun, 2023
The problem of sorting can be viewed as following.
Input: A sequence of n numbers <a1, a2, . . . , an>.
Output: A permutation (reordering) <a‘1, a‘2, . . . , a‘n> of the input sequence such that a‘1 <= a‘2 ….. <= a’n.
A sorting algorithm is comparison based if it uses comparison operators to find the order between two numbers. Comparison sorts can be viewed abstractly in terms of decision trees. A decision tree is a full binary tree that represents the comparisons between elements that are performed by a particular sorting algorithm operating on an input of a given size. The execution of the sorting algorithm corresponds to tracing a path from the root of the decision tree to a leaf. At each internal node, a comparison ai <= aj is made. The left subtree then dictates subsequent comparisons for ai <= aj, and the right subtree dictates subsequent comparisons for ai > aj. When we come to a leaf, the sorting algorithm has established the ordering. So we can say following about the decision tree.
1) Each of the n! permutations on n elements must appear as one of the leaves of the decision tree for the sorting algorithm to sort properly.
2) Let x be the maximum number of comparisons in a sorting algorithm. The maximum height of the decision tree would be x. A tree with maximum height x has at most 2^x leaves.
After combining the above two facts, we get following relation.
n! <= 2^x
Taking Log on both sides.
log2(n!) <= x
Since log2(n!) = Θ(nLogn), we can say
x = Ω(nLog2n)
Therefore, any comparison based sorting algorithm must make at least nLog2n comparisons to sort the input array, and Heapsort and merge sort are asymptotically optimal comparison sorts.
Example :
C++
#include <iostream>
#include <algorithm>
void mergeSort( int * arr, int left, int right) {
if (left < right) {
int mid = (left + right) / 2;
mergeSort(arr, left, mid);
mergeSort(arr, mid + 1, right);
int i = left, j = mid + 1, k = 0;
int * temp = new int [right - left + 1];
while (i <= mid && j <= right) {
if (arr[i] < arr[j]) {
temp[k] = arr[i];
i++;
} else {
temp[k] = arr[j];
j++;
}
k++;
}
while (i <= mid) {
temp[k] = arr[i];
i++;
k++;
}
while (j <= right) {
temp[k] = arr[j];
j++;
k++;
}
for (i = left; i <= right; i++) {
arr[i] = temp[i - left];
}
delete [] temp;
}
}
int main() {
int arr[] = {5, 2, 4, 6, 1, 3};
mergeSort(arr, 0, 5);
for ( int i = 0; i < 6; i++) {
std::cout << arr[i] << " " ;
}
std::cout << std::endl;
return 0;
}
|
Java
public class MergeSort {
public static void mergeSort( int [] arr) {
if (arr.length > 1 ) {
int mid = arr.length / 2 ;
int [] left = Arrays.copyOfRange(arr, 0 , mid);
int [] right = Arrays.copyOfRange(arr, mid, arr.length);
mergeSort(left);
mergeSort(right);
int i = 0 , j = 0 , k = 0 ;
while (i < left.length && j < right.length) {
if (left[i] < right[j]) {
arr[k] = left[i];
i++;
} else {
arr[k] = right[j];
j++;
}
k++;
}
while (i < left.length) {
arr[k] = left[i];
i++;
k++;
}
while (j < right.length) {
arr[k] = right[j];
j++;
k++;
}
}
}
public static void main(String[] args) {
int [] arr = { 5 , 2 , 4 , 6 , 1 , 3 };
mergeSort(arr);
System.out.println(Arrays.toString(arr));
}
}
|
Python3
def merge_sort(arr):
if len (arr) > 1 :
mid = len (arr) / / 2
left = arr[:mid]
right = arr[mid:]
merge_sort(left)
merge_sort(right)
i = j = k = 0
while i < len (left) and j < len (right):
if left[i] < right[j]:
arr[k] = left[i]
i + = 1
else :
arr[k] = right[j]
j + = 1
k + = 1
while i < len (left):
arr[k] = left[i]
i + = 1
k + = 1
while j < len (right):
arr[k] = right[j]
j + = 1
k + = 1
arr = [ 5 , 2 , 4 , 6 , 1 , 3 ]
merge_sort(arr)
print (arr)
|
C#
using System;
public class MergeSort {
public static void mergeSort( int [] arr, int left, int right) {
if (left < right) {
int mid = (left + right) / 2;
mergeSort(arr, left, mid);
mergeSort(arr, mid + 1, right);
int i = left, j = mid + 1, k = 0;
int [] temp = new int [right - left + 1];
while (i <= mid && j <= right) {
if (arr[i] < arr[j]) {
temp[k] = arr[i];
i++;
} else {
temp[k] = arr[j];
j++;
}
k++;
}
while (i <= mid) {
temp[k] = arr[i];
i++;
k++;
}
while (j <= right) {
temp[k] = arr[j];
j++;
k++;
}
for (i = left; i <= right; i++) {
arr[i] = temp[i - left];
}
Array.Clear(temp, 0, temp.Length);
}
}
public static void Main() {
int [] arr = {5, 2, 4, 6, 1, 3};
mergeSort(arr, 0, 5);
Console.WriteLine( string .Join( " " , arr));
}
}
|
Javascript
function mergeSort(arr) {
if (arr.length > 1) {
let mid = Math.floor(arr.length / 2);
let left = arr.slice(0, mid);
let right = arr.slice(mid, arr.length);
mergeSort(left);
mergeSort(right);
let i = 0, j = 0, k = 0;
while (i < left.length && j < right.length) {
if (left[i] < right[j]) {
arr[k] = left[i];
i++;
} else {
arr[k] = right[j];
j++;
}
k++;
}
while (i < left.length) {
arr[k] = left[i];
i++;
k++;
}
while (j < right.length) {
arr[k] = right[j];
j++;
k++;
}
}
}
let arr = [5, 2, 4, 6, 1, 3];
mergeSort(arr);
console.log(arr);
|
Output
[1, 2, 3, 4, 5, 6]
Time Complexity: O(n*log2n).
Auxiliary Space: O(n) as recursively merge sort method has been called so recursion stack space will be there.
References:
Introduction to Algorithms, by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein
It’s NOT possible to sort an array faster than in O(nlogn) time when we restrict ourselves to sorting algorithms that are based on comparing array elements.
The lower bound for the time complexity can be proved by considering sorting as a process where each comparison of two elements gives more information about the content of the array.
consider the following decision tree, which will give you a more clear understanding of the process:
x < y ?, this means comparison between ‘x’ and ‘y’
If x < y, the process continues to the left, and otherwise to the right.
By doing this process the possible ways to sort the array, a total of n! ways.
therefore, the height of the tree must be at least.
log2(n!) = log2(1)+log2(2)+··· +log2(n)
We get a lower bound for this sum by choosing the last n/2 elements and changing the value of each element to log2(n/2).
log2(n!) ≥ (n/2)·log2(n/2),
so, the height of the tree and the minimum possible number of steps in a sorting algorithm in the worst case is at least nlogn.