Understanding Algorithms and Analysis
Understanding Algorithms and Analysis
1|Page
1.1 Algorithm
An algorithm is a set of steps of operations to solve a problem performing calculation, data
processing, and automated reasoning tasks. An algorithm is an efficient method that can be
expressed within a finite amount of time and space.
An algorithm is the best way to represent the solution of a particular problem in a very simple
and efficient way. If we have an algorithm for a specific problem, then we can implement it
in any programming language, meaning that the algorithm is independent from any
programming languages.
Algorithm is like procedure which does something. For example, preparing the dish of food,
some experiment done by the student in physics. Each procedure needs some input and
produce output.
Algorithms can be expressed as natural languages, programming languages, pseudocode,
flowcharts and control tables. Natural language expressions are rare, as they are more
ambiguous. Programming languages are normally used for expressing algorithms executed by
a computer.
Algorithm Program
It is related to designing the solution of a It is related to implementation of solution of
problem. a problem.
Domain knowledge is required who writes Programmer writes it and also need some
algorithm. domain knowledge.
It is written in any language – English like It is written in any programming languages
language or mathematical notation as long like C, C++, Python etc..
as it is understandable by programmer.
It is independent of hardware and software. It is dependent on hardware and software.
We need to select either Linux or Windows.
Analysis is performed on the algorithm once Testing is performed once it is
it is written. implemented.
2|Page
1.3 Characteristics of Algorithms
The main characteristics of algorithms are as follows −
Input – it may or may not take input – 0 or more input.
Output – It must produce some output – otherwise no meaning of it. At least
one output.
Definiteness (clear/precise)– each statement must be unambiguous means it
must be having clear meaning. For example, we try to find square root of
negative number which is not possible.
Finiteness – it must terminate at some point. It is not like running continuously
until we stop it like web server.
Effectiveness – we cannot write statement unnecessarily. It must do something
and have some effect. Each operation must be simple and feasible so that one
can trace it out using paper and pencil. While preparing a dish, we don’t do the
things (like cutting vegetables but not used in the dish) which is not part of
preparation.
Language independent: algorithms must be independent of any language, and
they must be implemented in any language.
3|Page
Algorithm analysis is an important part of computational complexity theory, which provides
theoretical estimation for the required resources of an algorithm to solve a specific
computational problem. Analysis of algorithms is the determination of the amount of time and
space resources required to execute it.
In software development, before implementing the software, we first need to prepare the
design. Without design, it is not possible to make the software.
When we construct any house then first, we prepare the drawing and then construct it. It is
not based on trial and error.
The following is the importance of analysis of algorithm.
1. To predict the behavior of an algorithm without implementing it on a specific
computer.
2. It is much more convenient to have simple measures for the efficiency of an
algorithm than to implement the algorithm and test the efficiency every time a
certain parameter in the underlying computer system changes.
3. It is impossible to predict the exact behavior of an algorithm. There are too many
influencing factors.
4. The analysis is thus only an approximation; it is not perfect.
5. More importantly, by analyzing different algorithms, we can compare them to
determine the best one for our purpose.
4|Page
Each statement takes one unit time as constant.
X=a+5*b+c*9
In machine, it is done using many instructions, but we don’t consider it. We just see as one
instruction and taking one unit time. It is like going to your friend house no need to do
planning how you will go there but if you have to go to mars then detailed planning is
needed.
Space analysis:
How much memory space is needed?
In the above algorithm, S(n) = 3 which is constant. O(1)
5|Page
ǎ la russe approach:
6|Page
• Both the multiplicand and the multiplier must have the same number of digits and this
number be a power of 2.
• If not then it can be done by adding zeros on the left if necessary.
Each of the above, two-digit multiplication is done using same way with some shifts and
addition.
Later, we will see how we can reduce the multiplication from four to three in divide and
conquer approach. With these improvements, divide and conquer approach runs faster on
computer than any of the preceding methods.
7|Page
8|Page
CHAPTER – 2 Asymptotic
Notations and Analysis
9|Page
2.1 Asymptotic Notations
The running time of an algorithm depends on how long it takes a computer to run the lines of
code of the algorithm—and that depends on the speed of the computer, the programming
language, and the compiler that translates the program from the programming language into
code that runs directly on the computer, among other factors.
Let's think about the running time of an algorithm more carefully. We can use a combination
of two ideas. First, we need to determine how long the algorithm takes, in terms of the size
of its input. In case of linear search, number of comparisons are increased when number of
elements are increased. So, we think about the running time of the algorithm as a function of
the size of its input.
Second, an idea is that we must focus on how fast a function grows with the input size. We
call this the rate of growth of the running time. To keep things manageable, we need to
simplify the function to extract the most important part and ignore the less important parts.
For example, suppose that an algorithm, running on an input of size n, takes 6 n2 + 100 n +
300. The 6 n2 term becomes larger than the remaining terms 100 n + 300 When n becomes
large enough, 20 in this case.
We would say that the running time of this algorithm grows as n2, dropping the coefficient 6
and the remaining terms 100 n + 300.
It doesn't really matter what coefficients we use; as long as the running time an2 + bn + c, for
some numbers a > 0, b, and c, there will always be a value of n for which an2 is greater
than bn + c and this difference increases as n increases. For example, here's a chart showing
values of 0.6 n2 + 1000n+3000, where we reduced the coefficient of n2 by a factor of 10 and
increased the other two constants by a factor of 10.
10 | P a g e
By dropping the less significant terms and the constant coefficients, we can focus on the
important part of an algorithm's running time—its rate of growth—. When we drop the
constant coefficients and the less significant terms, we use asymptotic notation.
Another example of growth rate:
The order of function growth is critical in evaluating the algorithm’s performance. Assume
the running times of two algorithms A and B are f(n) and g(n), respectively.
f(n) = 2n2 + 5
g(n) = 10n
Here, n represents the size of the problem, while polynomials f(n) and g(n) represent the
number of basic operations performed by algorithms A and B, respectively. Running time of
both the functions for different input size is shown in following table:
n 1 2 3 4 5 6 7
f(n) - A 7 13 23 37 55 77 103
g(n) - B 10 20 30 40 50 60 70
Algorithm A may outperform algorithm B for small input sizes, however when input sizes
become sufficiently big (in this example n = 5), f(n) always runs slower (performs more
steps) than g(n). As a result, understanding the growth rate of functions is critical.
Asymptotic notations describe the function’s limiting behaviour.
11 | P a g e
Machine-specific constants include the machine’s hardware architecture, RAM, supported
virtual memory, processor speed, available instruction set (RISC or CISC), and so on. The
asymptotic notations examine algorithms that are not affected by any of these above factors.
12 | P a g e
2.2 Big- O notation:
We write f(n) = O(g(n)), If there are positive constants n0 and c such that, to the right of n0
the f(n) always lies on or below c*g(n).
O(g(n)) = { f(n) : There exist positive constant c and n0 such that 0 ≤ f(n) ≤ c g(n), for all n ≥
n0}
For example:
f(n) = 2 n + 3
2 n + 3 < = ????
2 n + 3 < = 10 n for any value of n > no, no =1 and c = 10.
We can write anything like 7 n, 1000 n.
Simply, we can do the following.
2 n + 3 <= 2 n + 3 n
2 n + 3 < = 5 n , n >=1
Here, f(n) = 2 n + 3 and g(n) = n
So, f(n) = O(g(n))
Can we write the following?
2 n + 3 <= 2 n2 + 3 n2
Yes, we can also write and f(n) = O(n2)
f(n) = O(n) , f(n) = O(n2)
Actually, f(n) belongs to linear class and all the classes on its right are upper bound of it. All
functions on its left are lower bound. So, we can take any function on its right only.
We should try to write the closet function to f(n).
1 < log n < √𝒏 < 𝑛 < n log n < n2 < n3 < … < 2n < 3n < nn
13 | P a g e
2.3 Big-Omega - Ω
Big-Omega (Ω) notation gives a lower bound for a function f(n) to within a constant factor.
We write f(n) = Ω(g(n)), If there are positive constants n0 and c such that, to the right of n0
the f(n) always lies on or above c*g(n).
Ω(g(n)) = { f(n) : There exist positive constant c and n0 such that 0 ≤ c g(n) ≤ f(n), for all n ≥
n0}
14 | P a g e
f(n) = 2 n + 3
2 n + 3 >= 1 * n
c=1, g(n) = n
For all n >=1, f(n) = Ω(n)
We can also say, f(n) = Ω(log n) or any lower bound on left side in the order of the class.
2.4 Theta - Θ
Big-Theta(Θ) notation gives bound for a function f(n) to within a constant factor.
We write f(n) = Θ(g(n)), If there are positive constants n0 and c1 and c2 such that, to the
right of n0 the f(n) always lies between c1*g(n) and c2*g(n) inclusive.
Θ(g(n)) = {f(n) : There exist positive constant c1, c2 and n0 such that 0 ≤ c1 g(n) ≤ f(n) ≤ c2
g(n), for all n ≥ n0}
f(n) = 2 n + 3
1 * n <= 2 n + 3 <= 5 * n
c1 = 1, g(n) = n , c2 = 5, n >= 1
Don’t misunderstand about worst case and best case with notation. Any notation can be used
to represent the best case, worst case of an algorithm.
15 | P a g e
Examples – 1:
f(n) = 2 n2 + 3 n + 4
2 n2 + 3 n + 4 <= 2 n2 + 3 n2 + 4 n2
2 n2 + 3 n + 4 <= 9 n2 for any value of n >=1, c = 9, n0 = 1
So, f(n) = O(n2)
Example – 2:
F(n) = 2 n2 + 3 n + 4
2 n2 + 3 n + 4 >= 1 * n2
f(n) = Ω(n2)
Example-3:
f(n) = 2 n2 + 3 n + 4
1 * n2 <= 2 n2 + 3 n + 4 <= 9 n2
Example-4:
f(n) = n2 log n + n
Example-5:
f(n) = n !
1 <= n ! <= nn
Upper bound - O(nn) , lower bound - Ω(1)
There is not average bound for n!.
When it is not possible to find ɵ then Ω and O are used to find lower and upper bound
respectively.
We can not put n10 as lower bound and n14 as upper bound.
16 | P a g e
Example-6:
f(n) = log n !
log (1 * 1 * 1 ..) <= log (1 * 2 * 3 * 4 *…n) <= log (n * n * n * n) (log nn ) means n log n
Upper bound - O(n log n) , lower bound - Ω(1)
Reflexive property:
If f(n) is given, then f(n) is O(f(n)).
e.g f(n) = n2 then O(n2)
Any function is a lower bound of itself.
Transitive property:
If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n)).
Example: n is O(n2) and n2 is O(n3) then n is O(n3)
Symmetric:
Only for ɵ notation only
If f(n) is ɵ(g(n)) then g(n) is ɵ(f(n))
f(n) = n2 and g(n) = n2
17 | P a g e
Transpose symmetric:
If f(n) is O(g(n)) then g(n) is Ω(f(n))
f(n) = n
g(n) = n2
f(n) = O(n2)
g(n) = Ω(n)
Another property:
If f(n) = O(g(n))
f(n) = Ω(g(n))
Then f(n) = ɵ(g(n))
Means c1 * g(n) <= f(n) <= c2 * g(n)
Another property:
If f(n) = O(g(n))
And d(n) = O(h(n))
Then f(n) + d(n) = O(max(g(n),h(n)))
Another property:
If f(n) = O(g(n))
And d(n) = O(h(n))
18 | P a g e
log n2 < log n3
2 log n < 3 log n so, 3 log is bigger.
What is logarithms functions? (Watch this video for more details - Relation and Function 04 |
All about Logarithmic Function | Class 11 | IIT JEE - YouTube)
2 X = log2 Y
In the above, for the 1st function give the input 2, 4 and 16 etc.. and its corresponding output
gives as input to the 2nd function so, the result will be those that is given as input to the 1st
function.
logb x – it is read as log x with base b.
log2 4 = 2
log2 32 = 5
log2 1/16 = -4 2? = 1/16 = 2-4
log1/4 ½ = ½
log21 = 0
Output of the logarithmic functions can be positive, zero or negative.
log2 0 = Not defined
log2 -4 = Not defined
Input of logarithm functions must not be zero or negative.
Base of logarithm:
log24 = 2
19 | P a g e
log1 4 = not defined
log-2 4 = not defined
base of log must not be one or negative.
Example:
20 | P a g e
Example:
Example:
F(n) = 2n and g(n) = 22n
Apply log
log 2n log 22n
n log2 2 2 n log2 2
n < 2n
In the worst-case analysis, we calculate the upper bound on the running time of an
algorithm. We must know the case that causes a maximum number of operations to be
executed. For Linear Search, the worst case happens when the element to be searched (x) is
not present in the array. When x is not present, the search() function compares it with all
the elements of arr[] one by one. Therefore, the worst-case time complexity of the linear
search would be O(n).
In the best-case analysis, we calculate the lower bound on the running time of an algorithm.
We must know the case that causes a minimum number of operations to be executed. In the
linear search problem, the best case occurs when x is present at the first location. The
number of operations in the best case is constant (not dependent on n). So time complexity
in the best case would be Ω(1) or O(1).
21 | P a g e
3. Average Case Analysis (Rarely used)
In average case analysis, we take all possible inputs and calculate the computing time for
all of the inputs. Sum all the calculated values and divide the sum by the total number of
inputs. We must know (or predict) the distribution of cases. For the linear search problem,
let us assume that all cases are uniformly distributed (including the case of x not being
present in the array). So, we sum all the cases and divide the sum by n which is O(n).
Case analysis is identifying instances for which the algorithm takes the longest or shortest
time to complete (i.e., takes the greatest number of steps), then formulating a growth function
using this.
Let’s understand them using two examples to find their best, worst and average case.
Examples:
1. Linear Search
2. Binary Search
3 90 56 43 67 12 35 9 78 52
22 | P a g e
Time taken is A(n) – O(n)
B(n) – 1
B(n) – O(1)
B(n) - Ω(1)
B(n) – Ɵ(1)
For any constant function, we can write any notation.
W(n) – n
W(n) – O(n)
W(n) - Ω(n)
W(n) – Ɵ(n)
Any notation can be used to show best, average and worst case.
Binary Search:
For any node elements less than are on left and elements greater than or equal to it on its right
side.
20
10 30
5 15 25 40
Number of searches required is equal to height of the binary tree which is – log2 n. where n is
number of elements in the tree.
Best case: The element to be searched for is present in the root. The time taken is constant.
B(n) – O(1).
Worst case: The element to be searched is present in the leaf node. So, the time taken is
proportional to height of the binary tree which is log2 n. W(n) – log2 n.
Average case: It is difficult to find so, mostly it is similar to worst case.
Here, we have balanced tree but if the tree of the following type for same elements where tree
is left skewed and height is n so, W(n) – O(n).
23 | P a g e
In worst case, minimum time is W(n) – O(log2 n) and maximum time is W(n) – O(n)
This type of minimum and maximum worst case is not possible for all types of algorithms.
24 | P a g e
2.8 Frequency count method:
A. Finding sum of array elements
Algorithm sum(A[],n)
{
s=0; -1
for(i=0;i<n;i++) - i=0 # 1, i < n # n + 1, i++ # n
s = s + A[i]; -n
return s; -1
}
One unit of time for each statement.
Time complexity:
f(n) = 2 n + 3 which is O(n)
Space complexity:
A # n, s # 1, n # 1
So, S(n) = n + 2 which is O(n)
25 | P a g e
C. Finding multiplication of two matrices.
Algorithm multiply(A, B, C, n)
for(i=0;i<n;i++) -n+1
for(j=0;j<n;j++) - n * (n + 1)
c[i,j]=0; -n*n
for(k=0;k<n;k++) - n * n * (n+1)
C[i,j]= A[i,k]+B[k.j]; -n*n
}
Time complexity:
F(n) = 2n3 + 3n2 + n + 1 which is O(n3)
Space complexity:
A, B, C – n * n and i, j, k, n – 1
S(n) = O(n2)
Example-1
for(i=0;i<n;i+=2) – n /2
statement;
Example-2
for(i=0;i<n;i+=20) – n /20
statement;
f(n) = n / 20 which is O(n)
Example-3
for(i=0;i<n;i++)
for(j=0;j<i;j++)
Statement;
26 | P a g e
i j No. of times
0 0 0
1 0 1
1
2 0 2
1
2
3 0 3
1
2
3
……
n -- n
F(n) = 1 + 2 + 3 + .. + n
= n (n+1) / 2
= O(n2)
Example-4
p=0
for(i=1;p<=n;i++)
p = p + i;
i P
1 0+1
2 1+2
3 1+2+3
4 1+2+3+4
5 1+2+3+4+5
k times
1 + 2 + 3 + 4 + 5 + .. + k
27 | P a g e
put the value of p in equation (1)
k2 > n
Example-5
for(i=1;i<n;i*=2)
statements;
i – 2, 22 , 23 , 24 …. 2k
The loop stops when,
i >= n (1)
i = 2k (2)
2k > = n
2k = n
k = log2 n which is O(log2 n)
When loops get multiplied then it takes log n time.
log may give you float value so, we have to consider the ceil value of the log result.
n = 8 so, log 8 = 3, log 10 = 3.2 so, ceil(3.2) = 4 means we have to take ⌈𝑙𝑜𝑔⌉
Example-6
for(i=n;i>=1;i/=2)
statements;
i – n/2, n/22 , n/23 … n/2k
It stops when i < 1
If we equate them then,
i = n/2k
n/2k = 1
n = 2k
k = log2 n which is O(log2 n)
28 | P a g e
Example-7
for(i=0;i * i < n; i++)
statements;
It terminates when i * i >= n
i2 = n
i = √𝑛 which is O(√𝑛 )
Example-8
for(i=0;i<n;i+=1)
statement; -n
for(i=0;i<n;i+=1)
statement; -n
f(n) = 2n which is O(n)
Example-9
p=0
for(i=1;i<n;i=i*2) - log n
p++:
for(j=0;j<p;j*=2) - log p
Statement;
Here, first loop is iterated log n times and value of p becomes log n.
Second loop is iterated log p times and replace the value of p as log n.
= log p = log log n
F(n) = loglog n which is O(loglog n)
Example-10
for(i=0;i<n;i++) - n
for(j=0;j<n;j*=2) - log2 n * n
Statement; - log2 n * n
29 | P a g e
F(n) = n log n which is O(n log2 n)
i =1
k=1
while(k < n)
{
statement;
k = k + i;
i++;
}
i k
1 1
2 1+1
3 1+1+2
30 | P a g e
4 1+1+2+3
5 1+ 1 + 2 + 3 + 4
m times 2 + 2 + 3 + 4 ---- + m
m = √𝑛 which is O(√𝑛 )
Algorithm Test(n)
{
if(n < 5)
Statement;
else
{
for(i=0;i<n;i++)
Statements;
}
}
31 | P a g e
Statements are executed based on the condition so, best case is O(1) and worst case is O(n)
1 < log n < √𝒏 < 𝒏 < n log n < n2 < n3 < … < 2n < 3n < nn
Let’s see in the following example, how exponentiation functions grow much faster than any
other functions when n grows.
log2 n n n2 2n
0 1 1 2
1 2 4 4
1.3 3 9 8
2 4 16 16
3 8 64 256
. . . .
. . . .
. . . .
4.3 20 400 10,48,576
After some value of n , 2n is much greater than all others. So, any power to n is less than 2n
32 | P a g e
CHAPTER – 3 Recurrence
Relation
33 | P a g e
3.1 Recurrence Relation
34 | P a g e
Recursive/Recursion tree approach
Calling of Test(3)
Test(3) 1
3 Test(2) 1
2 Test(1) 1
1 Test(0) 1
Each call prints the value of n and printf statement executes one time per call. Functions calls
total 4 times for Test(3) so, for Test(n) it will be called n + 1 times.
Time complexity is = O(n)
How to find recurrence relation for the above function:
void Test(int) - T(n)
{
if(n>0) - 1 // we may add or not. It will not make any
difference.
{
printf(“%d”,n); -1
Test(n-1); - T(n-1)
}
}
1 𝑖𝑓 𝑛 = 0
𝑇(𝑛) = {
𝑇(𝑛 − 1) + 1 𝑖𝑓 𝑛 > 0
35 | P a g e
Backward substation method:
T(n) = T(n-1) + 1 - (1)
Now, we find T(n-1) as follows,
T(n-1) = T(n-2) + 1
Put T(n-1) in equation (1)
T(n) = [ T(n-2) + 1 ] + 1
T(n) = T(n-2) + 2 - (2)
Now, find T(n-2) as follows,
T(n-2) = T(n-2-1) + 1 = T(n-3) + 1
Put T(n-2) in equation (2)
T(n) = [ T(n-3) + 1 ] + 2
T(n) = T(n-3) + 3 - (3)
If we repeat this k times,
T(n) = T(n-k) + k - (4)
Assume n-k=0 so, n=k
T(n) = T(n-n) + n - (5)
T(n) = T(0) + n - (6)
T(n) = 1 + n - (7)
T(n) = O(n)
Another function
void Test(int n) - T(n)
{
if(n>0) -1
{
for(i=0;i<n;i++) -n+1
{
printf(“%d”,n); -n
}
Test(n-1); - T(n-1)
36 | P a g e
}
}
Substitution method
T(n) = T(n-1) + n + 1 + n + 1
T(n) = T(n-1) + 2n + 1
Take the asymptotic notation of 2n + 1 which is O(n).
T(n) = T(n-1) + n
1 𝑖𝑓 𝑛 = 0
𝑇(𝑛) = {
𝑇(𝑛 − 1) + 𝑛 𝑖𝑓 𝑛 > 0
37 | P a g e
Put T(n-2) in equation (2)
T(n) = [ T(n-3) + n - 2] + (n-1) + n
T(n) = T(n-3) + (n-2) + (n-1) + n (3)
Repeat this k times.
T(n) = T(n-k) + (n-(k-1)) + (n-(k-2)) .. + n-1 + n (4)
Assume that n – k = 0 so, n = k
T(n) = T(n-n) + (n-(n-1)) + (n-(n-2)) .. + n-1 + n (5)
T(n) = T(0) + 1 + 2 + …. + n-1 + n (6)
T(n) = 1 + 1 + 2 + … + n-1 + n
T(n) = 1 + n (n+1) / 2
T(n) = n2
T(n) = n2
Which is O(n2)
Another function
void Test(int) - T(n)
{
if(n>0) -1
{
for(i=1;i<n;i=i*2) // this is repeated in power of two
{
printf(“%d”,n); - log2 n
}
Test(n-1); - T(n-1)
}
}
T(n) = T(n-1) + log n
1 𝑖𝑓 𝑛 = 0
𝑇(𝑛) = {
𝑇(𝑛 − 1) + log 𝑛 𝑖𝑓 𝑛 > 0
38 | P a g e
Recursion tree method:
T(n)
log n T(n-1)
log n-1 T(n-2)
log n-2 T(n-3)
log n-3 T(n-2)
…..
log 2 T(1)
log 0 T(0)
Substitution method:
39 | P a g e
= log n!
= log nn
= O(n log n )
Important note:
In the above table, second term is multiplied with n to get the order.
Types of Recurrences:
Homogeneous
Inhomegeneous
Logarithmic
Homogeneous recurrence:
If the recurrence is equated to zero and if it contains all terms in homogeneous form, then it is
called homogeneous recurrence.
For example:
𝑡𝑛 + 𝑎 𝑡𝑛−1 + 𝑏 𝑡𝑛−2 + ⋯ . +𝑛 𝑡𝑛−𝑘 = 0
Inhomogeneous recurrence:
If the sum of the linear terms of the equation is not equal to zero, then it is called
inhomogeneous recurrence.
40 | P a g e
The general format is as follows:
𝑎0 𝑡𝑛 + 𝑎1 𝑡𝑛−1 + 𝑎2 𝑡𝑛−2 + 𝑎𝑛 𝑡𝑛−𝑘 = 𝑏 𝑛 𝑝(𝑛)
Example:
𝑡𝑛+3 + 6 𝑡𝑛+2 + 8 𝑡𝑛+1 + 5𝑡𝑛 = 2𝑛
Logarithmic recurrence:
Divide and conquer techniques uses logarithmic recurrence.
The general format is:
𝑛
𝑇(𝑛) = 𝑎𝑇 ( ) + 𝑓(𝑛)
𝑏
Where a and b are constant a > =1 and b > 1. f(n) is some function.
Examples:
𝑛
𝑇(𝑛) = 2 𝑇 ( ) + 𝑛2
2
41 | P a g e
Iteration method:
Recurrence is expanded as summation of terms, then summation provides the result.
This method is known as try back substituting until you know what is going on.
Substitution method:
We start the method by a guess of the solution and then prove it by induction.
Example 1:
Suppose a recurrence relation given as follows,
𝑛
𝑇(𝑛) = 𝑇 ( ) + 1
2
Solution:
Given recurrence is
𝑛 (1)
𝑇(𝑛) = 𝑇 ( ) + 1
2
𝑛 𝑛
𝑇( ) ≥ 𝑐 (log )
2 2
𝑛
𝑇(𝑛) ≤ 𝑐 ∗ (log ) + 1
2
𝑇(𝑛) ≤ 𝑐 ∗ ( 𝑙𝑜𝑔2 𝑛 − 1) + 1
42 | P a g e
𝑇(𝑛) ≤ 𝑐 𝑙𝑜𝑔2 𝑛 − 𝑐 + 1
For c = 1
𝑇(𝑛) ≤ 1 𝑙𝑜𝑔2 𝑛 − 1 + 1
𝑇(𝑛) ≤ 1 𝑙𝑜𝑔2 𝑛
𝑇(𝑛) ≤ 𝑙𝑜𝑔2 𝑛
Example: 2
𝑛 𝑛 𝑛
𝑇( ) ≥ 𝑐 ( log )
2 2 2
43 | P a g e
𝑛 𝑛
𝑇(𝑛) ≥ 2 𝑐 ∗ ( log ) + 𝑛
2 2
𝑛
𝑇(𝑛) ≥ 𝑐 ∗ 𝑛 ∗ log +𝑛
2
𝑇(𝑛) ≥ 𝑐 ∗ 𝑛 ∗ ( 𝑙𝑜𝑔2 𝑛 − 1) + 𝑛
𝑇(𝑛) ≥ 𝑐 𝑛 𝑙𝑜𝑔2 𝑛 − 𝑐 𝑛 + 𝑛
For c = 1
𝑇(𝑛) ≥ 1 𝑛 𝑙𝑜𝑔2 𝑛 − 1 𝑛 + 𝑛
𝑇(𝑛) ≥ 1 𝑛 𝑙𝑜𝑔2 𝑛 − 1 𝑛 + 𝑛
𝑇(𝑛) ≥ 𝑛 𝑙𝑜𝑔2 𝑛
44 | P a g e
Times it takes.
void Test(int) T(n)
{
if(n>0)
{
printf(“%d”,,n); 1
Test(n-1); T(n-1)
Test(n-1); T(n-1)
}
}
T(n) = 2 T(n-1) + 1
Recurrence relation is
1 𝑖𝑓 𝑛 = 0
𝑇(𝑛) = {
2 𝑇(𝑛 − 1) + 1 𝑖𝑓 𝑛 > 0
T(n) 1
1 T(n-1) T(n-1) 2
45 | P a g e
= 2n+1 - 1
= O(2n)
For reference of GP series
a + ar + ar2 + ar3 + … + ark = a(rk+1 – 1) / r – 1
in above series
a = 1, r = 2
Substitution method:
T(n) = 2 T(n-1) + 1 (1)
Now find T(n-1) as follows,
T(n-1) = 2 [T(n-2) + 1]
Put T(n-1) in equation (1)
T(n) = 2 [ 2 T(n-2) + 1] + 1
= 22 T(n-2) + 2 + 1 (2)
Now find T(n-2) as follows,
T(n-2) = 2 T(n-2-1) + 1
T(n-2) = 2 T(n-3) + 1
Put T(n-2) in equation (2)
= 22 [2 T(n-3) + 1] + 2 + 1
= 23 T(n-3) + 22 + 2 + 1 (3)
If we repeat k times..
= 2k T(n-k) + .. + 2k-1 + 2k-2 + 22 + 2 + 1 (4)
Assume n – k = 0 so, n = k
= 2n T(n-n) + .. + 2n-1 + 2n-2 + 22 + 2 + 1 (5)
= 2n 1 + .. + 2n-1 + 2n-2 + 22 + 2 + 1
= 2n+1 + 1
= O(2n)
46 | P a g e
3.2 Master Theorem
Master's Theorem is the best method to quickly find the algorithm's time complexity from its
recurrence relation. This theorem can be applied to decreasing as well as dividing functions.
Recursive functions call themselves in their body. It might get complex if we start calculating
its time complexity function by other commonly used simpler methods. Master's method is
the most useful and easy method to compute the time complexity function of recurrence
relations.
1. Dividing Functions
2. Decreasing Functions
47 | P a g e
T(n) = T(n-1) + n2
In this problem, a = 1, b = 1 and f(n) = O(nk) = n2, giving us k = 2.
Since a = 1, case 1 must be applied for this equation.
To calculate, T(n) = O(nk+1)
= n2+1
= n3
Therefore, T(n) = O(n3) is the tight bound for this equation.
More examples:
Examples 1:
T(n) = T(n-1) + n(n-1)
a = 1, b = 1, k =2
Therefore, T(n) = O(nk+1) = O(n3) is the tight bound for this equation
Example 2:
T(n) = 3T(n-1)
a = 3, b = 1, k =0
To calculate, T(n) = O(an/b * nk)
48 | P a g e
= O(3n/1 * n0 )
= O(3n)
Therefore, T(n) = O(3n) is the tight bound for this equation
Example 3:
T(n) = 2T(n-1) – 1
This recurrence can’t be solved using above method
since function is not of form T(n) = aT(n-b) + f(n).
Example 4:
Fibonacci series:
𝑛 𝑖𝑓 𝑛 = 0 𝑜𝑟 𝑛 = 1
𝑇(𝑛) = {
𝑇(𝑛 − 1) + 𝑇(𝑛 − 2) 𝑖𝑓 𝑛 ≥ 2
Let T(n-1) ≈ T(n-2)
T(n) = 2T(n-1) + c // c is constant cost for addition operation
where, f(n) = O(1)
∴ k=0, a=2, b=1;
T(n) = O(n02n/1)
= O(2n)
Example 5:
Factorial of number:
factorial(n):
if n is 0 -1
return 1
return n * factorial(n-1) - 1 + 1 + T(n-1)
49 | P a g e
3.3 Recurrence of dividing functions
50 | P a g e
Using substitution method
T(n) = T(n/2) + 1 (1)
Now find T(n/2) as follows:
T(n/2) = T(n/22) + 1
Replace T(n/2) in to (1)
= T(n/22) + 1 + 1
= T(n/22) + 2 (2)
Now find T(n/22) as follows:
T(n/22) = T(n/2* 22) + 1
Replace T(n/22) in equation (2)
= T(n/23) + 1 + 2 (3)
Similarly, we get the next terms as follows,
= T(n/24) + 4 (4)
Repeat this k number of times
= T(n/2k) + k (5)
Assume n / 2k = 1 so, k = log n
= T(1) + log n (6)
= 1 + log n
= O(log n)
51 | P a g e
}
}
1 𝑖𝑓 𝑛 = 1
𝑇(𝑛) = { 𝑛
𝑇 (2) + 𝑛 𝑖𝑓 𝑛 > 1
T(n)
n T(n/2)
n/2 T(n/22)
n/22 T(n/23)
……..
n/2k-1 T(n/2k)
Add the all the above so, we get the following:
= n + n/2 + n/22 + n/23 + … + n/2k-1 + n/2k
= n ( 1 + 1/2 + 1 / 22 + 1 / 23 + 1 / 2k )
1 1
= n (1 + ∑𝑘𝑖=0 2𝑖 ) , ∑𝑘𝑖=0 =1
2𝑖
52 | P a g e
T(n/22) = T(n/23) + n/22
Replace the above value of T(n/22) in (2)
= T(n/23) + n/22 + n / 2 + n
Repeat this k number of times
= T(n/2k) + n/2k-1 + …. + n / 2 + n
Assume n/2k = 1
= T(1) + n (1/2k-1 + 1/2k-2 + … + ½2 + ½ + 1)
1
Same as previous - ∑𝑘𝑖=0 2𝑖 = 1
= 1 + n (1 + 1)
= 1 + 2n
= O(n)
1 𝑖𝑓 𝑛 = 1
𝑇(𝑛) = { 𝑛
2 𝑇 (2) + 𝑛 𝑖𝑓 𝑛 > 1
53 | P a g e
Recursion tree method :
T(n) n
1 T(n/2) T(n/2) n
1 T(n/22 ) T(n/22 ) 1 T(n/22 ) T(n/22 ) n
1 T(n/23 ) T(n/23 ) 1 T(n/23 ) T(n/23 ) 1 T(n/23 ) T(n/23 ) 1 T(n/23 ) T(n/23 )
…………………………………….
T(n/2k) - k times
So,
k number of steps and in each step, it is called n number of times (n/4 + n/4 + n/4 + n/4 = n)
=nk (1)
Assume
n / 2k = 1
k = log n
so, replace k in (1)
= O (n log n)
Substitution method:
54 | P a g e
Repeat this k times
= 2k T(n/2k) + k n (4)
Assume that n/2k = 1
k = log n
= 2k T(1) + k n
= n 1 + log n * n
= n + n log n
We consider the dominating term only as per asymptotic notation.
= O(n log n)
55 | P a g e
log 2 2 = 1
1 > 0 so, O( 𝑛log𝑏 𝑎 ) – O(n)
T(n) = 4 T(n/2) + n
a = 4, b = 2, k =1
log 2 4 = 2
2 > 1 so, O( 𝑛log𝑏 𝑎 ) – O(n2)
T(n) = 8 T(n/2) + n
a = 8, b = 2, k =1
log 2 8 = 3
3 > 1 so, O( 𝑛log𝑏 𝑎 ) – O(n3)
T(n) = 9 T(n/2) + n
a = 9, b = 2, k =1
log 2 9 = 2
2 > 1 so, O( 𝑛log𝑏 𝑎 ) – O(n2)
Case 2:
T(n) = 2 T(n/2) + n
a = 2, b = 2, k =1, p = 0
log 2 2 = 1
1 = 1 so, O(nk logp+1n) - O(n log n)
56 | P a g e
T(n) = 8 T(n/2) + n3
a = 8, b = 2 and k = 3
3 = 3 so, O(n3 log n)
Case 2.2
T(n) = 2 T(n/2) + n / log n
a = 2, b =2, k = 1, p = -1
1 = 1 and p = - 1 so, O(nk loglog n)
Case 2.3
T(n) = 2 T(n/2) + n / log2 n
a = 2, b =2, k = 1, p = -2
1 = 1 and p = - 2 so, O(n)
Case 3.2:
T(n) = T(n/2) + n2
a = 1 , b = 2 and k = 2
0 < 2 so, O(n2)
Case 3.1
T(n) = 2 T(n/2) + n2 log2 n
a = 2, b = 2, k = 2 and p = 2
1 < 2 so, O(n2 log2 n)
57 | P a g e
Recurrence relation for Root function
void Test(int) T(n)
{
if(n>0) 1
{
printf(“%d”,n); 1
Test(√𝑛); T(√𝑛)
}
}
T(n) = T(√𝑛) + 1
1 𝑖𝑓 𝑛 = 2
𝑇(𝑛) = {
𝑇(√𝑛) + 1 𝑖𝑓 𝑛 > 2
m = 2k
k = log m
we want answer in n and n = 2m
so, m = log n
58 | P a g e
k = loglog n
𝑚⁄
= T(2 2𝑘 ) +k (6)
Replace value of k in (6)
= 1 + loglog n
= O(loglog n)
59 | P a g e
CHAPTER – 4 Sorting
Algorithms
60 | P a g e
4.1 Category of sorting algorithms
Comparison based sorting –
In comparison-based sorting, elements of an array are compared with each other to find the
sorted array.
Examples: Bubble sort, insertion sort, selection sort, quick sort, heap sort and merge sort.
Non-comparison-based sorting –
In non-comparison-based sorting, elements of array are not compared with each other to
find the sorted array.
Examples: Radix sort, counting sort and bucket sort
In-place/Outplace technique –
A sorting technique is in place if it does not use any extra memory to sort the array.
Among the comparison-based techniques discussed, only merge sort is outplaced technique
as it requires an extra array to merge the sorted subarrays.
Among the non-comparison-based techniques discussed, all are outplaced techniques.
Counting sort uses a counting array and bucket sort uses a hash table for sorting the array.
Online/Offline technique –
A sorting technique is considered online if it can accept new data while the procedure is
ongoing i.e. complete data is not required to start the sorting operation.
Among the comparison-based techniques discussed, only Insertion Sort qualifies for this
because of the underlying algorithm it uses i.e. it processes the array (not just elements)
from left to right and if new elements are added to the right, it doesn’t impact the ongoing
operation.
Stable/Unstable technique –
A sorting technique is stable if it does not change the order of elements with the same
value.
Out of comparison-based techniques, bubble sort, insertion sort and merge sort are stable
techniques. Selection sort is unstable as it may change the order of elements with the same
value. For example, consider the array 4, 4, 1, 3 and sort them using selection sort.
In the first iteration, the minimum element found is 1 and it is swapped with 4 at 0 th
position. Therefore, the order of 4 with respect to 4 at the 1st position will change.
Similarly, quick sort and heap sort are also unstable.
Out of non-comparison-based techniques, Counting sort and Bucket sort are stable sorting
techniques whereas radix sort stability depends on the underlying algorithm used for
sorting.
61 | P a g e
4.2 Bubble Sort
It compares the first and second elements of the array; if the first element is greater than the
second element, it will swap both elements, and then compare the second and third elements,
and so on.
The idea is that neighboring elements are compared with each other and swapped.
Why is it called bubble sort? – If we throw the stone in water then the stone which is heavier
so, it goes down and bubbles which are lighter comes up. Similarly, largest element is placed
in its proper position at the end of each iteration/pass.
8 5 7 3 2
1st Pass:
8 5 5 5 5
5 8 7 7 7
7 7 8 3 3
3 3 3 8 2
2 2 2 2 8
At each pass, the largest element is placed in its proper position. Element 8 is placed in the
last position.
Number of comparisons - 4
2nd Pass:
5 5 5 5
7 7 3 3
3 3 7 2
2 2 2 7
8 8 8 8
62 | P a g e
8 8 8
Best case:
If the list of elements is already sorted. Then there is not any swap in the first pass, which
shows that elements are sorted and that can be done by the following algorithm using flag
variable.
Algorithm bubble(A,n)
{
63 | P a g e
for(i=0;i<n-1;i++)
{
flag=0;
for(j=0;j<n-1-i;j++)
{
if(A[j] > A[j+1])
Swap element A[j] with A[j+1]
flag=1;
}
if(flag==0)
break;
}
}
Number of comparison in first pass – n-1 = O(n) which is minimum time taken by the
bubble sort.
We can say that Bubble sort is adaptive by putting the flag variable.
Summary of time complexity:
Best Average Worst
Without flag O(n2) O(n2) O(n2)
With flag O(n) O(n2) O(n2)
Space complexity:
It is constant. No additional memory space is required. O(1). It is in place sort algorithm.
Stable or not?
Whether it is stable or not? A stable sorting algorithm maintains the relative order of the
items with equal sort keys.
We can check it by sorting the following elements:
8 8 3 5 4
In the above list, two elements are having same value so, their relative order in the sorted list
will remain the same means second 8 will be placed after first 8.
64 | P a g e
Insertion sort works similar to the sorting of playing cards in hands. It is assumed that the
first card is already sorted in the card game, and then we select an unsorted card. If the
selected unsorted card is greater than the first card, it will be placed at the right side;
otherwise, it will be placed at the left side. Similarly, all unsorted cards are taken and put in
their exact place.
List with one element is already sorted as follows. Dark background shows sorted portions of
the array.
8 5 7 3 2
Insert 5: pass 1
8 5 7 3 2
5 8 7 3 2
65 | P a g e
Number of comparisons – 1
Insert 7: pass 2
5 8 7 3 2
5 7 8 3 2
Number of comparisons – 2
Insert 3: pass 3
5 7 8 3 2
3 5 7 8 2
Number of comparisons - 3
Insert 2: pass 4
3 5 7 8 2
3 5 7 8
2
3 5 7 8
2
3 5 7 8
2
2 3 5 7 8
Number of comparisons – 4
Number of passes required = 4 means n – 1 pass
Time complexity:
66 | P a g e
Best Case: - It occurs when there is no sorting required, i.e. the array is already sorted and
outer loop is iterated only n times and it does not enter into inner loop for swapping of the
elements. The best-case time complexity of insertion sort is O(n).
Average Case: - It occurs when the array elements are in jumbled order that is not properly in
ascending and not properly in descending order. The average case time complexity of
insertion sort is O(n2) which as bad as worst case.
Worst Case: - It occurs when the array elements are required to be sorted in reverse order.
That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of insertion sort is O(n2).
Space complexity:
It is constant. No additional memory space required so, O(1). It is in place sort algorithm
Algorithm InsertionSort(A[],n)
{
for(i=1;i<n;i++)
{
j=i-1;
x=A[j];
while( j >= -1 && A[j] > x)
{
A[j+1] = A[j];
j--;
}
A[j+1]=x;
}
67 | P a g e
}
Stable or not?
Insertion sort is also stable sort. It maintains the relative position of elements having same
value.
o Simple implementation
o Efficient for small data sets
o Adaptive, i.e., it is appropriate for data sets that are already substantially sorted.
The usual Θ(n2) implementation of Insertion Sort to sort an array uses linear search to
identify the position where an element is to be inserted into the already sorted part of the
array. If instead, we use binary search to identify the position, the worst-case running time
will __________.
a. Remain Θ(n2)
b. Become Θ(n(logn)2)
c. Become Θ(nlogn)
d. Become Θ(n)
Answer (a)
68 | P a g e
Pass 1: (start searching the element for position 0). Wherever we get minimum element that is
swapped with element with position 0.
0 1 2 3 4 5
1 4 10 8 3 7
Pass 3:
0 1 2 3 4 5
1 3 10 8 4 7
Number of comparisons – 3
0 1 2 3 4 5
1 3 10 8 4 7
Pass 4:
0 1 2 3 4 5
1 3 4 8 10 7
69 | P a g e
Number of comparisons – 2
0 1 2 3 4 5
1 3 4 8 10 7
0 1 2 3 4 5
1 3 4 7 10 8
Pass 5:
0 1 2 3 4 5
1 3 4 7 10 8
Number of comparisons – 1
0 1 2 3 4 5
1 3 4 7 10 8
0 1 2 3 4 5
1 3 4 7 8 10
Time complexity:
Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already
sorted. As per the mechanism of this sorting, it is difficult to decide that array is sorted or not
and we need to do all the comparisons as we do in worst case scenario. The best-case time
complexity of selection sort is O(n2).
Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of
selection sort is O(n2).
Worst Case Complexity - It occurs when the array elements are required to be sorted in
reverse order. That means suppose you have to sort the array elements in ascending order, but
70 | P a g e
its elements are in descending order. The worst-case time complexity of selection sort
is O(n2).
Space complexity:
It is constant. No additional memory space is required so, O(1). It is in place sort algorithm.
Algorithm SelectionSort(A[],n)
{
for(i=0;i<n-1;i++)
{
min = i;
for(j+i+1,j<n;j++)
{
if(A[j] < A[min])
min = j;
}
if(min != i)
swap(A[i],A[min])
}
Adaptive or not? Not adaptive, meaning it doesn’t take advantage of the fact that the list may
already be sorted or partially sorted.
Stable or not?
By default, implementation is not stable, but it can be made stable.
71 | P a g e
A heap is a complete binary tree, and the binary tree is a tree in which the node can have the
utmost two children. A complete binary tree is a binary tree in which all the levels except the
last level, i.e., leaf node, should be filled completely.
B C
D E F G
1 2 3 4 5 6 7
A B C D E F G
𝑖
Parent of ith element is at = ⌊ ⌋
2
If index starts from 0 then:
If any element is at index i then its
Left child of ith element is at = 2 * i + 1
Right child of ith element is at = 2 * i + 2
𝑖−1
Parent of ith element is at = ⌊ ⌋
2
Complete binary tree (it is strictly tree of depth d, and all its leaves are located at level d) then
the above representation is worth otherwise it keeps gap in the array. The elements are filled
level by level in complete binary tree.
Height of the complete binary is log2 n.
Max heap: All its descendants are less than its. Element which root – 50 having all its
descendants are less than it.
72 | P a g e
50
30 20
15 10 8 16
Min heap: All its descendants are greater than its. Element which root – 50 having all its
descendants are less than it.
10
0
30 20
35 40 32 25
Let’s take max heap and see the process of insertion and deletion.
Insertion process:
50
30 20
15 10 8 16
1 2 3 4 5 6 7 8
50 30 20 15 10 8 16 60
1 2 3 4 5 6 7 8
73 | P a g e
50 30 20 60 10 8 16 15
1 2 3 4 5 6 7 8
50 30 20 60 10 8 16 15
1 2 3 4 5 6 7 8
50 60 20 30 10 8 16 15
1 2 3 4 5 6 7 8
50 60 20 30 10 8 16 15
1 2 3 4 5 6 7 8
60 50 20 30 10 8 16 15
Number of comparison for insertion is equal to height of the tree which is lon2 n .
Deletion process:
Delete 50 from the max heap. Mostly, we want the maximum or minimum element from the
max heap and min heap respectively.
50
30 20
15 10 8 16
1 2 3 4 5 6 7
50 30 20 15 10 8 16
1 2 3 4 5 6 7
16 30 20 15 10 8 50
But it does not satisfy the property of max heap. So, we check its children and swap it with
largest one.
1 2 3 4 5 6 7
30 16 20 15 10 8 50
74 | P a g e
Now, again check its children which are 15 and 10 but both are smaller than it.
1 2 3 4 5 6 7
30 16 20 15 10 8 50
Deleted element’s (30) position will be taken by last element (8) but by putting it, property of
max heap is not satisfied. So, we need to do it by changing the position of element 8 in the
above.
1 2 3 4 5 6 7
8 16 20 15 10 30 50
Now, check the children of element 8 and it is swapped with which is larger. So, 8 is
swapped with 20.
1 2 3 4 5 6 7
20 16 8 15 10 30 50
Still, element 8 has both of its children are greater than its so, swap element 8 with larger of
its children which is 15.
1 2 3 4 5 6 7
20 16 15 8 10 30 50
In this way, we keep on deleting the elements and adding it in the list and finally we will get
sorted list of elements.
Maximum time to delete the element is equal to height of binary tree which is log2 n.
Heap Sorting:
In heap sort, basically, there are two phases involved in the sorting of elements. By using the
heap sort algorithm, they are as follows -
o The first step includes the creation of a heap by adjusting the elements of the array.
75 | P a g e
o After the creation of heap, remove the root element of the heap repeatedly by shifting
it to the end of the array.
Take element one by one and compare with its parent by formula floor (i / 2) where i is the
index of the element.
Take 1st element – 10:
1 2 3 4 5
10 20 15 30 40
Check the parent of newly inserted element which should be greater than it. If not then
interchange.
1 2 3 4 5
20 10 15 30 40
Now, 30’s parent – floor (4/2) = 2, which is 10 and less than 30 so, swap them.
1 2 3 4 5
20 30 15 10 40
76 | P a g e
In the above, element 30 is not its proper position so, it is swapped with its parent 20.
1 2 3 4 5
30 20 15 10 40
40’s parent (floor(5/2) – 2) which is 30 so, less than 40 hence swap them.
1 2 3 4 5
30 40 15 10 20
The following is the final representation of elements using max heap in array and tree.
1 2 3 4 5
40 30 15 10 20
40
30 15
10 20
Delete 40:
1 2 3 4 5
20 30 15 10 40
77 | P a g e
First, element which is the largest element and swapped with last element as above.
But, this does not satisfy the property of max heap so, 20 will be sent down and swapped with
one of its children which is larger so, it is swapped with 30.
1 2 3 4 5
30 20 15 10 40
Delete 30:
1 2 3 4 5
10 20 15 30 40
Delete 20:
1 2 3 4 5
15 10 20 30 40
78 | P a g e
There is no mechanism to detect whether the list is sorted or not. For any sequence of data,
heap sort dose the same work. So, time complexity of heap sort is the same in all three cases.
Hence heap sort is not adaptive.
Space complexity:
It is constant. So, no additional memory is required – O (1). It is in place sorting algorithm.
Stable or not?
It is not stable. It does not maintain the relative order of same element.
〈89,19,50,17,12,15,2,5,7,11,6,9,100〉
a. 4
b. 5
c. 2
d. 3
Answer (d)
79 | P a g e
Non – comparison-based sorting algorithms:
Pass – 1:
Put the number in the specific bin based on the LSB or last digit of the number.
0 1 2 3 4 5 6 7 8 9
62 163 235 146 237 348 259
36 48
Now, make each bin empty from 0 to 9 and list the number…
62 163 235 146 36 237 348 48 259
Pass – 2:
Now, put the number in the bins based on 2nd LSB of the number. For example, 62 and its
LSB is 6 so, it is kept in bin number 6.
0 1 2 3 4 5 6 7 8 9
235 146 259 62
36 348 163
237 48
80 | P a g e
Now, make all the bins empty and list the numbers.
Pass-3:
Put the number into bins based on 3rd LSB. Largest number is of three digits and LSB for 235
number is 2 so, it is kept in bin number 2. For two-digit numbers, 0 is appended so, it is LSB
is zero and kept in bin number 0.
0 1 2 3 4 5 6 7 8 9
36 146 235 348
48 163 237
62 259
Time complexity:
Number of passes is equal to number of digits d in the largest number.
Number of elements we put into the bin is n
So, time complexity is – O(d n) where d is number of digits in the largest number. d cannot
be constant. But if it is constant then the complexity is linear – O(n).
Summary of time complexity:
Best case Average case Worst case
O(nd) O(nd) O(nd)
Space complexity:
Radix sort also has a space complexity of O (n + d), where n is the number of elements and
d is the base of the number system. This space complexity comes from the need to create
buckets for each digit value and to copy the elements back to the original array after each
sorting is done on digit.
Stable or not?
It is stable.
81 | P a g e
Sort a large set of floating-point numbers which are in range from 0.0 to 1.0 and are
uniformly distributed across the range. How do we sort the numbers efficiently?
A simple way is to apply a comparison-based sorting algorithm. The lower bound for
Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is Ω(n log
n), i.e., they cannot do better than n log n.
Bucket sort is a sorting algorithm that separate the elements into multiple groups said to be
buckets. Elements in bucket sort are first uniformly divided into groups called buckets, and
then they are sorted by any other sorting algorithm. After that, elements are gathered in a
sorted manner.
Bucket sort is commonly used -
o With floating-point values. It works on floating point number in range of 0.0 to 1.0.
o When input is distributed uniformly over a range.
The advantages of bucket sort are -
o Bucket sort reduces the no. of comparisons.
o It is asymptotically fast because of the uniform distribution of elements.
The limitations of bucket sort are -
o It may or may not be a stable sorting algorithm.
o It is not useful if we have a large array because it increases the cost.
o It is not an in-place sorting algorithm, because some extra space is required to sort the
buckets.
Put the elements into the appropriate bucket by applying - B⌊𝑛 ∗ 𝐴[𝑖]⌋ = B[10 * 0.79]=B[7]
so, 0.79 is placed in bucket 7.
82 | P a g e
2. n=length[A]
3. for i=0 to n-1
4. make B[i] an empty list
5. for i=1 to n
6. do insert A[i] into list B⌊𝑛 ∗ 𝐴[𝑖]⌋ – (floor value of the index)
7. for i=0 to n-1
8. do sort list B[i] with insertion-sort
9. Concatenate lists B[0], B[1],........, B[n-1] together in order
End
Time complexity:
Best Case: - In Bucket sort, best case occurs when the elements are uniformly distributed in
the buckets. The complexity will be better if the elements are already sorted in the buckets.
Time complexity will be O (n + k), where O(n) is for making (insertion into the bucket) the
buckets, and O(k) is for concatenating the bucket elements.
Average Case: - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending. Bucket sort runs in the linear time, even when the
elements are uniformly distributed. The average case time complexity of bucket sort is O (n
+ k).
Worst Case: - In bucket sort, worst case occurs when the elements are of the close range in
the array and because of that, they have to be placed in the same bucket. So, some buckets
have a greater number of elements than others.
The complexity will get worse when the elements are in the reverse order.
The worst-case time complexity of bucket sort is O(n2).
Insert – 0.74
83 | P a g e
0
1
2
3
4
5
6
7 0.74 0.79
8
9
Insert – 0.72
0
1
2
3
4
5
6
7 0.72 0.74 0.79
8
9
If all the elements are stored in the same bucket as above, then they are stored using insertion
sort and it takes time O(n2).
Worst case - O(n2).
Summary of time complexity:
Best case Average case Worst case
O(n+k) O(n+k) O(n2)
Space complexity:
If k is the number of buckets required, then O(k) extra space is needed to store k empty
buckets, and then we map each element to a bucket that requires O(n) extra space. So, the
overall space complexity is O (n + k).
We are given input size – n and range - k. Range means input must be in that range.
From the name, we have to count the frequency/occurrence of each input.
84 | P a g e
For example:
2 1 2 1 3 4 1 2
Create the array of 4 elements:
2 1 2 1 3 4 1 2
1 0 1 1 2 2 2 3 3
2 1 1 2 2 2 2 2 3
3 0 0 0 0 1 1 1 1
4 0 0 0 0 0 1 1 1
Now, traverse the list and note down its occurrence in the array. If it appears again then its
corresponding entry is incremented.
Order the list of elements based on their occurrence as follows. 1 appears thrice so, list it
three times. 3 appears once so, list it one time.
So, the sorted order is - 1 1 1 2 2 2 3 4
Time complexity:
We traverse the list n times and array (to store the frequency of each element) is k times so,
O(n+k).
Same time complexity in best, average and worst case.
But if we have the list of elements as follows:
2 23000 5 9 20
We need to take array of 23000 and remaining space will be wasted.
It does not work for floating points or negative values.
Summary of time complexity:
Best case Average case Worst case
O(n+k) O(n+k) O(n+k)
Space complexity:
O(n+k). where k is range of elements. It means dependent on the larger number in the list.
The algorithm allocates two additional arrays: one for counts and one for the output.
Stable or not?
It is stable sort.
Counting sort is specifically useful in following scenarios:
85 | P a g e
4.9 Amortize Analysis
This analysis is used when the occasional operation is very slow, but most of the operations
which are executing very frequently are faster. Data structures we need amortized analysis for
Hash Tables, Disjoint Sets etc.
In the Hash-table, the most of the time the searching time complexity is O(1), but sometimes
it executes O(n) operations. When we want to search or insert an element in a hash table for
most of the cases it is constant time taking the task, but when a collision occurs, it needs O(n)
times operations for collision resolution.
Aggregate Method
The aggregate method is used to find the total cost. If we want to add a bunch of data, then
we need to find the amortized cost by this formula.
86 | P a g e