0% found this document useful (0 votes)
41 views86 pages

Understanding Algorithms and Analysis

An algorithm is a finite set of steps to solve a problem, independent of programming languages, and can be expressed in various forms. The document discusses the characteristics of algorithms, the difference between algorithms and programs, and the importance of analyzing algorithms for efficiency. It also introduces asymptotic notations for evaluating algorithm performance based on time and space complexity.

Uploaded by

harsh Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views86 pages

Understanding Algorithms and Analysis

An algorithm is a finite set of steps to solve a problem, independent of programming languages, and can be expressed in various forms. The document discusses the characteristics of algorithms, the difference between algorithms and programs, and the importance of analyzing algorithms for efficiency. It also introduces asymptotic notations for evaluating algorithm performance based on time and space complexity.

Uploaded by

harsh Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CHAPTER – 1 What is

Algorithm? and Basics

1|Page
1.1 Algorithm
An algorithm is a set of steps of operations to solve a problem performing calculation, data
processing, and automated reasoning tasks. An algorithm is an efficient method that can be
expressed within a finite amount of time and space.

An algorithm is the best way to represent the solution of a particular problem in a very simple
and efficient way. If we have an algorithm for a specific problem, then we can implement it
in any programming language, meaning that the algorithm is independent from any
programming languages.
Algorithm is like procedure which does something. For example, preparing the dish of food,
some experiment done by the student in physics. Each procedure needs some input and
produce output.
Algorithms can be expressed as natural languages, programming languages, pseudocode,
flowcharts and control tables. Natural language expressions are rare, as they are more
ambiguous. Programming languages are normally used for expressing algorithms executed by
a computer.

1.2 What is difference between Algorithm and Program?

Algorithm Program
It is related to designing the solution of a It is related to implementation of solution of
problem. a problem.
Domain knowledge is required who writes Programmer writes it and also need some
algorithm. domain knowledge.
It is written in any language – English like It is written in any programming languages
language or mathematical notation as long like C, C++, Python etc..
as it is understandable by programmer.
It is independent of hardware and software. It is dependent on hardware and software.
We need to select either Linux or Windows.
Analysis is performed on the algorithm once Testing is performed once it is
it is written. implemented.

2|Page
1.3 Characteristics of Algorithms
The main characteristics of algorithms are as follows −
 Input – it may or may not take input – 0 or more input.
 Output – It must produce some output – otherwise no meaning of it. At least
one output.
 Definiteness (clear/precise)– each statement must be unambiguous means it
must be having clear meaning. For example, we try to find square root of
negative number which is not possible.
 Finiteness – it must terminate at some point. It is not like running continuously
until we stop it like web server.
 Effectiveness – we cannot write statement unnecessarily. It must do something
and have some effect. Each operation must be simple and feasible so that one
can trace it out using paper and pencil. While preparing a dish, we don’t do the
things (like cutting vegetables but not used in the dish) which is not part of
preparation.
 Language independent: algorithms must be independent of any language, and
they must be implemented in any language.

1.4 How to write an algorithm?


The algorithm is written without using any language, but it is mostly written in C like
language.
Algorithm swap(a,b)
begin
temp = a
a=b
b =temp
end

1.5 What is analysis of algorithm?


In computer science, the analysis of algorithms is the process of finding the computational
complexity of algorithms—the amount of time, storage, or other resources needed to
execute them. Usually, this involves determining a function that relates the size of an
algorithm's input to the number of steps it takes (its time complexity) or the number of
storage locations it uses (its space complexity).
An algorithm is said to be efficient when this function's values are small or grow slowly
compared to a growth in the size of the input.
Different inputs of the same size may cause the algorithm to have different behavior, so best,
worst and average case descriptions might all be of practical interest.

3|Page
Algorithm analysis is an important part of computational complexity theory, which provides
theoretical estimation for the required resources of an algorithm to solve a specific
computational problem. Analysis of algorithms is the determination of the amount of time and
space resources required to execute it.

1.6 Why Analysis of Algorithms is important?

 In software development, before implementing the software, we first need to prepare the
design. Without design, it is not possible to make the software.
 When we construct any house then first, we prepare the drawing and then construct it. It is
not based on trial and error.
The following is the importance of analysis of algorithm.
1. To predict the behavior of an algorithm without implementing it on a specific
computer.
2. It is much more convenient to have simple measures for the efficiency of an
algorithm than to implement the algorithm and test the efficiency every time a
certain parameter in the underlying computer system changes.
3. It is impossible to predict the exact behavior of an algorithm. There are too many
influencing factors.
4. The analysis is thus only an approximation; it is not perfect.
5. More importantly, by analyzing different algorithms, we can compare them to
determine the best one for our purpose.

1.7 How to analyse an algorithm?


Algorithm is analysed based on the following parameters:
 Time: how much time it is taking? time function
 Space: how much memory space it requires?
 Network data transfer: how much data it is transferring.
 Power consumption: as devices are different now adays.
 CPU register it uses. For ex. When we are writing algorithm for device driver.

1.8 Time and Space analysis:


Algorithm swap(a,b)
begin
temp = a -1
a=b -1
b =temp -1
end
Time analysis:
The above algorithm takes 3 units of time so, f(n) = 3 which is O (1).

4|Page
Each statement takes one unit time as constant.
X=a+5*b+c*9
In machine, it is done using many instructions, but we don’t consider it. We just see as one
instruction and taking one unit time. It is like going to your friend house no need to do
planning how you will go there but if you have to go to mars then detailed planning is
needed.
Space analysis:
How much memory space is needed?
In the above algorithm, S(n) = 3 which is constant. O(1)

1.9 Priori Analysis and Posteriori Analysis:


Time complexity of an algorithm can be calculated by using two methods:
1. Posteriori Analysis
2. Priori Analysis

Priori Analysis Posteriori Analysis


Algorithm Program
Independent of language Dependent on language
Hardware independent Hardware dependent
Time and Space function Watch time and bytes
It uses asymptotic notation to calculate time It does not use asymptotic notation to
and space complexity. calculate time and space complexity.
The time complexity of an algorithm using The time complexity of an algorithm using
a priori analysis is same for every system. a posteriori analysis differ from system to
system.
It is cheaper. It is costlier because it needs hardware and
software.
It is done before execution of an algorithm. It is done after execution of an algorithm.

1.10 Why analysis of algorithm needed?


Algorithmics is defined as the study of algorithms. When we set out to solve a problem, there
might be a choice of algorithms available. In this case, it is important to decide which one to
use. Depending on our priorities and on limits of the equipment available to us, we may want
to choose the algorithm that takes least time or least storage or easiest to program and so on.
Now, suppose we have to multiply two positive integers using only pencil and paper.
Classical multiplication algorithm:
In North America, they use the American approach of multiplication and in England they use
English multiplication approach.

5|Page
ǎ la russe approach:

Divide and conquer approach of multiplication:

6|Page
• Both the multiplicand and the multiplier must have the same number of digits and this
number be a power of 2.
• If not then it can be done by adding zeros on the left if necessary.

The multiplication of two four-digit numbers is reduced to four multiplications of two-digit


numbers.
Again, 2 two-digit numbers’ multiplication is given as follows:
Multiply Shift Result
0*1 2 00
0*2 1 0
9*1 1 90
9*2 0 18
108

Each of the above, two-digit multiplication is done using same way with some shifts and
addition.
Later, we will see how we can reduce the multiplication from four to three in divide and
conquer approach. With these improvements, divide and conquer approach runs faster on
computer than any of the preceding methods.

7|Page
8|Page
CHAPTER – 2 Asymptotic
Notations and Analysis

9|Page
2.1 Asymptotic Notations
The running time of an algorithm depends on how long it takes a computer to run the lines of
code of the algorithm—and that depends on the speed of the computer, the programming
language, and the compiler that translates the program from the programming language into
code that runs directly on the computer, among other factors.
Let's think about the running time of an algorithm more carefully. We can use a combination
of two ideas. First, we need to determine how long the algorithm takes, in terms of the size
of its input. In case of linear search, number of comparisons are increased when number of
elements are increased. So, we think about the running time of the algorithm as a function of
the size of its input.
Second, an idea is that we must focus on how fast a function grows with the input size. We
call this the rate of growth of the running time. To keep things manageable, we need to
simplify the function to extract the most important part and ignore the less important parts.
For example, suppose that an algorithm, running on an input of size n, takes 6 n2 + 100 n +
300. The 6 n2 term becomes larger than the remaining terms 100 n + 300 When n becomes
large enough, 20 in this case.

We would say that the running time of this algorithm grows as n2, dropping the coefficient 6
and the remaining terms 100 n + 300.
It doesn't really matter what coefficients we use; as long as the running time an2 + bn + c, for
some numbers a > 0, b, and c, there will always be a value of n for which an2 is greater
than bn + c and this difference increases as n increases. For example, here's a chart showing
values of 0.6 n2 + 1000n+3000, where we reduced the coefficient of n2 by a factor of 10 and
increased the other two constants by a factor of 10.

10 | P a g e
By dropping the less significant terms and the constant coefficients, we can focus on the
important part of an algorithm's running time—its rate of growth—. When we drop the
constant coefficients and the less significant terms, we use asymptotic notation.
Another example of growth rate:
The order of function growth is critical in evaluating the algorithm’s performance. Assume
the running times of two algorithms A and B are f(n) and g(n), respectively.

f(n) = 2n2 + 5
g(n) = 10n

Here, n represents the size of the problem, while polynomials f(n) and g(n) represent the
number of basic operations performed by algorithms A and B, respectively. Running time of
both the functions for different input size is shown in following table:

n 1 2 3 4 5 6 7
f(n) - A 7 13 23 37 55 77 103
g(n) - B 10 20 30 40 50 60 70

Algorithm A may outperform algorithm B for small input sizes, however when input sizes
become sufficiently big (in this example n = 5), f(n) always runs slower (performs more
steps) than g(n). As a result, understanding the growth rate of functions is critical.
Asymptotic notations describe the function’s limiting behaviour.

What is asymptotic notation?


Asymptotic notations are a mathematical tool that can be used to determine the time or space
complexity of an algorithm without having to implement it in a programming language. This
measure is unaffected by machine-specific constants. It is a way of describing a significant
part of the cost of the algorithm.

11 | P a g e
Machine-specific constants include the machine’s hardware architecture, RAM, supported
virtual memory, processor speed, available instruction set (RISC or CISC), and so on. The
asymptotic notations examine algorithms that are not affected by any of these above factors.

When doing complexity analysis, the following assumptions are assumed.

Assumptions for finding the growth rate:


 The actual cost of operation is not considered.
 Abstract cost c is ignored: O(c. n2) reduces to O(n2)
 Only leading term of a polynomial is considered: O(n3 + n) reduces to O(n3)
 Drop multiplicative or divisive constant if any: O(2n2) and O(n2/2) both reduces
to O(n2).

There are mainly three asymptotic notations:


 Big-O notation – upper bound of the function
 Omega notation - lower bound of the function
 Theta notation – average bound of the function.

12 | P a g e
2.2 Big- O notation:
We write f(n) = O(g(n)), If there are positive constants n0 and c such that, to the right of n0
the f(n) always lies on or below c*g(n).
O(g(n)) = { f(n) : There exist positive constant c and n0 such that 0 ≤ f(n) ≤ c g(n), for all n ≥
n0}

For example:
f(n) = 2 n + 3
2 n + 3 < = ????
2 n + 3 < = 10 n for any value of n > no, no =1 and c = 10.
We can write anything like 7 n, 1000 n.
Simply, we can do the following.
2 n + 3 <= 2 n + 3 n
2 n + 3 < = 5 n , n >=1
Here, f(n) = 2 n + 3 and g(n) = n
So, f(n) = O(g(n))
Can we write the following?
2 n + 3 <= 2 n2 + 3 n2
Yes, we can also write and f(n) = O(n2)
f(n) = O(n) , f(n) = O(n2)
Actually, f(n) belongs to linear class and all the classes on its right are upper bound of it. All
functions on its left are lower bound. So, we can take any function on its right only.
We should try to write the closet function to f(n).

1 < log n < √𝒏 < 𝑛 < n log n < n2 < n3 < … < 2n < 3n < nn

13 | P a g e
2.3 Big-Omega - Ω
Big-Omega (Ω) notation gives a lower bound for a function f(n) to within a constant factor.
We write f(n) = Ω(g(n)), If there are positive constants n0 and c such that, to the right of n0
the f(n) always lies on or above c*g(n).
Ω(g(n)) = { f(n) : There exist positive constant c and n0 such that 0 ≤ c g(n) ≤ f(n), for all n ≥
n0}

14 | P a g e
f(n) = 2 n + 3
2 n + 3 >= 1 * n
c=1, g(n) = n
For all n >=1, f(n) = Ω(n)
We can also say, f(n) = Ω(log n) or any lower bound on left side in the order of the class.

2.4 Theta - Θ
Big-Theta(Θ) notation gives bound for a function f(n) to within a constant factor.
We write f(n) = Θ(g(n)), If there are positive constants n0 and c1 and c2 such that, to the
right of n0 the f(n) always lies between c1*g(n) and c2*g(n) inclusive.
Θ(g(n)) = {f(n) : There exist positive constant c1, c2 and n0 such that 0 ≤ c1 g(n) ≤ f(n) ≤ c2
g(n), for all n ≥ n0}

f(n) = 2 n + 3
1 * n <= 2 n + 3 <= 5 * n
c1 = 1, g(n) = n , c2 = 5, n >= 1

f(n) = ɵ(n) but not ɵ(n2) or ɵ(log n)

Don’t misunderstand about worst case and best case with notation. Any notation can be used
to represent the best case, worst case of an algorithm.

15 | P a g e
Examples – 1:
f(n) = 2 n2 + 3 n + 4

2 n2 + 3 n + 4 <= 2 n2 + 3 n2 + 4 n2
2 n2 + 3 n + 4 <= 9 n2 for any value of n >=1, c = 9, n0 = 1
So, f(n) = O(n2)
Example – 2:
F(n) = 2 n2 + 3 n + 4
2 n2 + 3 n + 4 >= 1 * n2
f(n) = Ω(n2)

Example-3:
f(n) = 2 n2 + 3 n + 4
1 * n2 <= 2 n2 + 3 n + 4 <= 9 n2

O(n2) , Ω(n2) , ɵ(n2)

Example-4:
f(n) = n2 log n + n

1 * n2 log n <= n2 log n + n <= 10 n2 log n

O(n2 log n) , Ω(n2 log n) , ɵ(n2 log n)

Example-5:
f(n) = n !
1 <= n ! <= nn
Upper bound - O(nn) , lower bound - Ω(1)
There is not average bound for n!.
When it is not possible to find ɵ then Ω and O are used to find lower and upper bound
respectively.
We can not put n10 as lower bound and n14 as upper bound.

16 | P a g e
Example-6:
f(n) = log n !

log (1 * 1 * 1 ..) <= log (1 * 2 * 3 * 4 *…n) <= log (n * n * n * n) (log nn ) means n log n
Upper bound - O(n log n) , lower bound - Ω(1)

2.5 General properties of asymptotic notations


General property:
If f(n) = O(g(n)) then a * f(n) = O(g(n)) where a is some constant value.
f(n) = 2 n2 + 5 is O(n2)
7 * (2 n2 + 5) = O(n2)

Also, similar for Ω and ɵ

If F(n) = ɵ(g(n)) then a * f(n) = ɵ(g(n))


If F(n) = Ω(g(n)) then a * f(n) = Ω(g(n))

Reflexive property:
If f(n) is given, then f(n) is O(f(n)).
e.g f(n) = n2 then O(n2)
Any function is a lower bound of itself.

Transitive property:
If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n)).
Example: n is O(n2) and n2 is O(n3) then n is O(n3)

Symmetric:
Only for ɵ notation only
If f(n) is ɵ(g(n)) then g(n) is ɵ(f(n))
f(n) = n2 and g(n) = n2

17 | P a g e
Transpose symmetric:
If f(n) is O(g(n)) then g(n) is Ω(f(n))
f(n) = n
g(n) = n2
f(n) = O(n2)
g(n) = Ω(n)

Another property:
If f(n) = O(g(n))
f(n) = Ω(g(n))
Then f(n) = ɵ(g(n))
Means c1 * g(n) <= f(n) <= c2 * g(n)

Another property:
If f(n) = O(g(n))
And d(n) = O(h(n))
Then f(n) + d(n) = O(max(g(n),h(n)))

Another property:
If f(n) = O(g(n))
And d(n) = O(h(n))

Then f(n) * d(n) = O(g(n) * h(n))

2.6 Comparison of functions


Value n2 n3
of n
2 4 8
3 9 81

Apply log on both the side.

18 | P a g e
log n2 < log n3
2 log n < 3 log n so, 3 log is bigger.

What is logarithms functions? (Watch this video for more details - Relation and Function 04 |
All about Logarithmic Function | Class 11 | IIT JEE - YouTube)

Inverse functions of exponentiation functions are called logarithms functions.


F(n) = 23 = 8
2 is called base, 3 is called exponent and 8 is called result.
Now, same thing can be represented by logarithm as follows:
Log2 8 = 3 , here base remains the same, result becomes input to log and exponent becomes
result.
Exponent and result are changed in exponentiation and logarithm.
1
Y = 2x

2 X = log2 Y

In the above, for the 1st function give the input 2, 4 and 16 etc.. and its corresponding output
gives as input to the 2nd function so, the result will be those that is given as input to the 1st
function.
logb x – it is read as log x with base b.
log2 4 = 2
log2 32 = 5
log2 1/16 = -4 2? = 1/16 = 2-4
log1/4 ½ = ½
log21 = 0
Output of the logarithmic functions can be positive, zero or negative.
log2 0 = Not defined
log2 -4 = Not defined
Input of logarithm functions must not be zero or negative.
Base of logarithm:
log24 = 2

19 | P a g e
log1 4 = not defined
log-2 4 = not defined
base of log must not be one or negative.

F(n) = loga x where x > 0, output (-infinite,+infinite), a > 0 and a != 1

e = 2.718 – Euler’s number (lnx) – loge x e = 1/1! + ½! + 1/3! + …+ infinite

Important log formulas:


1. loga x = y then ay = x
2. log ab = log a + log b
3. log a/b = log a – log b
4. logb a = b log a
5. a logb c = 𝑏 log 𝑐 𝑎
6. ab = n then b = loga n
7. logany number for the base 1 = 0

f(n) = n2 log n and g(n) = n (log n)10


apply log on both the sides
log(n2 log n) and log(n (log n)10)
log n2 + loglog n and log n + log(log n)10
2 log n + loglog n and log n + 10 loglog n
So, f(n) is bigger.

Example:

F(n) = 3𝑛√𝑛 and g(n) = 2√𝑛 log 𝑛


Apply above formula (3) in g(n)
𝑛√𝑛
3𝑛√𝑛 2𝑙𝑜𝑔 – apply formula no. (4) in this equation
2
3𝑛√𝑛 (𝑛√𝑛 )𝑙𝑜𝑔2

3𝑛√𝑛 > (𝑛√𝑛 )1

2 n2 > n2 – this is correct but asymptotically they are same O(n2)

20 | P a g e
Example:

F(n) = nlog n and g(n) = 2√𝑛


Apply log on both the sides

log nlog n log 2√𝑛

log n log n √𝑛 log2 2 where log2 2 = 1


log2 n √𝑛
if you unable to judge then again apply log
loglog2 n log n1/2
2 loglog n < ½ log n

Example:
F(n) = 2n and g(n) = 22n
Apply log
log 2n log 22n
n log2 2 2 n log2 2
n < 2n

2.7 Best case, Worst case and Average case analysis


There are three cases to analyze an algorithm: (Types of analysis of algorithms)
1. Worst Case Analysis (Mostly used)

In the worst-case analysis, we calculate the upper bound on the running time of an
algorithm. We must know the case that causes a maximum number of operations to be
executed. For Linear Search, the worst case happens when the element to be searched (x) is
not present in the array. When x is not present, the search() function compares it with all
the elements of arr[] one by one. Therefore, the worst-case time complexity of the linear
search would be O(n).

2. Best Case Analysis (Very Rarely used)

In the best-case analysis, we calculate the lower bound on the running time of an algorithm.
We must know the case that causes a minimum number of operations to be executed. In the
linear search problem, the best case occurs when x is present at the first location. The
number of operations in the best case is constant (not dependent on n). So time complexity
in the best case would be Ω(1) or O(1).

21 | P a g e
3. Average Case Analysis (Rarely used)

In average case analysis, we take all possible inputs and calculate the computing time for
all of the inputs. Sum all the calculated values and divide the sum by the total number of
inputs. We must know (or predict) the distribution of cases. For the linear search problem,
let us assume that all cases are uniformly distributed (including the case of x not being
present in the array). So, we sum all the cases and divide the sum by n which is O(n).

Case analysis is identifying instances for which the algorithm takes the longest or shortest
time to complete (i.e., takes the greatest number of steps), then formulating a growth function
using this.

Let’s understand them using two examples to find their best, worst and average case.
Examples:
1. Linear Search
2. Binary Search

Linear Search (Sequential Search):


0 1 2 3 4 5 6 7 8 9

3 90 56 43 67 12 35 9 78 52

How searching is done?


To search the element in the list, each element is inspected in the list one by one until we get
the required element.
Best case: It is a case when algorithm takes minimum time. When the searched key is present
on the index 0 (first) then it is best case. Time taken by this is constant time – B(n) = O(1).
Worst case: When algorithm takes maximum time. When searched key is present at last
index. Time taken is W(n) – O(n)
Average case: In remaining cases, it may be found on any index.
= all possible case time / no. of cases
= 1 + 2 + 3 + 4 + .. + n / n
= n (n+1) / 2 n
= (n+1)/2
= O(n)

22 | P a g e
Time taken is A(n) – O(n)
B(n) – 1
B(n) – O(1)
B(n) - Ω(1)
B(n) – Ɵ(1)
For any constant function, we can write any notation.
W(n) – n
W(n) – O(n)
W(n) - Ω(n)
W(n) – Ɵ(n)

Any notation can be used to show best, average and worst case.
Binary Search:
For any node elements less than are on left and elements greater than or equal to it on its right
side.

20

10 30

5 15 25 40

Number of searches required is equal to height of the binary tree which is – log2 n. where n is
number of elements in the tree.
Best case: The element to be searched for is present in the root. The time taken is constant.
B(n) – O(1).
Worst case: The element to be searched is present in the leaf node. So, the time taken is
proportional to height of the binary tree which is log2 n. W(n) – log2 n.
Average case: It is difficult to find so, mostly it is similar to worst case.
Here, we have balanced tree but if the tree of the following type for same elements where tree
is left skewed and height is n so, W(n) – O(n).

23 | P a g e
In worst case, minimum time is W(n) – O(log2 n) and maximum time is W(n) – O(n)
This type of minimum and maximum worst case is not possible for all types of algorithms.

24 | P a g e
2.8 Frequency count method:
A. Finding sum of array elements
Algorithm sum(A[],n)
{
s=0; -1
for(i=0;i<n;i++) - i=0 # 1, i < n # n + 1, i++ # n
s = s + A[i]; -n
return s; -1
}
One unit of time for each statement.
Time complexity:
f(n) = 2 n + 3 which is O(n)
Space complexity:
A # n, s # 1, n # 1
So, S(n) = n + 2 which is O(n)

B. Finding sum of two square matrices.


Algorithm mat_add(A[], B[], n)
{
for(i=0;i<n;i++) -n+1
for(j=0;j<n;j++) - n * (n + 1)
C[i,j]= A[i,j]+B[i.j]; -n*n
}
Time complexity:
F(n) = 2 n2 + 2 n + 1
Which is O(n2)
Space complexity:
A, B and C are n * n and i , j and n are scaler variables.
S(n) = 3 n2 + 3 which is O(n2)

25 | P a g e
C. Finding multiplication of two matrices.
Algorithm multiply(A, B, C, n)
for(i=0;i<n;i++) -n+1
for(j=0;j<n;j++) - n * (n + 1)
c[i,j]=0; -n*n
for(k=0;k<n;k++) - n * n * (n+1)
C[i,j]= A[i,k]+B[k.j]; -n*n
}
Time complexity:
F(n) = 2n3 + 3n2 + n + 1 which is O(n3)
Space complexity:
A, B, C – n * n and i, j, k, n – 1
S(n) = O(n2)

Time complexity examples:

 Example-1
for(i=0;i<n;i+=2) – n /2
statement;

f(n) = n / 2 which is O(n)

 Example-2
for(i=0;i<n;i+=20) – n /20
statement;
f(n) = n / 20 which is O(n)

 Example-3
for(i=0;i<n;i++)
for(j=0;j<i;j++)
Statement;

26 | P a g e
i j No. of times
0 0 0
1 0 1
1
2 0 2
1
2
3 0 3
1
2
3
……
n -- n

F(n) = 1 + 2 + 3 + .. + n
= n (n+1) / 2
= O(n2)

 Example-4
p=0
for(i=1;p<=n;i++)
p = p + i;

i P
1 0+1
2 1+2
3 1+2+3
4 1+2+3+4
5 1+2+3+4+5
k times
1 + 2 + 3 + 4 + 5 + .. + k

The loop stops when,


p > n (1)
p is as follows,
p = 1 + 2 + 3 + .. + k
p = k (k+1) /2
p = k2 (2) //we estimate it to k2

27 | P a g e
put the value of p in equation (1)
k2 > n

k > √𝑛 which is O(√𝑛 )

 Example-5
for(i=1;i<n;i*=2)
statements;
i – 2, 22 , 23 , 24 …. 2k
The loop stops when,
i >= n (1)
i = 2k (2)
2k > = n
2k = n
k = log2 n which is O(log2 n)
When loops get multiplied then it takes log n time.
log may give you float value so, we have to consider the ceil value of the log result.
n = 8 so, log 8 = 3, log 10 = 3.2 so, ceil(3.2) = 4 means we have to take ⌈𝑙𝑜𝑔⌉

 Example-6
for(i=n;i>=1;i/=2)
statements;
i – n/2, n/22 , n/23 … n/2k
It stops when i < 1
If we equate them then,
i = n/2k
n/2k = 1
n = 2k
k = log2 n which is O(log2 n)

28 | P a g e
 Example-7
for(i=0;i * i < n; i++)
statements;
It terminates when i * i >= n
i2 = n

i = √𝑛 which is O(√𝑛 )

 Example-8
for(i=0;i<n;i+=1)
statement; -n
for(i=0;i<n;i+=1)
statement; -n
f(n) = 2n which is O(n)

 Example-9
p=0
for(i=1;i<n;i=i*2) - log n
p++:
for(j=0;j<p;j*=2) - log p
Statement;
Here, first loop is iterated log n times and value of p becomes log n.
Second loop is iterated log p times and replace the value of p as log n.
= log p = log log n
F(n) = loglog n which is O(loglog n)

 Example-10
for(i=0;i<n;i++) - n
for(j=0;j<n;j*=2) - log2 n * n
Statement; - log2 n * n

29 | P a g e
F(n) = n log n which is O(n log2 n)

Summary of some common looping statements’ time complexity:


for(i=0;i<n;i++) O(n)
for(i=0;i<n;i+=2) n/2 – O(n)
for(i=n;i>1;i--) O(n)
for(i=0;i<n;i=i*2) O(log2 n)
for(i=0;i<n;i=i*3) O(log3 n)
for(i=n;i>1;i=i/2) O(log2 n)

2.9 Analysis of if and while loop


i=1 -1
while(i<n) n+1
{
statement; n
i++; n
}
F(n) = 3n +2 which is O(n)
Almost while and for are similar but we prefer to use for statement.

i =1
k=1
while(k < n)
{
statement;
k = k + i;
i++;
}

i k
1 1
2 1+1
3 1+1+2

30 | P a g e
4 1+1+2+3
5 1+ 1 + 2 + 3 + 4
m times 2 + 2 + 3 + 4 ---- + m

It stops when k > = n


If k repeats m times then m (m+1) / 2
m(m+1) / 2 = n
m2 = n

m = √𝑛 which is O(√𝑛 )

if statement inside loop:


while(m!=n)
{
if(m > n)
m = m – n;
else
n = n – m;
}
Take m = 16 and n = 2…
Take m = 4 and n = 2
Maximum time is n / 2 so, O(n) and minimum time is O(1)

Algorithm Test(n)
{
if(n < 5)
Statement;
else
{
for(i=0;i<n;i++)
Statements;
}
}

31 | P a g e
Statements are executed based on the condition so, best case is O(1) and worst case is O(n)

2.10 Types/classes of time/space functions

Function Types of class


O(1) Constant
Any of the following:
F(n) = 3
F(n) = 1000 or any value
O(log n) Logarithmic
O(n) Linear
Any of the following:
F(n) = n + 3
F(n) = 4000 n + 2 n + 4
F(n) = n/2000 + 3
O(n2) Quadratic
O(n3) Cubic
O(2n) , O(3n) or O(nn) Exponential

Classes of function in order of their weightage:

1 < log n < √𝒏 < 𝒏 < n log n < n2 < n3 < … < 2n < 3n < nn

Let’s see in the following example, how exponentiation functions grow much faster than any
other functions when n grows.

log2 n n n2 2n
0 1 1 2
1 2 4 4
1.3 3 9 8
2 4 16 16
3 8 64 256
. . . .
. . . .
. . . .
4.3 20 400 10,48,576

After some value of n , 2n is much greater than all others. So, any power to n is less than 2n

32 | P a g e
CHAPTER – 3 Recurrence
Relation

33 | P a g e
3.1 Recurrence Relation

A recurrence relation is an equation which represents a sequence based on some rule. It


helps in finding the subsequent term (next term) dependent upon the preceding term
(previous term). If we know the previous term in a given series, then we can easily determine
the next term.
There are different methods to solve the recurrence relations:
1. Substitution method
2. Iterative method
3. Recursion tree method
4. Master theorem
5. Change of variable
6. Characteristic equation method
Substitution Method: We make a guess for the solution and then we use mathematical
induction to prove the guess is correct or incorrect.
Recurrence Tree Method: In this method, we draw a recurrence tree and calculate the time
taken by every level of tree. Finally, we sum up the work done at all levels. To draw the
recurrence tree, we start from the given recurrence and keep drawing till we find a pattern
among levels. The pattern is typically an arithmetic or geometric series.
Simple Recursive function:
void Test(int)
{
if(n>0)
{
printf(“%d”,n);
Test(n-1);
}
}
The major work done is printing of value of n.

34 | P a g e
Recursive/Recursion tree approach
Calling of Test(3)

Test(3) 1

3 Test(2) 1

2 Test(1) 1

1 Test(0) 1

Each call prints the value of n and printf statement executes one time per call. Functions calls
total 4 times for Test(3) so, for Test(n) it will be called n + 1 times.
Time complexity is = O(n)
How to find recurrence relation for the above function:
void Test(int) - T(n)
{
if(n>0) - 1 // we may add or not. It will not make any
difference.
{
printf(“%d”,n); -1
Test(n-1); - T(n-1)
}
}

1 𝑖𝑓 𝑛 = 0
𝑇(𝑛) = {
𝑇(𝑛 − 1) + 1 𝑖𝑓 𝑛 > 0

35 | P a g e
Backward substation method:
T(n) = T(n-1) + 1 - (1)
Now, we find T(n-1) as follows,
T(n-1) = T(n-2) + 1
Put T(n-1) in equation (1)
T(n) = [ T(n-2) + 1 ] + 1
T(n) = T(n-2) + 2 - (2)
Now, find T(n-2) as follows,
T(n-2) = T(n-2-1) + 1 = T(n-3) + 1
Put T(n-2) in equation (2)
T(n) = [ T(n-3) + 1 ] + 2
T(n) = T(n-3) + 3 - (3)
If we repeat this k times,
T(n) = T(n-k) + k - (4)
Assume n-k=0 so, n=k
T(n) = T(n-n) + n - (5)
T(n) = T(0) + n - (6)
T(n) = 1 + n - (7)
T(n) = O(n)

Another function
void Test(int n) - T(n)
{
if(n>0) -1
{
for(i=0;i<n;i++) -n+1
{
printf(“%d”,n); -n
}
Test(n-1); - T(n-1)

36 | P a g e
}
}

Recursion tree approach


Test(n) n
n Test(n-1) n-1
n-1 Test(n-2) n-2
n-2 Test(n-3)
……..
T(2) 2
2 T(1) 1
1 T(0) 0
0 + 1 + 2 + …. + n- 1 + n = n (n + 1) / 2 = O(n2)
Another way of recursion tree approach:

Substitution method
T(n) = T(n-1) + n + 1 + n + 1
T(n) = T(n-1) + 2n + 1
Take the asymptotic notation of 2n + 1 which is O(n).
T(n) = T(n-1) + n
1 𝑖𝑓 𝑛 = 0
𝑇(𝑛) = {
𝑇(𝑛 − 1) + 𝑛 𝑖𝑓 𝑛 > 0

T(n) = T(n-1) + n (1)


Now we find T(n-1) as follows,
T(n-1) = T(n-2) + n – 1
Put T(n-1) in equation (1)
T(n-1) = [ T(n-2) + n - 1] + n
T(n) = T(n-2) + (n – 1) + n (2)
Now we find T(n-2) as follows,
T(n-2) = T(n-3) + n – 2

37 | P a g e
Put T(n-2) in equation (2)
T(n) = [ T(n-3) + n - 2] + (n-1) + n
T(n) = T(n-3) + (n-2) + (n-1) + n (3)
Repeat this k times.
T(n) = T(n-k) + (n-(k-1)) + (n-(k-2)) .. + n-1 + n (4)
Assume that n – k = 0 so, n = k
T(n) = T(n-n) + (n-(n-1)) + (n-(n-2)) .. + n-1 + n (5)
T(n) = T(0) + 1 + 2 + …. + n-1 + n (6)
T(n) = 1 + 1 + 2 + … + n-1 + n
T(n) = 1 + n (n+1) / 2
T(n) = n2
T(n) = n2
Which is O(n2)

Another function
void Test(int) - T(n)
{
if(n>0) -1
{
for(i=1;i<n;i=i*2) // this is repeated in power of two
{
printf(“%d”,n); - log2 n
}
Test(n-1); - T(n-1)
}
}
T(n) = T(n-1) + log n
1 𝑖𝑓 𝑛 = 0
𝑇(𝑛) = {
𝑇(𝑛 − 1) + log 𝑛 𝑖𝑓 𝑛 > 0

38 | P a g e
Recursion tree method:
T(n)
log n T(n-1)
log n-1 T(n-2)
log n-2 T(n-3)
log n-3 T(n-2)
…..
log 2 T(1)
log 0 T(0)

= log n + log n-1 + log n-2 + log 1 + log 0


= log (n * n-1 * n-2 * 1 * 1)
= log n!
There is not a tight bound for n! but upper bound is nn
= log nn
= O( n log n )

Substitution method:

T(n) = T(n-1) + log n (1)


= [ T(n-2) + log n-1] + log n
= T(n-2) log n-1 + log n (2)
= [T(n-3) + log n-2 ] + log n-1 + log n
= T(n-3) + log n-2 + log n-1 + log n (3)
Repeat this k times…
= T(n-k) + log n-(k-1) + log n-(k-2) + log n-(k-3) … + log n
Assume n – k = 0 so, n = k
= T(n-n) + log 1 + log 2 + log 3 … + log n-1 + log n
= 1 + log 1 + log 2 + log 3 … + log n-1 + log n

39 | P a g e
= log n!
= log nn
= O(n log n )

General observation for recurrence relation of the above types:


Recurrence Order
T(n) = T(n-1) + 1 O(n)
T(n) = T(n-1) + n O(n2)
T(n) = T(n-1) + log n O(n log n)
T(n) = T(n-1) + n2 O(n3)
T(n) = T(n-2) + 1 O(n)
T(n) = T(n-100) + n O(n2)
Whatever it is T(n – 200) or T(n – 1000) etc…
T(n) = 2 T(n-1) + 1 ?? not same as above

Important note:
In the above table, second term is multiplied with n to get the order.

Types of Recurrences:

Homogeneous

Inhomegeneous

Logarithmic

Homogeneous recurrence:
If the recurrence is equated to zero and if it contains all terms in homogeneous form, then it is
called homogeneous recurrence.
For example:
𝑡𝑛 + 𝑎 𝑡𝑛−1 + 𝑏 𝑡𝑛−2 + ⋯ . +𝑛 𝑡𝑛−𝑘 = 0

Inhomogeneous recurrence:
If the sum of the linear terms of the equation is not equal to zero, then it is called
inhomogeneous recurrence.

40 | P a g e
The general format is as follows:
𝑎0 𝑡𝑛 + 𝑎1 𝑡𝑛−1 + 𝑎2 𝑡𝑛−2 + 𝑎𝑛 𝑡𝑛−𝑘 = 𝑏 𝑛 𝑝(𝑛)

Example:
𝑡𝑛+3 + 6 𝑡𝑛+2 + 8 𝑡𝑛+1 + 5𝑡𝑛 = 2𝑛

Logarithmic recurrence:
Divide and conquer techniques uses logarithmic recurrence.
The general format is:
𝑛
𝑇(𝑛) = 𝑎𝑇 ( ) + 𝑓(𝑛)
𝑏
Where a and b are constant a > =1 and b > 1. f(n) is some function.
Examples:
𝑛
𝑇(𝑛) = 2 𝑇 ( ) + 𝑛2
2

41 | P a g e
Iteration method:
Recurrence is expanded as summation of terms, then summation provides the result.
This method is known as try back substituting until you know what is going on.

Substitution method:
We start the method by a guess of the solution and then prove it by induction.
Example 1:
Suppose a recurrence relation given as follows,
𝑛
𝑇(𝑛) = 𝑇 ( ) + 1
2
Solution:
Given recurrence is
𝑛 (1)
𝑇(𝑛) = 𝑇 ( ) + 1
2

But we assume that T(n) = O( log n) (2)


We have to prove that T(n) <= c * log n where c is constant. (3)
Compute T(n/2) from equation (2) as follow

𝑛 𝑛
𝑇( ) ≥ 𝑐 (log )
2 2

Now, substitute the above value of T(n/2) in equation (1)


𝑛
𝑇(𝑛) ≤ ( 𝑐 ∗ (log )) + 1
2

𝑛
𝑇(𝑛) ≤ 𝑐 ∗ (log ) + 1
2

𝑇(𝑛) ≤ 𝑐 ∗ ( 𝑙𝑜𝑔2 𝑛 − 𝑙𝑜𝑔2 2) + 1

𝑇(𝑛) ≤ 𝑐 ∗ ( 𝑙𝑜𝑔2 𝑛 − 1) + 1

42 | P a g e
𝑇(𝑛) ≤ 𝑐 𝑙𝑜𝑔2 𝑛 − 𝑐 + 1

For c = 1
𝑇(𝑛) ≤ 1 𝑙𝑜𝑔2 𝑛 − 1 + 1

𝑇(𝑛) ≤ 1 𝑙𝑜𝑔2 𝑛

𝑇(𝑛) ≤ 𝑙𝑜𝑔2 𝑛

Hence it is proved, T(n) = O ( log n)

Example: 2

Suppose a recurrence relation given as follows,


𝑛
𝑇(𝑛) = 2 𝑇 ( ) + 𝑛
2
Then show that it is asymptotically bounded by Ω (n logn).
Solution:
Given recurrence is
𝑛 (1)
𝑇(𝑛) = 2 𝑇 ( ) + 𝑛
2

But we assume that T(n) = Ω (n logn) (2)


We have to prove that T(n) >= c * n logn where c is constant. (3)
Note: We can also assume T(n) = O(n logn) in that case 0 <= T(n) <= c * n logn
Compute T(n/2) from equation (2) as follow

𝑛 𝑛 𝑛
𝑇( ) ≥ 𝑐 ( log )
2 2 2

Now, substitute the above value of T(n/2) in equation (1)


𝑛 𝑛
𝑇(𝑛) ≥ 2 ( 𝑐 ∗ ( log )) + 𝑛
2 2

43 | P a g e
𝑛 𝑛
𝑇(𝑛) ≥ 2 𝑐 ∗ ( log ) + 𝑛
2 2

𝑛
𝑇(𝑛) ≥ 𝑐 ∗ 𝑛 ∗ log +𝑛
2

𝑇(𝑛) ≥ 𝑐 ∗ 𝑛 ∗ ( 𝑙𝑜𝑔2 𝑛 − 𝑙𝑜𝑔2 2) + 𝑛

𝑇(𝑛) ≥ 𝑐 ∗ 𝑛 ∗ ( 𝑙𝑜𝑔2 𝑛 − 1) + 𝑛

𝑇(𝑛) ≥ 𝑐 𝑛 𝑙𝑜𝑔2 𝑛 − 𝑐 𝑛 + 𝑛

For c = 1
𝑇(𝑛) ≥ 1 𝑛 𝑙𝑜𝑔2 𝑛 − 1 𝑛 + 𝑛

𝑇(𝑛) ≥ 1 𝑛 𝑙𝑜𝑔2 𝑛 − 1 𝑛 + 𝑛

𝑇(𝑛) ≥ 𝑛 𝑙𝑜𝑔2 𝑛

Hence it is proved, T(n) = Ω (n logn)

44 | P a g e
Times it takes.
void Test(int) T(n)
{
if(n>0)
{
printf(“%d”,,n); 1
Test(n-1); T(n-1)
Test(n-1); T(n-1)
}
}

T(n) = 2 T(n-1) + 1
Recurrence relation is
1 𝑖𝑓 𝑛 = 0
𝑇(𝑛) = {
2 𝑇(𝑛 − 1) + 1 𝑖𝑓 𝑛 > 0

Recursion tree approach :

T(n) 1

1 T(n-1) T(n-1) 2

1 T(n-2) T(n-2) 1 T(n-2) T(n-2) 4

1 T(n-3) T(n-3) 1 T(n-3) T(n-3) 1 T(n-3) T(n-3) 1 T(n-3) T(n-3)


………….
8
16

1 + 2 + 22 + 23 + 24 … + 2k = 2k+1 - 1 (GP Series )


Assume n-k=0 so, n=k

45 | P a g e
= 2n+1 - 1
= O(2n)
For reference of GP series
a + ar + ar2 + ar3 + … + ark = a(rk+1 – 1) / r – 1
in above series
a = 1, r = 2

Substitution method:
T(n) = 2 T(n-1) + 1 (1)
Now find T(n-1) as follows,
T(n-1) = 2 [T(n-2) + 1]
Put T(n-1) in equation (1)
T(n) = 2 [ 2 T(n-2) + 1] + 1
= 22 T(n-2) + 2 + 1 (2)
Now find T(n-2) as follows,
T(n-2) = 2 T(n-2-1) + 1
T(n-2) = 2 T(n-3) + 1
Put T(n-2) in equation (2)
= 22 [2 T(n-3) + 1] + 2 + 1
= 23 T(n-3) + 22 + 2 + 1 (3)
If we repeat k times..
= 2k T(n-k) + .. + 2k-1 + 2k-2 + 22 + 2 + 1 (4)
Assume n – k = 0 so, n = k
= 2n T(n-n) + .. + 2n-1 + 2n-2 + 22 + 2 + 1 (5)
= 2n 1 + .. + 2n-1 + 2n-2 + 22 + 2 + 1
= 2n+1 + 1
= O(2n)

46 | P a g e
3.2 Master Theorem

Master's Theorem is the best method to quickly find the algorithm's time complexity from its
recurrence relation. This theorem can be applied to decreasing as well as dividing functions.
Recursive functions call themselves in their body. It might get complex if we start calculating
its time complexity function by other commonly used simpler methods. Master's method is
the most useful and easy method to compute the time complexity function of recurrence
relations.

We can apply Master's Theorem only for:

1. Dividing Functions
2. Decreasing Functions

Master theorem for decreasing functions:


Recurrence Order
T(n) = T(n-1) + 1 O(n)
T(n) = T(n-1) + n O(n2)
T(n) = T(n-1) + log n O(n log n)
T(n) = T(n-1) + n2 O(n3)
T(n) = T(n-2) + 1 O(n)
T(n) = T(n-100) + n O(n2)
T(n) = 2 T(n-1) + 1 O(2n)
T(n) = 3 T(n-1) + 1 O(3n)
T(n) = 3 T(n-1) + n O(n3n)

T(n) = a T(n-b) + f(n)


a is number of sub problems in recursion and it is >= 1
n-b − size of the sub problems based on the assumption that all sub-problems are of the same
size so, b > 0.
f(n) represents the cost of work done outside of recursion and it is f(n) = O(nk) and k >= 0
If the recurrence relation is in the above given form, then there are three cases in the master
theorem to determine the asymptotic notations:
 Case:1 if a = 1, T(n) = O (nk+1)
 Case 2 if a > 1, T(n) = O (an/b * nk)
 Case 3 if a < 1, T(n) = O (nk)

Examples of different cases:


Case:1 if a = 1, T(n) = O (nk+1)
Example: 1)

47 | P a g e
T(n) = T(n-1) + n2
In this problem, a = 1, b = 1 and f(n) = O(nk) = n2, giving us k = 2.
Since a = 1, case 1 must be applied for this equation.
To calculate, T(n) = O(nk+1)
= n2+1
= n3
Therefore, T(n) = O(n3) is the tight bound for this equation.

Case 2: if a > 1, T(n) = O (an/b * nk)


Example: 1)
T(n) = 2T(n-1) + n
In this problem, a = 2, b = 1 and f(n) = O(nk) = n, giving us k = 1.
Since a > 1, case 2 must be applied for this equation.
To calculate, T(n) = O(an/b * nk)
= O(2n/1 * n1)
= O(n2n)
Therefore, T(n) = O(n2n) is the tight bound for this equation.

Case 3: if a < 1, T(n) = O (nk)


Example: 1)
T(n) = n4
In this problem, a = 0 and f(n) = O(nk) = n4, giving us k = 4
Since a < 1, case 3 must be applied for this equation.
To calculate, T(n) = O(nk)
= O(n4)
= O(n4)
Therefore, T(n) = O(n4) is the tight bound for this equation

More examples:
Examples 1:
T(n) = T(n-1) + n(n-1)
a = 1, b = 1, k =2
Therefore, T(n) = O(nk+1) = O(n3) is the tight bound for this equation

Example 2:
T(n) = 3T(n-1)
a = 3, b = 1, k =0
To calculate, T(n) = O(an/b * nk)

48 | P a g e
= O(3n/1 * n0 )
= O(3n)
Therefore, T(n) = O(3n) is the tight bound for this equation
Example 3:
T(n) = 2T(n-1) – 1
This recurrence can’t be solved using above method
since function is not of form T(n) = aT(n-b) + f(n).
Example 4:

Fibonacci series:
𝑛 𝑖𝑓 𝑛 = 0 𝑜𝑟 𝑛 = 1
𝑇(𝑛) = {
𝑇(𝑛 − 1) + 𝑇(𝑛 − 2) 𝑖𝑓 𝑛 ≥ 2
Let T(n-1) ≈ T(n-2)
T(n) = 2T(n-1) + c // c is constant cost for addition operation
where, f(n) = O(1)
∴ k=0, a=2, b=1;
T(n) = O(n02n/1)
= O(2n)
Example 5:
Factorial of number:
factorial(n):
if n is 0 -1
return 1
return n * factorial(n-1) - 1 + 1 + T(n-1)

1 unit cost for comparison, 1 multiplication and 1 for subtraction.


T(n) = T(n-1) + 3
Solve this equation using subtraction, recursion tree and mater theorem method.
T(n) = O(n).

49 | P a g e
3.3 Recurrence of dividing functions

void Test(int) - T(n)


{
if(n>0) - 1 // we may add or not. It will not make any
difference.
{
printf(“%d”,n); -1
Test(n/2); - T(n/2)
}
}

Different decreasing value – n , n-1, n / 2 , square root of n


There are different functions for each.
T(n) = T(n-2) + 1
1 𝑖𝑓 𝑛 = 1
𝑇(𝑛) = { 𝑛 `
𝑇 (2) + 1 𝑖𝑓 𝑛 > 1

Note: In dividing function, we use n = 1 as base case not n = 0 as we took in decreasing


function.
Recursion tree method:
T(n)
1 T(n/2)
1 T(n/22)
1 T(n/23)
……..
T(n/2k)
n/2k = 1
n = 2k
k = O(log2 n) - ( ab = c –> b = loga c )

50 | P a g e
Using substitution method
T(n) = T(n/2) + 1 (1)
Now find T(n/2) as follows:
T(n/2) = T(n/22) + 1
Replace T(n/2) in to (1)
= T(n/22) + 1 + 1
= T(n/22) + 2 (2)
Now find T(n/22) as follows:
T(n/22) = T(n/2* 22) + 1
Replace T(n/22) in equation (2)
= T(n/23) + 1 + 2 (3)
Similarly, we get the next terms as follows,
= T(n/24) + 4 (4)
Repeat this k number of times
= T(n/2k) + k (5)
Assume n / 2k = 1 so, k = log n
= T(1) + log n (6)
= 1 + log n
= O(log n)

Another dividing function


void Test(int) - T(n)
{
if(n>0) -1
{
for(i=0;i<n;i++)
{
printf(“%d”,n); -n
}
Test(n/2); - T(n/2)

51 | P a g e
}
}

1 𝑖𝑓 𝑛 = 1
𝑇(𝑛) = { 𝑛
𝑇 (2) + 𝑛 𝑖𝑓 𝑛 > 1

Recursion tree method

T(n)
n T(n/2)
n/2 T(n/22)
n/22 T(n/23)
……..
n/2k-1 T(n/2k)
Add the all the above so, we get the following:
= n + n/2 + n/22 + n/23 + … + n/2k-1 + n/2k
= n ( 1 + 1/2 + 1 / 22 + 1 / 23 + 1 / 2k )
1 1
= n (1 + ∑𝑘𝑖=0 2𝑖 ) , ∑𝑘𝑖=0 =1
2𝑖

Divide the circle in to ½, ¼, 1/8, 1/16, 1/∞ but it is approximate as 1.


= n * (1 + 1)
= O(n)
Substitution method:
T(n) = T(n/2) + n (1)
Now find T(n/2) as follows,
T(n/2) = T(n/22) + n/2
Replace the above value of T(n/2) in (1)
= T(n/22) + n/2 + n (2)
Now find T(n/22) as follows,

52 | P a g e
T(n/22) = T(n/23) + n/22
Replace the above value of T(n/22) in (2)
= T(n/23) + n/22 + n / 2 + n
Repeat this k number of times
= T(n/2k) + n/2k-1 + …. + n / 2 + n
Assume n/2k = 1
= T(1) + n (1/2k-1 + 1/2k-2 + … + ½2 + ½ + 1)
1
Same as previous - ∑𝑘𝑖=0 2𝑖 = 1

= 1 + n (1 + 1)
= 1 + 2n
= O(n)

Another dividing function

void Test(int) T(n)


{
if(n>0) 1
{
for(i=0;i<n;i++)
{
printf(“%d”,n); n
}
Test(n/2); T(n/2)
Test(n/2) T(n/2)
}
}

1 𝑖𝑓 𝑛 = 1
𝑇(𝑛) = { 𝑛
2 𝑇 (2) + 𝑛 𝑖𝑓 𝑛 > 1

53 | P a g e
Recursion tree method :
T(n) n
1 T(n/2) T(n/2) n
1 T(n/22 ) T(n/22 ) 1 T(n/22 ) T(n/22 ) n
1 T(n/23 ) T(n/23 ) 1 T(n/23 ) T(n/23 ) 1 T(n/23 ) T(n/23 ) 1 T(n/23 ) T(n/23 )
…………………………………….
T(n/2k) - k times
So,
k number of steps and in each step, it is called n number of times (n/4 + n/4 + n/4 + n/4 = n)
=nk (1)
Assume
n / 2k = 1
k = log n
so, replace k in (1)
= O (n log n)

Substitution method:

T(n) = 2 T(n/2) + n (1)


Now find T(n/2) as follows,
T(n/2) = 2 T(n/22) + n /2
Replace T(n/2) in equation (1)
= 2 [2 T(n/22) + n/2] + n
= 22 T(n/22) + n + n (2)
Now find T(n/22) as follows,
T(n/22) = 2 T(n/23) + n/22
Replace T(n/22) in equation (2)
= 22 [ 2 T(n/23) + n/22 ] + n + n
= 23 T(n/23) + n + n + n
= 23 T(n/23) + 3n (3)

54 | P a g e
Repeat this k times
= 2k T(n/2k) + k n (4)
Assume that n/2k = 1
k = log n
= 2k T(1) + k n
= n 1 + log n * n
= n + n log n
We consider the dominating term only as per asymptotic notation.
= O(n log n)

3.4 Master Theorem for Dividing function

T(n) = a T(n/b) + f(n)


a >= 1 , b > 1 and f(n) = O(nk logp n)
log 𝑏 𝑎 = ?
k=?
Case 1:

If log 𝑏 𝑎 > k then (𝑛log𝑏 𝑎 )


Case 2:
If log 𝑏 𝑎 = k
p > -1 then O(nk logp+1n)
p = -1then O(nk loglog n)
p < -1 then O(nk)
Case 3:
If log 𝑏 𝑎 < k
p >= 0 then O(nk logp n)
p < 0 then O(nk)
Examples:
Case 1:
T(n) = 2 T(n/2) + 1
a = 2 , b = 2, k=0

55 | P a g e
log 2 2 = 1
1 > 0 so, O( 𝑛log𝑏 𝑎 ) – O(n)

T(n) = 4 T(n/2) + n
a = 4, b = 2, k =1
log 2 4 = 2
2 > 1 so, O( 𝑛log𝑏 𝑎 ) – O(n2)
T(n) = 8 T(n/2) + n
a = 8, b = 2, k =1
log 2 8 = 3
3 > 1 so, O( 𝑛log𝑏 𝑎 ) – O(n3)

T(n) = 9 T(n/2) + n
a = 9, b = 2, k =1
log 2 9 = 2
2 > 1 so, O( 𝑛log𝑏 𝑎 ) – O(n2)

Case 2:
T(n) = 2 T(n/2) + n
a = 2, b = 2, k =1, p = 0
log 2 2 = 1
1 = 1 so, O(nk logp+1n) - O(n log n)

T(n) = 4 T(n/2) + n2 log n


a = 4, b = 2, k = 2, p = 0
2 = 2 so, O(nk logp+1n) - O(n2 log n)

T(n) = 4 T(n/2) + n2 log3 n


Then simply multiply, f(n) with log n – so, O(n2 log4 n)

56 | P a g e
T(n) = 8 T(n/2) + n3
a = 8, b = 2 and k = 3
3 = 3 so, O(n3 log n)

Case 2.2
T(n) = 2 T(n/2) + n / log n
a = 2, b =2, k = 1, p = -1
1 = 1 and p = - 1 so, O(nk loglog n)

Case 2.3
T(n) = 2 T(n/2) + n / log2 n
a = 2, b =2, k = 1, p = -2
1 = 1 and p = - 2 so, O(n)

Case 3.2:
T(n) = T(n/2) + n2
a = 1 , b = 2 and k = 2
0 < 2 so, O(n2)

Case 3.1
T(n) = 2 T(n/2) + n2 log2 n
a = 2, b = 2, k = 2 and p = 2
1 < 2 so, O(n2 log2 n)

57 | P a g e
Recurrence relation for Root function
void Test(int) T(n)
{
if(n>0) 1
{
printf(“%d”,n); 1

Test(√𝑛); T(√𝑛)
}
}

T(n) = T(√𝑛) + 1
1 𝑖𝑓 𝑛 = 2
𝑇(𝑛) = {
𝑇(√𝑛) + 1 𝑖𝑓 𝑛 > 2

For root function, value of n should be 2 or more.


Using substitution method

T(n) = T(√𝑛) + 1 (1)


= T(n1/2) + 1
1⁄
= T(𝑛 22 ) +1+1 (2)
1⁄
= T(𝑛 23 ) +1+1+1 (3)
Repeat k times…
1⁄
= T(𝑛 2𝑘 ) +k (4)
Assume that n is in power of 2 so, n = 2m
𝑚⁄
T(2m ) = T(2 2𝑘 ) +k (5)
𝑚⁄ 𝑚⁄
Now, T(2 2𝑘 ) is reduced to T(2) so, T(2 2𝑘 ) = T(21)
𝑚
=1
2𝑘

m = 2k
k = log m
we want answer in n and n = 2m
so, m = log n

58 | P a g e
k = loglog n
𝑚⁄
= T(2 2𝑘 ) +k (6)
Replace value of k in (6)
= 1 + loglog n
= O(loglog n)

59 | P a g e
CHAPTER – 4 Sorting
Algorithms

60 | P a g e
4.1 Category of sorting algorithms
Comparison based sorting –
In comparison-based sorting, elements of an array are compared with each other to find the
sorted array.
Examples: Bubble sort, insertion sort, selection sort, quick sort, heap sort and merge sort.
Non-comparison-based sorting –

In non-comparison-based sorting, elements of array are not compared with each other to
find the sorted array.
Examples: Radix sort, counting sort and bucket sort
In-place/Outplace technique –

A sorting technique is in place if it does not use any extra memory to sort the array.
Among the comparison-based techniques discussed, only merge sort is outplaced technique
as it requires an extra array to merge the sorted subarrays.
Among the non-comparison-based techniques discussed, all are outplaced techniques.
Counting sort uses a counting array and bucket sort uses a hash table for sorting the array.

Online/Offline technique –

A sorting technique is considered online if it can accept new data while the procedure is
ongoing i.e. complete data is not required to start the sorting operation.
Among the comparison-based techniques discussed, only Insertion Sort qualifies for this
because of the underlying algorithm it uses i.e. it processes the array (not just elements)
from left to right and if new elements are added to the right, it doesn’t impact the ongoing
operation.

Stable/Unstable technique –

A sorting technique is stable if it does not change the order of elements with the same
value.
Out of comparison-based techniques, bubble sort, insertion sort and merge sort are stable
techniques. Selection sort is unstable as it may change the order of elements with the same
value. For example, consider the array 4, 4, 1, 3 and sort them using selection sort.
In the first iteration, the minimum element found is 1 and it is swapped with 4 at 0 th
position. Therefore, the order of 4 with respect to 4 at the 1st position will change.
Similarly, quick sort and heap sort are also unstable.
Out of non-comparison-based techniques, Counting sort and Bucket sort are stable sorting
techniques whereas radix sort stability depends on the underlying algorithm used for
sorting.

61 | P a g e
4.2 Bubble Sort
It compares the first and second elements of the array; if the first element is greater than the
second element, it will swap both elements, and then compare the second and third elements,
and so on.
The idea is that neighboring elements are compared with each other and swapped.
Why is it called bubble sort? – If we throw the stone in water then the stone which is heavier
so, it goes down and bubbles which are lighter comes up. Similarly, largest element is placed
in its proper position at the end of each iteration/pass.
8 5 7 3 2

1st Pass:
8 5 5 5 5
5 8 7 7 7
7 7 8 3 3
3 3 3 8 2
2 2 2 2 8

At each pass, the largest element is placed in its proper position. Element 8 is placed in the
last position.
Number of comparisons - 4
2nd Pass:
5 5 5 5
7 7 3 3
3 3 7 2
2 2 2 7
8 8 8 8

Element 7 from unsorted elements, is placed in its position.


Number of comparisons - 3
3rd Pass:
5 3 3
3 5 2
2 2 5
7 7 7

62 | P a g e
8 8 8

Element 5 is placed in its proper position.


Number of comparisons - 2
4th Pass:
3 2
2 3
5 5
7 7
8 8

Element 2 is placed in its proper position.


Number of comparisons – 1
Time complexity:
Worst case:
Number of comparisons are = 1 + 2 + 3 + 4 … + n-1 = n (n-1) /2 = O(n2) which is maximum
time taken.
Note: We mostly see the number of comparisons for finding time complexity.
Algorithm bubble(A,n)
{
for(i=0;i<n-1;i++)
for(j=0;j<n-1-i;j++)
{
if(A[j] > A[j+1])
Swap element A[j] with A[j+1]
}
}

Best case:
If the list of elements is already sorted. Then there is not any swap in the first pass, which
shows that elements are sorted and that can be done by the following algorithm using flag
variable.
Algorithm bubble(A,n)
{

63 | P a g e
for(i=0;i<n-1;i++)
{
flag=0;
for(j=0;j<n-1-i;j++)
{
if(A[j] > A[j+1])
Swap element A[j] with A[j+1]
flag=1;
}
if(flag==0)
break;
}
}
Number of comparison in first pass – n-1 = O(n) which is minimum time taken by the
bubble sort.
We can say that Bubble sort is adaptive by putting the flag variable.
Summary of time complexity:
Best Average Worst
Without flag O(n2) O(n2) O(n2)
With flag O(n) O(n2) O(n2)

Space complexity:
It is constant. No additional memory space is required. O(1). It is in place sort algorithm.
Stable or not?
Whether it is stable or not? A stable sorting algorithm maintains the relative order of the
items with equal sort keys.
We can check it by sorting the following elements:
8 8 3 5 4
In the above list, two elements are having same value so, their relative order in the sorted list
will remain the same means second 8 will be placed after first 8.

4.3 Insertion Sort

64 | P a g e
Insertion sort works similar to the sorting of playing cards in hands. It is assumed that the
first card is already sorted in the card game, and then we select an unsorted card. If the
selected unsorted card is greater than the first card, it will be placed at the right side;
otherwise, it will be placed at the left side. Similarly, all unsorted cards are taken and put in
their exact place.

We are given the list of sorted elements as follows:


2 6 10 15 20 25 30

Now, we have to insert element 12 in the above list.


Start comparing element from the last index, if element is larger than newly inserted then
shift that element to right.
2 6 10 15 20 25 30
12
2 6 10 15 20 25 30
12
2 6 10 15 20 25 30
12
2 6 10 15 20 25 30
12
Now, element 10 is smaller than 12 so, put element 12 in the empty slot.
2 6 10 12 15 20 25 30

Consider the following list of elements which is unsorted.


8 5 7 3 2

List with one element is already sorted as follows. Dark background shows sorted portions of
the array.
8 5 7 3 2

Insert 5: pass 1
8 5 7 3 2

5 8 7 3 2

65 | P a g e
Number of comparisons – 1

Insert 7: pass 2
5 8 7 3 2

5 7 8 3 2

Number of comparisons – 2
Insert 3: pass 3
5 7 8 3 2

3 5 7 8 2

Number of comparisons - 3
Insert 2: pass 4
3 5 7 8 2

3 5 7 8
2

3 5 7 8
2

3 5 7 8
2

2 3 5 7 8

Number of comparisons – 4
Number of passes required = 4 means n – 1 pass

Time complexity:

66 | P a g e
Best Case: - It occurs when there is no sorting required, i.e. the array is already sorted and
outer loop is iterated only n times and it does not enter into inner loop for swapping of the
elements. The best-case time complexity of insertion sort is O(n).

Average Case: - It occurs when the array elements are in jumbled order that is not properly in
ascending and not properly in descending order. The average case time complexity of
insertion sort is O(n2) which as bad as worst case.

Worst Case: - It occurs when the array elements are required to be sorted in reverse order.
That means suppose you have to sort the array elements in ascending order, but its elements
are in descending order. The worst-case time complexity of insertion sort is O(n2).

Number of comparisons - 1 + 2 + 3 + 4 + … + n – 1 = n (n-1) / 2 = O(n2)


Number of swapping are same as comparisons.
Summary of time complexity:
Best case Average case Worst case
O(n) O(n2) O(n2)

Space complexity:
It is constant. No additional memory space required so, O(1). It is in place sort algorithm
Algorithm InsertionSort(A[],n)
{
for(i=1;i<n;i++)
{
j=i-1;
x=A[j];
while( j >= -1 && A[j] > x)
{
A[j+1] = A[j];
j--;
}
A[j+1]=x;
}

67 | P a g e
}

Stable or not?
Insertion sort is also stable sort. It maintains the relative position of elements having same
value.

Insertion sort has various advantages such as -

o Simple implementation
o Efficient for small data sets
o Adaptive, i.e., it is appropriate for data sets that are already substantially sorted.

(GATE CSE 2003)

The usual Θ(n2) implementation of Insertion Sort to sort an array uses linear search to
identify the position where an element is to be inserted into the already sorted part of the
array. If instead, we use binary search to identify the position, the worst-case running time
will __________.

a. Remain Θ(n2)
b. Become Θ(n(logn)2)
c. Become Θ(nlogn)
d. Become Θ(n)

Answer (a)

4.4 Selection Sort


In selection sort, the smallest value among the unsorted elements of the array is selected in
every pass and inserted to its appropriate position into the array. It is also the simplest
algorithm. In this algorithm, the array is divided into two parts, first is sorted part, and
another one is the unsorted part. Initially, the sorted part of the array is empty, and unsorted
part is the given array. Sorted part is placed at the left, while the unsorted part is placed at the
right.
The array is divided into sorted and unsorted array as the process progresses. In each
pass/iteration, one minimum (ascending order)/maximum (descending order) element is
selected and placed in its proper position.
0 1 2 3 4 5
7 4 10 8 3 1

68 | P a g e
Pass 1: (start searching the element for position 0). Wherever we get minimum element that is
swapped with element with position 0.
0 1 2 3 4 5
1 4 10 8 3 7

Number 1 is swapped with 7.


Number of comparisons – 5
Pass 2:
0 1 2 3 4 5
1 4 10 8 3 7

Number 4 and 3 are swapped.


Number of comparisons – 4
0 1 2 3 4 5
1 3 10 8 4 7

Pass 3:
0 1 2 3 4 5
1 3 10 8 4 7

Number of comparisons – 3
0 1 2 3 4 5
1 3 10 8 4 7

Numbers 10 and 4 are swapped.


0 1 2 3 4 5
1 3 4 8 10 7

Pass 4:
0 1 2 3 4 5
1 3 4 8 10 7

69 | P a g e
Number of comparisons – 2
0 1 2 3 4 5
1 3 4 8 10 7

Number 8 and 7 are swapped.

0 1 2 3 4 5
1 3 4 7 10 8

Pass 5:
0 1 2 3 4 5
1 3 4 7 10 8

Number of comparisons – 1
0 1 2 3 4 5
1 3 4 7 10 8

Number 10 and 8 are swapped.

0 1 2 3 4 5
1 3 4 7 8 10

Number of passes are 5 which is n – 1.

Time complexity:
Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already
sorted. As per the mechanism of this sorting, it is difficult to decide that array is sorted or not
and we need to do all the comparisons as we do in worst case scenario. The best-case time
complexity of selection sort is O(n2).

Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of
selection sort is O(n2).

Worst Case Complexity - It occurs when the array elements are required to be sorted in
reverse order. That means suppose you have to sort the array elements in ascending order, but

70 | P a g e
its elements are in descending order. The worst-case time complexity of selection sort
is O(n2).

Number of comparisons = 1 + 2 + 3 + 4 + 5 + … + (n-1) + n = n (n-1) /2 = O(n2)


Summary of time complexity:
Best case Average case Worst case
O(n2) O(n2) O(n2)

Space complexity:
It is constant. No additional memory space is required so, O(1). It is in place sort algorithm.

Algorithm SelectionSort(A[],n)
{
for(i=0;i<n-1;i++)
{
min = i;
for(j+i+1,j<n;j++)
{
if(A[j] < A[min])
min = j;
}
if(min != i)
swap(A[i],A[min])
}
Adaptive or not? Not adaptive, meaning it doesn’t take advantage of the fact that the list may
already be sorted or partially sorted.

Stable or not?
By default, implementation is not stable, but it can be made stable.

4.5 Heap Sort

71 | P a g e
A heap is a complete binary tree, and the binary tree is a tree in which the node can have the
utmost two children. A complete binary tree is a binary tree in which all the levels except the
last level, i.e., leaf node, should be filled completely.

First, we see representation of binary tree using array.

B C

D E F G

1 2 3 4 5 6 7
A B C D E F G

If index starts from 1 then:


If any element is at index i then its
Left child of ith element is at = 2 * i
Right child of ith element is at = 2 * i + 1

𝑖
Parent of ith element is at = ⌊ ⌋
2
If index starts from 0 then:
If any element is at index i then its
Left child of ith element is at = 2 * i + 1
Right child of ith element is at = 2 * i + 2

𝑖−1
Parent of ith element is at = ⌊ ⌋
2
Complete binary tree (it is strictly tree of depth d, and all its leaves are located at level d) then
the above representation is worth otherwise it keeps gap in the array. The elements are filled
level by level in complete binary tree.
Height of the complete binary is log2 n.
Max heap: All its descendants are less than its. Element which root – 50 having all its
descendants are less than it.

72 | P a g e
50

30 20

15 10 8 16

Min heap: All its descendants are greater than its. Element which root – 50 having all its
descendants are less than it.

10
0
30 20

35 40 32 25

Let’s take max heap and see the process of insertion and deletion.
Insertion process:

50

30 20

15 10 8 16

1 2 3 4 5 6 7 8
50 30 20 15 10 8 16 60

Now, insert element 60 in the above heap.


Element 60 will be inserted on index 8 but as per the property of max heap, we cannot insert
it at 8th position.
So, it will go upward by checking ⌊8/2⌋ = 4 and compared with element at index 4 which is
15. Again, it goes upward by checking ⌊4/2⌋ = 2 and compared with element at index 2
which is 30. Again, it is compared with its parent 50 and interchanged.
1 2 3 4 5 6 7 8
50 30 20 15 10 8 16 60

1 2 3 4 5 6 7 8

73 | P a g e
50 30 20 60 10 8 16 15

1 2 3 4 5 6 7 8
50 30 20 60 10 8 16 15

1 2 3 4 5 6 7 8
50 60 20 30 10 8 16 15

1 2 3 4 5 6 7 8
50 60 20 30 10 8 16 15

1 2 3 4 5 6 7 8
60 50 20 30 10 8 16 15

Number of comparison for insertion is equal to height of the tree which is lon2 n .

Deletion process:
Delete 50 from the max heap. Mostly, we want the maximum or minimum element from the
max heap and min heap respectively.

50

30 20

15 10 8 16

1 2 3 4 5 6 7
50 30 20 15 10 8 16

1 2 3 4 5 6 7
16 30 20 15 10 8 50

But it does not satisfy the property of max heap. So, we check its children and swap it with
largest one.
1 2 3 4 5 6 7
30 16 20 15 10 8 50

74 | P a g e
Now, again check its children which are 15 and 10 but both are smaller than it.
1 2 3 4 5 6 7
30 16 20 15 10 8 50

No need to swap now.


Now, delete element 30 so,
1 2 3 4 5 6 7
8 16 20 15 10 30 50

Deleted element’s (30) position will be taken by last element (8) but by putting it, property of
max heap is not satisfied. So, we need to do it by changing the position of element 8 in the
above.
1 2 3 4 5 6 7
8 16 20 15 10 30 50

Now, check the children of element 8 and it is swapped with which is larger. So, 8 is
swapped with 20.
1 2 3 4 5 6 7
20 16 8 15 10 30 50

Still, element 8 has both of its children are greater than its so, swap element 8 with larger of
its children which is 15.
1 2 3 4 5 6 7
20 16 15 8 10 30 50

In this way, we keep on deleting the elements and adding it in the list and finally we will get
sorted list of elements.

Maximum time to delete the element is equal to height of binary tree which is log2 n.
Heap Sorting:

In heap sort, basically, there are two phases involved in the sorting of elements. By using the
heap sort algorithm, they are as follows -

o The first step includes the creation of a heap by adjusting the elements of the array.

75 | P a g e
o After the creation of heap, remove the root element of the heap repeatedly by shifting
it to the end of the array.

Creating heap for heap sort:


1 2 3 4 5
10 20 15 30 40

Take element one by one and compare with its parent by formula floor (i / 2) where i is the
index of the element.
Take 1st element – 10:
1 2 3 4 5
10 20 15 30 40

So, it is max heap of one element, no need to do anything.


Take 2nd element – 20:
1 2 3 4 5
10 20 15 30 40

Check the parent of newly inserted element which should be greater than it. If not then
interchange.
1 2 3 4 5
20 10 15 30 40

Take 3rd element – 15:


No need to do anything because 15’s parent is 20 which is greater than it.
Take 4th element – 30:
1 2 3 4 5
20 10 15 30 40

Now, 30’s parent – floor (4/2) = 2, which is 10 and less than 30 so, swap them.

1 2 3 4 5
20 30 15 10 40

76 | P a g e
In the above, element 30 is not its proper position so, it is swapped with its parent 20.
1 2 3 4 5
30 20 15 10 40

Take 5th element – 40:


1 2 3 4 5
30 20 15 10 40

40’s parent (floor(5/2) – 2) which is 30 so, less than 40 hence swap them.
1 2 3 4 5
30 40 15 10 20

Still, 40’s parent is smaller than it so, swap it.


1 2 3 4 5
40 30 15 10 20

The following is the final representation of elements using max heap in array and tree.
1 2 3 4 5
40 30 15 10 20

40

30 15

10 20

Deletion for heap sort:


Now, start deleting elements from the above max heap.
1 2 3 4 5
40 30 15 10 20

Delete 40:
1 2 3 4 5
20 30 15 10 40

77 | P a g e
First, element which is the largest element and swapped with last element as above.
But, this does not satisfy the property of max heap so, 20 will be sent down and swapped with
one of its children which is larger so, it is swapped with 30.
1 2 3 4 5
30 20 15 10 40

Delete 30:
1 2 3 4 5
10 20 15 30 40

Now, element 10 is swapped with largest children.


1 2 3 4 5
20 10 15 30 40

Delete 20:
1 2 3 4 5
15 10 20 30 40

No, need to swap because no child is greater than it.


Delete 15:
1 2 3 4 5
10 15 20 30 40

Now, last element is already in its proper position.


Insertion process (creation of max heap): one element can be inserted in to max heap with log
n comparison so, n elements can be inserted with n log n comparisons.
Deletion process: one element is deleted and in order to maintain the property of max heap it
take log n comparisons so, for n number of elements we need n log n number of comparisons.
So, total time is =
max heap creation time + deletion time = n log n + n log n = 2 n log2 n = O (n log2 n)
Summary of time complexity:
Best case Average case Worst case
O(n log n) O(n log n) O(n log n)

78 | P a g e
There is no mechanism to detect whether the list is sorted or not. For any sequence of data,
heap sort dose the same work. So, time complexity of heap sort is the same in all three cases.
Hence heap sort is not adaptive.

Space complexity:
It is constant. So, no additional memory is required – O (1). It is in place sorting algorithm.

Stable or not?
It is not stable. It does not maintain the relative order of same element.

(GATE CSE 2015 Set 3)

 Consider the following array of elements.

〈89,19,50,17,12,15,2,5,7,11,6,9,100〉

The minimum number of interchanges needed to convert it into a max-heap is _______.

a. 4
b. 5
c. 2
d. 3

Answer (d)

79 | P a g e
Non – comparison-based sorting algorithms:

4.6 Radix Sort


It sorts the elements by sorting them on their radix. Radix for decimal number is from 0 to 9
(10 digits), radix for alphabets is a to z (26 alphabets)
The process of radix sorting works similar to the sorting of student’s names, according to the
alphabetical order. In this case, there are 26 radixes formed due to the 26 alphabets in
English. In the first pass, the names of students are grouped according to the ascending order
of the first letter of their names. After that, in the second pass, their names are grouped
according to the ascending order of the second letter of their name. And the process continues
until we find the sorted list.
Take the following list of elements:

237 146 259 348 152 163 235 48 36 62

Here, we have decimal number so, we take bin of size 9 ( 0 to 9).

Pass – 1:
Put the number in the specific bin based on the LSB or last digit of the number.

0 1 2 3 4 5 6 7 8 9
62 163 235 146 237 348 259
36 48

Now, make each bin empty from 0 to 9 and list the number…
62 163 235 146 36 237 348 48 259

Pass – 2:
Now, put the number in the bins based on 2nd LSB of the number. For example, 62 and its
LSB is 6 so, it is kept in bin number 6.
0 1 2 3 4 5 6 7 8 9
235 146 259 62
36 348 163
237 48

80 | P a g e
Now, make all the bins empty and list the numbers.

235 36 237 146 348 48 259 62 163

Pass-3:
Put the number into bins based on 3rd LSB. Largest number is of three digits and LSB for 235
number is 2 so, it is kept in bin number 2. For two-digit numbers, 0 is appended so, it is LSB
is zero and kept in bin number 0.
0 1 2 3 4 5 6 7 8 9
36 146 235 348
48 163 237
62 259

Now, make bins empty and list the numbers.


36 48 62 146 163 235 237 259 348

Time complexity:
Number of passes is equal to number of digits d in the largest number.
Number of elements we put into the bin is n
So, time complexity is – O(d n) where d is number of digits in the largest number. d cannot
be constant. But if it is constant then the complexity is linear – O(n).
Summary of time complexity:
Best case Average case Worst case
O(nd) O(nd) O(nd)

Space complexity:
Radix sort also has a space complexity of O (n + d), where n is the number of elements and
d is the base of the number system. This space complexity comes from the need to create
buckets for each digit value and to copy the elements back to the original array after each
sorting is done on digit.
Stable or not?
It is stable.

4.7 Bucket Sort

81 | P a g e
Sort a large set of floating-point numbers which are in range from 0.0 to 1.0 and are
uniformly distributed across the range. How do we sort the numbers efficiently?
A simple way is to apply a comparison-based sorting algorithm. The lower bound for
Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is Ω(n log
n), i.e., they cannot do better than n log n.
Bucket sort is a sorting algorithm that separate the elements into multiple groups said to be
buckets. Elements in bucket sort are first uniformly divided into groups called buckets, and
then they are sorted by any other sorting algorithm. After that, elements are gathered in a
sorted manner.
Bucket sort is commonly used -
o With floating-point values. It works on floating point number in range of 0.0 to 1.0.
o When input is distributed uniformly over a range.
The advantages of bucket sort are -
o Bucket sort reduces the no. of comparisons.
o It is asymptotically fast because of the uniform distribution of elements.
The limitations of bucket sort are -
o It may or may not be a stable sorting algorithm.
o It is not useful if we have a large array because it increases the cost.
o It is not an in-place sorting algorithm, because some extra space is required to sort the
buckets.

Let’s take the example:


0.79, 0.13, 0.64, 0.39, 0.20, 0.89, 0.53, 0.42, 0.06, 0.94
0 0.06
1 0.13
2 0.20
3 0.39
4 0.42
5 0.53
6 0.64
7 0.79
8 0.89
9 0.94

Put the elements into the appropriate bucket by applying - B⌊𝑛 ∗ 𝐴[𝑖]⌋ = B[10 * 0.79]=B[7]
so, 0.79 is placed in bucket 7.

Algorithm Bucket Sort(A[],n)


1. Let B[0....n-1] be a new array

82 | P a g e
2. n=length[A]
3. for i=0 to n-1
4. make B[i] an empty list
5. for i=1 to n
6. do insert A[i] into list B⌊𝑛 ∗ 𝐴[𝑖]⌋ – (floor value of the index)
7. for i=0 to n-1
8. do sort list B[i] with insertion-sort
9. Concatenate lists B[0], B[1],........, B[n-1] together in order
End

Time complexity:
Best Case: - In Bucket sort, best case occurs when the elements are uniformly distributed in
the buckets. The complexity will be better if the elements are already sorted in the buckets.
Time complexity will be O (n + k), where O(n) is for making (insertion into the bucket) the
buckets, and O(k) is for concatenating the bucket elements.

The best-case time complexity of bucket sort is O(n + k).

Average Case: - It occurs when the array elements are in jumbled order that is not properly
ascending and not properly descending. Bucket sort runs in the linear time, even when the
elements are uniformly distributed. The average case time complexity of bucket sort is O (n
+ k).

Worst Case: - In bucket sort, worst case occurs when the elements are of the close range in
the array and because of that, they have to be placed in the same bucket. So, some buckets
have a greater number of elements than others.
The complexity will get worse when the elements are in the reverse order.
The worst-case time complexity of bucket sort is O(n2).

If the sequence is like – 0.79, 0.74, 0.72, 0.71 …


Insert – 0.79
0
1
2
3
4
5
6
7 0.79
8
9

Insert – 0.74

83 | P a g e
0
1
2
3
4
5
6
7 0.74 0.79
8
9

Insert – 0.72
0
1
2
3
4
5
6
7 0.72 0.74 0.79
8
9

If all the elements are stored in the same bucket as above, then they are stored using insertion
sort and it takes time O(n2).
Worst case - O(n2).
Summary of time complexity:
Best case Average case Worst case
O(n+k) O(n+k) O(n2)

Space complexity:
If k is the number of buckets required, then O(k) extra space is needed to store k empty
buckets, and then we map each element to a bucket that requires O(n) extra space. So, the
overall space complexity is O (n + k).

4.8 Counting Sort


It is non-comparison-based algorithm as bucket sort. It sorts data based on the frequency of
the elements. it is an integer sorting algorithm.

We are given input size – n and range - k. Range means input must be in that range.
From the name, we have to count the frequency/occurrence of each input.

84 | P a g e
For example:
2 1 2 1 3 4 1 2
Create the array of 4 elements:
2 1 2 1 3 4 1 2
1 0 1 1 2 2 2 3 3
2 1 1 2 2 2 2 2 3
3 0 0 0 0 1 1 1 1
4 0 0 0 0 0 1 1 1

Now, traverse the list and note down its occurrence in the array. If it appears again then its
corresponding entry is incremented.
Order the list of elements based on their occurrence as follows. 1 appears thrice so, list it
three times. 3 appears once so, list it one time.
So, the sorted order is - 1 1 1 2 2 2 3 4
Time complexity:
We traverse the list n times and array (to store the frequency of each element) is k times so,
O(n+k).
Same time complexity in best, average and worst case.
But if we have the list of elements as follows:
2 23000 5 9 20
We need to take array of 23000 and remaining space will be wasted.
It does not work for floating points or negative values.
Summary of time complexity:
Best case Average case Worst case
O(n+k) O(n+k) O(n+k)

Space complexity:
O(n+k). where k is range of elements. It means dependent on the larger number in the list.
The algorithm allocates two additional arrays: one for counts and one for the output.
Stable or not?
It is stable sort.
Counting sort is specifically useful in following scenarios:

 Linear complexity is needed


 Smaller integers with multiple count
 Can be used as a subroutine in Radix sort

85 | P a g e
4.9 Amortize Analysis
This analysis is used when the occasional operation is very slow, but most of the operations
which are executing very frequently are faster. Data structures we need amortized analysis for
Hash Tables, Disjoint Sets etc.

In the Hash-table, the most of the time the searching time complexity is O(1), but sometimes
it executes O(n) operations. When we want to search or insert an element in a hash table for
most of the cases it is constant time taking the task, but when a collision occurs, it needs O(n)
times operations for collision resolution.

Aggregate Method

The aggregate method is used to find the total cost. If we want to add a bunch of data, then
we need to find the amortized cost by this formula.

For a sequence of n operations, the cost is –

86 | P a g e

You might also like