Unit - 1 Material
Unit - 1 Material
Course Objectives:
• Introduces the notations for analysis of the performance of algorithms and the data structure of disjoint sets.
• Describes major algorithmic techniques (divide-and-conquer, backtracking, dynamic programming, greedy, branch and
bound methods) and mention problems for which each technique is appropriate
• Describes how to evaluate and compare different algorithms using worst-, average-, and best case analysis.
• Explains the difference between tractable and intractable problems, and introduces the problems that are P, NP and NP
complete.
UNIT - I
Introduction: Algorithm, Performance Analysis-Space complexity, Time complexity, Asymptotic Notations- Big oh notation,
Omega notation, Theta notation and Little oh notation.
Divide and conquer: General method, applications-Binary search, Quick sort, Merge sort, Strassen’s matrix multiplication.
UNIT - II
Disjoint Sets: Disjoint set operations, union and find algorithms, Priority Queue- Heaps, Heapsort
Backtracking: General method, applications, n-queen’s problem, sum of subsets problem, graph Coloring, hamiltonian
cycles.
UNIT – III
Dynamic Programming: General method, applications- Optimal binary search tree, 0/1 knapsack problem, All pairs shortest
path problem, Traveling salesperson problem, Reliability design.
UNIT - IV
Greedy method: General method, applications-Job sequencing with deadlines, knapsack problem, Minimum cost spanning
trees, Single source shortest path problem.
Basic Traversal and Search Techniques: Techniques for Binary Trees, Techniques for Graphs, Connected components,
Biconnected components.
UNIT - V
Branch and Bound: General method, applications - Traveling salesperson problem, 0/1 knapsack problem - LC Branch and
Bound solution, FIFO Branch and Bound solution. NP-Hard and
NP-Complete problems: Basic concepts, non-deterministic algorithms, NP-Hard and NP-Complete classes, Cook’s theorem.
TEXT BOOK:
1. Fundamentals of Computer Algorithms, Ellis Horowitz, Satraj Sahni and Rajasekharan, University press, 1998.
REFERENCE BOOKS:
1. Design and Analysis of algorithms, Aho, Ullman and Hopcroft, Pearson education.
2. Introduction to Algorithms, second edition, T. H. Cormen, C.E. Leiserson, R. L. Rivest, and C. Stein, PHI Pvt. Ltd./ Pearson
Education.
3. Algorithm Design: Foundations, Analysis and Internet Examples, M.T. Goodrich and R. Tamassia, John Wiley and sons.
UNIT I:
Introduction: Algorithm, Psuedo code for expressing algorithms, Performance Analysis-
Space complexity, Time complexity, Asymptotic Notation- Big oh notation, Omega notation,
Theta notation and Little oh notation, Probabilistic analysis, Amortized analysis.
Divide and conquer: General method, applications-Binary search, Quick sort, Merge sort,
Strassen’s matrix multiplication.
What is an Algorithm?
‘’a set of steps to accomplish or complete a task that is described precisely enough that a computer
can run it’’.
Described precisely: very difficult for a machine to know how much water, milk to be
added etc. in the above tea making algorithm.
These algorithms run on computers or computational devices..For example, GPS in our
smartphones, Google hangouts.
GPS uses shortest path algorithm.. Online shopping uses cryptography which uses RSA
algorithm.
• Algorithm Definition1:
• Algorithms that are definite and effective are also called computational procedures.
• A program is the expression of an algorithm in a programming language
Example of Pseudocode:
• To find the max element of an array
Algorithm arrayMax(A, n) Input
array A of n integers Output
maximum element of A
currentMax A[0]
for i 1 to n − 1 do
if A[i] currentMax then
return currentMax
Control flow
• if … then … [else …]
• while … do …
• repeat … until …
• for … do …
• Indentation replaces braces
Method declaration
• Algorithm method (arg [, arg…])
• Input …
• Output …
Method call
• var.method (arg [, arg…])
Return value
• return expression
Expressions
• Assignment (equivalent to =)
• Equality testing (equivalent to = =)
• n2 Superscripts and other mathematical formatting allowed
PERFORMANCE ANALYSIS:
What are the Criteria for judging algorithms that have a more direct
relationship to performance?
• computing time and storage requirements.
Space Complexity:
When we design an algorithm to solve a problem, it needs some computer memory to
complete its execution. For any algorithm, memory is required for the following purposes...
4. And for few other things like funcion calls, jumping statements etc,.
The Space needed by each of these algorithms is seen to be the sum of the following component.
➢ A fixed part that is independent of the characteristics (eg:number,size)of the inputs and
outputs.
The part typically includes the instruction space (ie. Space for the code), space for simple
variable and fixed-size component variables (also called aggregate) space for constants, and so
on.
➢ A variable part that consists of the space needed by component variables whose size is
dependent on the particular problem instance being solved, the space needed by referenced
variables (to the extent that is depends on instance characteristics), and the recursion stack space.
➢ The space requirement s(p) of any algorithm p may therefore be written as, S(P) = c+ Sp
(Instance characteristics) Where ‘c’ is a constant.
Time Complexity:
Every algorithm requires some amount of computer time to execute its instruction to perform the
task. This computer time required is called time complexity.
The time complexity of an algorithm can be defined as follows...
The time complexity of an algorithm is the total amount of time required by an algorithm to
complete its execution.
Generally, the running time of an algorithm depends upon the following...
1. Whether it is running on Single processor machine or Multi processor machine.
2. Whether it is a 32 bit machine or 64 bit machine.
3. Read and Write speed of the machine.
4. The amount of time required by an algorithm to
perform Arithmetic operations, logical operations, return value and assignment operations etc.,
5. Input data
Calculating Time Complexity of an algorithm based on the system configuration is a very difficult
task because the configuration changes from one system to another system. To solve this problem,
we must assume a model machine with a specific configuration. So that, we can able to calculate
generalized time complexity according to that model machine.
To calculate the time complexity of an algorithm, we need to define a model machine. Let us
assume a machine with following configuration...
It is a Single processor machine
It is a 32 bit Operating System machine
It performs sequential execution
It requires 1 unit of time for Arithmetic and Logical operations
It requires 1 unit of time for Assignment and Return value
It requires 1 unit of time for Read and Write operations
Now, we calculate the time complexity of following example code by using the above-defined
model machine...
Time Complexity can be calculated by using Two types of methods. They are:
• Step Count Method
• Asymptotic Notation.
Here, we will discuss the Step Count Method.
What is Step Count Method?
The step count method is one of the methods to analyze the Time complexity of an algorithm. In
this method, we count the number of times each instruction is executed. Based on that we will
calculate the Time Complexity .
The step Count method is also called as Frequency Count method. Let us discuss step count for
different statements:
1. Comments:
• Comments are used for giving extra meaning to the program. They are not executed during
the execution. Comments are ignored during execution.
• Therefore the number of times that a comment executes is 0.
2. Conditional statements:
Conditional statements check the condition and if the condition is correct then the conditional
subpart will be executed. So the execution of conditional statements happens only once. The
compiler will execute the conditional statements to check whether the condition is correct or not
so it will be executed one time.
• In if-else statements the if statement is executed one time but the else statement will be
executed zero or one time because if the “if” statement is executed then the else statement
will not execute.
• In switch case statements the starting switch(condition) statement will be executed one time
but the inner case statements will execute if none of the previous case statements are
executed.
• In nested if and if else ladder statements also the initial if statement is executed at least once
but inner statements will be executed based on the previous statements’ execution.
3. Loop statements:
Loop statements are iterative statements. They are executed one or more times based on a given
condition.
• A typical for(i = 0; i ≤ n; i++) statement will be executed “n+1” times for the first n times
the condition is satisfied and the inner loop will be executed and for the (n+1) th time the
condition is failed and the loop terminates.
• While: The statement is executed until the given condition is satisfied.
• Do while: The statement will repeat until the given condition is satisfied. The do-while
statement will execute at least once because for the first time it will not check the condition.
4. Functions:
Functions are executed based on the number of times they get called. If they get called n times
they will be executed n times. If they are not called at least once then they will not be executed.
Other statements like BEGIN, END and goto statements will be executed one time.
Example: Analysis of Linear Search algorithm
Let us consider a Linear Search Algorithm.
Linearsearch(arr, n, key)
{
i = 0;
for(i = 0; i < n; i++)
{
if(arr[i] == key)
{
printf(“Found”);
}
}
Where,
• i = 0, is an initialization statement and takes O(1) times.
• for(i = 0;i < n ; i++), is a loop and it takes O(n+1) times .
• if(arr[i] == key), is a conditional statement and takes O(1) times.
• printf(“Found”), is a function and that takes O(0)/O(1) times.
Therefore Total Number of times it is executed is n + 4 times. As we ignore lower exponents in
time complexity total time became O(n).
Time complexity: O(n).
Auxiliary Space: O(1)
Linear Search in Matrix
Searching for an element in a matrix
Algo Matrixsearch(mat[][], key)
{
// number of rows;
r := len(mat)
// number of columns;
c := len(mat[0])
for(i = 0; i < r; i++)
{
for(j = 0; j < c; j++)
{
if(mat[i][j] == key)
{
printf(“Element found”);
}
}
}
}
Where,
• r = len(mat), takes O(1) times.
• c = len(mat[0]), takes O(1) times
• for(i = 0; i < r; i++), takes O(r + 1) times
• for(j = 0; j < c; j++), takes O(( c + 1 ) ) for each time the outer loop is satisfied. So total r
times the loop is executed.
• if(mat[i][j] == key), takes O(1) times
• printf(“Element found”), takes O(0)/O(1) times.
Therefore Total Number of times it is executed is (1 + 1 + (r + 1) + (r) * (c + 1) + 1) times. As
we ignore the lower exponents, total complexity became O(r * (c + 1)).
the mat is an array so it takes n*n words , k, c, r, i, j will take 1 word.
Time Complexity: O(n2).
Auxiliary Space: O(n2)
In this way, we calculate the time complexity by counting the number of times each line
executes.
Advantages of using this method over others:
• Easy to understand and implement.
• We will get an exact number of times each statement is executed.
Example
int sum(int a, int b)
{
return a+b;
}
In the above sample code, it requires 1 unit of time to calculate a+b and 1 unit of time to return the
value. That means, totally it takes 2 units of time to complete its execution. And it does not change
based on the input values of a and b. That means for all input values, it requires the same amount of
time i.e. 2 units.
If any program requires a fixed amount of time for all input values then its time complexity is
said to be Constant Time Complexity.
Consider the following piece of code...
Example 2
int sum(int A[], int n)
{
int sum = 0, i;
for(i = 0; i < n; i++)
sum = sum + A[i];
return sum;
}
For the above code, time complexity can be calculated as follows...
Totally it takes '4n+4' units of time to complete its execution and it is Linear Time Complexity.
If the amount of time required by an algorithm is increased with the increase of input value
then that time complexity is said to be Linear Time Complexity.
[Which does not involve any calls to other algorithms]
• Iterative statement such as for, while & repeat-until ‘à Control part of the statement.
• We introduce a variable, count into the program statement to increment count with initial value
‘0’.Statement to increment count by the appropriate amount are introduced into the program.
• This is done so that each time a statement in the original program executes count is
incremented by the step count of that statement.
Example: Recursive Function to sum of a list of numbers
} 0
Total 2 2+x
X= tRSum(n-1)
tRSum(n)= 2 if n<=0
2+x if n>0
_____________________________________________________________________________
Algorithm mult(a, b, c, M) 0 0 0
{ 0 0 0
} 0 1 0
} 0 1
_______________________________________________________________________________
Asymptotic notations are mathematical tools to express the time complexity of algorithms for
asymptotic analysis.
Note: if there’s no input to the algorithm, Then it is considered to work in a constant time. Other
than the "input" all other factors are considered to be constant.
• Big Oh Notation(Ο)
• Omega Notation(Ω)
• Theta Notation(θ)
• Little oh notation(o)
Example:
f(n) = 2n + 3
2n + 3 ≤ 10 n ∀ n ≥ 1
Here, c=10, n0=1, g(n)=n => f(n) = O(n)
Also, 2n + 3 ≤ 2 n + 3n
2n + 3 ≤ 5 n ∀ n ≥ 1
And, 2n + 3 ≤ 2n2 + 3n2
2n + 3 ≤ 5n2 => f(n) = O(n2 )
O(1) < O(log n) < O(√ n) < O(n) < O(n log n) < O(n2 ) < O(n3 ) < O(2n ) < O(3n ) < O(nn )
The Omega Asymptotic Notation Ο(n) represents the lower bound of an algorithm’s running time.
It measures the best-case time complexity or the minimum amount of time an algorithm can
possibly take to complete. example, a Bubble Sort algorithm has a running time of Ω(N) because in
the best-case scenario, the list is already sorted, and the bubble sort will terminate after the first
iteration.
g(n) is an asymptotic lower bound for f(n)
Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }
Example:
f(n) = 2n + 3
2n + 3 ≥ n ∀ n ≥ 1
Here, c=1, n0=1, g(n)=n => f(n) = Ω(n)
Also, f(n) = Ω(log n)
f(n) = Ω(√n)
The Theta Asymptotic Notation Ο(n) represents the both lower bound and upper bound of an
algorithm’s running time. It measures the average case of time complexity. When we use big-Θ
notation, we’re saying that we have an asymptotically tight bound on the running time.
1 * n ≤ 2n + 3 ≤ 5n ∀ n ≥ 1
Example :
f(n) = 2n2 + 3n + 4
2n2 + 3n + 4 ≤ 9n2
f(n) = O (n2 )
also, 2n2 + 3n + 4 ≥ 1 * n2
Example :
f(n) = n2 log n + n
Example :
f(n) = n! = 1 × 2 × 3 × 4 × … × n 1 × 1 × 1 × … × 1 ≤ 1 × 2 × 3 × 4 × … × n ≤ n × n × n × … × n
1 1 ≤ n! ≤ nn
f(n) is o(g(n)), if for all real constants c (c > 0) and n0 (n0 > 0), f(n) is < c g(n) for every input size n
(n > n0).
The definitions of O-notation and o-notation are similar. The main difference is that in f(n) =
O(g(n)), the bound f(n) <= g(n) holds for some constant c > 0, but in f(n) = o(g(n)), the bound f(n)
< c g(n) holds for all constants c > 0.
"Little-omega" notation( ω) :
Little-omega, commonly written as ω, is an Asymptotic Notation to denote the lower bound (that
is not asymptotically tight) on the growth rate of runtime of an algorithm.
f(n) is ω(g(n)), if for all real constants c (c > 0) and n0 (n0 > 0), f(n) is > c g(n) for every input size
n (n > n0).
The definitions of Ω-notation and ω-notation are similar. The main difference is that in f(n) =
Ω(g(n)), the bound f(n) >= g(n) holds for some constant c > 0, but in f(n) = ω(g(n)), the bound f(n)
> c g(n) holds for all constants c > 0.
1. General Properties:
If f(n) is O(g(n)) and k is constant then k*f(n) is also O(g(n)).
2. Transitive Properties:
If g(n) is O(h(n)) and f(n) is O(g(n)) then f(n) = O(h(n)).
3. Reflexive Properties:
If f(n) is given then f(n) is O(f(n)). Since the max value of f(n) will be f(n) itself
Hence x = f(n) and y = O(f(n) tie themselves in reflexive relation always.
4. Symmetric Properties:
If f(n) is Θ(g(n)) then g(n) is Θ(f(n)).
2. If f(n) = O(g(n)) and d(n)=O(e(n)) then f(n) + d(n) = O( max( g(n), e(n) ))
Conclusion:
In the realm of algorithm analysis, the properties of asymptotic notation—Big O, Omega, and
Theta—serve as indispensable tools for quantifying the efficiency of algorithms as input sizes
become large. By abstracting away constant factors and lower-order terms, these notations allow us
to focus on the fundamental growth rates of algorithms. Big O notation helps us establish upper
bounds on an algorithm’s runtime, Omega notation provides lower bounds, and Theta notation
offers a tight range of growth rates. Armed with these notations, we can make informed decisions
about algorithm selection, optimization, and scalability, contributing to the development of more
efficient and effective software solutions.
Recurrence relation: Recurrence relation is used to analyze the time complexity of recursive
algorithms in terms of input size.
• A recurrence relation is a mathematical equation in which any term is defined by its previous
terms.
• Recurrence relation is used to analyze the time complexity of recursive algorithms in terms of
input size.
swap(a[l], a[r]);
reverse(a, l-1, r+1); // input size decreased by 2
}
• Time complexity => Time complexity of solving an (n-2) size problem + Time complexity of
swapping operation
• recurrence relation => T(n) = T(n-2) + c, where T(1) = c
T(n) = T(n-2)+O(1)
Merge sort
• recurrence relation: T(n) = 2T(n/2) + cn, T(1) = c
• cn = extra cost of merging the solution of two smaller sub-problems of size n/2
• The time complexity of recursion depends on the number of times the function calls itself.
• If a function calls itself two times then its time complexity is O(2^N).
• If it calls three times then its time complexity is O(3^N) and so on.
Master Theorem
The master theorem is a formula for solving recurrences of the form T(n) = aT(n/b) +f(n), where a ≥ 1 and
b > 1 and f(n) is asymptotically positive. (Asymptotically positive means that the function is positive for all
sufficiently large n.)
All divide and conquer algorithms (also discussed in detail in the Divide and Conquer chapter)
divide the problem into sub-problems, each of which is part of the original problem, and then
perform some additional work to compute the final answer. As an example, a merge sort
algorithm [for details, refer to Sorting chapter] operates on two sub-problems, each of which is
half the size of the original, and then performs O(n) additional work for merging. This gives the
running time equation:
T(n) = 2T(n/2) + O(n)
The following theorem can be used to determine the running time of the divide and conquer
algorithms. For a given program (algorithm), first, we try to find the recurrence relation for the
problem. If the recurrence is of the below form then we can directly give the answer without fully
solving it. If the recurrence is of the form ,T(n) = aT (n/b) + Θ(n klogpn) where a ≥ 1,b >1,k ≥ 0 and
p is a real number, then:
The master theorem compares the function n logb a to the function f(n). Intuitively, if n logb a is larger (by a
polynomial factor), then the solution is T(n) = Θ(n logb a ). If f(n) is larger (by a polynomial factor), then the
solution is T(n) = Θ(f(n)). If they are the same size, then we multiply by a logarithmic factor.
Examples:
To use the master theorem, we simply plug the numbers into the formula.
T(n) = Θ(n 2 ).
Example 2: T(n) = T(2n/3)+1. Here a = 1, b = 3/2, f(n) = 1, and n logb a = n 0 = 1. Since f(n) = Θ(n logb a ), case
2 of the master theorem applies, so the solution is T(n) = Θ(log n).
Example 3: T(n) = 3T(n/4) + n log n. Here n logb a = n log4 3 = O(n 0.793). For = 0.2, we have
f(n) = Ω(n log4 3+). So case 3 applies if we can show that af(n/b) ≤ cf(n) for some c < 1 and all sufficiently
large n. This would mean 3(n /4) log( n /4) ≤ cn log n. Setting c = 3/4 would cause this condition to be
satisfied.
Example 4:
T(n) = 2T(n/2)+n log n. Here the master method does not apply. n logb a = n, and f(n) = n log n.
Case 3 does not apply because even though n log n is asymptotically larger than n, it is not polynomially
larger. That is, the ratio f(n)/nlogb a = log n is asymptotically less than n for all positive constants .
Control abstraction for divide-and-conquer technique: Control abstraction means a procedure whose
flow of control is clear but whose primary operations are satisfied by other procedure whose precise
meanings are left undefined.
Algorithm DandC(p)
{
if small (p) then return S(p)
else
{
Divide P into small instances P1, P2, P3……..Pk, where k≥1;
Apply DandC to each of these sub-problems;
return combine (DandC(P1), DandC(P1),…. DandC(Pk));
}
}
Algorithm: Control abstraction for divide-and-conquer DandC(p) is the divide-and-conquer
algorithm, where P is the problem to be solved.
Small(p) is a Boolean valued function(i.e., either true or false) that determines whether the input size is
small enough that the answer can be computed without splitting. If this is so, the function S is invoked.
Otherwise the problem P is divided into smaller sub-problems.
These sub-problems P1, P2, P3……..Pk, are solved by receive applications of DandC.
Combine is a function that combines the solution of the K sub-problems to get the solution for original
problem ‘P’.
Example: Specify an application that divide-and-conquer cannot be applied.
Solution: Let us consider the problem of computing the sum of n numbers a0, a1,…..an-1. If n>1, we
divide the problem into two instances of the same problem. That is to compute the sum of the first [n/2]
numbers and to compute the sum of the remaining [n/2] numbers. Once each of these two sum is
compute (by applying the same method recursively), we can add their values to get the sum in question-
a0+ a1+….+an-1= (a0+ a1+….+a[n/2]-1)+ a[n/2]-1+………+ an-1).
For example, the sum of 1 to 10 numbers is as follows-
(1+2+3+4+………………..+10)
= (1+2+3+4+5)+(6+7+8+9+10)
= [(1+2) + (3+4+5)] + [(6+7) + (8+9+10)]
= …..
= …..
= (1) + (2) +…………..+ (10).
This is not an efficient way to compute the sum of n numbers using divide-and-conquer technique. In
this type of problem, it is better to use brute-force method.
Binary Search:
Binary search is an efficient searching technique that works with only sorted lists. So the
list must be sorted before using the binary search method. Binary search is based on divide-and-
conquer technique.
The process of binary search is as follows:
The method starts with looking at the middle element of the list. If it matches with the key
element, then search is complete. Otherwise, the key element may be in the first half or second
half of the list. If the key element is less than the middle element, then the search continues with
the first half of the list. If the key element is greater than the middle element, then the search
continues with the second half of the list. This process continues until the key element is found or
the search fails indicating that the key is not there in the list.
Consider the list of elements: -4, -1, 0, 5, 10, 18, 32, 33, 98, 147, 154, 198, 250, 500.
Trace the binary search algorithm searching for the element -1.
Sol: The given list of elements are:
Low High
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
-4 -1 0 5 10 18 27 32 33 98 147 154 198 250 500
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Here, the search key -1 is less than the middle element (32) in the list. So the search process
continues with the first half of the list.
Low High
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
The search key ‘-1’ is less than the middle element (5) in the list. So the search process
continues with the first half of the list.
Low High
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
The main advantage of binary search is that it is faster than sequential (linear) search. Because
it takes fewer comparisons, to determine whether the givenkey is in the list, then the linear
search method.
Disadvantages of Binary Search: The disadvantage of binary search is that can be applied to
only a sorted list of elements. The binary search is unsuccessful if the list is unsorted.
Efficiency of Binary Search: To evaluate binary search, count the number of comparisons inthe
best case, average case, and worst case.
Best Case:
The best case occurs if the middle element happens to be the key element. Then onlyone
comparison is needed to find it. Thus the efficiency of binary search is O(1).
Ex: Let the given list is: 1, 5, 10, 11, 12.
Low Mid High
11 12
Let key = 10.
1 5
Since the key is the middle element and is found at our first attempt.
Worst Case:
Assume that in worst case, the key element is not there in the list. So the process ofdivides the
list in half continues until there is only one item left to check.
Items left to search Comparisons so far
16 0
8 1
4 2
2 3
1 4
For a list of size 16, there are 4 comparisons to reach a list of size one, given that there is one
comparison for each division, and each division splits the list size in half.
In general, if n is the size of the list and c is the number of comparisons, then
C = log2 n
.
Binary Search
• In the case of the iterative approach, no extra space is used. Hence, the space complexity
is O(1).
• In the worst case, log n recursive calls are stacked in the memory.
o i comparisons require i recursive calls to be stacked in memory.
o Since average time complexity analysis has logn comparisons, the average memory
will be O(logn).
Thus, in recursive implementation the overall space complexity will be O(logn).
2)Quick Sort:
Sorting is a way of arranging items in a systematic manner. Quicksort is the widely used sorting
algorithm that makes n log n comparisons in average case for sorting an array of n elements. It is a
faster and highly efficient sorting algorithm. This algorithm follows the divide and conquer approach.
Divide and conquer is a technique of breaking down the algorithms into subproblems, then solving
the subproblems, and combining the results back together to solve the original problem.
Divide: In Divide, first pick a pivot element. After that, partition or rearrange the array into two sub-
arrays such that each element in the left sub-array is less than or equal to the pivot element and each
element in the right sub-array is larger than the pivot element.
Quicksort picks an element as pivot, and then it partitions the given array around the picked pivot
element. In quick sort, a large array is divided into two arrays in which one holds values that are
smaller than the specified value (Pivot), and another array holds the values that are greater than the
pivot.
After that, left and right sub-arrays are also partitioned using the same approach. It will continue
until the single element remains in the sub-array.
Picking a good pivot is necessary for the fast implementation of quicksort. However, it is typical to
determine a good pivot. Some of the ways of choosing a pivot are as follows -
o Pivot can be random, i.e. select the random pivot from the given array.
o Pivot can either be the rightmost element of the leftmost element of the given array.
o Select median as the pivot element.
Example overview of quick sort technique
Algorithm:
44 33 11 55 77 90 40 60 99 22 88
Let 44 be the Pivot element and scanning done from right to left
Comparing 44 to the right-side elements, and if right-side elements are smaller than 44, then
swap it. As 22 is smaller than 44 so swap them.
22 33 11 55 77 90 40 60 99 44 88
Now comparing 44 to the left side element and the element must be greater than 44 then swap them.
As 55 are greater than 44 so swap them.
22 33 11 44 77 90 40 60 99 55 88
Recursively, repeating steps 1 & steps 2 until we get two lists one left from pivot element 44 & one
right from pivot element.
22 33 11 40 77 90 44 60 99 55 88
22 33 11 40 44 90 77 60 99 55 88
Now, the element on the right side and left side are greater than and smaller than 44 respectively.
And these sublists are sorted under the same process as above done.
Quicksort complexity
Now, let's see the time complexity of quicksort in best case, average case, and in worst case. We will
also see the space complexity of quicksort.
The best case occurs when we select the pivot as the mean. So here
T(N) = 2 * T(N / 2) + N * constant
Now T(N/2) is also 2*T(N / 4) + N / 2 * constant. So,
T(N) = 2*(2*T(N / 4) + N / 2 * constant) + N * constant
= 4 * T(N / 4) + 2 * constant * N.
So, we can say that
T(N) = 2k * T(N / 2k) + k * constant * N
then, 2k = N
k = log2N
So T(N) = N * T(1) + N * log2N.
Therefore, the time complexity is O(N * logN).
o Best Case Complexity analysis - In Quicksort, the best-case occurs when the pivot element is the middle
element or near to the middle element. The best-case time complexity of quicksort is O(n*logn).
1 n
2 2 *(n/2)
3 4 *(n/4)
. .
. .
x x (n/x)
o Average Case Complexity - It occurs when the array elements are in jumbled order that is not
properly ascending and not properly descending. The average case time complexity of quicksort
is O(n*logn).
Let cA(n) be the average number of key comparisons made by quick-sort on a list of
elements of size n. assuming that the partitions split can happen in each position k(1kn)
With the same probability 1/n, we get the following recurrence relation.
The left child of each node represents a sub-problem size 1/4 as large, and the right child
represents a sub-problem size 3/4 as large.
There are log4/3 n levels, and so the total partitioning time is O(nlog4/3n). Now, there's a
mathematical fact that
logan = logbn / logba
for all positive numbers a, b, and n. Letting a=4/3 and b=2, we get thatlog4/3 n=log
n / log(4/3)
Quick Sort
Worst Case complexity : In worst case, assume that the pivot partition the list into two parts,
so that one of the partition has no elements while the other has all the other elements.
Total number of comparisons will be-
( n - 1) + ( n - 2) + ( n - 1) +………..+ 2 + 1 = n ( n - 1 ) / 2
2
Though the worst-case complexity of quicksort is more than other sorting algorithms such as Merge
sort and Heap sort, still it is faster in practice. Worst case in quick sort rarely occurs because by
changing the choice of pivot, it can be implemented in different ways. Worst case in quicksort can
be avoided by choosing the right pivot element.
Worst-case scenario: O(n) due to unbalanced partitioning leading to a skewed recursion tree
requiring a call stack of size O(n).
• It has a worst-case time complexity of O(N2 ), which occurs when the pivot is chosen
poorly.
• It is not a good choice for small data sets.
• It is not a stable sort, meaning that if two elements have the same key, their relative order
will not be preserved in the sorted output in case of quick sort, because here we are
swapping elements according to the pivot’s position (without considering their original
positions).
Quicksort Applications:
Quicksort's efficiency and adaptability make it suitable for many applications, including but not
limited to:
1. Sorting Algorithms: Quicksort is frequently used as a building block for hybrid sorting algorithms, such
as Timsort (used in Python's built-in sorting function).
2. Database Systems: Quicksort plays a vital role in database management systems for sorting records
efficiently.
3. Computer Graphics: Rendering and graphics applications often involve sorting operations, where
Quicksort can be employed to optimize rendering performance.
4. Network Routing: Quicksort can be utilized in various networking algorithms, particularly routing tables.
5. File Systems: File systems use Quicksort to manage and organize files efficieciently.
3)Merge Sort:
Merge sort is based on divide-and-conquer technique. Merge sort method is a two phase
process-
1. Dividing
2. Merging
Dividing Phase: During the dividing phase, each time the given list of elements is divided into
two parts. This division process continues until the list is small enough to divide.
Merging Phase: Merging is the process of combining two sorted lists, so that, the resultant list is
also the sorted one. Suppose A is a sorted list with n element and B is a sorted list with n2 elements.
The operation that combines the elements of A and B into a single sorted list C with n=n1 + n2,
elements is called merging.
Ex: Let the list is: - 500, 345, 13, 256, 98, 1, 12, 3, 34, 45, 78, 92.
500 345 13 256 98 1 12 3 34 45 78 92
12
1 98 256 3 12 34 45 78 92
13 345 500
Sorted List
1 3 12 13 34 45 78 92 98 256 354
The merge sort algorithm works as follows-
Step 1: If the length of the list is 0 or 1, then it is already sorted, otherwise,
Step 2: Divide the unsorted list into two sub-lists of about half the size.
Step 3: Again sub-divide the sub-list into two parts. This process continues until each element in
the list becomes a single element.
Step 4: Apply merging to each sub-list and continue this process until we get one sorted list.
Algorithm-(Divide algorithm)
Algorithm Divide (a, low, high)
{
// a is an array, low is the starting index and high is the end index of a
While (j high) do
{
B[k]=a[j];
k: = k+1;
j: =j + 1;
}
//copy elements of b to a
for i: = l to n do
{
A[i]: =b[i];
}
}
Let T (n) be the total time taken by the Merge Sort algorithm.
But we ignore '-1' because the element will take some time to be copied in merge lists.
So T (n) = 2T + n...equation 1
Note: Stopping Condition T (1) =0 because at last, there will be only 1 element left that need
to be copied, and there will be no comparison.
logn=log2i
logn=ilog2
log2n=i
From 6 equation
Best Case Complexity: The merge sort algorithm has a best-case time complexity
of O(n*log n) for the already sorted array.
Average Case Complexity: The average-case time complexity for the merge sort algorithm
is O(n*log n), which happens when 2 or more elements are jumbled, i.e., neither in the
ascending order nor in the descending order.
Worst Case Complexity: The worst-case time complexity is also O(n*log n), which occurs
when we sort the descending order of an array into the ascending order.
Efficiency of Merge List: Let ‘n’ be the size of the given list/ then the running time for merge
sort is given by the recurrence relation.
We can solve this equation by using successive substitution.
Replace n by n/2 in equation- , 1 we get
= 4T (n/4) + 2Cn
= 4T 2 T (n/8) + Cn + 2Cn
4
...
...
...
= 2 k T(1) + KCn .. .
= a n + Cn log n
Therefore
. T (n) = O( n log n)
Advantages of Merge Sort
• Guaranteed worst-case complexity: Merge Sort has a guaranteed O(n log n) time complexity, even in the
worst-case scenario. This means the sorting time increases logarithmically with the number of elements,
making it efficient for large datasets.
• Stable sorting: Merge Sort preserves the relative order of equal elements. This is useful in situations where
the order of identical elements matters, like sorting files by name and creation date.
• Efficient for external sorting: Merge Sort can be efficiently used for external sorting, where the data is too
large to fit in memory at once. It breaks down the data into smaller chunks and sorts them on disk before
merging them back together.
• Parallelizable: Merge Sort can be easily parallelized by dividing the sorting task across multiple processors
or cores, further improving its speed for large datasets.
• No in-place modification: Merge Sort requires additional memory to store the sorted sub-arrays during the
merging process. While this can be a disadvantage for small datasets with limited memory, it avoids
overwriting the original data, which can be useful for certain applications.
• Higher memory usage: As mentioned above, Merge Sort requires additional memory to store the sub-arrays
during merging. This can be a disadvantage for small datasets or systems with limited resources.
• Not as efficient for small datasets: For small datasets, Merge Sort's overhead in dividing and merging sub-
arrays can make it slower than simpler sorting algorithms like Bubble Sort or Insertion Sort.
• Cache inefficiencies: Merge Sort might suffer from cache inefficiencies due to its access patterns, especially
for small datasets. This can be mitigated by careful coding and data organization.
In conclusion, Merge Sort is like a wise, experienced librarian, adept at organizing vast collections
of data with precision and efficiency. By understanding and implementing this algorithm in C, you
unlock the potential to manage and sort through data with the grace of a well-organized library.
Algorithm
MatrixMultiply(n,k,m,A[1..n][1..k], B[1..k][1..m])
begin
C[1..n][1..m]; // define the result
matrixfor i = 1 to n do
for j = 1 to m do
c = 0;
for s = 1 to k do
c = c + A[i][s] _ B[s][j];
end for
C[i][j] = c;
Analysis
Correctness is straightforward: the algorithm implements faith-fully the definitionof the matrix
multiplication.
Runtime. Let us assume that n =(N), m = (N) and k = (N). Let us estimate
the runtime complexity of Algorithm MatrixMultiply by counting the mostexpensive
operations in the algorithm: the multiplications.
The c = c+A[i][s]*B[s][j] assignment statement will be executed exactly n · m · ktimes. With the
assumptions about, we obtain our bound on the runtime of the algorithm: T(N) = (N3).
Here XP, Y R, XQ, Y S, ZP, WR, ZQ and WS are products of the respective matrices X, Y , Z,
W, P, Q, R, S and the + operator is the element-by-elementmatrix addition.
Using this observation, we can devise a divide-and-conquer algorithm formultiplying matrices
Algorithm
Analysis
Consider the running time of the Algorithm MatrixSum. The assignment operationin that
algorithm is performed n2 times, so, the running time of the algorithm is O(n2) ((n2), in fact).
Now, we can devise the recurrence relation to represent the running time
of Algorithm MMDC. Algorithm MMDC reduces solving problem of multiplyingof two n × n
matrices to eight problems of multiplying n/2 x n/2 matrices, and computing
four O(n2) matrix sums. Therefore, the recurrence relation for Algorithm MMDC is:
T(n) = 8T(n/2) + O(n2)
To solve this recurrence relation, observe that in terms of the Master Theorem a =8, b = 2 and
logb(a) = 3 and f(n) = O(n2) = o(nloga(b)−€) = o(n3−0.2 for € = 0.2.
Therefore, by the Master Theorem,
T(n) = O(n3)
This does not improve upon the straightforward algorithm, but as we saw beforewith finding
second largest number problem, this gives us a set up to devise a better algorithm that would
not be possible without Divide-and- Conquer.
Strassen’s Algorithm
In 1969, Volker Strassen, a German mathematician, observed that we can eliminateone matrix
multiplication operation from each round of the divide- and-conquer algorithm for matrix
multiplication.
Consider again two n × n matrices
We recall
By Master Theorem, because n2 = o(n log2 7−€), the running time of the
Strassen’s Algorithm is
T(n) = O(nlog2 7) = O(n 2.81)
Note. This is not a tight upper bound on the algorithmic complexity of matrix multiplication.
The current best algorithmic bound is O(n 2.3728). This algorithm, however, and other
algorithms similar to it have a very large multiplicative constant associated with the
computation, that it is not practical to use.