DATA STRUCTURES AND
ALGORITHMS
Lecture 6
ANALYSIS OF SORTING ALGORITHMS
2
WHY THE ANALYSIS OF ALGORITHMS?
• To go from city “A” to city “B”, there can be many ways of accomplishing
this: by flight, by bus, by train and also by bicycle. Depending on the
availability and convenience, we choose the one that suits us.
• Similarly, in computer science, multiple algorithms are available for
solving the same problem (for example, a sorting problem has many
algorithms, like insertion sort, selection sort, quick sort and many more).
• Algorithm analysis helps us to determine which algorithm is most efficient
in terms of time and space consumed. 3
ANALYSIS OF ALGORITHMS
• An important step is to determine how much time or space the algorithm will
require.
• The amount of memory a program needs to run to completion is called space
complexity.
• The amount of computer time a program needs to run to completion is called
time complexity.
4
HOW TO COMPARE ALGORITHMS
Given two algorithms for a task, how do we find out which one is better?
• One naive way of doing this is – implement both the algorithms and run the two programs on
your computer for different inputs and see which one takes less time.
• There are many problems with this approach for analysis of algorithms.
• It might be possible that for some inputs, first algorithm performs better than the second.
And for some inputs second performs better.
• It might also be possible that for some inputs, first algorithm perform better on one
machine and the second works better on other machine for some other inputs.
5
HOW TO COMPARE ALGORITHMS
To compare algorithms, let us define a few objective measures:
• Execution times? Not a good measure as execution times are specific to a particular computer.
• Number of statements executed? Not a good measure, since the number of statements varies
• Ideal solution? Let us assume that we express the running time of a given algorithm as a
function of the input size n (i.e., f(n)) and compare these different functions corresponding to
running times. This kind of comparison is independent of machine time, programming style,
etc.
6
ASYMPTOTIC ANALYSIS
• In Asymptotic Analysis, we evaluate the performance of an algorithm in terms of input size.
• We calculate, how does the time (or space) taken by an algorithm increases with the input
size.
• For example, let us consider the search problem (searching a given item in a sorted array).
• One way to search is Linear Search (order of growth is linear), and
• Other way is Binary Search (order of growth is logarithmic).
7
ASYMPTOTIC ANALYSIS
• Asymptotic Analysis is not perfect, but that’s the best way available for analyzing
algorithms.
• For example, say there are two sorting algorithms that take 1000nLogn and 2nLogn time
respectively on a machine.
• Both of these algorithms are asymptotically same (order of growth is nLogn).
What is Rate of Growth?
• The rate at which the running time increases as a function of input is called rate of growth.
8
GROWTH RATES
Execution
Time
Input Size
9
TYPES OF ANALYSIS
• To analyze the given algorithm, we need to know with which inputs the
algorithm takes less time (performing well) and with which inputs the algorithm
takes a long time.
• There are three types of analysis:
• Worst Case
• Average Case
• Best Case
10
TYPES OF ANALYSIS
• Worst Case Analysis
• Defines the input for which the algorithm takes a long time (slowest time to complete).
• Input is the one for which the algorithm runs the slowest
• In the worst case analysis, we calculate upper bound on running time of an algorithm.
• For Linear Search, the worst case happens when the element to be searched is not present
or the last one in the array.
• Therefore, the worst case time complexity of linear search would be O(n).
• Most of the times, we do worst case analysis to analyze algorithms. 11
TYPES OF ANALYSIS
• Average Case Analysis
• Provides a prediction about the running time of the algorithm. Run the algorithm many
times, using many different inputs that come from some distribution that generates these
inputs
• We take all possible inputs and calculate computing time for all of the inputs.
• Sum all the calculated values and divide the sum by total number of inputs.
• For Linear Search, the average case time complexity is O(n/2).
• The average case analysis is not easy to do in most of the practical cases and it is rarely
done. 12
TYPES OF ANALYSIS
• Best Case Analysis
• Defines the input for which the algorithm takes the least time (fastest time to complete).
Input is the one for which the algorithm runs the fastest.
• In the best case analysis, we calculate lower bound on running time of an algorithm.
• In the linear search problem, the best case occurs when ‘x’ is present at the first location.
• The number of operations in the best case is constant (not dependent on n).
• So time complexity in the best case would be O(1)
• The best case analysis is generally not used.
13
ASYMPTOTIC NOTATIONS
• Asymptotic notations are mathematical tools to represent time complexity of
algorithms for asymptotic analysis.
• The following 4 asymptotic notations are mostly used to represent time
complexity of algorithms.
• Big O Notation
• Θ Notation
• Ω Notation
• Small o Notation
14
ASYMPTOTIC NOTATIONS
• Big O Notation
• The Big O notation defines an upper bound of an algorithm, it bounds a function only
from above.
• For example, Insertion Sort takes linear time in best case and quadratic time in worst
case.
• We can safely say that the time complexity of Insertion sort is O(n^2).
• Note that O(n^2) also covers linear time.
15
ASYMPTOTIC NOTATIONS
• Θ Notation
• The theta notation bounds a functions from above and below, so it defines exact
asymptotic behavior.
• A simple way to get Theta notation of an expression is to drop low order terms and
ignore leading constants.
• For example, consider the expression, 3n3 + 6n2 + 6000 = Θ(n3)
16
ASYMPTOTIC NOTATIONS
• Ω Notation
• Ω notation provides an asymptotic lower bound.
• The Omega notation is the least used notation among all three.
17
WHY IS IT CALLED ASYMPTOTIC ANALYSIS?
• From the discussion (for all three notations: worst case, best case, and average
case), we can easily understand that, in every case for a given function f(n) we
are trying to find another function g(n) which approximates f(n) at higher
values of n. That means g(n) is also a curve which approximates f(n) at higher
values of n.
• In mathematics we call such a curve an asymptotic curve. In other terms, g(n)
is the asymptotic curve for f(n). For this reason, we call algorithm analysis
asymptotic analysis.
18
GUIDELINES FOR ASYMPTOTIC ANALYSIS
• There are some general rules to help us determine the running time of an algorithm.
• Loops: The running time of a loop is, at most, the running time of the statements inside the
loop (including tests) multiplied by the number of iterations.
for(i=1; i<=n; i++) //executes n times
m=m+2; //constant time, c
Total time = a constant c × n = c n = O(n).
• Nested loops: Analyze from the inside out. Total running time is the product of the sizes of
all the loops.
for(i=1; i<=n; i++) // outer loop executes n time
for(j=1; j<=n; j++) // inner loop executes n time
K=k+1; // constant time
Total time = c × n × n = cn2 = O( n).2 19
TIME COMPLEXITY
Algorithm Best Case Average Case Worst Case
Bubble Sort Ω(n) Θ(n^2) O(n^2)
Insertion Sort Ω(n) Θ(n^2) O(n^2)
Selection Sort Ω(n^2) Θ(n^2) O(n^2)
Merge Sort Ω(nlog(n)) Θ(nlog(n)) O(nlog(n))
Quick Sort Ω(nlog(n)) Θ(nlog(n)) O(n^2)
Heap Sort Ω(nlog(n)) Θ(nlog(n)) O(nlog(n))
Linear Search Ω(1) Θ(n) O(n)
Binary Search Ω(1) Θ(log(n)) O(log(n))
20
ANALYSIS OF INSERTION SORT
Statement Execution Cost
proc insertionSort(a as array)
n |a| 1
for i 2 to n do n times
key a[i] n-1 times
ji–1 n-1 times
while j > 0 and a[j] > key do 1 + 2 + 3 + 4 + …. + n times
a[j+1] a[j] 1 + 2 + 3 + 4 + …. + n-1 times
jj-1 1 + 2 + 3 + 4 + …. + n-1 times
next
a[j+1] key n-1 times
next
end proc
ANALYSIS OF INSERTION SORT
Calculate the running time (execution cost) of Insertion Sort
T(n) = 1 + n + n-1 + n-1 + (1+2+3+…+n)+ (1+2+3+…+n-1)+ (1+2+3+…+n-1)+n-1
= 1 + n + n-1 + n-1 + + + +n -1
= 1 + n + n-1 + n-1 + + + + +n-1
=3+7 -2
= O(n2)