DAA Mod2
DAA Mod2
Module II
DIVIDE AND
CONQUER
Contents
1. Introduction to Divide and Conquer
General method
.[Link]: Strassen’s
matrix multiplication
4. Advantages and
Disadvantages of Conquer
divide and conquer.
5. Decrease and Conquer
Approach Introduction
Algorithm: Topological Sort
Divide and Conquer is one of the best-known general algorithm design technique It
works according to the following general plan:
Given a function to compute on ‗n‘ inputs the divide-and-conquer strategy
suggests splitting the inputs into ‗k‘ distinct subsets, 1<k<=n, yielding ‗k‘ sub
problems.
These sub problems must be solved, and then a method must be found to
combine sub solutions into a solution of the whole.
If the sub problems are still relatively large, then the divide-and-conquer
strategy can possibly be reapplied.
Often the sub problems resulting from a divide-and-conquer design are of the
same type as the original problem. For those cases the reapplication of the divide-
and- conquer principle is naturally expressed by a recursive algorithm.
Now smaller and smaller subproblems of the same kind are generated until
eventually subproblems that are small enoughto be solved without spiltting are
produced.
A typical case with k=2 is diagrammatically shown below
Small (P) is a Boolean valued function that determines whether the input size
is small enough that the answer can be computed without splitting. If small (P)
is true then function ‘S’ is invoked.
Otherwise the problem ‗P‘ is divided into sub problems. These sub
problems are solved by recursive application of DAndC.
Finally the solution from k sub problems is combined to obtain the solution of the
given problem using Combine function.
If the size of problem ‗p‘ is n and the sizes of the ‗k‘ sub problems are n1, n2
….nk, respectively, then the computing time of divide and conquer is described by the
recurrence relation
Where,
T(n) is the time for divide and conquer method on any input of size n and
g(n) is the time to compute answer directly for small inputs.
The function f(n) is the time for dividing the problem ‗p‘ and combining the
solutions
to sub problems.
For divide and conquer based algorithms that produce sub problems of the
same type as the original problem, it is very natural to describe them by using
recursion.
It states that, in recurrence equation T(n) = aT(n/b) + f (n), If f (n)∈ Θ (nd ) where
d ≥ 0 then
Solution: For the above problem, the algorithm can be implemented as recursive
or non- recursive algorithm.
Algorithm BinSearch (A, low , high, x)
//Given an array A[low : high] of elements in non decreasing order , 1≤ low ≤ high
//Determine whether x is present, and if so return j such that x=A[j]; else return 0.
{
if (low=high) then
{
if(x=A[high]) then return h; //If Small
(P) else return 0;
}
else
{//reduce P into a smaller sub
problem. mid :=[(low +high)/2];
if(x=A[mid]) then return
mid; else if(x<A[mid]) then
return BinSearch (A, low, mid-1,
x); else return BinSearch (A, mid+1,
high, x);
}
}
The above algorithm is a recursive binary search algorithm.
Explanation:
The problem is subdivided into smaller problems until only one single element is left
out.
1. If low = high then it means there is only one single element. So compare it
with the search element ‗x‘. if both are equal then return the index, else
return 0.
2. If low high then divide the problem into smaller subproblems.
a. Calculate mid = (low+high)/2.
b. Compare x with element at mid if both are equal the return index mid.
c. Else if x is less than A[mid] then the element x would be in the first
partition A[low:mid-1]. So make a recursive call to BinSearch with low as
low and high as mid-1.
d. Else x is greater than A[mid] then the element x would be in the
second partition A[mid+1: high]. So make a recursive call to
BinSearch with low as mid+1 and high as high.
The below algorithm is an iterative algorithm for Binary search.
Algorithm BinSearch (a, n, x)
//Given an array a[1:n] of elements in non decreasing order, n≥0,
//Determine whether x is present, and if so return j such that x=a[j]; else return 0
{
low:=1;
high:=n;
while(low ≤ high) do
{
mid:=[(low+high)/2];
if(x<a[mid])then high:=mid-1;
else if(x>a[mid])then
low:=mid+1; else return mid;
}
return 0;
}
Example:-
Consider the elements :
-15,-6,0,7,9,23,25,54,82,101,112,125,131,142,151.
Place them in a[1:14], and simulate the steps BinSearch goes through
as it searches for different values of ‗x‘.
Only the variables, low, high & mid need to be traced as we simulate the algorithm.
We try the following values for x: 151, -14 and 9. For 2 successful searches &
1 for unsuccessful search.
Otherwise we observe that each time through the loop the possible elements to
be checked of or equality with x are a[low], a[low+1], ,a[mid],……a[high].
If x=a[mid], then the algorithm terminates successfully.
Otherwise, the range is narrowed by either increasing low to (mid+1) or
decreasing high to (mid-1).
Clearly, this narrowing of the range does not affect the outcome of
the search. If low becomes > than high, then ‗x‘ is not present & hence
the
loop is exited.
Each and every internal node represents a successful search and the values
denoted by internal nodes are the various values taken by the variable mid.
If the value of mid is at level zero then the no. of comparisons for an element
present at the position is one.
The no. of comparisons for a successful search if the mid value is at level one
is two and the maximum number of comparisons for a successful search in the
above decision tree is equal to four. i.e., the max. No. of comparisons is directly
proportional to the height of the tree. Therefore the time complexity for a
successful search is given by O (log2 n).
Each and every unsuccessful search terminates at an external node. The
no. of comparisons needed for an unsuccessful search of an element less
than -15 is 3. For the entire remaining unsuccessful search the no. of
comparisons made is 4. Therefore the average no. of elemental comparisons
of an unsuccessful search is
Except for one case the no. of comparisons for unsuccessful search is
constant and is equal to the height of the tree. Therefore the time for
unsuccessful search is given by O (log2 n).
Explanation:
StraightMaxMin requires 2(n-1) comparisons in the best, average & worst
cases.
By realizing that the comparison a[i]<min is necessary only when a[i]>max
is false, improvement in a algorithm can be done. Hence we can replace
the contents of the for loop by,
if(a[i]>max) then max
= a[i]; else if (a[i]< min)
min=a[i];
Now the best case occurs when the elements are in increasing order, then
only first if condition is satisfied(i,e true) therefore the number of
comparisions is (n
-1).
The worst case condition occurs when the elements are in decreasing
order, in this case the first condition is evaluated as false and the second
condition is evaluated to true therefore the number of comparisions is 2(n-
1).
Let P = (n, a [i],……,a[j]) denote an arbitrary instance of the problem. Here ‗n‘
is the no. of elements in the list (a[i],….,a[j]) and we are interested in finding
the maximum and minimum of the list.
Let Small(P) be true when <=2. In this case, the maximum and minimum
are a[i] if n=1 and if n=2, the problem can be solved by making one
comparison.
If the list has more than 2 elements, P has to be divided into smaller
instances. For example, we might divide ‗P‘ into the 2 instances, P1=(
[n/2],a[1], a[n/2])
and P2= ( n-[n/2], a[[n/2]+1], ... , a[n])
After having divided ‗P‘ into 2 smaller sub problems, we can solve them
by recursively invoking the same divide-and-conquer algorithm.
When the maximum and minimum of these sub problems are determined, the
two maxima are compared and the two minima are compared to achieve the
solution for the entire problem.
We see that the root node contains 1 and 9 as the values of i and j
corresponding to the initial call to MaxMin. This execution produces two new call
to MaxMin, where i and j have the values 1, 5 and 6, 9, and thus split the set
into two subsets of the same size. From the tree we can immediately see that
the maximum depth of recursion is four (including the first call). The circled
numbers in the upper left order of each node represent the the orders in which
max and min are assigned values.
Time Complexity:
Now what is the number of element comparisons needed for MaxMin? If
T(n) represents this number, then the resulting recurrence relation is
Note that 3n/2 – 2 is the best, average, worst case number of comparison when
n is a power of two.
Comparisons with Straight Forward Method:
Compared with the 2n – 2 comparisons for the Straight Forward method, this is
a saving of 25% in comparisons. It can be shown that no algorithm based on
comparisons uses less than 3n/2 – 2 comparisons.
Space Complexity
It sorts a given array A[1],….,A[n] by dividing it into two halves A[1],…..,A[ _n/2_ ] and
A[ _n/2_ +1], .. ,A[n], sorting each of them recursively, and then merging the two
smaller
sorted arrays into a single sorted one.
The following algorithm describes the process using recursion and the
algorithm Merge
merges two sorted sets.
The merging of two sorted arrays can be done as follows. Two pointers
(array
indices) are initialized to point to the first elements of the arrays being merged.
The elements pointed to are compared, and the smaller of them is added to a new
array being constructed; after that, the index of the smaller element is incremented
to point to its immediate successor in the array it was copied from.
This operation is repeated until one of the two given arrays is exhausted, and then
the remaining elements of the other array are copied to the end of the new array.
Analysis
Here the basic operation is key comparison. As merge sort execution does
not depend on the order of the data, best case and average case runtime are
the same as worst case runtime.
Assuming for simplicity that total number of elements n is a power of 2, the
recurrence relation for the number of key comparisons C(n) is
Where, Cmerge(n) is the number of key comparison made during the merging
stage. Let us analyze Cmerge(n), the number of key comparisons performed
during the merging stage. At each step, exactly one comparison is made, after
which the total number of elements in the two arrays still needing to be processed
is reduced by 1.
In the worst case, neither of the two arrays becomes empty before the other one
contains just one element (e.g., smaller elements may come from the alternating
arrays). Therefore, for the worst case, Cmerge(n) = n – 1. Now,
For large n, the number of comparisons made by this algorithm in the average
case turns out to be about 0.25n less and hence is also in Θ (n log n).
Mergesort is a stable
algorithm Limitations
overhead
Quick Sort
Quicksort is the other important sorting algorithm that is based on the divide-
andconquer approach. Unlike mergesort, which divides its input elements
according to their position in the array, quicksort divides them according to their
value.
A partition is an arrangement of the array‘s elements so that all the elements to the
left of some element A[s] are less than or equal to A[s], and all the elements to the
right of A[s] are greater than or equal to it
Obviously, after a partition is achieved, A[s] will be in its final position in the sorted
array, and we can continue sorting the two subarrays to the left and to the right of
A[s] independently (e.g., by the same method).
Note the difference with mergesort: there, the division of the problem into two
subproblems is immediate and the entire work happens in combining their solutions;
here, the entire work happens in the division stage, with no work required to
combine the solutions to the subproblems.
Partitioning
We will now scan the subarray from both ends, comparing the subarray‘s
elements to the pivot.
The left-to-right scan, denoted below by index pointer i, starts with the
second element. Since we want elements smaller than the pivot to be in the
left part of the subarray, this scan skips over elements that are smaller than
the pivot and stops upon encountering the first element greater than or equal
to the pivot.
The right-to-left scan, denoted below by index pointer j, starts with the last
element of the subarray. Since we want elements larger than the pivot to
be in the right part
of the subarray, this scan skips over elements that are larger than the
pivot and stops on encountering the first element smaller than or equal to
the pivot.
After both scans stop, three situations may arise, depending on whether or not the
scanning indices have crossed.
If scanning indices i and j have not crossed, i.e., i < j, we simply exchange A[i] and
A[j
] and resume the scans by incrementing i and decrementing j, respectively:
If the scanning indices have crossed over, i.e., i > j, we will have partitioned the
subarray after exchanging the pivot with A[j ]:
Finally, if the scanning indices stop while pointing to the same element, i.e., i = j, the
value they are pointing to must be equal to p (why?). Thus, we have the subarray
partitioned, with the split position s = i = j. We can combine the last case with the
case of crossed-over indices (i > j ) by exchanging the pivot with A[j ] whenever i ≥ j
.
0 1 2 Z 4 5 8 7
3 1 g Al 2 a
3 1 g 8 2 4 7
5 3 1 4 B 2 9 7
3 1 4 B 2 9 7
5 3 1 4 Z B 9 7
5 3 1 A Z 8 Q 7
2 3 1 4 5 8 9 7
3 1 4
4
1 2 3 4
1
3 4
/”
s 9 7
s 7 e
7 B B
7
Analysi
s Best
Case
If all the splits happen in the middle of corresponding subarrays, we will have
the best case.
The number of key comparisons in the best case satisfies the recurrence,
According to the Master Theorem, Cbest(n) ∈ Θ(n log2 n); solving it exactly for n =
2k yields
Cbest(n) = n log2 n.
Worst Case
In the worst case, all the splits will be skewed to the extreme: one of the two
subarrays will be empty, and the size of the other will be just 1 less than the size of
the subarray being partitioned. This unfortunate situation will happen, in particular,
for increasing arrays, i.e., for inputs for which the problem is already solved
Indeed, if A[0..n − 1] is a strictly increasing array and we use A[0] as the pivot, the
left-to- right scan will stop on A[1] while the right-to-left scan will go all the way to
reach A[0], indicating the split at position 0: So, after making n + 1 comparisons to
get to this partition and exchanging the pivot A[0] with itself, the algorithm will be
left with the strictly increasing array A[1..n − 1] to sort. This sorting of strictly
increasing arrays of diminishing sizes will continue until the last one A[n−2 .. n−1]
has been processed. The total number of key comparisons made will be equal to
Average Case
Let Cavg(n) be the average number of key comparisons made by quicksort
on a randomly ordered array of size n.
relation:
Its solution, which is much trickier than the worst- and best-case analyses, turns
out to be
Thus, on the average, quicksort makes only 39% more comparisons than in
the best case.
Moreover, its innermost loop is so efficient that it usually runs faster than
mergesort on randomly ordered arrays of nontrivial sizes. This certainly justifies
the name given to the algorithm by its inventor.
Variations
Because of quicksort‘s importance, there have been persistent efforts over the
years to refine the basic algorithm. Among several improvements discovered by
researchers are:
Better pivot selection methods such as randomized quicksort that uses a
random element or the median-of-three method that uses the median of the
leftmost, rightmost, and the middle element of the array
Switching to insertion sort on very small subarrays (between 5 and 15
elements for most computer systems) or not sorting small subarrays at
all and finishing the algorithm with insertion sort applied to the entire nearly
sorted array
Modifications of the partitioning algorithm such as the three-way partition
into segments smaller than, equal to, and larger than the pivot
Limitations
It is not stable.
It requires a stack to store parameters of subarrays that are yet to be sorted.
While performance on randomly ordered arrays is known to be sensitive not
only to implementation details of the algorithm but also to both computer
architecture and data type.
3. Strassen’s Matrix
Multiplication Direct
Method
Suppose we want to multiply two n x n matrices, A and B. Their
product
where
It is not difficult to verify that one can treat these submatrices as numbers to get
the correct product. For example, C00 can be computed either as A00 * B00 + A01 *
B10 or as M1 + M4 – M5 + M7 where M1, M4, M5, and M7 are found by Strassen‘s
formulas, with the numbers replaced by the corresponding submatrices. If the
seven products of n/2 × n/2 matrices are computed recursively by the same
Dept. of CSE,KNSIT Page 33
DAA 21CS42
method, we have
Analysis
Here the basic operation is multiplication.
If M(n) is the number of multiplications made by Strassen‘s algorithm in
multiplying two n× n matrices (where n is a power of 2), we get the following
recurrence relation for it
This implies M(n) = Θ(n2.807) which is smaller than n3 required by the brute-
force algorithm.
Advantages
reuses the cached data until the sub-problem has been completely solved.
It allows us to solve difficult and often impossible looking problems, such as the
Tower of Hanoi, which is a mathematical game or puzzle. Being given a difficult
problem can often be discouraging if there is no idea how to go about solving it.
However, with the divide and conquer method, it reduces the degree of difficulty
since it divides the problem into sub problems that are easily solvable, and
usually runs faster than other algorithms would.
Disadvantages
Another advantage to this paradigm is that it often plays a part in finding other
efficient algorithms, and in fact it was the central role in finding the quick
sort and merge sort algorithms.
One of the most common issues with this sort of algorithm is the fact that
the recursion is slow, which in some cases outweighs any advantages of
this divide and conquer process.
Sometimes it can become more complicated than a basic iterative approach,
especially in cases with a large n. In other words, if someone wanted to add a
large amount of numbers together, if they just create a simple loop to add them
together, it would turn out to be a much simpler approach than it would be to
divide the numbers up into two groups, add these groups recursively, and then
add the sums of the two groups together.
Another downfall is that sometimes once the problem is broken down into sub
problems, the same sub problem can occur many times. In cases like these,
it can often be easier to identify and save the solution to the repeated sub
problem, which is commonly referred to as memorization
1. Decrease by a constant
:(usually by 1): Examples:
a. Insertion sort
b. Graph traversal algorithms (DFS and BFS)
c. Topological sorting
d. Algorithms for generating permutations, subsets
2. Decrease by a constant factor (usually
by half) Example: Binary search
3. Variable size
decrease Example:
Euclid‘s algorithm
by 2):
The decrease-by-a-constant-factor technique suggests reducing a problem
instance by the same constant factor on each iteration of the algorithm. In most
applications, this constant factor is equal to two. The decrease-by-half idea is
illustrated in the following figure:
For an example, let us revisit the exponentiation problem. If the instance of size n is to
compute an, the instance of half its size is to compute a n/2, with the obvious
relationship between the two: an = (an/2) 2.
But since we consider here instances with integer exponents only, the former
does not work for odd n. If n is odd, we have to compute a n-1 by using the rule for
following formula:
Topological Sort
Background
A directed graph, or digraph for short, is a graph with directions specified for all its
edges. The adjacency matrix and adjacency lists are still two principal means of
representing a digraph.
There are only two notable differences between undirected and directed graphs
in representing them: (1) the adjacency matrix of a directed graph does not have
to be symmetric; (2) an edge in a directed graph has just one (not two)
corresponding nodes in the digraph‘s adjacency lists.
Depth-first search and breadth-first search are principal traversal algorithms for
traversing digraphs as well, but the structure of corresponding forests can be
more complex than for undirected graphs. Thus, even for the simple example
(Figure a) of the depth-first search forest (Figure b) exhibits all four types of edges
possible in a DFS forest of a directed graph:
tree edges (ab, bc, de)
backedges (ba) from vertices to their ancestors,
forward edges (ac) from vertices to their descendants in the tree
other than their children
cross edges (dc), which are none of the aforementioned types.
Note that a back edge in a DFS forest of a directed graph can connect a vertex to its
parent. Whether or not it is the case, the presence of a back edge indicates that the
digraph has a directed cycle. A directed cycle in a digraph is a sequence of three
or more of its vertices that starts and ends with the same vertex and in which every
vertex is connected to its immediate predecessor by an edge directed from the
predecessor to the successor. For example, a, b, a is
Let us consider a set of five required courses {C1, C2, C3, C4, C5} a
part-time student has to take in some degree program.
The courses can be taken in any order as long as the following course
prerequisites are met:
- C1 and C2 have no prerequisites,
- C3 requires C1 and C2,
- C4 requires C3,
- and C5 requires C3 and C4.
The student can take only one course per term. In which order should the
student take the courses?
Dept. of CSE,KNSIT Page 41
DAA 21CS42
In terms of this digraph, the question is whether we can list its vertices in
such an order that for every edge in the graph, the vertex where the edge
starts is listed before the vertex where the edge ends. (Can you find such an
ordering of this digraph‘s vertices?) This problem is called topological
sorting.
It can be posed for an arbitrary digraph, but it is easy to see that the problem
cannot have a solution if a digraph has a directed cycle.
Thus, for topological sorting to be possible, a digraph in question must be a
directed acyclic graph (dag).
Two algorithms for obtaining topological sort of a digraph is as follows:
The first algorithm is a simple application of depth-first search:
- perform a DFS traversal and note the order in which vertices become dead-
ends (i.e., popped off the traversal stack).
- Reversing this order yields a solution to the topological sorting problem,
provided, of course, no back edge has been encountered during the traversal.
If a back edge has been encountered, the digraph is not a dag, and
topological sorting of its vertices is impossible.
Figure: (a) Digraph for which the topological sorting problem needs to be solved.
(b) DFS traversal stack with the subscript numbers indicating the popping-off
order. (c) Solution to the problem.
Note: The solution obtained by the source-removal algorithm is different from the
one obtained by the DFS-based algorithm. Both of them are correct, because the
topological sorting problem may have several alternative solutions.