Merged Notes
Merged Notes
Divide-and-Conquer Strategy
• Three steps:
1. Divide a problem into sub problems of smaller sizes
2. Recursively solve smaller problems
3. Merge the solutions of sub problems to arrive at a
solution to original problem.
• Tower of Hanoi problem – Given 3 pegs move a stack
of different sized disks from one peg to another
making sure that a smaller disk is always on top of a
larger disk.
• Integer multiplication
• Other examples : MergeSort , QuickSort, FFT
Tower of Hanoi
move(n,i,j):
Input : n disks and integers i and j s.t. 1 ≤ i, j ≤ 3
Output: Disk moves to migrate all n disks from i to j
if n = 1
move a disk from i to j
else
otherPeg <- 6-(i+j)
move(n-1,i, otherPeg)
move(1,i,j)
move(n-1,otherPeg,j)
• Timecomplexity: O(T(n)) – T(n) is number of disk moves
T(n) = 2 T(n-1) + 1, n > 1 and T(1) = 1 => T(n) = 2ⁿ - 1
Divide-and-Conquer
• In most problems, we divide a problem into sub
problems which are a fraction of size of the original
problem.
• General time-complexity recurrence for divide-and-
conquer:
T(n) = a T(n/b) + f(n), n ≥ d
= c for n < d
• Second term includes dividing and merging steps;
first term includes number of basic problems to solve
• Based on f(n), general asymptotic solutions for T(n)
can be derived in some cases.
Divide-and-conquer recurrence (Master
theorem)
!
• At recursion level 1, there are "a" sub problems of size to
"
solve and f(n) additional time for divide/merge overhead
!
• At recursion level 2, there are 𝑎# sub problems of size "!
to
!
solve and a*f( ) additional time for divide/merge overhead.
"
!
• At i-th step there are 𝑎$ sub problems of size ""
to solve and
!
𝑎($&') *f( ) additional time for divide/merge overhead.
""#$
!
• Setting i=𝑙𝑜𝑔" ()), we see that ultimately we need to spend
&
*+,% (') ! %
()* +
time, say T1(n) in solving 𝑎 = problems of size d
-()*% '
(each of which takes “c” time units) and in divide/merge steps
&
*+,% (')&' !
which takes a total of time units T2(n) = ∑./0 𝑎 . ∗f( , )
"
Divide-and-conquer recurrence (Master
theorem)
&
!()*% + *+,% (')&' !
• T1(n) = = T2(n) = ∑./0 𝑎 . ∗f( , )
-()*% ' "
Case 1 : If f(n) is O(𝑛*+,% -&1 ) for small constant ϵ > 0 then T1(n)
dominates T2(n) asymptotically and T(n) is θ(𝒏𝒍𝒐𝒈𝒃 𝒂 )
(a), (c) and (d) takes O(𝑛) bit operations, (b) requires 3 recursive
calls on n/2-bit integers.
T(n) = 3 T(n/2) + k n, for n > d
= c for n = d
Applying Case 1 of Master theorem, we get T(n) is θ(𝒏%&'! 𝟑 )
which is better than θ(𝑛#)
Smooth functions and time complexity
extensibility (optional)
• In many divide-and-conquer algorithms , we derive T(n)
assuming n to be a power of some integer b > 1.
• Can we extend this result for any n asymptotically ?
• We can do this as long as T(n) involves “smooth” functions.
• Definition 1 : A function f(n) is eventually non-decreasing if
there exists 𝑛& ≥ 0 such that for all 𝑛# ≥ 𝑛) ≥ 𝑛*, f(𝑛#) ≥ f(𝑛))
• Definition : A function f(n) is smooth iff f(n) is “eventually”
non-decreasing and f(2n) is O(f(n))
• Can show that for a smooth function f, f(bn) is O(f(n)) for any
fixed positive integer b.
• Examples of smooth functions : n, log n, 𝑛#.
• Is 2! a smooth function ? No
Smooth functions and time complexity
extensibility (optional)
• Extensibility Theorem:
Suppose T(n) is O(f(n)) when n is a power of b for some
constant integer b > 1 and T(n) is asymptotically non-
decreasing (usually the case with time-complexities)
Then we can say T(n) is O(f(n)) for any n provided f(n) is
a smooth function.
Dynamic Programming Paradigm
• Discovered by Richard Bellman for solving various
optimal decision problems.
• Applicable for problems with objective functions that
satisfy “optimal substructure property” (or principle
of optimality). It allows an objective function to be
broken down into a series of recursive functions each
with smaller number of decision variables.
• At each stage, no decision made but we compute
best solution for each of all possible states at that
stage. (Decision graph)
• We then work backward from final stage as only one
possible state at that stage.
Dynamic Programming
• Efficiency of DP is due to avoiding repeated
computation for a state at a decision stage. How we
arrive at that state is not important. T
• Recursive formulation but no repeated computation!!
• Related to “memoization” which is caching of a result
during recursive computation.
e.g. Pascal triangle
• Table underlying DP computations.
• Example problems: matrix chain product, multiple
joins of relations in RDB, Optimizing string edits,
Longest Common Subsequence(LCS), DNA sequence
alignment,
Hidden Markov Models (speech recognition)
Minimum edit distance
Problem:
Given two strings X = x_1 x_2 …..x_m and Y = y_1 y_2 …. y_n
compute minimum cost to transform X to Y using following
operations:
(a) Insert a new character into X – c_ins(.) units cost
(b) Delete a character from X – c_del(.) units cost
(c) Replace a character from X = c_rep(.) units cost
• N(1) = N(2) = 1
• N(k) = ∑$%#
!"# 𝑁 𝑙 𝑁(𝑘 − 𝑙), k > 2
• Time complexity :
T(n) ≤ n + ∑%&"
*+" 5𝑘(𝑛 − 𝑘) is O(𝑛 $ ) integer operations.
Matrix chain product DP example
• 𝐴! 3x5 , 𝐴" 5x6 , 𝐴# 6x2 , 𝐴$ 2x4
𝑁%,' entries , 0 ≤ 𝑖 ≤ 𝑗 ≤ 3, Also store 𝑘%,' as
value of 𝑖 ≤ 𝑘 < 𝑗 that gives minimum value.
𝑁!,! = 𝑁"," = 𝑁#,# = 𝑁$,$ = 0, 𝑘%,% = 𝑖, 0 ≤ 𝑖 ≤ 3
𝑁!," = 𝑁!,! + 𝑁"," + 𝑑! 𝑑" 𝑑# = 0 + 0 + 3x5x6 = 90; 𝑘!," = 0
𝑁",# = 𝑁"," + 𝑁#,# + 𝑑" 𝑑# 𝑑$ = 0 + 0 + 5x6x2 = 60; 𝑘",# = 1
𝑁#,$ = 𝑁#,# + 𝑁$,$ + 𝑑# 𝑑$ 𝑑( = 0 + 0 + 6x2x4 = 48; 𝑘#,$ = 2
𝑁!,# = min( 𝑁!,! + 𝑁",# + 𝑑! 𝑑" 𝑑$ , 𝑁!," + 𝑁#,# + 𝑑! 𝑑# 𝑑$ )
= min ( 0 + 60 + 3x5x2 , 90 + 0 + 3x6x2) = 90; 𝑘!,# = 0
𝑁",$ = min( 𝑁"," + 𝑁#,$ + 𝑑" 𝑑# 𝑑( , 𝑁",# + 𝑁$,$ + 𝑑" 𝑑$ 𝑑( )
= min ( 0 + 48 + 5x6x4 , 60 + 0 + 5x2x4) = 100; 𝑘",$ = 2
𝑁!,$ = min( 𝑁!,! + 𝑁",$ + 𝑑! 𝑑" 𝑑( , 𝑁!," + 𝑁#,$ + 𝑑! 𝑑# 𝑑( ,
𝑁!,# + 𝑁$,$ + 𝑑! 𝑑$ 𝑑( )
= min ( 0 + 100 + 3x5x4 , 90 + 48 + 3x6x4, 90 + 0 + 3x2x4) = 114
𝑘!,$ = 2
Matrix chain product DP example (contd.)
Final answer : 𝑁!,# = 114
Optimal order computed as : 𝑘!,# = 2 i.e (𝐴! x 𝐴$ x 𝐴% ) x 𝐴# à look at entry
(0,2) to compute (𝐴! x 𝐴$ x 𝐴% )
𝑘!,% = 0 i.e. (𝐴! x 𝐴$ x 𝐴% ) computed as 𝐴! x (𝐴$ x 𝐴% ) à look at entry (1,2)
to compute (𝐴$ x 𝐴% )
𝑘$,% = 1 i.e. 𝐴$ x 𝐴% computed as (𝐴$ x 𝐴% ). Final order : (𝐴! x (𝐴$ x 𝐴% ) ) x 𝐴#
i=3 x x x 0(𝑘%,% = 3)
Greedy Algorithms
• An optimization problem involves finding a solution that
minimizes or maximizes an objective function of decision
variables with or without constraints
• A global optimal choice strategy finds best solution
among all possible solutions to the problem.
• A local partial solution strategy finds best partial solution
among a limited set of solutions.
• A problem satisfies greedy-choice property if a global
optimal solution can be reached by a sequence of local
partial solution choices starting from a well-defined
state.
• Otherwise this strategy provides only a heuristic and
optimal solution is not guaranteed.
• Well-defined starting state may require some pre-
processing.
Optimal substructure
• Recall a problem exhibits “optimal substructure”
property if optimal solution contains within itself
optimal solutions to sub problems.
• This is the basis of DP wherein we split the
optimization function as a sequence of recursive
functions ; they depend only on the “state” we arrive
at by choice of values for a subset (typically of size
1) of decision variables.
• Also called “principle of optimality” by Richard
Bellman.
• Optimal substructure is a necessary condition for
greedy algorithms.
Coin change problem
Given :
Unlimited limited supply of n coins, each of which has
value {𝑐# , 𝑐( ,….. 𝑐1 },
Required:
Minimum number of coins to make change for an
amount V. Assume 𝑐1 = 1
• Satisfies “optimal substructure” property – if in an
optimal solution we make change for value “v” out of
“V”, then it contains optimal solution to sub problem
of making change for V-v regardless of how we made
change for “v”
• Can use a DP formulation for it
Coin change – DP formulation
• Define F(𝑖, 𝑣) – min number of coins needed to make
change for v from set {𝑐) , 𝑐)*# …. 𝑐1 }, 1 ≤ 𝑖 ≤ n, 1 ≤ 𝑣 ≤ 𝑉
N(𝑖, 𝑣) – number of coins of type 𝑐) used in this
solution
• Boundary conditions : F(n+1, 𝑣) = N(𝑛 + 1, 𝑣) = ∞ , if
𝑣 > 0, else F(n+1, 𝑣) = N(𝑛 + 1, 𝑣)= 0
• F(𝑖, 𝑣) = min) 𝑗 + 𝐹(𝑖 + 1, 𝑣 − 𝑗 ∗ 𝑐) ), 0 ≤ 𝑣 ≤ 𝑉,
2≤ + ≤⎿* ⏌
+
1 ≤ 𝑖 ≤ n-1
N(𝑖, 𝑣) = argmin F(𝑖, 𝑣)
• Need F(1, 𝑉) and N(1, 𝑉)
• Time complexity of DP algorithm : O(𝑛V)
Coin change - greedy solution
• Assume 𝑐" ≥ 𝑐# ….. ≥ 𝑐%
• At each stage i, only one choice considered for making change
for 𝑣
- -
• F(𝑖, 𝑣) =⎿ . ⏌ + 𝐹(𝑖 + 1, 𝑣 − ⎿ . ⏌ ∗ 𝑐' ) and N(𝑖, 𝑣) =
- ! !
⎿. ⏌
!
• Iterative greedy algorithm:
𝑣 ←V
F←0
for 𝑖 ← 1 to n
-
N(i) ← ⎿ . ⏌
!
F ← F + N(i)
𝑣 ← 𝑣 mod 𝑐'
• Time complexity : O(n)
Coin change – greedy choice
• Does it have “greedy choice” property ?
• Not always. Consider only quarters, dimes and pennies (i.e.
𝑐"= 25, 𝑐#=10 and 𝑐$=1 ) and V = 30
- Greedy choice will give N(1) =1, N(2)=0, N(3)=5, F=6
-- Is it optimal ?
-- No. N(1) =0, N(2)=3, N(3)=0, F = 3
• But for quarters, dimes, nickels and pennies (i.e. 𝑐"= 25,
𝑐#=10, 𝑐$=5 and 𝑐/=1) , it satisfies “greedy choice property”
--- Ignore pennies, remaining value divisible by 5
--- Any state F(2, 𝑣) with 𝑣 ≥ 25 not part of optimal (as you
can reduce number of coins either by replacing 2 dimes+1 nickel
or 3 dimes by a quarter and nickel)
--- Any state F(3, 𝑣) with 𝑣 ≥ 10 not part of optimal as you can
reduce number of coins by replacing 2 nickels by a dime
Algorithm Design Strategies
(contd.)
Fractional knapsack problem
• Problem : Max ∑$!"# 𝑏! t !
s.t. ∑$!"# 𝑤! t ! ≤ W, 0 ≤ t ! ≤ 1, ∀ 𝑖
• Select items (may be partially) with weights 𝑤! ’s and values
𝑏!% s so as to fill a knapsack of weight W.
• In 0-1 knapsack problem, an item cannot partially fill a
knapsack. It is a harder problem to solve.
• Define relative value 𝑣! = 𝑏! /𝑤! , ∀ 𝑖. An item with a larger
value of 𝑏! and smaller value of 𝑤! is relatively more valuable.
Let 𝑤! t ! = 𝑥! , ∀ 𝑖.
• Problem : Max ∑$!"# 𝑣! 𝑥!
s.t. ∑$!"# 𝑥! ≤ W, 0 ≤ 𝑥! ≤ 𝑤! , ∀ 𝑖
Fractional knapsack problem
• Satisfies “optimal substructure” property. Why?
• If in an optimal solution to this problem, after selecting a
few items capacity of 𝑣 remains then optimal solution to a
knapsack problem with capacity 𝑣 must be part of the
complete optimal solution.
• Greedy approach:
Let items be ordered such that 𝑣! ≥ 𝑣" · · · ≥ 𝑣# and they
are filled in that order; for last item if weight exceeds
remaining capacity, use partial amount. This is a greedy
approach.
• At most one item will be partially filled in this approach.
• Does this algorithm satisfy greedy choice property ?
Example of fractional knapsack
• 𝑏! = 7, 𝑏" = 5, 𝑏# = 4, 𝑏$ = 3
• 𝑤! = 4, 𝑤" = 3, 𝑤# = 2, 𝑤$ =1, W = 6
Strategy 1: Fill by least weight to highest weight
Choose 𝑤$ , 𝑤# , 𝑤" – total value = 3+4+5 = 12
Strategy 2: Fill by largest value to smallest value
Choose 𝑤! , 𝑤" *2/3 – total value = 7 + 5*2/3= 10.33
(Optimal) Strategy 3 : Fill by largest relative value to
smallest
𝑣! = 7/4 = 1.75, 𝑣" = 5/3 =1.67, 𝑣# = 4/2 = 2
𝑣$ = 3/1 = 3
Choose 𝑤$ , 𝑤# , ¾* 𝑤!
Total value = 3 + 4 + ¾*7 = 12.25
Fractional knapsack greedy alg.
• Optimality of greedy choice:
We prove by contradiction.
Let there be an optimal solution such that for two items such
that 𝑣! > 𝑣&, we do not fully fill item i but use some amount of
item j thereby not using the greedy choice.
i.e. 𝑥! < 𝑤! and 𝑥& > 0.
We can then replace amount of j as much as possible by an equal
amount of item 𝑖. This amount = min( 𝑥& , 𝑤! -𝑥! ).
Additional value obtained
= (𝑣! - 𝑣& ) min (x& , 𝑤! - x! ) > 0 violating optimality of solution.
• Time-complexity – O(n log n)
• Fast considering there are 2ⁿ possible subsets of n items.
Meeting scheduling problem
Problem: Given a set S of n meetings, each with start and finish
times 𝑠! and 𝑓! times (0 < 𝑠! < 𝑓! ) respectively for 1 ≤ i ≤ n, find a
mapping φ : S → {1,2,…M} (M conf. rooms) such that
(a) For two meetings 𝑚! and 𝑚& s.t. φ(𝑚! ) = φ(𝑚& ) (assigned to
same room), either 𝑓! ≤ 𝑠& or 𝑓& ≤ 𝑠! (they do not conflict in time)
(b) M should be as small as possible (minimum number of rooms).
• Greedy approach:
(a) Sort meetings according to start times 𝑠! ’s. Start with a single
room.
(b) For each meeting in sequence
(i) check if it does not conflict with any of the meetings scheduled
so far, schedule it at earliest opportunity in a room.
(ii) Else schedule it in a new meeting room.
Meeting scheduling problem
• Proof of correctness:
We prove by contradiction.
Suppose the optimal solution requires m ≤ 𝑘 − 1 meeting rooms while the
greedy algorithm requires k rooms.
Let 𝑚! be the first meeting scheduled in last room k by greedy approach.
Þ 𝑚! conflicts with at least one meeting scheduled in each of the rooms 1..k-
1
Þ all these meetings have start times not later than 𝑠! but have finish times
later than 𝑠! . These meetings conflict with each other.
Þ At least k meetings conflict with each other, a contradiction as the
algorithm considers non-conflicting schedule and this cannot be done with
< k rooms
• Time complexity:
O(n log n) time for pre-processing.
In each scheduling step, how do we check efficiently if the new meeting does
not conflict with previous meetings ?
Keep track of earliest finish time among the tasks that start latest in each
room and the rooms associated with the tasks. If the next task’s start time is
later than this finish time, schedule it in that room. Else it cannot be
scheduled in any of the rooms used so far and has to be scheduled in a new
room. By keeping the earliest finish times of latest tasks in each room in a
heap, we can do this in O(log n) time in each step.
Example of meeting scheduling
• 𝑚! : 10(𝑠! )-14(𝑓! ), 𝑚" : 9-11, 𝑚$ : 8-10, 𝑚% : 7-12, 𝑚& :
10-15, 𝑚' : 13-15
• Meetings sorted by start times: 𝑚% , 𝑚$ , 𝑚" , 𝑚! , 𝑚& ,
𝑚'
• Let 𝑓( be earliest finish time of latest task scheduled in a
room and 𝑟( , the corresponding room after step k (i.e. k
meetings have been scheduled).
𝑅! : 𝑚% (7-12)
𝑅" : 𝑚$ (8-10), 𝑚! (10-14)
𝑅$ : 𝑚" (9-11), 𝑚' (13-15)
𝑅% : 𝑚& (10-15)
𝑓! = 12 , 𝑓" = 10, 𝑓$ = 10, 𝑓% = 11, 𝑓& = 11, 𝑓' = 12
𝑟1= 1 , 𝑟" = 2, 𝑟$ = 2, 𝑟% = 3, 𝑟& = 3, 𝑟' = 1
AVL tree insertion examples
BF= +2 B BF=0
A Single left
rotation of B
B +1 0 A T3
T1
T3 T1 T2
T2
-1 or +1 C T4 T1 T2 T3 T4
T2 T3
T1 T2
Case d: Insertion into T2 or T3 changes BF of B from 0
to +1 and of A from -1 to -2
A -2 C 0
Double rotation
of C (left, right)
B +1 T4 B 0 A
C -1 or +1 T1 T2
T1 T3 T4
• Insert 60 60 BF=0
• Insert 40
60 BF= -1
40 BF=0
B T3 BF= 0
40 BF=-1 BF= 0
30 A 60
Single right
rotation of B
30 BF=0 T2
T1 T2 T3
T1
Case c applies
A BF= +2
30 60 BF= +1
BF= -1
40
BF= 0
C 35 60 BF= +1
BF= 0 BF= +1
75 BF= 0
A 30 B 37
BF= 0
BF= 0 BF= 0
20 32 38
T3
T1 T2
T4
Dr. Ravi Varadarajan
AVL Insertion examples (contd.)
• Insert 39
BF= -2
40
BF= +1 BF= +1
35 60
BF= +1
BF= 0 BF= 0
20 B
32 38
T1
Case a applies
BF= 0
39
T2
Dr. Ravi Varadarajan T3
AVL insertion examples (contd.)
BF= -1
40
BF= 0
35 60 BF= +1
BF= 0
BF= 0
30 75 BF= 0
B 38
A
BF= 0 20 BF= 0 32 BF= 0
37 39 BF= 0
T1 T2 T3
BF= +1
35 60 BF= +1
B
Double rotation of
BF= -1 C (left, right)
BF= 0
30 75 BF= 0
C 38
T1 T4
T3
BF= 0 36
T2
Case d applies
B A
BF= 0 35 40 BF= +1
BF= 0
BF= 0 30 39 60 BF= +1
37 BF= -1
BF= 0
BF= 0 20 BF= 0 32 T3 75 BF= 0
36
T2 T4
T1
Time Complexity:
O(h) to find node to remove, O(h) to find in order successor,
O(1) to remove successor node
Total time complexity : O(h)
Total additional space complexity : O(h)
Dictionary operations for BST - delete
(a) If node to be removed is a leaf, we can just remove
it from the tree
30
30
25
25
20 27
20
(b) If a node has only one child, then make the child
the child of its parent
30 30
25
20
20
Dictionary operations for BST - delete
(c) when node to be removed has 2 children (replace it
with in-order successor or minimum key in right subtree
i.e. left most node)
32
30 32
40 40 20 40
20 20
ç
35
35
√ 35
32
34
34 34
Build a BST
• To build a binary search tree with n keys
• If we randomly choose a sequence 𝑎! , 𝑎" , … 𝑎# from the
given keys and build using insertElement() n times,
worst-case complexity is O(𝑛" ) as tree may be skewed
and height can be O(j) for a j-element tree
• But average time complexity can be shown to be O(n log
n)
• Also using DP we can build a optimal binary search tree
for a set of n keys with 𝑝$ probability of executing a find()
with key equal to 𝑎$ and 𝑞$ probability of executing a
find() with 𝑎$ < key < 𝑎$%! and 𝑞& probability of
executing a find() with key < 𝑎!
Balanced Binary Search Trees
• Satisfies some height balancing property at
every node of the tree
• Recursive structure
• Height balancing property typically guarantees
logarithmic bounds on tree height
• Insertions and removals restructure trees to
guarantee this property with minimum
overhead
• They involve rotation operations.
AVL tree
• Due to inventors Adelson-Velskii and Landis
• Balance factor at a node = Height of right subtree – height of left
subtree rooted at that node
• Binary search tree with height-balancing property : balance factor
at each node is 0,-1 or 1
• Keys stored only in internal nodes
• Note # of external nodes = # internal nodes + 1
• An AVL tree of height “h” has at least n(h) nodes, n(0) = 0, n(1) = 1
and n(h) = 1 + n(h-1)+n(h-2), h ≥ 2. Recognize n(h) ?
Like Fibonacci sequence
• It has at most m(h) nodes, m(h) = 1 + 2* m(h-1), h ≥ 2 and m(0) = 0
!
!"
• Can show n(h) > 2 " → h < 2 log n + 2 → h is O(log n)
• Also we see that m(h) = 2# - 1 → n < 2# → h > log n → h is Ω(log n)
• FindElement() takes O(log n) time.
O(log n) AVL tree insertion
• May cause balance factor at an ancestor node of inserted
node to change to -2 or +2.
• Fixing it requires only one “rotation” operation which takes
O(1) time as it requires only pointer changes
• 4 cases:
(a) Node A’s BF changes from +1 to +2, its right child node B’s BF
changes from 0 to +1
Left rotate B to move its right subtree up one level
(b) Node A’s BF changes from +1 to +2, its right child node B’s BF
changes from 0 to -1 as its left child C’s BF changes from 0 to
+1 or -1
Double rotate C (right followed by left) to make A and B its
children
(c) & (d) are “mirror” cases of (a) and (b)
Case a: Insertion into T3 changes BF of B from 0 to +1
and of A from +1 to +2
BF= +2 B BF=0
A Single left
rotation of B
B +1 0 A T3
T1
T3 T1 T2
T2
-1 or +1 C T4 T1 T2 T3 T4
T2 T3
Case c: Insertion into T1 changes BF of B from 0 to -1
and of A from -1 to -2
B BF=0
A BF=-2 Single right
rotation of B
A 0
B -1 T3 T1
T2 T3
T1 T2
Case d: Insertion into T2 or T3 changes BF of B from 0
to +1 and of A from -1 to -2
A -2 C 0
Double rotation
of C (left, right)
B +1 T4 B 0 A
C -1 or +1 T1 T2
T1 T3 T4
T2 T3
Search Trees (contd.)
O(log n) AVL tree deletion
• First do deletion as in ordinary binary search
tree (discussed before)
• This may cause BF of an ancestor node to
change to +2 or -2.
• Use “rotation” operations as in insertion to
balance the tree
• Can be shown to be O(log n) in the worst-case.
Multi-way search trees
• Stores more than one key in an internal node and if a
node has at most “m” keys 𝑘! 𝑘" … 𝑘# then it has m+1
children
• Any key 𝑘 in subtree of i-th child (1 ≤ i ≤ m+1)
satisfying 𝑘$%! < 𝑘 < 𝑘$ assuming 𝑘& = -∞ and 𝑘#'! =
∞
• Examples : 2-4 tree (m=3), B-tree (m depends on disk
block size)
• Maintains height balance property by having every
external node at the same height.
• Height of tree is Θ(log n) where n is number of keys.
2-4 tree
• A search tree with each node having either 2,3 or 4
children.
• Every path from root to external node has same
length.
• A tree of height h must have at most 4( external
nodes and at least 2( external nodes.
• Hence 2( ≤ n+1 ≤ 4( where n is number of keys
• à log(n+1) /2 ≤ h ≤ log (n+1) à h is Θ(log n)
• FindElement() takes Θ(log n) time
Example of 2-4 tree
Insertion in 2-4 tree
• Use FindElement() to reach bottom internal node.
• Case 1 : No overflow in internal node to be inserted
(i.e. # of keys < 3 before insertion)
• Case 2 : Overflow in node which is not root. Add
key, split the node (making extra child of parent ) and
push the middle key of node up to be inserted into
parent. (recursive case)
• Case 3 : Overflow in node which is root. Create a new
root node and make node and is split node as its
children. Increases height of tree.
• Time complexity : Down and up phases take Θ(log n)
time (comparisons + pointer changes)
Deletion from 2-4 tree
• If it is not in a bottom internal node, replace key by the smallest
item in the subtree whose keys > key.
• Problem reduces to removing key from a bottom internal node
with only external children.
• Case 1: No underflow - # of keys in node > 1 before deletion.
• Case 2 : Underflow in node whose parent is not root. (recursive
case)
2a : Immediate sibling has at least 2 keys from which we can
transfer a key to this node or thru’ a transfer of a key from parent.
2b : All siblings have just one key. Merge with a sibling node
by having a new node and moving a key from parent to this new
node.
• Case 3 : Underflow in node whose root is parent. New
replacement root node causes height to decrease.
• Time complexity : Down and up phases take Θ(log n) time.
B-tree
• m-way search tree used for external searching of records as in
a database (B-tree index)
• # of keys in internal node allowed to vary between “d” and
“2d”. Branching factor - # of keys + 1
• “d” depends on size of disk block
• Insertion/deletion work the same way as in 2-4 tree.
• Typically only when a node is accessed, it is brought into main
memory from disk; entire tree not kept in main memory.
• # of levels ≤ 1 + 𝑙𝑜𝑔! ((n+1)/2) where n is number of keys
• For n ≈ 2 million and d ≈ 100, # of levels is at most 3 requiring
at most 3-4 disk accesses.
• In B+ tree, records with keys kept in leaf nodes
• In B* tree, non-root nodes have at least 2/3 full capacity.
Instead of splitting/merging, sometimes keys may be
transferred from/to sibling nodes.
B- tree (order 7) example
Basic data structures-2
Amortization
• Many dynamic data structures do well for a
sequence of operations though the worst-case
time complexity based on a single operation
may be high
• But we amortize the restructuring cost of a
data structure over the sequence of n
operations – restructure for future benefit
• Two methods for time complexity analysis :
(a) Accounting method
(b) Potential function method
Accounting method
• Assign an amortized cost to each operation which may be less than
or greater than the actual cost of the operation.
• Operations whose actual cost is less than amortized cost will help
pay for operations whose actual cost is more than amortized cost
using “credits” which are associated with data structure.
• We require that sum of amortized costs for n operations must be at
least the sum of actual costs so that total credit is always positive.
• For dynamic array example, set amortized cost to be 3 for each
operation: 1 unit for inserting element itself, one for copying it in
the future when array is doubled and 1 unit for an item already
copied during array doubling before the insertion of this item
• 2 credits will help pay for actual cost when array is doubled.
• Total actual cost ≤ Total amortized cost ≤ 3n
Potential Function
• Associate a “potential” with each state of the data structure
that represents prepaid work for future operations.
• Assume 𝐷! is initial state and let 𝐷" be the state after the i-th
operation.
• Define potential function φ : 𝐷" → R s.t.
amortized cost for i-th operation 𝑒" =
actual cost 𝑐" + φ(𝐷" ) - φ(𝐷"#$)
• Sum of amortized cost ∑&"%$ 𝑒" = sum of actual cost ∑&"%$ 𝑐" +
φ(𝐷& ) - φ(𝐷!) (due to telescopic sum)
• If we choose potential function s.t. φ(𝐷& ) ≥ φ(𝐷!), we get an
upper bound on total actual cost using total amortized cost.
Potential function (contd.)
• For dynamic arrays, we can set φ(T) = 2*T.num – T.size.
• Initial value of φ is 0 .
• Immediately before an expansion, T.num = T.size => φ = T.num
• Immediately after expansion, T.size = 2*T.num => φ = 0.
• Since array is at least half full always, T.num ≥ T.size/2, hence φ(T)
≥0
• Let num! - number of elements after i-th operation, note num! -
num!"# = 1 .
• Two cases:
a. i-th operation does not cause expansion (
𝑒! = 𝑐! + φ! – φ!"# = 1 + (2* num! – size) – (2*num!"# - size)= 1+ 2 =
3
b. i-th operation triggers expansion: (1 insert and num! -1 copying
operations)
𝑒! = num! + (2* num! – 2*(num! -1)) – (2*(num! -1) – (num! -1)) = 3
Linked list implementation
• Each node has prev and next links (doubly linked)
• “count” for number of items in the list
insertAfter(p,e): // p can be null if need to insert as first item in the list
v ← newly allocated node
v.item ← e
v.prev ← p
if p = null
v.next ← head
else
v.next ← p.next
if v.next = null
tail ← v
else
v.next.prev ← v
if v.prev = null
head ← v
else
v.prev.next ← v
count ← count + 1
return v
Linked List (contd.)
remove(p):
elem ← p.item
if p.prev = null
head ← p.next
else
(p.prev).next ← p.next
if p.next = null
tail ← p.prev
else
(p.next).prev ← p.prev
p.prev ← null; p.next ← null;
count ← count - 1
return elem
Time complexity comparison
Array vs Linked List
Operations Array Linked List
size, isEmpty O(1) O(1)
atRank,rankOf,elemAtRank O(1) O(n)
first,last,before,after O(1) O(1)
insertAtRank,removeAtRank O(n) O(n)
insertFirst, insertLast O(1) O(1)
insertAfter, insertBefore O(n) O(1)
remove O(n) O(1)
Binary Tree
• A recursive data structure with a root node and left
and right children being roots of binary trees
themselves
• We use convention that each internal node has
exactly two children and an external node (leaf) has
no children
• Operations:
leftChild(v) – left child of node v; error for external
node v
rightChild(v) – right child of node v, error for
external node v
isInternal(v) – true iff v is an internal node
Binary tree properties
• Depth of a node is number of internal nodes in the path from
root to the node. Root has depth 0
• Height h of a tree is the maximum depth of an external node
in the tree.
• Height of tree = 1 + max (height of left sub tree, height of right
sub tree)
• # of external nodes = 1 + # of internal nodes (by induction on
tree height)
• h ≤ # of internal nodes ≤ 2' -1
• h+1 ≤ # of external nodes ≤ 2'
• 2h+1 ≤ # of nodes ≤ 2'($ – 1
• For a tree with n nodes, log(n+1) -1 ≤ h ≤ (n-1)/2
Binary tree traversals
• Preorder – root, left subtree, right subtree
• Inorder – left subtree, root, right subtree
• Postorder – left subtree, right subtree, root
binaryPreorder(T,v):
Input : Binary tree T, a node v
Output: Perform action on each node of subtree rooted
at node v
performAction(v)
if T.isInternal(v)
binaryPreorder(T, T.leftChild(v))
binaryPreorder(T, T.rightChild(v))
Time complexity – O(n) as each node is visited only once
BT array implementation
• A tree with height h needs an array A[0..n] where n =
2!"# – 1
• Rank(v) determines index of node v in A
• Rank(v) = 1 for root v
Rank(leftChild(v)) = 2 * Rank(v)
Rank(rightChild(v)) = 2 * Rank(v) + 1
• For a sparse tree, many cells in A will be unused
space complexity is O(2$ ) in the worst-case
• Simple and fast for many tree operations.
• Can use dynamic arrays for expanding trees.
• Revisit this implementation for heap ADTs.
BT Linked structure
• Similar to linked list, each node has item as well links
to left and child children nodes (null for external
nodes) and optionally a link to its parent (null for
root)
• Easy to extend to non-binary trees
• Space-efficient as complexity is O(n) where n is
number of nodes
• Time complexity for many operations is comparable
to array implementation but small overhead for
dereferencing the links.
• Revisit this implementation for binary search trees.
Basic data structures-3
Min-Priority Queue ADT
• Allows a totally ordered set of elements to be stored
in such a way that “minimum” element can be
extracted efficiently – self-reorganizing structure
• Useful for task scheduling, efficient sorting
• Operations:
insertElement(e) – insert element in queue
removeMin() – remove and return smallest
minElement() – return minimum element
• In a Max-heap, maximum elements are of interest.
• In Java PriorityQueue<E> is a class based on
unbounded heap
PQ – simple array implementation
• Two approaches:
(a) Keep array unordered
(b) Keep array sorted at all times
• Unordered array
(a) insertElement(E) – add element to end of array – O(1) time
(b) minElement() – O(n) time even in best-case
(c) removeMin() – O(n) time even in best-case
basis of Selection Sort O(𝑛!) even in best-case
𝑝" =2 𝑝! =3 𝑝# = 5 𝑝$ = 7
𝑑" 𝑢" = 1 𝑑! 𝑢! = 2 𝑑# 𝑢# = 12 𝑑$ 𝑢$ = 12
Ravi Varadarajan
OPT(#)
For a maximization problem 𝜋, define 𝑅! I =
A(#)
• Absolute performance ratio 𝑅! is given by smallest value
𝑟 ≥ 1 such that 𝑅! I ≤ 𝑟 for all instances 𝐼 of 𝜋.
Travelling Salesman Problem
• Triangular inequality constraint:
The graph satisfies the property that
𝑤(𝑣! , 𝑣" ) ≤ 𝑤(𝑣! , 𝑣# ) + 𝑤(𝑣# , 𝑣" ) for any 3
distinct vertices 𝑣! , 𝑣# and 𝑣" .
• There is an approximation alg. 𝐴 for TSP with
triangular inequality that achieves 𝑅$ < 2.
• It constructs an Euler tour of the minimum
cost spanning tree of the graph and applies
triangular inequality to find a TSP tour.
Approximation Schemes
• An approximation algorithm 𝐴 which given an
accuracy requirement 𝜀 > 0 and an instance I
of the problem constructs a solution in
polynomial time and achieves 𝑅$" I ≤ 1 + 𝜀
(i.e. a range of app. algorithms one for each 𝜀)
• A fully polynomial-time approximation
scheme runs in time which is a polynomial
%
function of the length of the input and
&
It has time-accuracy trade-offs.
Knapsack problem
• Problem : Max ∑$!"# 𝑏! t !
s.t. ∑$!"# 𝑤! t ! ≤ W, t ! = 0 or 1, ∀ 𝑖
• Decision version is NP-Complete (Can reduce
Partition problem to Knapsack)
• Note the DP algorithm has time complexity O(nW).
This is NOT a polynomial-time algorithm as length of
input includes log 𝑊 and time complexity is O(n
2%&' ( ). It is called a “pseudo-polynomial time”
algorithm.
• By scaling the problem depending on accuracy 𝜀, we
can design an approximation scheme for knapsack
$!
problem that works in time O( ) )
c � g(n)
Running time
Input size
f(n)
Running time
Input size
c � g(n)
f(n)
f(n) ∈ Θ(g(n))
c’ � g(n) f(n) and g(n) are asymptotically equal, up to a constant factor.
c � g(n)
Running time
Input size
(less than in asymptotic sense)
f(n) Approaches but never touches
Running time
Innovate:
Achieve text compression goal using a variable-length encoding scheme
Core Idea:
least
Most-frequently used characters to use the _______ numbers of bits and vice versa.
Root
Input string X:
a fast runner need never be afraid of the dark
a: 010
f: 1100
Representation Convention:
• Circle: internal
• Square: external
• v: a vertex/node
• c: a character
• C: collection of characters
Define the Tree T:
Total path weight: p(T)
T1
T2
T3 …
Each of the subset is the optimal for the sub
problem of coding the subset
Optimal Tree Structure T starts with its subtree:
Array Index 1 2 3 4 5 6 7 8 9 10 11
Unsorted List 2 3 13 5 9 15 16 11 17 18
Sorted list 2 3 5 9 11 13 15 16 17 18 Insertion order: 2,3,13,5,9,15,16,11,17,18
Min Heap 2 3 13 5 9 15 16 11 17 18
Max Heap 18 17 15 11 16 3 13 2 9 5
Complexity: O(n + dlogd)
O(n)
• d: number of distinct c in X (size of C)
O(logd)
O(dlogd)
1-level tree containing 2 lowest frequency characters is part of an optimal tree
Capacity
Previous Round Result @
the same capacity c
Round 1: only consider object 1 with w1 and v1 Round 2: only consider object 1 & 2
Total weight of all objects = 1 Total weight of all objects = 3
Round 2: only consider object 1,2 Round 3: only consider object 1,2,3
c = 7 , i = 3 w3=5
max(F4(7),V3+F4(7-5) )
= max(7, 18 + 6) = 24
Keep them the same
Since w3 = 5 > current capacity
Previous Round
Result @ the
same capacity c Current (new item’s) value c = 5 , i = 3 w3=5
+ Previous Round Result @
capacity c – wi
max(F4(5),V3+F4(5-5) )
= max(7, 18 + 0) = 18
NP-complete problem. Solved using Greedy Algorithm via
dynamic programming: Solution is not the optimal solution
Example:
c = 11. 5 object with weights & values as followed.
Capacity
Previous Round Result @
the same capacity c
Round 1: only consider object 1 with w1 and v1 Round 2: only consider object 1 & 2
Total weight of all objects = 1 Total weight of all objects = 3
Round 2: only consider object 1,2 Round 3: only consider object 1,2,3
c = 7 , i = 3 w3=5
max(F4(7),V3+F4(7-5) )
= max(7, 18 + 6) = 24
Keep them the same
Since w3 = 5 > current capacity
Previous Round
Result @ the
same capacity c Current (new item’s) value c = 5 , i = 3 w3=5
+ Previous Round Result @
capacity c – wi
max(F4(5),V3+F4(5-5) )
= max(7, 18 + 0) = 18
4
5 50
44 3
2
2 4
44 62
17 62
2 1 2 2
1 3 2
17 48 54 78
32 50 78
1 1 1
1 2 1
32 52 88
48 54 88
1
52
4
3
20 60 60 60
20 20
1 3
2
20 70 20 70 20 70
60 60
1 2
70 65 65
1
68
Shares Price Age Shares Price Age
1000 4.05 20s 2000 4.06 10s
100 4.05 6s 500 4.07 70s
2100 4.02 2s 1000 4.07 50s
2500 4.01 85s 2100 4.20 5s
100 4.21 1s
A[1..i-1] A[i..n]
In-place Algorithm
Assign index i to a temp variable
s = i
Loop through i+1 to n
for j = i + 1 to n Loop it (n-1) times from 1 to n-1
if A[j]<A[s] then
s = j
If s ≠ I
Swap A[i] and A[s]
i = 0: Run n-1 comparisons
i = 1: Run n-2 comparisons
…
I = n-2: Run 1 comparisons
(n-1)+(n-2)+…+3+2+1 comparisons
O(n) swaps/exchanges
A[1..i-1] A[i..n]
Assign index i to a temp variable along with its
value In-place Algorithm
s = i
t = A[i]
Loop backward till position 0
for j = i - 1 to 0 Loop it (n-1) times from 1 to n-1
if A[j]>A[s] then
A[s] = A[j]
s = s-1
Found the place for the old index I
A[s] = t
i = 0: Run 1 comparisons
i = 1: Run 2 comparisons
…
I = n-1: Run n-1 comparisons
(n-1)+(n-2)+…+3+2+1 comparisons
44 75 23 43 55 12 64 77 33 Greater than
Less than
Step 2: Search for the first value at the right end < pivot value
Left
44 75 23 43 55 12 64 77 33
Right
Step 3: Exchange these values
44 33 23 43 55 12 64 77 75
Repeatly moving Left, Right till condition is met again
44 33 23 43 55 12 64 77 75 Legend:
Pivot Element A[0]
Swap/Exchange Greater than
Less than
44 33 23 43 12 55 64 77 75
Left
Stop when Left “passed” Right. Exchange Pivot with Right.
12 33 23 43 44 55 64 77 75
Repeat on L
12 33 23 43 44 55 64 77 75 Legend:
Pivot Element A[0]
Greater than
12 33 23 43 44 55 64 77 75 Less than
From Left
12 23 33 43 44 55 64 77 75
From Right
Repeat on G
12 23 33 43 44 55 64 77 75 Legend:
Pivot Element A[0]
Greater than
12 23 33 43 44 55 64 77 75 Less than
From Left
12 23 33 43 44 55 64 77 75
12 23 33 43 44 55 64 75 77 From Right
Done!
44 75 23 43 55 12 64 77 33
12 33 23 43 55 64 77 75
33 23 43 64 77 75
23 43 77 75
Observation:
75
Height of the Quick-sort tree is …………………………………….. in the worse case. When will this happen?
1 2 3 4 5 6 7 8 9
2 3 4 5 6 7 8 9
3 4 5 6 7 …
4 5 6 7 …
………. Etc.
0
0
depth=0 Root node: processes n values
depth=1 Root’s 2 children process n-1 values
depth=2 All nodes process n-(1+2) values
…
depth =i All nodes process n-(1+2+…+2i-1) = n-(2i-1) values
Master Data
Structures
Graduate
School
Get an A in
CS435
Master A
Language
A E
1
1
1
1 1
u
1
1 1
1 1
1
H
1
A
A E
1
1
1
1 1
u
1
1 1
1 1
C 1
H
1
A E
1
1
1
1 1
u
1
1 1
1 1
1
H
1
A E
1
1
1
1 1
D u
1
1 1
1 1
1
H
1
A E
1
1
1
1 1
u
1
1 1
1 1
1
H
1
A E
1
1
1
1 1
u
1
1 1
1 1
C 1
H
1
A E
1
1
1
1 1
u
1
1 1
1 1
1
H
1
A E
1
1
1
1 1
u
1
B
1 1
1 1
1
HH
1
A E
1
1
1
1 1
u
I
1
1 1
1 1
1
H
1
A E
1
1
1
1 1
G
1
u
1
1 1
1 1
1
HH
1
A E
1
1
1
1 1
u
1
1 1
1 1
1
H
F
1
A E
1
1
1
1 1
u
1
1 1
1 1
1
H
1
A E
1
1
1
1 1
F
1
u
1
1 1
1 1
1
H
1
A E
1
1
1
1 1
u
1
1 1
1 1
1
H
Goal: To approximate the distance in G from v to every other vertex u
Always store the the length of the best path we have found so far from v to u
Goal: To approximate the distance in G from v to every other vertex u ≠ v
Always store the the length of the best path we have found so far from v to u
D[v] = 0
D[u] = +∞
Goal: To approximate the distance in G from v to every other vertex u ≠ v
Always store the the length of the best path we have found so far from v to u
D[v] = 0
D[u] = +∞
Each iteration,
-> we select a vertex u with smallest D[u] to put into C
-> Update D[z] where z is adjacent to u and not yet in C
Goal: To approximate the distance in G from v to every other vertex u ≠ v
Always store the the length of the best path we have found so far from v to u
D[v] = 0
D[u] = +∞
Always store the the length of the best path we have found so far from v to u
D[v] = 0
D[u] = +∞
Each iteration,
-> we select a u with smallest D[u] to put into C
-> Update D[z] where z is adjacent to u and not yet in C
From BWI
From v:
G: Adjacency list structure.
Q: Heap, key D[u] for all u from v. removeMin in O(logn) time.
Keep an array to access to all the keys of the vertices in the heap Q.
Update D[z] by first removing object containing z then insert z with new D[z] via
heapification in O(logn) time
From v:
G: Adjacency list structure.
Q: Heap, key D[u] for all u from v. removeMin in O(logn) time.
Keep an array to access to all the keys of the vertices in the heap Q.
Update D[z] by first removing object containing z then insert z with new D[z] via
heapification in O(logn) time
From v:
G: Adjacency list structure.
Q: Heap, key D[u] for all u from v. removeMin in O(logn) time.
Keep an array to access to all the keys of the vertices in the heap Q.
Update D[z] by first removing object containing z then insert z with new D[z] via
heapification in O(logn) time
Consider (ORD,LAX)
D*[ORD] + w((ORD,LAX))= 2 + 3 = 5 < D[LAX]=50
-> D[ORD] = 5
Consider (ORD, DFW)
D*[ORD] + w((ORD,DFW))= 2 + -10 = -8 < D[DFW]=-5
-> D[ORD] = -8
Consider (DFW,LAX)
D$[DFW] + w((DFW,LAX))= -13 + 12 = -1 < D[LAX]=4
-> D[ORD] = -1
From BWI
Consider (DFW,LAX)
D$[DFW] + w((DFW,LAX))= -13 + 12 = -1 < D[LAX]=4
-> D[ORD] = -1
5
10 1. Requirement:
20 7 10 1. Run Internet cable through the neighborhood.
5 2. Interconnect all the houses as cheaply as possible.
5
10 5
5
5 5
10
10
20 7 10
20 7 10
7 10 5
5 5
5
5 5 5
5 10
5 10
Definition: A tree that contain all the vertices in a weighted undirected graph G is a spanning tree.
Definition: A spanning tree that minimizes the sum of the weights of the edges of T.
Proof by contradiction:
Then we can remove f from T and replace it with e, and this will
result in a spanning tree, T_
w(T_) < w(T)
But the existence of such a tree, T_, would contradict the fact that
T
is a minimum spanning tree.
Proof by contradiction:
Suppose, for the sake of contradiction, that there is an edge, f, in T
w(e) < w(f)
f Then we can remove f from T and replace it with e, and this will
result in a spanning tree, T_
w(T_) < w(T)
But the existence of such a tree, T_, would contradict the fact that
T is a minimum spanning tree.
Edge (Weights)
(JFK, PVD) 144
(BWI, JFK) 184
(BOS, JFK) 187
(LAX, SFO) 337
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{BOS}, {BWI} {SFO} {LAX} {DFW}, {ORD} {PVD, JFK} {MIA}
Simple: No loop,
multiple edges
Unordered linked
list
O(1)
Unordered linked
list
O(min{|C(u)|, |C(v)|})
Each merge The size of the cluster got merged into at most doubled.
(R, +, ∘, 0n ,1n)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a mnoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
3. Multiplication is distributive w.r.t addition
1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
Associative is not required. 3. Multiplication is distributive w.r.t addition
1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
Associative is not required. 3. Multiplication is distributive w.r.t addition
1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)
• ({0,1,2,….m-1}, +𝑚𝑚 , ∗𝑚𝑚 , 0, 1) where +𝑚𝑚 and ∗𝑚𝑚 are addition and multiplication modulo m (commutative ring)
• (𝑀𝑀𝑛𝑛 , +, *, 0𝑛𝑛 , 𝐼𝐼𝑛𝑛 ) where 𝑀𝑀𝑛𝑛 is the set of all n x n matrices with elements from a ring, + ,
* are matrix addition and multiplication and 0𝑛𝑛 is zero matrix and 𝐼𝐼𝑛𝑛 is identity matrix
• The set M2(R) of all 2x2 matrices over R is a ring using matrix addition and multiplication.
Abbreviated as
Abbreviated as
1. A + B = B + A
2. (A+B) +C = A+ (B+C)
3. A + On = A
4. A + (-A) = O
5. (AB)C = A(BC)
6. AI = A = IA
7. A(B+C) = AB + AC and (B+C)A = BA + CA
Each column in B
Multiply through n rows in A
Multiply & Sum (I,j) pair n multiplication & n-1 summations
n2
n column in B
n3
(i,j) entry of the product of these two matrices: go across the ith row of A and down jth column of B.
Instead of 8 multiplications and 4 additions as above
There are only 7 multiplications and 18 additions/subtractions
Instead of 8 multiplications and 4 additions as above
There are only 7 multiplications and 18 additions/subtractions
m1 = (a12 – a22) (b21+ b22)
m2 = (a11 + a22) (b11 + b22)
m3 = (a11 – a21) (b11 + b12)
m4 = (a11 + a12) b22
m5 = a11 (b11 - b22)
m6 = a22 (b21 - b11)
m7 = (a21 + a22) b11
c11 = m1 + m2 – m4 + m6
c12 = m2 + m5
c21 = m6 + m7
c12 = m2 - m3 + m5 – m7
Instead of 8 multiplications and 4 additions as above
There are only 7 multiplications and 18 additions/subtractions
m1 = (a12 – a22) (b21+ b22)
m2 = (a11 + a22) (b11 + b22)
m3 = (a11 – a21) (b11 + b12)
m4 = (a11 + a12) b22
m5 = a11 (b11 - b22)
Main idea: Divide-and-conquer
n/2 x n/2 sub matrices m6 = a22 (b21 - b11)
m7 = (a21 + a22) b11
T(n) = 7T(n/2) + 18(n/2)2 n>=2 c11 = m1 + m2 – m4 + m6
T(1) = 1
c12 = m2 + m5
T(n) is ring operations. c21 = m6 + m7
c12 = m2 - m3 + m5 – m7
Homomorphisms, Isomorphism, decomposition, subrings, prime…. (Abstract Algebra)
Introduction to Homomorphisms:
R: A set with structure of an additive abelian group & a multiplicative monoid, together
with the distributive laws.
θ: R → S
Θ(r2su5 – 3su-2r + 2 ) =
If θ: R -> S is a ring homomorphism, show that
Is also a ring homomorphism where
x = ?
x = 2 (mod 3)
x = 3 (mod 5)
x = 2 (mod 7)
x = ?
where
𝑝𝑝 = 𝑝𝑝0 ∗ 𝑝𝑝1 ∗ …𝑝𝑝𝑘𝑘−1 and 𝑐𝑐𝑖𝑖 = 𝑝𝑝 / 𝑝𝑝𝑖𝑖, and 𝑑𝑑𝑖𝑖 = 𝑐𝑐𝑖𝑖 −1 mod 𝑝𝑝𝑖𝑖
140+63+30
x = 2 (mod 3), x = 3 (mod 5), x = 2 (mod 7)
p = p0p1p2 = 3x5x7= 105
i pi ci = p/pi di = ci-1 mod pi cidiui
0 3 35 35d0= 1 mod 3 35x2x2 = 140
2d0 =1 mod 3
d0 = 2
1 5 21 21d1 = 1 mod 5 21x1x3 = 63
d1 = 1
2 7 15 15d2 = 1 mod 7 15x1x2 = 30
d2 = 1
"Others assumed large servers were the fastest way to handle massive amounts of data.
Google found networked PCs to be faster,“ – Google Info Web page.
PR(A) can be interpreted as the probability of visiting web-page A during the random walk obtained through the
application following 2 rules we learnt.
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations
2
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations
2
For every link (i,j) in E that point to document dj
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations
2
For every link (i,j) in E that point to document dj
a(0) = 0
a(1) = h(0)
a(2)= h(0) + h(1)
Initialize h: 1 1 1 , a: 1 1 1
Step 1: a: 0 1 2
a(0) = 0 h: 3 2 0
a(1) = h(0)
a(2)= h(0) + h(1)
a(0) = 0
a(1) = h(0)
a(i) = ATh(i-1)
a(2)= h(0) + h(1) h(i) = A a(i)
h(0) = a(1) + a(2)
h(1) = a(2)
h(2) = 0
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations
a(0) = 0
a(1) = h(0)
a(i) = ATh(i-1)
a(2)= h(0) + h(1) h(i) = A a(i)
h(0) = a(1) + a(2) Power method:
h(1) = a(2) a(1) = ATh(0)
h(2) = 0 a(2) = ATh(1) -> a(1) = AT A a(1)
a(i) = ATh(i-1) -> a(i) = AT A a(i-1) = (AT A)i-1 a(1)
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations
a(0) = 0
a(1) = h(0)
a(i) = (ATA)i-1 a(1)
a(2)= h(0) + h(1) h(i) = Aa(i) = AATh(i-1) = (AAT)i-1 h(1)
h(0) = a(1) + a(2) Avoid a,h grows large by scaling to ensure they always stay within 0,1
h(1) = a(2)
h(2) = 0 What does convergence mean?
PageRank(G,V,E) //The code performs a SYNCHRONOUS Rank update
1. for all vertices u in V /* Initialization Step */
2. Src[u] = 1/n;
3. small = something-small;
4. while (convergence-distance > small) {
5. for all v in V
6. D[v]=0;
7. for(i=0;i<|V|;i++) {
8. Read-Adjacency-List(u,m,k1,k2,...,km); /*k1,k2,…,km: endpoints-of-outgoing edges */
9. for(j=1;j<=m;j++) /* m : out degrees of vertex u*/
10. D[kj] = D[kj] + Src[u]/m
11. }
12. for all v in V
13. D[v] = d * D[v] + (1-d)/n
14. convergence-distance = ||Src-D|| /* Euclidean distance */
15. Src=D;
16. }
Intricacy: Google's Technology page explains how the process gets more complicated:
PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual
page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But Google looks at
considerably more than the sheer volume of votes, or links the page receives. For example, it also analyzes the page that
casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages
"important." Using these and other factors, Google provides its views on the pages' relative importance. And that's still only part
of the protocol.
Performance: It's almost impossible to fathom, but PageRank considers more than 500 variables and 3 billion terms and still
manages to deliver results in fractions of a second. Yet there also is a certain simplicity to the search process.
Searching Problems
Dictionary ADT
• Stores associations between keys and items (also
called associative map)
• Operations :
(a) insertElement(k,e) – insert association between a
key and item; replace if necessary
(b) removeElement(k) – remove association between
key k and its element if it exists, else return
NO_SUCH_KEY error
(c) findElement(k) – find the element associated with
key k if it exists, else return EMPTY_ELEMENT
• Keys are from a set which may or may not be totally
ordered but keys checked for “equality”.
Dictionary implementations
• Store keys as sequence (unordered) with new elements
inserted at end.
-insertElement(k,e) – O(1) time worst-case
- removeElement(k) – O(n) time worst-case
- findElement(k) – O(n) time worst-case
• Hash tables
-- Define a function f : S → {0,1,…N-1} where S is the set of keys
and N is the table size.
--- Use this hash function to identify the index in the table
where the key is stored. (similar to direct access in an array)
--- Since it is not 1-1 function, more than one key may map into
the same index in the table causing collision.
-- Ideally f should distribute keys evenly in the table.
Hash tables
• Hash function f is composed of two functions
f1 : S → Z and f2 : Z → {0,1,…N_1} where is set of integers.
f1 is called “hashcode” function (defined in Java Object) and f2
is called “compression map”
• Hashcode function should be representative of all fields of an
object and for a primitive type (e.g. integer) representative of
all bits in integer. It is defined independent hash table size.
• Polynomial hash codes:
𝑥!"# 𝑎!"# + 𝑥!"$ 𝑎!"$ +…… 𝑥# 𝑎 + 𝑥%
where 𝑥!"#, 𝑥!"$,…. are integer representations of components
of key object and 𝑎 is a constant not equal to 1
-- 33, 37, 39, 41 are found to be good choices for character
strings.
Example hashcode in Java
public class Key implements Comparable<Key> {
private final String firstName, lastName;
public Key(String fName, String lName) { ..}
@Override
public int hashCode() {
int hash = 17 + firstName.hashCode();
hash = hash * 31 + lastName.hashCode();
return hash;
}
@Override
public boolean equals(Object obj) {….}
• a.equals(b) => a.hashCode() == b.hashCode() but not vice
versa.
Hash tables (contd.)
Compression Maps
• Division method:
h(k) = |k| mod N , k is hash code and choose N as a
prime number so as to distribute hash values evenly
among table indices.
• MAD (Multiply, Add and Divide) method:
h(k) = |ak+b| mod N where a and b are integers
randomly chosen .s.t. a,b ≥ 0 and a mod N ≠ 0
-- provides close to “ideal” hash function where
Prob(two keys hash into same value) ≈ 1/N
Collision resolution
• Each location in hash table is called “bucket”.
• (a) Chaining, (b) Open addressing
• Chaining – keep colliding keys in a list or sequence
called “chain” in the same bucket
-- in findElement(k), after hashing need to search for
keys in the chain
--- in insertElement(k,e) and removeElement(k), need
to hash into bucket and then insert/remove element
in/from chain
• Load factor – n/N (n number of items in hash table)
preferably n/N < 1
• Expected time complexity – O(⌈n/N⌉) (O(1) if n is
O(N))
Open Addressing
• Open Addressing – no chaining of keys that have same
hash value
• Probes other locations for the key
Suppose i = h(k). Probe sequence of locations (i+f(𝑗))
mod N, 𝑗 = 0,1,2,…. until the key is found for search/remove
or until an empty slot is found for insert
-- if f(𝑗) = 𝑗 for all j, it is called “linear probing”
-- if f(𝑗) = 𝑗 ! , it is called “quadratic probing”
--- if f(𝑗) = 𝑗. g(k) where g(k) is another hashing function, it is
called “double hashing”
• Faster than chaining for search/insert but removal is
complicated as there should not be any “holes” in
sequence for a particular h(k)
• Tends to introduce clusters of keys in the table for linear
and quadratic probing.
Universal Hashing
• A family of hash functions that minimizes expected number of
collisions
• Stated formally, let H be a subset of functions from [0,M-1] to
[0,N-1] satisfying the property that for any randomly chosen
function h from H and for any two integers j, k in [0,M-1],
Pr(h(j)) = h(k)) ≤ 1/N
• Implies that E[# of collisions between j and “n” integers from
[0,M-1]) ≤ n/N
• Set of hash functions of form (ak+ b mod p) mod N
where 0 < a < p, 0 ≤ b < p where p is a prime number with M ≤ p
< 2M ( M is number of hash codes)
can be shown to be “universal”.
• All Dictionary ADT operations can be done in expected time
O(⌈n/N⌉) using a randomly chosen hash function from this set
and chaining.
FFT computations
CS 610 Spring 2019
Instructor : Ravi Varadarajan
In the third stage, i.e, when m = 2, computations are done by pairing ele-
ments in positions that differ in the 3rd bit, i.e. 0 with 1, 2 with 3, 4 with 5 and
6 with 7. For example, new values for position 0 = 12 + ω 0 ∗ (−1) = 11 and for
position 1 = 12+ω 4 ∗(−1) = 13 . New values for position 6 = 8+ω 3 ∗(−2) = −8
and for position 7 = 8 + ω 7 ∗ (−2) = 7.
The DFT vector is given by [11, 8, 6, 9, 13, 10, 3, 7]. Note that elements in
positions 1 and 4 were switched, as well positions in 3 and 6.
Let’s compute inverse DFT of the DFT vector to see if we get back the
original vector. For inverse FFT, ω1 = ω −1 = 2−1 ≡ 9 mod 17 ≡ −8 mod 17.
The powers of this root are : ω10 = 1, ω11 = 9, ω12 = −4, ω13 = −2, ω14 = −1, ω15 =
8, ω16 = 4, ω17 = 2. We do the exat type of computations as in FFT but at the
end we need to multiply the values by n− 1 = 8− 1 = 15 ≡ −2 mod 17. In
the diagram below at the end, we got back the original vector for which we
computed the DFT before.
Graph algorithms
Graphs
• A directed graph G = (V,E), V is set of vertices and E is
set of edges, i.e. subset of V x V
• A digraph defines a binary relation on V.
• Useful representation in many applications (e.g. web
page links, transportation routes, semantic, social
networks)
• InEdges(v) = { (u,v)| (u,v) ϵ E } and outEdges(v) = {
(v,u) | (v,u) ϵ E }
• indeg(v) = |InEdges(v)| outDeg(v) = |outEdges(v)|
• ∑! ∈# 𝑖𝑛𝑑𝑒𝑔(𝑣) = ∑! ∈# 𝑜𝑢𝑡𝑑𝑒𝑔(𝑣) = |E|
• |E| ≤ 𝑛$ where n is number of vertices and |E| ≤
n(n-1) if there are no self-cycles.
Undirected graphs
• In undirected graph, no edge orientation.
(u,v) ϵ E → (v,u) ϵ E
• Defines a symmetric relation on V. Not
reflexive i.e. (u,u) not in E.
• incidentedges(v) = { (u,v)| (u,v) ϵ E or (v,u) ϵ E}
• deg(v) = |incidentedges(v)|
• ∑! ∈# 𝑑𝑒𝑔(𝑣) = 2 |E|
• |E| ≤ n(n-1)/2
Graph data Structures
• Adjacency matrix A – n x n boolean matrix where A(i,j) =
1 iff (𝑣! , 𝑣" ) ϵ E where V = {𝑣# , 𝑣$ , …𝑣% }
- Useful for matrix operations to solve all-pair problems
• Incidence matrix B – n x m matrix where B(i,j) = 1 if 𝑒" =
(𝑣! , 𝑣& ) for some k and B(i,j) = -1 if 𝑒" = (𝑣& , 𝑣! ) for some
k. We assume V = {𝑣# , 𝑣$ , …𝑣% } and E = {𝑒# , 𝑒$ , …𝑒' } –
Useful for analysis of some cycle problems
• Adjacency List – Array L[0..n-1] of lists where L[i] contains
the list of indices of vertices 𝑣" such that (𝑣! , 𝑣" ) ϵ E
Space is O(n+m) as opposed to O(𝑛$ ) for adjacency
matrix. Useful for many efficient graph algorithms.
Graph representations (example)
𝑒! 𝑣" 𝑒"
𝑣! 𝑒%
𝑒# 𝑒$ 𝑣$
𝑣#
𝑣! 𝑣" 𝑣# 𝑣$ 𝑒! 𝑒" 𝑒# 𝑒$ 𝑒%
0 1 0 0 1 0 −1 −1 0
0 0 1 1 −1 1 0 0 1
A= . B=
1 0 0 0 0 0 1 0 −1
1 0 0 0 0 −1 0 1 0
Adj lists:
L(𝑣! ) 𝑣"
L(𝑣" ) 𝑣# 𝑣$
L(𝑣# ) 𝑣!
L(𝑣$ ) 𝑣!
Graph Search
• Two approaches:
(a) Breadth-first search - From start node s, get vertices immediately reachable from s and
fan-out from each of those nodes and so on
BFS(G):
Input : G=(V,E) as adjacency list
Output: List of nodes visited in BFS order
q ← empty queue; bfsList ← {}
for i ← 0 to n-1
visited[i] ← false
for i ← 0 to n-1
if !visited[𝑣! ]
q.enqueue(𝑣! )
visited[s] ← true
while !q.empty()
v ← q.dequeue()
bfsList.append(v)
for each w in G.adjList[v]
if !visited[w]
visited[w] ← true
q.enqueue(w)
Return bfsList
Graph Search (contd.)
• BFS takes O(n) additional space and O(n+m) time primitive queue, list operations,
assignment and boolean operations
• Depth-first search is a recursive traversal exploring a path before back-tracking by a step
and trying a different path from that vertex.
DFSMain(G,s):
Input : G=(V,E) as adjacency list and s start vertex index
Output: List of nodes visited in DFS order
dfsList ← {}
for i ← 0 to n-1
visited[i] ← false
for i ← 0 to n-1
if !visited(𝑣! )
DFS(𝐺, 𝑣! ,dfsList)
DFS(G,v,dfsList):
visited[v] ← true
dfsList.append(v)
for each w in G.adjList[v]
if !visited[w]
DFS(G,w,dfsList))
Return dfsList
• Takes O(L) additional stack space (L- longest path length from v) and O(n+m) time
Graph search example
𝑣! 𝑣&
𝑣$
𝑣" 𝑣%
𝑣#
Graph search example - BFS
• Start vertex : 1; Q = {𝑣! }; bfslist = []
• Remove 𝑣! from Q, bfslist=[𝑣! ]; Q = {𝑣" , 𝑣# }
• Remove 𝑣" , bfslist=[𝑣! , 𝑣" ]; Q = {𝑣# , 𝑣$ }
• Remove 𝑣# , bfslist=[𝑣! , 𝑣" , 𝑣# ]; Q = {𝑣$ , 𝑣% }
• Remove 𝑣$ , bfslist=[𝑣! , 𝑣" , 𝑣# , 𝑣$ ]; Q = {𝑣% , 𝑣& }
• Remove 𝑣% , bfslist=[𝑣! , 𝑣" , 𝑣# , 𝑣$ , 𝑣% ]; Q = {𝑣& }
• Remove 𝑣& , bfslist=[𝑣! , 𝑣" , 𝑣# , 𝑣$ , 𝑣% , 𝑣& ]; Q = {}
Graph search example - DFS
• Start vertex : 𝑣# ; dfslist ={}
DFS(𝑣# ) dfslist={𝑣# }
𝑣% 𝑣& 𝑣(
𝑣"
𝑣# 𝑣$ Stack has 𝑣! , 𝑣" , 𝑣# , 𝑣$ , 𝑣% after
DFS(𝑣! ) is completed.
DF LOWLINK (in square) LOWLINK(𝑣! ) = 1 = DF(𝑣! ).
Eject vertices from stack until 𝑣!
1 1 𝑣! 1 1 𝑣! as part of a strongly connected
component
1 2 𝑣" 1 2 𝑣" 3
5 𝑣%
We need to transform the string X = x1 x2 ....xm into the string Y = (y1 y2 ....yn ).
Let’s keep a pair of cursors one for X and another for Y which indicate the
transformation remaining to be done. For example when the cursor pair is
(i, j), it remains to transform xi ....xm into yj .....yn . This cursor pair defines a
state and we move from state to state as we make decisions to keep a character,
insert a character from Y, delete a character from X or replace a character in
X from Y in these cursor positions.
Let us define F (i, j) as the mimimum number of operations required to
transform xi ....xm into yj .....yn . i.e. it is optimization function for a subproblem
when we are in state (i, j). So what we require ultimately is F (1, 1). Let us
focus on the subproblem defined by the state (i, j).
(a) When xi = yj , then the optimum number of operations required to
transform xi ....xm into yj ....yn should be the same as the optimum number of
operations required to transform xi+1 ....xm into yj+1 ....yn ; in this case we move
to cursor pair state (i + 1, j + 1). So F (i, j) = F (i + 1, j + 1).
(b) When xi 6= yj , we have 3 choices :
(i) insert yj before xi in which case we move to state (i, j + 1) as what remains
is to transform xi ....xm into yi+1 .....yn ; here the cost will be 1 unit for insert +
mimimum number of operations required to transform xi ....xm into yj+1 .....yn ,
i.e. 1 + F (i, j + 1),
(ii) delete xi in which case we move to state (i + 1, j) as what remains is to
transform xi+1 ....xm into yj .....yn ; cost here is 1 + F (i + 1, j),
(iii) replace xi by yj in which case we move to state (i + 1, j + 1); here cost is
1 + F (i + 1, j + 1). Obviously we like to take the choice that gives the minimum.
This means that F (i, j) = min(1 + F (i + 1, j), 1 + F (i, j + 1), 1 + F (i + 1, j + 1)).
This is the recursive formulation of DP. But we do not compute it recursively
but iteratively in an order which gurantees that when we compute F (i, j), we
have already computed F (i + 1, j), F (i, j + 1), F (i + 1, j + 1). What order
gurantees that ?
We start from the following boundary conditions first :
(a) For 1 ≤ i ≤ m, F (i, n + 1) i.e. for the problem of transforming xi ...xm into
empty string as n + 1-th cursor position in Y moves past the end of the string.
It is easy to see F (i, n + 1) = m − i + 1 as we just to have to delete these
m − i + 1 characters from X.
(b) similary for 1 ≤ j ≤ n, F (m + 1, j) i.e. for the problem of transforming
empty string into yj ...yn as m + 1-th cursor position in X moves past the end
of the string. It is easy to see F (m + 1, j) = n − j + 1 as we just to have to
insert these n − j + 1 characters of Y into X.
(c) F (m + 1, n + 1) = 0 as all transformation is done here.
Let’s have two (m+1)×(n+1) tables, one to store F (i, j), 1 ≤ i ≤ m+1, 1 ≤
j ≤ n + 1 and another to store A(i, j) for optimum action in that state which
has one of the values O, I, D, R corresponding to the four actions of keeping xi
when xi = yj , inserting yj , deleting xi and replacing xi with yj
We fill first the bottom row and right most column for boundary conditions,
then we fill the rows from bottom to top and from right to left in each row.
Final entry to be filled is F (1, 1). The algorithm steps are follows:
for i ← 1 to m + 1 do
F [i, n + 1] ← m − i + 1
end
for j ← 1 to n + 1 do
F [m + 1, j] ← n − j + 1
end
for i ← m downto 1 do
for j ← n downto 1 do
if X.elementAtRank(i)=Y.elementAtRank(j) then
(F [i, j], A[i, j]) ← (F [i + 1, j + 1],0 O0 ]
end
else
F (i, j) = min(1 + F [i + 1, j], 1 + F [i, j + 1], 1 + F [i + 1, j + 1])
if F [i, j] = 1 + F [i + 1, j] then
A[i, j] =0 D0
end
if F [i, j] = 1 + F [i, j + 1] then
A[i, j] =0 I 0
end
if F [i, j] = 1 + F [i + 1, j + 1] then
A[i, j] =0 R0
end
end
end
end
// get optimum solution
print ’minimum edit distance = ’, F [1, 1]
(i, j) ← (1, 1)
while i ≤ m and j ≤ n do
if A[i, j] =0 O0 then
print ’Keep ’,X.elementAtRank(i)
(i, j) ← (i + 1, j + 1)
end
if A[i, j] =0 I 0 then
print ’Insert ’,Y.elementAtRank(j)
(i, j) ← (i, j + 1)
end
if A[i, j] =0 D0 then
print ’Delete ’,X.elementAtRank(i)
(i, j) ← (i + 1, j)
end
if A[i, j] =0 R0 then
print ’Replace ’,X.elementAtRank(i), ’ by ’, Y.elementAtRank(j)
(i, j) ← (i + 1, j + 1)
end
end
while i ≤ m do
print ’Delete ’, ,X.elementAtRank(i)
i←i+1
end
while j ≤ n do
print ’Insert ’, ,Y.elementAtRank(j)
j ←j+1
end
Try to compute these tables for X=’AMAZING’ and Y=’HORIZONS’ and
get optimal solution.
Note optimal actions are indicated by O - keep symbol in X, I - insert symbol
from Y in X, D - delete symbol in X, R - replace symbol in X by symbol in Y.
Table after filling in boundary conditions:
H O R I Z O N S
j→ 1 2 3 4 5 6 7 8 9
i↓
A 1 7 D
M 2 6 D
A 3 5 D
Z 4 4 D
I 5 3 D
N 6 2 D
G 7 1 D
8 8 I 7 I 6 I 5 I 4 I 3 I 2 I 1 I 0
Table after filling in the row i = 7 from right left:
H O R I Z O N S
j→ 1 2 3 4 5 6 7 8 9
i↓
A 1 7 D
M 2 6 D
A 3 5 D
Z 4 4 D
I 5 3 D
N 6 2 D
G 7 8 R 7 R 6 R 5 R 4 R 3 R 2 I 1 R 1 D
8 8 I 7 I 6 I 5 I 4 I 3 I 2 I 1 I 0
For example entry for the cell (7, 8), namely for the subproblem of transforming
’G’ to ’S’ is given by F (7, 8) = min(1 + F (8, 9), 1 + F (7, 9), 1 + F (8, 8)) =
1 + F (8, 9) = 1. The corresponding action here is replace ’G’ by ’S’ and we will
be done with transformation as we move to state (8, 9).
On the other hand for the cell (7, 7), for the subproblem of transforming ’G’ to
’NS’, F (7, 7) = min(1 + F (8, 8), 1 + F (7, 8), 1 + F (8, 7)) and both 1 + F (8, 8)
and 1 + F (7, 8) give the same min value of 2 and we arbitrarily pick 1 + F (7, 8)
with the corresponding action of inserting ’N’ before ’G’ and then move state
to (7, 8).
Table after filling in the row i = 6 from right left:
H O R I Z O N S
j→ 1 2 3 4 5 6 7 8 9
i↓
A 1 7 D
M 2 6 D
A 3 5 D
Z 4 4 D
I 5 3 D
N 6 7 I 6 I 5 I 4 I 3 I 2 I 1O 2 R 2 D
G 7 8 R 7 R 6 R 5 R 4 R 3 R 2 I 1 R 1 D
8 8 I 7 I 6 I 5 I 4 I 3 I 2 I 1 I 0
For example for the cell (6, 7), the subproblem is transforming ’NG’ to ’NS’ and
since the symbols are same at the cursor positions, F (6, 7) = F (7, 8) = 1 and
we move to state (7, 8).
Final table values are given below:
H O R I Z O N S
j→ 1 2 3 4 5 6 7 8 9
i↓
A 1 6 R 5 R 5 R 5 D 6 R 6 R 6 D 7 R 7 D
M 2 6 R 5 R 4 R 4 R 5 R 5R 5 D 6 R 6 D
A 3 6 R 5 R 4 R 3 R 3 D 4 R 4 D 5 R 5 D
Z 4 6 R 5 R 4 R 3 I 2O 3 R 3 D 4 R 4 D
I 5 6 I 5 I 4 I 3 O 3 R 2 R 2 D 3 R 3 D
N 6 7 I 6 I 5 I 4 I 3 I 2 I 1O 2 R 2 D
G 7 8 R 7 R 6 R 5 R 4 R 3 R 2 I 1 R 1 D
8 8 I 7 I 6 I 5 I 4 I 3 I 2 I 1 I 0
Optimal value is given by F (1, 1) = 6. We can get the optimal sequence of
actions by starting from (1, 1) and using optimal action in the cell to move to
next state. The optimal sequence of states : (1, 1) → (2, 2) → (3, 3) → (4, 4) →
(4, 5) → (5, 6) → (6, 7) → (7, 8) → (8, 9).
The corresponding sequence of actions : R (replace ’A’ by ’H’) → R (replace
’M’ by ’O’) → R (replace ’A’ by ’R’) → I (insert ’I’) → O (keep ’Z’) → R
(replace ’I’ by ’O’) → O (keep ’N”) → R (replace ’G’ by ’S)
Numerical Problems
Ring Algebra
• (S, +, ◦, 0, 1) is a ring if
(a) (S, +, 0) is a monoid
(b) (S, ◦, 1) is a monoid
(c) + is commutative (a + b = b + a)
(d) ◦ distributes over + ( a ◦ (b+c) = (a ◦ b) + (a ◦ c))
(e) every element a in S has additive inverse b, i.e. a +b = b + a
= 0 (we call it –a)
• (𝑀" , +, *, 0" , 𝐼" ) where 𝑀" is the set of all n x n matrices with
elements from a ring, + , * are matrix addition and
multiplication and 0" is zero matrix and 𝐼" is identity matrix
Fourier Transforms
• Fourier transform is used in signal processing to convert a
signal from time domain to frequency domain.
• Discrete Fourier Transform (DFT) is used in digital signal
processing to examine frequency spectrum of a sampled
signal (e.g. audio) over a period of time
• Let (S, +, ◦, 0, 1) be a commutative ring. An element w in S is
called a principal n-th root of unity if
(a) w ¹ 1
(b) w" = 1 and
(c) ∑"&'
#$% w #( = 0, 1 ≤ p < n
−1 = 𝜔# 𝜔& = 𝜔' = 1
𝜔( 𝜔%
−𝜔 −𝜔"
𝜔$
−𝜔!
!"#
!+ !+
In Complex number ring , 𝜔 = 𝑒 $ = cos + 𝑖 𝑠𝑖𝑛
& &
!+ !+
𝜔)* = cos - 𝑖 𝑠𝑖𝑛
& &
In modulo 17 ring {0,1,2…16}, 𝜔 = 2, 𝜔! = 4, 𝜔" = 8, 𝜔# = 16 ≡ −1 𝑚𝑜𝑑 17 ,
𝜔( = (-1) * 2= -2 ≡ 15 (mod 17), 𝜔$ = -4, 𝜔% = -8, 𝜔& = -16 ≡ 1 (mod 17)
𝜔)* = 𝜔% = -8 ≡ 9 (mod 17)
DFT and polynomials
• Let p(x) be the polynomial = 𝑎! + 𝑎" x + …𝑎#$" x #$"
• Then i-th element of F(a) (DFT)
= ∑#$" '% '
%&! 𝑎% w = p(w ) – evaluated at root w
'
−1 = 𝜔# 𝜔& = 𝜔' = 1
𝜔( 𝜔%
−𝜔 −𝜔"
𝜔$
−𝜔!
!"#
!+ !+
In Complex number ring , 𝜔 = 𝑒 $ = cos + 𝑖 𝑠𝑖𝑛
& &
!+ !+
𝜔)* = cos - 𝑖 𝑠𝑖𝑛
& &
In modulo 17 ring {0,1,2…16}, 𝜔 = 2, 𝜔! = 4, 𝜔" = 8, 𝜔# = 16 ≡ −1 𝑚𝑜𝑑 17 ,
𝜔( = (-1) * 2= -2 ≡ 15 (mod 17), 𝜔$ = -4, 𝜔% = -8, 𝜔& = -16 ≡ 1 (mod 17)
𝜔)* = 𝜔% = -8 ≡ 9 (mod 17)
Basis of FFT algorithm
• Suppose 𝑝(𝑥) = 𝑎2 + 𝑎3 𝑥 + … 𝑎453 𝑥 453 where 𝑛 =
2𝑘
Then 𝑝(𝑥) can be written as 𝑝6764 (𝑥 8) + 𝑥 . 𝑝9:: (𝑥 8)
where
𝑝6764 (𝑦) = 𝑎2 + 𝑎8 𝑦 + 𝑎; 𝑦 8+ ……+ 𝑎8(=53) 𝑦 (=53)
𝑝9:: (𝑦) = 𝑎3 + 𝑎? 𝑦+ 𝑎@ 𝑦 8+ ……+ 𝑎8 =53 A3 𝑦 =53
• If b = FFT([𝑎2 ,….. 𝑎453 ]), for 0 ≤ 𝑖 ≤ 𝑘 = n/2,
𝑏B = p(wB ) = 𝑝6764 ((w8)B ) + wB 𝑝9:: ((w8)B )
= 𝑐B + wB 𝑑B
where c = = FFT([𝑎2 , 𝑎8 ….. 𝑎8(=53) ]), d= FFT([𝑎3 , 𝑎? …..
𝑎8 =53 A3 ]) evaluated at k=n/2-th root of unity w8
• Also note that 𝑏=AB = 𝑐B + w=AB 𝑑B = 𝑐B - wB 𝑑B as w= = w4/8 = -
1
Recursive FFT algorithm
FFT(a, w ):
Input : a = [𝑎2 , 𝑎3 ….. 𝑎453 ] from a commutative ring, where
n=2= , 𝑘 ≥ 0 and w is n-th root of unity in the ring
Output: b = F. a is the FFT of a
if n = 1
return a
c ← FFT([𝑎2 , 𝑎8 ….. 𝑎458 ] , w8)
d ← FFT([𝑎3 , 𝑎? ….. 𝑎453 ] , w8)
𝑥←w
for 𝑖 ← 0 to n/2 -1
𝑏B ← 𝑐B + 𝑥 * 𝑑B
𝑏!AB ← 𝑐B − 𝑥 * 𝑑B
"
𝑥 ← 𝑥*w
return b
More on FFT
• Time complexity (primitive ring operations):
T(n) ≤ 2 T(n/2) + k n, n ≥ 2, T(1) = d → T(n) is O(n log n)
• Another way to look at FFT is division by polynomials
• For a polynomial 𝑝(𝑥), value at 𝑥 = 𝑎
𝑝(𝑎) = remainder when 𝑝(𝑥) is divided by 𝑥 − 𝑎 (remainder
theorem)
• To compute values of 𝑝(𝑥) at w! , w" , ….. w#$" ,
we need remainders when 𝑝(𝑥) is divided by
𝑥 − w! , 𝑥 − w" , … , 𝑥 − w#$"
• To find out remainders of 𝑝(𝑥) when we divide by 𝑞" (𝑥) and 𝑞% (𝑥),
we can first find remainder polynomial 𝑟(𝑥) of 𝑝(𝑥) when divided
by 𝑞" (𝑥) * 𝑞% (𝑥) and then
take remainders of r(𝑥) when divided by 𝑞" (𝑥) and 𝑞% (𝑥)
• If we pair 𝑥 - w! , 𝑥 - w" ,… , 𝑥 - w#$" in a particular order and
successively multiply products to get final product polynomial and
then find remainders successively by the product polynomials we
can find evaluations efficiently.
FFT Butterfly network
• Iterative bottom-up approach amenable to parallel processing
• For n = 2= , there are 𝑘 stages
• At each stage m, 0 ≤ 𝑚 < 𝑘, we have 2D smaller butterfly
networks of size 2=5D
• In a butterfly network at stage m, for each 0 ≤ 𝑖 ≤ 𝑛 − 1,
value at position 𝑖 namely 𝑣(𝑖) paired with value at position
𝑗 𝑣(𝑗) where 𝑖 and 𝑗 differ in 𝑘 − 𝑚 −th bit position
• Assuming 𝑖 < 𝑗, value for next stage at 𝑖 = 𝑣(𝑖) +
wB# 𝑣(𝑗) and value for next stage at 𝑗 = 𝑣(𝑖) + wE# 𝑣(𝑗)
𝑖D - integer resulting from reversing bits of integer 𝑖 and
shifting by (𝑘 − 𝑚 − 1) bits to the left
Numerical Problems (contd.)
Wrapped convolution
• Convolution of two n-length vectors results in 2n-length
vector and hence FFT algorithm requires padding input
vectors with zeros for the remaining n elements
• Wrapped convolution results in n-length vector and does not
require padding with zeros.
• Positive wrapped convolution (useful in many integer
algorithms):
𝑐! = ∑!"#$ 𝑎! 𝑏!%" + ∑(%'
"#!&' 𝑎" 𝑏(&!%" , 0 ≤ i ≤ n-1
• Negative wrapped convolution:
𝑑! = ∑!"#$ 𝑎! 𝑏!%" - ∑(%'
"#!&' 𝑎" 𝑏(&!%" , 0 ≤ i ≤ n-1
𝐹 0 = 𝑘% = 1
𝐹 1 = (𝑘! + 𝑘%) = 𝑑 + 1 => 𝑘! = d
Final solution : 𝐹 𝑛 = 𝑛 + 𝑑
Dr. Ravi Varadarajan
Characteristic equation method for non-homogeneous
recurrences
T F
𝑎" ≤ 𝑎# ? 𝑎! ≤ 𝑎# ?
T T F
F
After heapify to 40 38 35 25 22 15 14 20
create max heap :
33 23 43 23 33 43 12 23 33 43
55 64 77 75 55 64 77 75
33 23 43 23 33 43 12 23 33 43
55 64 77 75 55 64 77 75
m-elements
(3/4)! m elements
(3/4)" m elements
Setting (3/4)! m = 1, we get k = 𝑙𝑜𝑔(!) m
"
Selection problem
• Finding k-th smallest element of a set S of n elements
• Using heap and extracting minimum k times gives time
complexity O(n + k log n). This finds all p-th smallest elements
where 1 ≤ 𝑝 ≤ 𝑘
• Can be done faster using divide-and-conquer approach.
• Similar to QuickSort, choose a pivot e and divide S into 3 sets
S1, S2 and S3.
(a) if k <= |S1|, recursively find k-th smallest element in S1
(b) if |S1| < k ≤ |S1| + |S2|, ‘e’ is k-th smallest element
(c) Otherwise recursively find k – (|S1|+|S2|)-th smallest
element in S3.
• If we choose pivot to be median of 5-element medians
• T(|S|) ≤ T(max(|S1|,|S3|) + T(|S4|) + k |S|
S4 is the set of 5-element medians from S.
Selection example
• S has 12 elements using pivot e
|S1| = 5, |S2| = 3 |S3| = 4
KMPFailureFunction(P[1..m]):
f(1) ← 0
for j ← 2 to m
k ← f(j-1)
while P[j] ¹ P[k+1] and k > 0
k ← f(k)
if P[j] ¹ P[k+1] and k = 0
f(j) ← 0
else
f(j) ← k+1
return f
KMP Time complexity
• Excluding construction of failure function, in KMPMatch, O(1) time
is spent in each while loop iteration assuming character comparison
takes O(1) time.
• To determine number of iterations, consider i-j, (i-j ≤ n)
(i) when there is a match i-j remains same (i, j both increase by 1)
(ii) when there is no match and j = 0, i-j increases by 1 (only i
increases by 1)
(iii) when there is no match and j > 0, j is set to f(j-1) which is less than
j and hence i-j increases by at least 1
• Hence at end of each iteration either i increases (text position
advances) or i-j increases (pattern shift) by at least 1
→ # of iterations ≤ 2n → complexity is O(n) excl. failure function
construction
• Failure function takes O(m) time to compute→ total complexity is
O(n+m)
Tries
• Efficient data structure for processing a series of search
queries on the same set of text strings
• Given (S, S) where S Í S∗ , a (compressed) trie T is an
ordered tree where
(a) each node is labeled with a string from alphabet S
(b) ordering of children nodes according to some canonical
(usually alphabetical) ordering of labels
(c) an external node is associated with a string of S formed by
concatenation of all labels from root to that node; every
string of S is associated with an external node.
(d) every internal node must have at least 2 children
• In a standard trie, the label is just a character in S and an
internal node can have just one child
Trie example
Compressed Trie example
Compressed Trie (with positions) example
Complexity of tries
• Given (S, S) where |S| = n and |S|= d
• Number of nodes in trie is O(n),
• Height is length of longest string in S denoted by m
• Every internal node has at least 2 but at most d children
• Space complexity is O(n m)
• Time to search for a string of size k is O(dk)
find path in T by matching substrings of search string
• Time complexity to construct T
(a) inserting a string at a time --- find path in T by tracing prefix
of string and when we stop at an internal node (i.e. cannot
match any children), insert a new node there for suffix
Time to insert a string 𝑥 is O(d |𝑥|)→ complexity is O(dn)
where n = ∑' ( ) |𝑥| where | 𝑥 | is length of 𝑥.
Suffix Tries
• Also called “position tree” that has external nodes representing all
possible suffixes of a given string s = 𝑠! 𝑠# .. 𝑠)*"
• Each node is label with an interval (i,j) which represents the
substring 𝑠( .. 𝑠% for 0 ≤ i < j < m
• Useful for many efficient string operations
(a) substring search O(m)
(b) longest common substring
(c) longest repeated substring
(d) useful in Bioinformatics to search for patterns in DNA or protein
sequences
• Space complexity is O(n)
• Time to construct a suffix trie – O(dn)
• Time to search for a substring of size m in the Suffix Trie – O(dm)
• Generalized version for a set of words
Suffix Trie example
ADT Set
• Stores distinct elements
• Operations supported :
makeSet(e) – make a set with a single element e
union(S1,S2) – returns S1 U S2
intersect(S1,S2) – returns S1 ∩ S2
substract(S1,S2) – returns set of elements in S1 but not
in S2
Initially S1 S2 S3 S4 S5 S6
S1 S2 S3 S4 S5 S6
S1 S2 S3 S4 S5 S6
S1 ← Union(S1,S3)
S1 S2 S1 S4 S5 S6
S1 S2 S3 S4 S5 S6
S2 ← Union(S2,S5)
S1 S2 S1 S4 S2 S6
S1 S2 S3 S4 S5 S6
S1 ← Union(S1,S2)
S1 S1 S1 S4 S2 S6
10 15
S1 ← Union(S1,S2)
10 5
15 2
S4 ← Union(S1,S4) 11
v5
10
Make root of smaller set
child of root of larger set. 15
Amortized cost analysis for union-find
• union(S1,S2) – make smaller sequence elements point to the
set node of larger sequence elements
• Complexity of sequence of n operations consisting of union
and find starting with singleton sets of n elements.
• Amortization : Accounting method
(i) For find operation charge unit cost to operation itself
(ii) For union operation, charge unit cost to each of the
elements whose links have changed (no cost to operation itself).
• Total amortization cost = # of cost units assigned to elements
+ # of cost units assigned to Find operation >= total actual cost
• # of cost units assigned to elements ≤ # of times an element
can change set
• A set doubles when an element moves, at most log n times an
element can change sets → time complexity is O(n log n)
Amortized analysis for a sequence of n
(weighted) union-find operations
• Accounts : Find_1, Find_2, Find_3….
Element_1, Element_2,………..Element_n
Amortized cost using accounting method is given as follows:
(a) Union à if Element_k (root of smaller set) points to Element_p, assign 1 unit cost to
Element_k account
(b) Find_k à assign unit cost to Find_k account
• Note that actual cost of a Find(e) is the path length from e to the root = number of
unions performed before this find that causes the element to change its set
membership.
• Actual cost of a union operation is unit cost
Total amortized cost of union and finds >= actual cost of union and finds
• Element e in S1 – find operation finds path from e to root of S1 – path length = p
After S1 U S2 – find(e) – finds path from e to root of S1 U S2 . Call this path length p1.
When can p1 = p+1 ? |S1| <= |S2| before merge. After merge , combined set |S1| + |S2|
>= 2 |S1| e has moved to a set which is double the size of the set it belonged to before
Max. # of times path length increases for e = 1 -> 2 -> 4 -> 8 ….. -> n == log n times
• Each element account charged at most log n units
Actual cost of union and finds <= Total amortized cost of union and finds <= n + n log n
Efficient Union-Find DS
• Use a tree for a set where root node identifies the set and each
child node has a parent link. Root node’s parent link points to itself.
• Union – make the root of the tree for one set a child node of root of
another. – Takes O(1) time
• To make find op more efficient, make root of smaller height (rank)
set child of root of bigger height (rank) set – keep track of # of
nodes in set
Let S(h) – minimum size of set with a tree of height h
S(h) ≥ 2 S(h-1), h ≥ 1, S(0) = 1→ S(h) ≥ 2! → h ≤ log n, number of
elements in partition sets
Complexity of Find is O(log n)
• Total complexity of a sequence of n union-find operations is O(n
log n)
• Can we do better ?
Efficient Union-Find DS
• Due to Robert E. Tarjan
• Use Union by rank – Similar to union by weight by making root
of set of smaller rank a child of root of set of larger rank.
MakeSet(x):
x.parent ← x; x.rank ← 0
Union(x,y):
if x.rank > y.rank
y.parent ← x
else
x.parent ← y
if x.rank = y.rank
y.rank ← y.rank+1
Efficient Union-Find DS (contd.)
• Use path-compression – After a find(e), make all
nodes in path from e to root children of the root.
Should take same order of time as without
compression.
Find(e):
if e ≠ e.parent
e.parent ← Find(e.parent)
return e.parent
• Recursion down finds path from element to root and
unwinding recursion sets parent of all elements in
the path to the root.
• Run-time : shown to be almost linear in the worst-
case.
Efficient Union-Find (path-compression)
9 9
2 1 15 2 1 3
3 5
10 5 10 11
11
15
𝑉! V-𝑉!
𝑒!
𝑒"
Cut set 𝐸!
𝑇! = (T - {𝑒! }) ∪ {𝑒" }
𝐶(𝑇! ) = C T − w 𝑒! + w(𝑒" ) ≤ 𝐶(𝑇)
Kruskal’s MST algorithm
• Idea:
-- At each step we find a minimum cost edge that
connects a vertex in one spanning tree to another in the
forest.
-- This greedy choice property is possible
--- Initially each vertex by itself is a spanning tree in the
forest
--- We choose in non-decreasing order of edge weights
(a) if the edge connects vertices in same tree, ignore it
(b) else choose the edge and reduce number of spanning
trees by 1
Kruskal’s MST algorithm example
2 v2
v1 7
12
v5
8 9
5 6
v6
v4 v3
4
2 7 9
v1 v2 v5 v6
5
4 Cost = 27
v3 v4
Kruskal’s algorithm time complexity
• Use a min-heap priority queue for edges using w(e)
• At each step O(log m) time to find min cost edge, m
number of edges
Total number of steps ≤ m
• Use a disjoint set union-find DS for spanning forest
• n makeSet() operations, n is number of vertices
• set find to detect if vertices of an edge are in same
spanning tree
• set union merge spanning trees
Time complexity of union-finds – O(m log *m)
• Total time complexity is O(n + m log m)
Prim-Jarnik’s MST algorithm
• Idea:
-- Start from a vertex and grow a spanning tree
-- Maintain D[v] , min cost of an edge from v to
a vertex in existing spanning tree; Initially D[v]
set to +∞
--- At each step choose vertex u with minimum
D[v] to be added to the spanning tree
--- This greedy choice property possible
--- Then update D[w] for each vertex not in
spanning tree that is adjacent to u
Prim-Jarnik’s Algorithm example
2 v2
v1 7
12
v5
8 9
5 6
v5
v6
v4 v3 7 9
4
v2 v6
D[v] v1 v2 v3 v4 v5 v6
Add v5; 𝟏𝟐 𝟕 ∞ ∞ 0 (*) 𝟗 2
update
v1,v2,v6 v1
Add v2 𝟐 𝟕(*) 𝟔 ∞ 0(*) 𝟗
update v1,v3 5
Add v1
𝟐(*) 𝟕(*) 𝟓 𝟖 0(*) 𝟗 v3
update v4,v3
Cost = 27
Add v3
𝟐(*) 𝟕(*) 𝟓 (*) 𝟒 0(*) 𝟗 4
update v4
v4
Add v4; no 𝟐(*) 𝟕(*) 𝟓 (*) 𝟒 (*) 0(*) 𝟗
updates
5 3
v1 v2
v4 4
Source = v1 v6
2 2 4 2
3
v5 Relaxation :
v3 6 D[v] <- min(D[v], D[u] + w((u,v))
D[v] v1 v2 v3 v4 v5 v6
Add v1; 𝟎(*) 𝟓 𝟐 ∞ ∞ ∞
update
v1
v2,v5,v6
Add v3 𝟎(∗) 𝟒 𝟐(∗) ∞ 𝟖 ∞ 2 v3
update v2,v5
Add v2
𝟎(∗) 𝟒 (∗) 𝟐(∗) 𝟕 8 ∞
update v4 4 v2 v5 8
Add v4
𝟎(*) 𝟒(*) 𝟐 (*) 𝟕(*) 8 𝟏𝟏
update v6 7 v4
Add v6
Shortest path alg. time complexity
• Min-Heap priority queue for vertices in V2 based on
D[v]
• O(log n) to get min D[v]
• number of extract-min ops ≤ n, number of nodes
• For a vertex v added to V1, need to update D[w] only
for each edge (v,w); each heap update O(log n)
• Total updates ≤ m, number of edges
• Total time complexity – O(n log n + m log n)
• Since for a connected graph m ≥ n-1, time complexity
for a connected graph is O(m log n)
Bellman Ford algorithm
• Finds single-source shortest paths even when
there are negative edges
• Can identify negative weight cycles
• Works by iterating at most n-1 times
• During each iteration look at all edges (u,v) and
update D[v] to better value as in Dijkstra’s
algorithm if possible, i.e. D[v] = min(D[v],
D[u]+w((u,v)) This is called “relaxation”.
• Total time-complexity is O(nm)
All pair path problems
Closed semi-rings
• (S, ◦ , 1) is a monoid if
(a) S is closed under ◦ ( a ◦ b ϵ S, " a, b ϵ S)
(b) ◦ is associative (a ◦ (b ◦ c) = (a ◦ b) ◦ c)
(c ) 1 is identity for ◦ (a ◦ 1 = 1 ◦ a = a)
• (S, +, ◦, 0, 1) is a closed semi-ring if
(a) (S, +, 0) is a monoid
(b) (S, ◦, 1) is a monoid and 0 is annihilator for ◦ (i.e a ◦ 0 = 0 ◦ a
= 0, " a)
(c) + is commutative (a + b = b + a) and idempotent ( a + a = a)
(d) ◦ distributes over + i.e. a ◦ (b+c) = (a ◦ b) + (a ◦ c)
(e) Associative, distributive, commutative and idempotency
properties extend to finite and infinite sums
∑! 𝑎! ◦ ∑" 𝑏" = ∑!," 𝑎! ◦ 𝑏" = ∑!( ∑" 𝑎! ◦ 𝑏" )
Note infinite sum ∑! 𝑎! exists and is unique (idempotency)
Examples of closed semi-rings
• Boolean algebra : ({0,1}, Ú, Ù, 0, 1) is a closed semi-ring
Ù distributes over Ú, Ú is idempotent ≥ 0 and 0 is annihilator for Ù
• (𝑅≥ 0 ⋃ {+∞}, MIN, +, +∞, 0) is a closed semi-ring (here + is
arithmetic addition)
-- (𝑅≥ 0 ⋃ {+∞}, MIN, +∞) is a monoid
--- (𝑅≥ 0 ⋃ {+∞}, +, 0) is a monoid and +∞ is annihilator for + ( a +
∞ = +∞)
--- + distributes over MIN : a + MIN(b,c) = MIN(a+b, a+c)
--- Infinite sum MIN(a1, Min(a2,…..) = Min(a1,a2,a3,….) exists and is
unique
• (𝐹S , È , ◦, Æ, {ϵ}) is a closed semi-ring where
𝐹S is the family of sets of finite length strings from alphabet S
including empty string ϵ (countably infinite sets)
È - set union, associative, identity empty set Æ
◦ - concatenation of sets S1 ◦ S2 = { xy | x ϵ S1 and y ϵ S2}; Æ is
annihilator for ◦
◦ distributes over È : S1 ◦ (S2 È S3) = (S1 ◦ S2) È (S1 ◦ S3)
All pairs path problem
Given : A directed graph G = (V,E) with possible
self-cycles and a label function l : E -> S where
(S,+, ◦, 0, 1) is a closed semi-ring
• For a directed path p = (𝑒! , 𝑒" , … 𝑒# ), path
product l(p) = l(𝑒! ) ◦ l(𝑒" ) ◦ … l(𝑒# )
• Sum of two path products 𝑝! and 𝑝" = l(𝑝! ) +
l(𝑝" )
• S(u,v) is sum of product of all paths from u to
v in the graph
Required: Find S(u,v) for all pairs of vertices in
G.
All pair path problems (contd.)
All pairs path problem
Given : A directed graph G = (V,E) with possible
self-cycles and a label function l : E -> S where
(S,+, ◦, 0, 1) is a closed semi-ring
• For a directed path p = (𝑒! , 𝑒" , … 𝑒# ), path
product l(p) = l(𝑒! ) ◦ l(𝑒" ) ◦ … l(𝑒# )
• Sum of two path products 𝑝! and 𝑝" = l(𝑝! ) +
l(𝑝" )
• S(u,v) is sum of product of all paths from u to
v in the graph
Required: Find S(u,v) for all pairs of vertices in
G.
Closure in closed semi-rings
• Define a closure operation * for a closed semi-ring
(S,+, ◦, 0, 1) element a as follows:
𝑎∗ = 1 + a + a ◦ a + a ◦ a ◦ a + ……. = 1 + a + 𝑎" + 𝑎# +..
By infinite idempotency of +, this infinite sum exists
and is unique.
• For ({0,1}, Ú, Ù, 0, 1) ,
𝑎∗ = 1 Ú a Ú 𝑎" …. = 1 for a = 0 or 1
• For (𝑅≥ 0 ⋃ {+∞}, MIN, +, +∞, 0) ,
𝑎∗ = MIN(0, a, 2a,3a,…) = 0 for a 𝜖 𝑅≥ 0 ⋃ {+∞}
• For (𝐹S , È , ◦, Æ, {ϵ}),
𝑆 ∗ = {ϵ} È S È (S ◦ S) È (S ◦ S ◦ S)…. = ⋃$%&{ 𝑥' 𝑥" …𝑥$ |
𝑥( ϵ S, 1 ≤ j ≤ i}
Closed semi-ring matrices
• Define 𝑀) be set of n x n matrices where elements
are from a closed semi-ring (S,+, ◦, 0, 1)
• (𝑀) , +) , ∗) , 0) , 𝐼) ) is a closed semi-ring
+) - addition of n x n matrices, (+ is closed semi-ring
idempotent operation)
∗) - multiplication of n x n matrices, ( +, ◦ closed semi-
ring operations) ; distributes over +)
0) - n x n matrix with all 0 (identity for +)
𝐼) - n x n identity matrix with 0 identity for + and 1
identity for ◦ - 0) is annihilator for ∗)
Infinite sum of matrices exists and is unique
Digraph Matrix Closure
• For a digraph with n vertices, define an n x n matrix L where
L(i,j) = l((𝑣! , 𝑣" ))
• L matrix is an element of closed semi-ring (𝑀# , +# , ∗# , 0# ,
𝐼# )
• Define 𝐿$ = L ∗# L …. multiplied k times. 𝐿% = 𝐼#
• What does 𝐿&(i,j) indicate ?
𝐿&(i,j) = L(i,1) ◦ L(1,j) + L(i,2) ◦ L(2,j) + ….. + L(i,n) ◦ L(n,j)
Sum of products of paths of length 2 from 𝑣! to 𝑣"
• 𝐿$ matrix gives for all vertex pairs sum of k length path
products between these vertices
• Closure matrix 𝐿∗ = ∑) $
$(% 𝐿 exists and is unique where sum is
+# and 𝐿%= 𝐼#
• It gives exactly what we need in all pairs-path problem.
DP algorithm for all-pair paths
• Due to Floyd-Warshall
• Let 𝐷* (u,v) = sum of path products from u to v that
go through w
= 𝑆' (u,w) ◦ 𝑆" (w,w) ◦ 𝑆# (w,v) where
𝑆' (u,w) = sum of paths from u to w that do not go
through w
𝑆' (w,w) = sum of paths from w to w
𝑆# (w,v) = sum of paths from w to v that do not go
through w
Distributivity of ◦ over + for finite and infinite sums
• Define 𝐷+ (i,j) as sum of paths from 𝑣$ to 𝑣( that do
not go through vertices other than 𝑣' , 𝑣" , …𝑣+
Floyd-Warshall Algorithm
AllPair(G, l):
Input : Digraph G = (V,E) with vertex numbered 1,2..n arbitrarily
and a labeling function l : E → S where (S,+, ◦, 0, 1) is a closed semi-
ring
Output : Compute 𝑳∗ matrix
for i ← 1 to n
for j ← 1 to n
l ← (𝑣" , 𝑣# ) ϵ E ? l((𝑣" , 𝑣# ) : 0
𝐷$ (i,j) ← i = j ? 1 + l : l --- (1)
for k ← 1 to n
for i ← 1 to n
for j ← 1 to n
𝐷% (i,j) = 𝐷%&' (i,j) + 𝐷%&' (i,k) ◦ (𝐷%&' (k,k))* ◦ 𝐷%&' (k,j) --
(2)
Return 𝐷#
Floyd-Warshall alg. time complexity