0% found this document useful (0 votes)
7 views

Merged Notes

Uploaded by

adityapatil3101
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Merged Notes

Uploaded by

adityapatil3101
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 622

Algorithm Design Strategies

Divide-and-Conquer Strategy
• Three steps:
1. Divide a problem into sub problems of smaller sizes
2. Recursively solve smaller problems
3. Merge the solutions of sub problems to arrive at a
solution to original problem.
• Tower of Hanoi problem – Given 3 pegs move a stack
of different sized disks from one peg to another
making sure that a smaller disk is always on top of a
larger disk.
• Integer multiplication
• Other examples : MergeSort , QuickSort, FFT
Tower of Hanoi
move(n,i,j):
Input : n disks and integers i and j s.t. 1 ≤ i, j ≤ 3
Output: Disk moves to migrate all n disks from i to j
if n = 1
move a disk from i to j
else
otherPeg <- 6-(i+j)
move(n-1,i, otherPeg)
move(1,i,j)
move(n-1,otherPeg,j)
• Timecomplexity: O(T(n)) – T(n) is number of disk moves
T(n) = 2 T(n-1) + 1, n > 1 and T(1) = 1 => T(n) = 2ⁿ - 1
Divide-and-Conquer
• In most problems, we divide a problem into sub
problems which are a fraction of size of the original
problem.
• General time-complexity recurrence for divide-and-
conquer:
T(n) = a T(n/b) + f(n), n ≥ d
= c for n < d
• Second term includes dividing and merging steps;
first term includes number of basic problems to solve
• Based on f(n), general asymptotic solutions for T(n)
can be derived in some cases.
Divide-and-conquer recurrence (Master
theorem)
!
• At recursion level 1, there are "a" sub problems of size to
"
solve and f(n) additional time for divide/merge overhead
!
• At recursion level 2, there are 𝑎# sub problems of size "!
to
!
solve and a*f( ) additional time for divide/merge overhead.
"
!
• At i-th step there are 𝑎$ sub problems of size ""
to solve and
!
𝑎($&') *f( ) additional time for divide/merge overhead.
""#$
!
• Setting i=𝑙𝑜𝑔" ()), we see that ultimately we need to spend
&
*+,% (') ! %
()* +
time, say T1(n) in solving 𝑎 = problems of size d
-()*% '
(each of which takes “c” time units) and in divide/merge steps
&
*+,% (')&' !
which takes a total of time units T2(n) = ∑./0 𝑎 . ∗f( , )
"
Divide-and-conquer recurrence (Master
theorem)
&
!()*% + *+,% (')&' !
• T1(n) = = T2(n) = ∑./0 𝑎 . ∗f( , )
-()*% ' "

Case 1 : If f(n) is O(𝑛*+,% -&1 ) for small constant ϵ > 0 then T1(n)
dominates T2(n) asymptotically and T(n) is θ(𝒏𝒍𝒐𝒈𝒃 𝒂 )

Case 2 : If f(n) is θ(𝑛*+,% - 𝑙𝑜𝑔6 n) for 𝑘 ≥ 0 then T2(n) dominates


T1(n) asymptotically and T(n) is is θ(𝒏𝒍𝒐𝒈𝒃 𝒂 𝒍𝒐𝒈𝒌8𝟏 n)

Case 3 : If f(n) is Ω(𝑛*+,% -81 ) for small constant ϵ > 0 and


a*f(n/b) ≤ δ f(n) for n ≥ d where δ < 1 then T2(n) dominates
T1(n) asymptotically and T(n) is θ(f(n))
Master theorem examples
• T(n) = 4 T(n/2) + cn for k > 0
a=4, b=2 and f(n) = cn. Case 1 applies here as f(n) is O(𝑛!"#! $%& ) for
ϵ = 1. Hence T(n) is θ(𝒏!"#! $ ) which is is θ(𝒏𝟐 )

• T(n) = 7 T(n/2) + c𝑛( for k > 0


a=7, b=2 and f(n) = c𝑛( . Case 1 applies here as f(n) is O(𝑛!"#! )%& ) for
ϵ = 0.1. Hence T(n) is θ(𝒏!"#! 𝟕 )

• T(n) = 2 T(n/2) + c n for c > 0


a=2, b=2 and f(n) = cn. Case 2 applies as f(n) is θ(𝑛!"#! ( 𝑙𝑜𝑔+ n) where
𝑘 = 0 and also f(n) is not O(𝑛!"#! (%& ) for any ϵ > 0, so case 1 does not
apply. Hence T(n) is θ(n 𝒍𝒐𝒈𝒌-𝟏 n ) which is θ(n log n )
• T(n) = 2 T(n/2) + c n log n for c > 0
a=2,b=2 and f(n) = = cn log n . Case 2 applies as as f(n) is θ(𝑛!"#! (
𝑙𝑜𝑔+ n) where 𝑘 = 1. Hence T(n) is θ(n 𝒍𝒐𝒈𝒌-𝟏 n ) which is θ (n 𝒍𝒐𝒈𝟐 n )
Master theorem examples
• T(n) = T(n/2) + c for c > 0
a=1,b=2 and f(n) = c. f(n) is not O(𝑛*+,! '&1) for any ϵ > 0, so
case 1 does not apply. Also f(n) is θ(𝑛*+,! ' 𝑙𝑜𝑔6 n ) where k=0.
Hence case 2 applies so T(n) is θ(𝒏𝒍𝒐𝒈𝟐 𝟏 𝒍𝒐𝒈𝒌8𝟏 n ) which is θ(log
n)

• T(n) = 2 T(n/2) + 𝑐 𝑛'.; for c > 0


a=2,b=2 and f(n) = 𝑐 𝑛'.;. f(n) is not O(𝑛*+,! #&1) for any ϵ > 0, so
case 1 does not apply.
f(n) is not θ(𝑛*+,! # 𝑙𝑜𝑔6 n ) for any k ≥ 0 (remember
𝑙𝑜𝑔6 n 𝑖𝑠 𝑜(𝑛< ) is for any r > 0 ). So case 2 does not apply.
But f(n) is Ω(𝑛*+,! #81 ) for ϵ = 0.5 and
! '.; =
a * f(n/b) =2c*( ) = δ 𝑛'.;where δ = /.1 . Hence case 3
# #
𝟏.𝟓
applies. Hence T(n) is θ(f(n)) which is θ(𝒏 )
Integer multiplication
• We are interested in finding efficient algorithm for multiplying two
n-bit integers where the time complexity is measured in number of
bit operations (e.g. single bit operations such as addition with carry,
shift etc.)
• Useful for software based integer multiplications as in Java
BigInteger data types.
• Simple algorithm uses same approach as hand multiplication of 2
numbers (add and shift). This takes (n-1) shifts of at most 2n bits
and addition of n integers each of which is at most 2n bits. This
approach takes θ(𝑛( ) time.
• We can do better than that using Divide-and-conquer.
• Assume n is a power of 2 for simplicity. Let X and Y be two n-bit
integers. For n large enough, we divide X into two parts a and b
where b has n/2 least significant bits of X and a has n/2 most
significant bits of X. Same way we divide Y into two parts c and d.
• In practice, if n is not even, we divide them into approximately
equal halves.
Algorithm Design Strategies
(contd.)
Integer multiplication
• We are interested in finding efficient algorithm for multiplying two
n-bit integers where the time complexity is measured in number of
bit operations (e.g. single bit operations such as addition with carry,
shift etc.)
• Useful for software based integer multiplications as in Java
BigInteger data types.
• Simple algorithm uses same approach as hand multiplication of 2
numbers (add and shift). This takes (n-1) shifts of at most 2n bits
and addition of n integers each of which is at most 2n bits. This
approach takes θ(𝑛! ) time.
• We can do better than that using Divide-and-conquer.
• Assume n is a power of 2 for simplicity. Let X and Y be two n-bit
integers. For n large enough, we divide X into two parts a and b
where b has n/2 least significant bits of X and a has n/2 most
significant bits of X. Same way we divide Y into two parts c and d.
• In practice, if n is not even, we divide them into approximately
equal halves.
Integer multiplication (contd.)
• Easy to see that X =2!/# * a + b and Y = =2!/# * c + d
• X * Y = 2! * (ac) + 2!/# * (ad + bc) + (bd)
• Note ac, ad, bc and bd are products of n/2-bit integers.
Multiplying an integer by 2$ is same as shifting “i” bits to left.
• If we use divide-conquer straight away, we need to solve 4 sub
problems of size n/2 recursively and in addition needs to
perform O(n) bit operations for shifting and adding
=> T(n) = 4 T(n/2) + kn => T(n) is θ(𝑛# ), no improvement!!
• Instead we try to reduce multiplications at the expense of
some additions/subtractions.
• (a –b)(d-c) = ad – (bd+ac) + bc => (ad+bc) = (a-b)(d-c)+(ac+bd)
• X*Y = 2! * (ac) + 2!/# * [(a-b)(d-c)+(ac+bd) ] + (bd)
Integer multiplication ( contd.)
• This requires :
(a) 2 subtractions (a-b,d-c) with n/2-bit integers
(b) 3 multiplications [ac,bd,(a-b)(d-c) ]of n/2-bit integers
(c) 4 additions of at most n-bit integers
(d) 1 n-bit shift and 1 n/2-bit shift of at most n-bit integers

(a), (c) and (d) takes O(𝑛) bit operations, (b) requires 3 recursive
calls on n/2-bit integers.
T(n) = 3 T(n/2) + k n, for n > d
= c for n = d
Applying Case 1 of Master theorem, we get T(n) is θ(𝒏%&'! 𝟑 )
which is better than θ(𝑛#)
Smooth functions and time complexity
extensibility (optional)
• In many divide-and-conquer algorithms , we derive T(n)
assuming n to be a power of some integer b > 1.
• Can we extend this result for any n asymptotically ?
• We can do this as long as T(n) involves “smooth” functions.
• Definition 1 : A function f(n) is eventually non-decreasing if
there exists 𝑛& ≥ 0 such that for all 𝑛# ≥ 𝑛) ≥ 𝑛*, f(𝑛#) ≥ f(𝑛))
• Definition : A function f(n) is smooth iff f(n) is “eventually”
non-decreasing and f(2n) is O(f(n))
• Can show that for a smooth function f, f(bn) is O(f(n)) for any
fixed positive integer b.
• Examples of smooth functions : n, log n, 𝑛#.
• Is 2! a smooth function ? No
Smooth functions and time complexity
extensibility (optional)
• Extensibility Theorem:
Suppose T(n) is O(f(n)) when n is a power of b for some
constant integer b > 1 and T(n) is asymptotically non-
decreasing (usually the case with time-complexities)
Then we can say T(n) is O(f(n)) for any n provided f(n) is
a smooth function.
Dynamic Programming Paradigm
• Discovered by Richard Bellman for solving various
optimal decision problems.
• Applicable for problems with objective functions that
satisfy “optimal substructure property” (or principle
of optimality). It allows an objective function to be
broken down into a series of recursive functions each
with smaller number of decision variables.
• At each stage, no decision made but we compute
best solution for each of all possible states at that
stage. (Decision graph)
• We then work backward from final stage as only one
possible state at that stage.
Dynamic Programming
• Efficiency of DP is due to avoiding repeated
computation for a state at a decision stage. How we
arrive at that state is not important. T
• Recursive formulation but no repeated computation!!
• Related to “memoization” which is caching of a result
during recursive computation.
e.g. Pascal triangle
• Table underlying DP computations.
• Example problems: matrix chain product, multiple
joins of relations in RDB, Optimizing string edits,
Longest Common Subsequence(LCS), DNA sequence
alignment,
Hidden Markov Models (speech recognition)
Minimum edit distance
Problem:
Given two strings X = x_1 x_2 …..x_m and Y = y_1 y_2 …. y_n
compute minimum cost to transform X to Y using following
operations:
(a) Insert a new character into X – c_ins(.) units cost
(b) Delete a character from X – c_del(.) units cost
(c) Replace a character from X = c_rep(.) units cost

Example: X = mitten Y = smiley


One transformation : mitten -> smitten -> smiten -> smilen ->
smiley
Another : mitten -> sitten -> smitten -> smilten -> smiltey ->
smiley
Applications of Min Edit Distance
• Text editor
• Computational Biology – Align two DNA
sequences of bases to assess their similarity
measures ( align each letter to a letter or gap)

• Natural Language Processing and speech


recognition
Minimum Edit Distance
• Use DP approach
• Define F(i,j) – mimimum cost of transforming x_i…..x_m to
y_j…y_n, for 1 ≤ i ≤ m+1, 1 ≤ j ≤ n+1.
• We define F(m+1, n+1) = 0
F(i, n+1) = c_del(x_i) + c_del(x_{i+1])+…..c_del(x_m), for 1 ≤ i ≤ m
F(m+1, j) = c_ins(y_j) + c_ins(y_{j+1])+…..c_ins(y_n), for 1 ≤ j ≤
m+1
• By principle of optimality, for 1 ≤ i ≤ m, 1 ≤ j ≤ n
F(i,j) = F(i+1,j+1) if x_i = y_j
= min (c_del(x_i) + F(i+1,j), c_ins(y_j) + F(i, j+1),
c_rep(x_i,y_j) + F(i+1, j+1))
• Time Complexity is O(mn) and space complexity O(mn) if we
keep track of specific edit operations.
Algorithm Design Strategies
(contd.)
Matrix Chain Product
• Multiplying a p x q matrix with a q x r matrix takes p x q x r
scalar multiplications (standard matrix multiplication). The
result is a p x r matrix
• Matrix multiplication is associative => A x (B x C) = (A x B) x C
• Suppose A is 𝑑! x 𝑑" matrix, B is 𝑑" x 𝑑# matrix and C is 𝑑#
x 𝑑$ matrix.
• A x (B x C) takes 𝑑" 𝑑# 𝑑$ + 𝑑! 𝑑" 𝑑$ multiplications
• (A x B) x C takes 𝑑! 𝑑" 𝑑# + 𝑑! 𝑑# 𝑑$ multiplications
• Problem:
Given a sequence of n matrices 𝐴!, 𝐴",…. 𝐴%&" where 𝐴' is a
𝑑' x 𝑑'(" matrix, find the order of multiplication of matrices so as
to minimize number of multiplications. Order can be specified by
parenthesizing multiplication expression.
Matrix Chain Product
• Brute-force approach: Consider all possible orders of
multiplication
• Let N(k) – number of orderings of k matrices

• N(1) = N(2) = 1
• N(k) = ∑$%#
!"# 𝑁 𝑙 𝑁(𝑘 − 𝑙), k > 2

• N(k) is called a Catalan number –


1,1,2,5,14,42,132,429, 1430,…
• Grows fast exponentially and time complexity is
Ω( 4ⁿ/ 𝑛&/( )
• DP approach takes only O(𝑛& ) time.
DP formulation
• Consider the subsequence 𝐴) , 𝐴)*# ,…. 𝐴+ of matrices
where i ≤ j. In any optimal solution that requires (𝐴)
x 𝐴)*# x …. x 𝐴+ ), optimal way to do this should occur
(principle of optimality)
• Let 𝑁),+ be minimum number of multiplications
required for multiplying these matrices
• 𝐿𝑒𝑡 𝑁),) = 0
• 𝑁),+ = min {𝑁),$ + 𝑁$*#,+ + 𝑑) 𝑑$*# 𝑑+*# } where min
over all i ≤ k < j
• Though recursive, we store 𝑁),+ ’s in a table to avoid
re-computing it.
• Computing order: increasing order of (j-i) from 0,… n-
1
Matrix Chain Product DP complexity
• j-i = 0 : 𝑁',' = 0, i=0, 1,2,…(n-1) – n computations
j-i = 1 : 𝑁','(" = 𝑁',' + 𝑁'(",'(" + 𝑑' 𝑑'(" 𝑑'(# – (n-1)
computations each with 2 integer multiplications and 2 integer
additions and 0 min op
j-i = 2 : 𝑁','(# = min { 𝑁',' + 𝑁'(",'(# + 𝑑' 𝑑'(" 𝑑'($, 𝑁','(" +
𝑁'(#,'($ + 𝑑' 𝑑'(# 𝑑'($}, – (n-2) computations each with 2
integer multiplications and 2 integer additions and 1 min op that
requires 1 comparison
j-i = k : 𝑁','(* -- (n-k) computations each with 2k integer
multiplications and 2k integer additions and a min op that
requires (k-1) comparisons.

• Time complexity :
T(n) ≤ n + ∑%&"
*+" 5𝑘(𝑛 − 𝑘) is O(𝑛 $ ) integer operations.
Matrix chain product DP example
• 𝐴! 3x5 , 𝐴" 5x6 , 𝐴# 6x2 , 𝐴$ 2x4
𝑁%,' entries , 0 ≤ 𝑖 ≤ 𝑗 ≤ 3, Also store 𝑘%,' as
value of 𝑖 ≤ 𝑘 < 𝑗 that gives minimum value.
𝑁!,! = 𝑁"," = 𝑁#,# = 𝑁$,$ = 0, 𝑘%,% = 𝑖, 0 ≤ 𝑖 ≤ 3
𝑁!," = 𝑁!,! + 𝑁"," + 𝑑! 𝑑" 𝑑# = 0 + 0 + 3x5x6 = 90; 𝑘!," = 0
𝑁",# = 𝑁"," + 𝑁#,# + 𝑑" 𝑑# 𝑑$ = 0 + 0 + 5x6x2 = 60; 𝑘",# = 1
𝑁#,$ = 𝑁#,# + 𝑁$,$ + 𝑑# 𝑑$ 𝑑( = 0 + 0 + 6x2x4 = 48; 𝑘#,$ = 2
𝑁!,# = min( 𝑁!,! + 𝑁",# + 𝑑! 𝑑" 𝑑$ , 𝑁!," + 𝑁#,# + 𝑑! 𝑑# 𝑑$ )
= min ( 0 + 60 + 3x5x2 , 90 + 0 + 3x6x2) = 90; 𝑘!,# = 0
𝑁",$ = min( 𝑁"," + 𝑁#,$ + 𝑑" 𝑑# 𝑑( , 𝑁",# + 𝑁$,$ + 𝑑" 𝑑$ 𝑑( )
= min ( 0 + 48 + 5x6x4 , 60 + 0 + 5x2x4) = 100; 𝑘",$ = 2
𝑁!,$ = min( 𝑁!,! + 𝑁",$ + 𝑑! 𝑑" 𝑑( , 𝑁!," + 𝑁#,$ + 𝑑! 𝑑# 𝑑( ,
𝑁!,# + 𝑁$,$ + 𝑑! 𝑑$ 𝑑( )
= min ( 0 + 100 + 3x5x4 , 90 + 48 + 3x6x4, 90 + 0 + 3x2x4) = 114
𝑘!,$ = 2
Matrix chain product DP example (contd.)
Final answer : 𝑁!,# = 114
Optimal order computed as : 𝑘!,# = 2 i.e (𝐴! x 𝐴$ x 𝐴% ) x 𝐴# à look at entry
(0,2) to compute (𝐴! x 𝐴$ x 𝐴% )
𝑘!,% = 0 i.e. (𝐴! x 𝐴$ x 𝐴% ) computed as 𝐴! x (𝐴$ x 𝐴% ) à look at entry (1,2)
to compute (𝐴$ x 𝐴% )
𝑘$,% = 1 i.e. 𝐴$ x 𝐴% computed as (𝐴$ x 𝐴% ). Final order : (𝐴! x (𝐴$ x 𝐴% ) ) x 𝐴#

j=0 j=1 j=2 j=3

i=0 0 (𝑘!,! = 0) 90 (𝑘!,# = 0) 90(𝑘!,$ = 0) 114(𝑘!,% = 2)

i=1 x 0 (𝑘#,# = 1) 60(𝑘#,$ = 1) 100(𝑘#,% = 2)

i=2 x x 0(𝑘$,$ = 2) 48(𝑘$,% = 2)

i=3 x x x 0(𝑘%,% = 3)
Greedy Algorithms
• An optimization problem involves finding a solution that
minimizes or maximizes an objective function of decision
variables with or without constraints
• A global optimal choice strategy finds best solution
among all possible solutions to the problem.
• A local partial solution strategy finds best partial solution
among a limited set of solutions.
• A problem satisfies greedy-choice property if a global
optimal solution can be reached by a sequence of local
partial solution choices starting from a well-defined
state.
• Otherwise this strategy provides only a heuristic and
optimal solution is not guaranteed.
• Well-defined starting state may require some pre-
processing.
Optimal substructure
• Recall a problem exhibits “optimal substructure”
property if optimal solution contains within itself
optimal solutions to sub problems.
• This is the basis of DP wherein we split the
optimization function as a sequence of recursive
functions ; they depend only on the “state” we arrive
at by choice of values for a subset (typically of size
1) of decision variables.
• Also called “principle of optimality” by Richard
Bellman.
• Optimal substructure is a necessary condition for
greedy algorithms.
Coin change problem
Given :
Unlimited limited supply of n coins, each of which has
value {𝑐# , 𝑐( ,….. 𝑐1 },
Required:
Minimum number of coins to make change for an
amount V. Assume 𝑐1 = 1
• Satisfies “optimal substructure” property – if in an
optimal solution we make change for value “v” out of
“V”, then it contains optimal solution to sub problem
of making change for V-v regardless of how we made
change for “v”
• Can use a DP formulation for it
Coin change – DP formulation
• Define F(𝑖, 𝑣) – min number of coins needed to make
change for v from set {𝑐) , 𝑐)*# …. 𝑐1 }, 1 ≤ 𝑖 ≤ n, 1 ≤ 𝑣 ≤ 𝑉
N(𝑖, 𝑣) – number of coins of type 𝑐) used in this
solution
• Boundary conditions : F(n+1, 𝑣) = N(𝑛 + 1, 𝑣) = ∞ , if
𝑣 > 0, else F(n+1, 𝑣) = N(𝑛 + 1, 𝑣)= 0
• F(𝑖, 𝑣) = min) 𝑗 + 𝐹(𝑖 + 1, 𝑣 − 𝑗 ∗ 𝑐) ), 0 ≤ 𝑣 ≤ 𝑉,
2≤ + ≤⎿* ⏌
+
1 ≤ 𝑖 ≤ n-1
N(𝑖, 𝑣) = argmin F(𝑖, 𝑣)
• Need F(1, 𝑉) and N(1, 𝑉)
• Time complexity of DP algorithm : O(𝑛V)
Coin change - greedy solution
• Assume 𝑐" ≥ 𝑐# ….. ≥ 𝑐%
• At each stage i, only one choice considered for making change
for 𝑣
- -
• F(𝑖, 𝑣) =⎿ . ⏌ + 𝐹(𝑖 + 1, 𝑣 − ⎿ . ⏌ ∗ 𝑐' ) and N(𝑖, 𝑣) =
- ! !
⎿. ⏌
!
• Iterative greedy algorithm:
𝑣 ←V
F←0
for 𝑖 ← 1 to n
-
N(i) ← ⎿ . ⏌
!
F ← F + N(i)
𝑣 ← 𝑣 mod 𝑐'
• Time complexity : O(n)
Coin change – greedy choice
• Does it have “greedy choice” property ?
• Not always. Consider only quarters, dimes and pennies (i.e.
𝑐"= 25, 𝑐#=10 and 𝑐$=1 ) and V = 30
- Greedy choice will give N(1) =1, N(2)=0, N(3)=5, F=6
-- Is it optimal ?
-- No. N(1) =0, N(2)=3, N(3)=0, F = 3
• But for quarters, dimes, nickels and pennies (i.e. 𝑐"= 25,
𝑐#=10, 𝑐$=5 and 𝑐/=1) , it satisfies “greedy choice property”
--- Ignore pennies, remaining value divisible by 5
--- Any state F(2, 𝑣) with 𝑣 ≥ 25 not part of optimal (as you
can reduce number of coins either by replacing 2 dimes+1 nickel
or 3 dimes by a quarter and nickel)
--- Any state F(3, 𝑣) with 𝑣 ≥ 10 not part of optimal as you can
reduce number of coins by replacing 2 nickels by a dime
Algorithm Design Strategies
(contd.)
Fractional knapsack problem
• Problem : Max ∑$!"# 𝑏! t !
s.t. ∑$!"# 𝑤! t ! ≤ W, 0 ≤ t ! ≤ 1, ∀ 𝑖
• Select items (may be partially) with weights 𝑤! ’s and values
𝑏!% s so as to fill a knapsack of weight W.
• In 0-1 knapsack problem, an item cannot partially fill a
knapsack. It is a harder problem to solve.
• Define relative value 𝑣! = 𝑏! /𝑤! , ∀ 𝑖. An item with a larger
value of 𝑏! and smaller value of 𝑤! is relatively more valuable.
Let 𝑤! t ! = 𝑥! , ∀ 𝑖.
• Problem : Max ∑$!"# 𝑣! 𝑥!
s.t. ∑$!"# 𝑥! ≤ W, 0 ≤ 𝑥! ≤ 𝑤! , ∀ 𝑖
Fractional knapsack problem
• Satisfies “optimal substructure” property. Why?
• If in an optimal solution to this problem, after selecting a
few items capacity of 𝑣 remains then optimal solution to a
knapsack problem with capacity 𝑣 must be part of the
complete optimal solution.

• Greedy approach:
Let items be ordered such that 𝑣! ≥ 𝑣" · · · ≥ 𝑣# and they
are filled in that order; for last item if weight exceeds
remaining capacity, use partial amount. This is a greedy
approach.
• At most one item will be partially filled in this approach.
• Does this algorithm satisfy greedy choice property ?
Example of fractional knapsack
• 𝑏! = 7, 𝑏" = 5, 𝑏# = 4, 𝑏$ = 3
• 𝑤! = 4, 𝑤" = 3, 𝑤# = 2, 𝑤$ =1, W = 6
Strategy 1: Fill by least weight to highest weight
Choose 𝑤$ , 𝑤# , 𝑤" – total value = 3+4+5 = 12
Strategy 2: Fill by largest value to smallest value
Choose 𝑤! , 𝑤" *2/3 – total value = 7 + 5*2/3= 10.33
(Optimal) Strategy 3 : Fill by largest relative value to
smallest
𝑣! = 7/4 = 1.75, 𝑣" = 5/3 =1.67, 𝑣# = 4/2 = 2
𝑣$ = 3/1 = 3
Choose 𝑤$ , 𝑤# , ¾* 𝑤!
Total value = 3 + 4 + ¾*7 = 12.25
Fractional knapsack greedy alg.
• Optimality of greedy choice:
We prove by contradiction.
Let there be an optimal solution such that for two items such
that 𝑣! > 𝑣&, we do not fully fill item i but use some amount of
item j thereby not using the greedy choice.
i.e. 𝑥! < 𝑤! and 𝑥& > 0.
We can then replace amount of j as much as possible by an equal
amount of item 𝑖. This amount = min( 𝑥& , 𝑤! -𝑥! ).
Additional value obtained
= (𝑣! - 𝑣& ) min (x& , 𝑤! - x! ) > 0 violating optimality of solution.
• Time-complexity – O(n log n)
• Fast considering there are 2ⁿ possible subsets of n items.
Meeting scheduling problem
Problem: Given a set S of n meetings, each with start and finish
times 𝑠! and 𝑓! times (0 < 𝑠! < 𝑓! ) respectively for 1 ≤ i ≤ n, find a
mapping φ : S → {1,2,…M} (M conf. rooms) such that
(a) For two meetings 𝑚! and 𝑚& s.t. φ(𝑚! ) = φ(𝑚& ) (assigned to
same room), either 𝑓! ≤ 𝑠& or 𝑓& ≤ 𝑠! (they do not conflict in time)
(b) M should be as small as possible (minimum number of rooms).

• Greedy approach:
(a) Sort meetings according to start times 𝑠! ’s. Start with a single
room.
(b) For each meeting in sequence
(i) check if it does not conflict with any of the meetings scheduled
so far, schedule it at earliest opportunity in a room.
(ii) Else schedule it in a new meeting room.
Meeting scheduling problem
• Proof of correctness:
We prove by contradiction.
Suppose the optimal solution requires m ≤ 𝑘 − 1 meeting rooms while the
greedy algorithm requires k rooms.
Let 𝑚! be the first meeting scheduled in last room k by greedy approach.
Þ 𝑚! conflicts with at least one meeting scheduled in each of the rooms 1..k-
1
Þ all these meetings have start times not later than 𝑠! but have finish times
later than 𝑠! . These meetings conflict with each other.
Þ At least k meetings conflict with each other, a contradiction as the
algorithm considers non-conflicting schedule and this cannot be done with
< k rooms
• Time complexity:
O(n log n) time for pre-processing.
In each scheduling step, how do we check efficiently if the new meeting does
not conflict with previous meetings ?
Keep track of earliest finish time among the tasks that start latest in each
room and the rooms associated with the tasks. If the next task’s start time is
later than this finish time, schedule it in that room. Else it cannot be
scheduled in any of the rooms used so far and has to be scheduled in a new
room. By keeping the earliest finish times of latest tasks in each room in a
heap, we can do this in O(log n) time in each step.
Example of meeting scheduling
• 𝑚! : 10(𝑠! )-14(𝑓! ), 𝑚" : 9-11, 𝑚$ : 8-10, 𝑚% : 7-12, 𝑚& :
10-15, 𝑚' : 13-15
• Meetings sorted by start times: 𝑚% , 𝑚$ , 𝑚" , 𝑚! , 𝑚& ,
𝑚'
• Let 𝑓( be earliest finish time of latest task scheduled in a
room and 𝑟( , the corresponding room after step k (i.e. k
meetings have been scheduled).
𝑅! : 𝑚% (7-12)
𝑅" : 𝑚$ (8-10), 𝑚! (10-14)
𝑅$ : 𝑚" (9-11), 𝑚' (13-15)
𝑅% : 𝑚& (10-15)
𝑓! = 12 , 𝑓" = 10, 𝑓$ = 10, 𝑓% = 11, 𝑓& = 11, 𝑓' = 12
𝑟1= 1 , 𝑟" = 2, 𝑟$ = 2, 𝑟% = 3, 𝑟& = 3, 𝑟' = 1
AVL tree insertion examples

Dr. Ravi Varadarajan


O(log n) AVL tree insertion
• May cause balance factor at an ancestor node of inserted
node to change to -2 or +2.
• Fixing it requires only one “rotation” operation which takes
O(1) time as it requires only pointer changes
• 4 cases:
(a) Node A’s BF changes from +1 to +2, its right child node B’s BF
changes from 0 to +1
Left rotate B to move its right subtree up one level
(b) Node A’s BF changes from +1 to +2, its right child node B’s BF
changes from 0 to -1 as its left child C’s BF changes from 0 to
+1 or -1
Double rotate C (right followed by left) to make A and B its
children
(c) & (d) are “mirror” cases of (a) and (b)
Dr. Ravi Varadarajan
Case a: Insertion into T3 changes BF of B from 0 to +1
and of A from +1 to +2

BF= +2 B BF=0
A Single left
rotation of B

B +1 0 A T3
T1

T3 T1 T2
T2

Case b: Insertion into T2 or T3 changes BF of B from 0


to -1 and of A from +1 to +2
A +2 C 0
Double rotation
of C (right, left)
B -1
T1 A 0 B

-1 or +1 C T4 T1 T2 T3 T4

T2 T3 Dr. Ravi Varadarajan


Case c: Insertion into T1 changes BF of B from 0 to -1
and of A from -1 to -2
B BF=0
A BF=-2 Single right
rotation of B
A 0
B -1 T3 T1

T2 T3
T1 T2
Case d: Insertion into T2 or T3 changes BF of B from 0
to +1 and of A from -1 to -2
A -2 C 0
Double rotation
of C (left, right)
B +1 T4 B 0 A

C -1 or +1 T1 T2
T1 T3 T4

T2 T3 Dr. Ravi Varadarajan


AVL Insertion examples
• Balance factor at a node = Height of right subtree –
height of left subtree rooted at that node
• Start with empty tree (single external node) :

• Insert 60 60 BF=0

• Insert 40
60 BF= -1

40 BF=0

Dr. Ravi Varadarajan


AVL Insertion examples (contd.)
• Insert 30
A BF= -2 BF= 0
60 B 40

B T3 BF= 0
40 BF=-1 BF= 0
30 A 60
Single right
rotation of B
30 BF=0 T2

T1 T2 T3
T1

Case c applies

Dr. Ravi Varadarajan


AVL Insertion examples (contd.)
• After inserting 75, 37, 20, 35, 38, when 32 inserted
BF= -2
40

A BF= +2
30 60 BF= +1

BF= 0 BF= -1 Double rotation of C (right, left)


75 BF= 0
20 B 37
BF= -1 BF= 0
T1 35 C 38
T3
BF= 0 32 T4
Case b applies
T2
Dr. Ravi Varadarajan
AVL Insertion examples (contd.)

BF= -1
40

BF= 0
C 35 60 BF= +1

BF= 0 BF= +1
75 BF= 0
A 30 B 37

BF= 0
BF= 0 BF= 0
20 32 38
T3

T1 T2
T4
Dr. Ravi Varadarajan
AVL Insertion examples (contd.)
• Insert 39
BF= -2
40

BF= +1 BF= +1
35 60

BF= 0 Single left


BF= +2 rotation of B
75 BF= 0
30 A 37

BF= +1
BF= 0 BF= 0
20 B
32 38
T1
Case a applies
BF= 0
39
T2
Dr. Ravi Varadarajan T3
AVL insertion examples (contd.)
BF= -1
40

BF= 0
35 60 BF= +1

BF= 0
BF= 0
30 75 BF= 0
B 38

A
BF= 0 20 BF= 0 32 BF= 0
37 39 BF= 0

T1 T2 T3

Dr. Ravi Varadarajan


AVL insertion examples (contd.)
• Insert 36 A 40
BF= -2

BF= +1
35 60 BF= +1
B
Double rotation of
BF= -1 C (left, right)
BF= 0
30 75 BF= 0
C 38
T1 T4

BF= 0 20 BF= 0 32 BF= -1


37 39 BF= 0

T3
BF= 0 36
T2
Case d applies

Dr. Ravi Varadarajan


AVL insertion examples (contd.)
C 38 BF= 0

B A
BF= 0 35 40 BF= +1

BF= 0
BF= 0 30 39 60 BF= +1
37 BF= -1

BF= 0
BF= 0 20 BF= 0 32 T3 75 BF= 0
36
T2 T4
T1

Dr. Ravi Varadarajan


Search Trees
Ordered Dictionary ADT
• Keys are assumed to be from a totally ordered set
• Additional operations:
closestKeyBefore(k) – return largest key smaller than k
closestKeyAfter(k) – return smallest key greater than k
• Implementation choices:
(a) Sorted array of keys
- insert and remove operations take O(n) time in
worst case
-- findElement(k) – use binary search that takes O(log
n) time
-- closestKeyBefore, closestKeyAfter take O(log n)
time.
Binary search tree
• Binary tree implementation with keys stored only in internal
nodes.
• Have ordering property:
For any subtree rooted at internal node v, all keys in left
subtree of node have items smaller than the item at v and all
keys in right subtree of node have items larger than the item at v
• Recursive structure
• Binary search tree time complexities:
-- findElement, insertElement, removeElement – O(h)
-- findClosestBefore, findClosestAfter – O(h) time
Where h is height of the tree which can be “n” in the worst case
n being number of keys in the tree.
• Balanced binary search trees keep height as O(log n) in the
worst-case while ensuring the same order of complexity for
the different operations.
BST example
Binary Search Tree
insertElement(v,k,e):
Input : v root node of binary search tree, k key and e element
Output: node where key is inserted or exists
if isExternal(v)
make v internal with key k and element e with 2 external children
nodes
return v
if k = key(v)
elem(v) ← e; return v
if k < key(v)
childNode ← leftChild(v)
else
childNode ← rightChild(v)
return insertElement(childNode,k,e)
insertElement(T,k,e):
return insertElement(T.root,k,e)
Search Trees
Binary Search Tree
insertElement(v,k,e):
Input : v root node of binary search tree, k key and e element
Output: node where key is inserted or exists
if isExternal(v)
make v internal with key k and element e with 2 external children
nodes
return v
if k = key(v)
elem(v) ← e; return v
if k < key(v)
childNode ← leftChild(v)
else
childNode ← rightChild(v)
return insertElement(childNode,k,e)
insertElement(T,k,e):
return insertElement(T.root,k,e)
BST key insertion example
Binary Search Tree (contd.)
removeElement(v,k,e):
Input : v root node of binary search tree, k key and e element
Output: element with key k if it exists or error
if isExternal(v)
return “NO_SUCH_KEY” error
if k < key(v)
return removeElement(leftChild(v),k,e)
else if k > key(v)
return removeElement(rightChild(v),k,e)
else
elem ← elem(v)
if v has both external child nodes
make v external node removing key and elem
return elem
else if isExternal(leftChild(v))
key(v) ← key(rightChild(v))
elem(v) ← elem(rightChild(v))
leftChild(v) ← leftChild(rightChild(v))
rightChild(v) ← rightChild(rightChild(v))
return elem
else if isExternal(rightChild(v))
// do similar to previous case with rightChild(v) replaced by leftChild(v) in rhs of
assignment statements
Binary Search Tree (contd.)
else
successor ← inOrderSuccessor(T,k)
key(v) ← key(successor)
elem(v) ← elem(successor)
removeElement(successor, key(successor),
elem(successor)

Time Complexity:
O(h) to find node to remove, O(h) to find in order successor,
O(1) to remove successor node
Total time complexity : O(h)
Total additional space complexity : O(h)
Dictionary operations for BST - delete
(a) If node to be removed is a leaf, we can just remove
it from the tree

30
30
25
25

20 27
20

(b) If a node has only one child, then make the child
the child of its parent
30 30

25
20
20
Dictionary operations for BST - delete
(c) when node to be removed has 2 children (replace it
with in-order successor or minimum key in right subtree
i.e. left most node)

32
30 32

40 40 20 40
20 20
ç
35
35
√ 35

32
34
34 34
Build a BST
• To build a binary search tree with n keys
• If we randomly choose a sequence 𝑎! , 𝑎" , … 𝑎# from the
given keys and build using insertElement() n times,
worst-case complexity is O(𝑛" ) as tree may be skewed
and height can be O(j) for a j-element tree
• But average time complexity can be shown to be O(n log
n)
• Also using DP we can build a optimal binary search tree
for a set of n keys with 𝑝$ probability of executing a find()
with key equal to 𝑎$ and 𝑞$ probability of executing a
find() with 𝑎$ < key < 𝑎$%! and 𝑞& probability of
executing a find() with key < 𝑎!
Balanced Binary Search Trees
• Satisfies some height balancing property at
every node of the tree
• Recursive structure
• Height balancing property typically guarantees
logarithmic bounds on tree height
• Insertions and removals restructure trees to
guarantee this property with minimum
overhead
• They involve rotation operations.
AVL tree
• Due to inventors Adelson-Velskii and Landis
• Balance factor at a node = Height of right subtree – height of left
subtree rooted at that node
• Binary search tree with height-balancing property : balance factor
at each node is 0,-1 or 1
• Keys stored only in internal nodes
• Note # of external nodes = # internal nodes + 1
• An AVL tree of height “h” has at least n(h) nodes, n(0) = 0, n(1) = 1
and n(h) = 1 + n(h-1)+n(h-2), h ≥ 2. Recognize n(h) ?
Like Fibonacci sequence
• It has at most m(h) nodes, m(h) = 1 + 2* m(h-1), h ≥ 2 and m(0) = 0
!
!"
• Can show n(h) > 2 " → h < 2 log n + 2 → h is O(log n)
• Also we see that m(h) = 2# - 1 → n < 2# → h > log n → h is Ω(log n)
• FindElement() takes O(log n) time.
O(log n) AVL tree insertion
• May cause balance factor at an ancestor node of inserted
node to change to -2 or +2.
• Fixing it requires only one “rotation” operation which takes
O(1) time as it requires only pointer changes
• 4 cases:
(a) Node A’s BF changes from +1 to +2, its right child node B’s BF
changes from 0 to +1
Left rotate B to move its right subtree up one level
(b) Node A’s BF changes from +1 to +2, its right child node B’s BF
changes from 0 to -1 as its left child C’s BF changes from 0 to
+1 or -1
Double rotate C (right followed by left) to make A and B its
children
(c) & (d) are “mirror” cases of (a) and (b)
Case a: Insertion into T3 changes BF of B from 0 to +1
and of A from +1 to +2

BF= +2 B BF=0
A Single left
rotation of B

B +1 0 A T3
T1

T3 T1 T2
T2

Case b: Insertion into T2 or T3 changes BF of B from 0


to -1 and of A from +1 to +2
A +2 C 0
Double rotation
of C (right, left)
B -1
T1 A 0 B

-1 or +1 C T4 T1 T2 T3 T4

T2 T3
Case c: Insertion into T1 changes BF of B from 0 to -1
and of A from -1 to -2
B BF=0
A BF=-2 Single right
rotation of B
A 0
B -1 T3 T1

T2 T3
T1 T2
Case d: Insertion into T2 or T3 changes BF of B from 0
to +1 and of A from -1 to -2
A -2 C 0
Double rotation
of C (left, right)
B +1 T4 B 0 A

C -1 or +1 T1 T2
T1 T3 T4

T2 T3
Search Trees (contd.)
O(log n) AVL tree deletion
• First do deletion as in ordinary binary search
tree (discussed before)
• This may cause BF of an ancestor node to
change to +2 or -2.
• Use “rotation” operations as in insertion to
balance the tree
• Can be shown to be O(log n) in the worst-case.
Multi-way search trees
• Stores more than one key in an internal node and if a
node has at most “m” keys 𝑘! 𝑘" … 𝑘# then it has m+1
children
• Any key 𝑘 in subtree of i-th child (1 ≤ i ≤ m+1)
satisfying 𝑘$%! < 𝑘 < 𝑘$ assuming 𝑘& = -∞ and 𝑘#'! =

• Examples : 2-4 tree (m=3), B-tree (m depends on disk
block size)
• Maintains height balance property by having every
external node at the same height.
• Height of tree is Θ(log n) where n is number of keys.
2-4 tree
• A search tree with each node having either 2,3 or 4
children.
• Every path from root to external node has same
length.
• A tree of height h must have at most 4( external
nodes and at least 2( external nodes.
• Hence 2( ≤ n+1 ≤ 4( where n is number of keys
• à log(n+1) /2 ≤ h ≤ log (n+1) à h is Θ(log n)
• FindElement() takes Θ(log n) time
Example of 2-4 tree
Insertion in 2-4 tree
• Use FindElement() to reach bottom internal node.
• Case 1 : No overflow in internal node to be inserted
(i.e. # of keys < 3 before insertion)
• Case 2 : Overflow in node which is not root. Add
key, split the node (making extra child of parent ) and
push the middle key of node up to be inserted into
parent. (recursive case)
• Case 3 : Overflow in node which is root. Create a new
root node and make node and is split node as its
children. Increases height of tree.
• Time complexity : Down and up phases take Θ(log n)
time (comparisons + pointer changes)
Deletion from 2-4 tree
• If it is not in a bottom internal node, replace key by the smallest
item in the subtree whose keys > key.
• Problem reduces to removing key from a bottom internal node
with only external children.
• Case 1: No underflow - # of keys in node > 1 before deletion.
• Case 2 : Underflow in node whose parent is not root. (recursive
case)
2a : Immediate sibling has at least 2 keys from which we can
transfer a key to this node or thru’ a transfer of a key from parent.
2b : All siblings have just one key. Merge with a sibling node
by having a new node and moving a key from parent to this new
node.
• Case 3 : Underflow in node whose root is parent. New
replacement root node causes height to decrease.
• Time complexity : Down and up phases take Θ(log n) time.
B-tree
• m-way search tree used for external searching of records as in
a database (B-tree index)
• # of keys in internal node allowed to vary between “d” and
“2d”. Branching factor - # of keys + 1
• “d” depends on size of disk block
• Insertion/deletion work the same way as in 2-4 tree.
• Typically only when a node is accessed, it is brought into main
memory from disk; entire tree not kept in main memory.
• # of levels ≤ 1 + 𝑙𝑜𝑔! ((n+1)/2) where n is number of keys
• For n ≈ 2 million and d ≈ 100, # of levels is at most 3 requiring
at most 3-4 disk accesses.
• In B+ tree, records with keys kept in leaf nodes
• In B* tree, non-root nodes have at least 2/3 full capacity.
Instead of splitting/merging, sometimes keys may be
transferred from/to sibling nodes.
B- tree (order 7) example
Basic data structures-2
Amortization
• Many dynamic data structures do well for a
sequence of operations though the worst-case
time complexity based on a single operation
may be high
• But we amortize the restructuring cost of a
data structure over the sequence of n
operations – restructure for future benefit
• Two methods for time complexity analysis :
(a) Accounting method
(b) Potential function method
Accounting method
• Assign an amortized cost to each operation which may be less than
or greater than the actual cost of the operation.
• Operations whose actual cost is less than amortized cost will help
pay for operations whose actual cost is more than amortized cost
using “credits” which are associated with data structure.
• We require that sum of amortized costs for n operations must be at
least the sum of actual costs so that total credit is always positive.
• For dynamic array example, set amortized cost to be 3 for each
operation: 1 unit for inserting element itself, one for copying it in
the future when array is doubled and 1 unit for an item already
copied during array doubling before the insertion of this item
• 2 credits will help pay for actual cost when array is doubled.
• Total actual cost ≤ Total amortized cost ≤ 3n
Potential Function
• Associate a “potential” with each state of the data structure
that represents prepaid work for future operations.
• Assume 𝐷! is initial state and let 𝐷" be the state after the i-th
operation.
• Define potential function φ : 𝐷" → R s.t.
amortized cost for i-th operation 𝑒" =
actual cost 𝑐" + φ(𝐷" ) - φ(𝐷"#$)
• Sum of amortized cost ∑&"%$ 𝑒" = sum of actual cost ∑&"%$ 𝑐" +
φ(𝐷& ) - φ(𝐷!) (due to telescopic sum)
• If we choose potential function s.t. φ(𝐷& ) ≥ φ(𝐷!), we get an
upper bound on total actual cost using total amortized cost.
Potential function (contd.)
• For dynamic arrays, we can set φ(T) = 2*T.num – T.size.
• Initial value of φ is 0 .
• Immediately before an expansion, T.num = T.size => φ = T.num
• Immediately after expansion, T.size = 2*T.num => φ = 0.
• Since array is at least half full always, T.num ≥ T.size/2, hence φ(T)
≥0
• Let num! - number of elements after i-th operation, note num! -
num!"# = 1 .
• Two cases:
a. i-th operation does not cause expansion (
𝑒! = 𝑐! + φ! – φ!"# = 1 + (2* num! – size) – (2*num!"# - size)= 1+ 2 =
3
b. i-th operation triggers expansion: (1 insert and num! -1 copying
operations)
𝑒! = num! + (2* num! – 2*(num! -1)) – (2*(num! -1) – (num! -1)) = 3
Linked list implementation
• Each node has prev and next links (doubly linked)
• “count” for number of items in the list
insertAfter(p,e): // p can be null if need to insert as first item in the list
v ← newly allocated node
v.item ← e
v.prev ← p
if p = null
v.next ← head
else
v.next ← p.next
if v.next = null
tail ← v
else
v.next.prev ← v
if v.prev = null
head ← v
else
v.prev.next ← v
count ← count + 1
return v
Linked List (contd.)
remove(p):
elem ← p.item
if p.prev = null
head ← p.next
else
(p.prev).next ← p.next
if p.next = null
tail ← p.prev
else
(p.next).prev ← p.prev
p.prev ← null; p.next ← null;
count ← count - 1
return elem
Time complexity comparison
Array vs Linked List
Operations Array Linked List
size, isEmpty O(1) O(1)
atRank,rankOf,elemAtRank O(1) O(n)
first,last,before,after O(1) O(1)
insertAtRank,removeAtRank O(n) O(n)
insertFirst, insertLast O(1) O(1)
insertAfter, insertBefore O(n) O(1)
remove O(n) O(1)
Binary Tree
• A recursive data structure with a root node and left
and right children being roots of binary trees
themselves
• We use convention that each internal node has
exactly two children and an external node (leaf) has
no children
• Operations:
leftChild(v) – left child of node v; error for external
node v
rightChild(v) – right child of node v, error for
external node v
isInternal(v) – true iff v is an internal node
Binary tree properties
• Depth of a node is number of internal nodes in the path from
root to the node. Root has depth 0
• Height h of a tree is the maximum depth of an external node
in the tree.
• Height of tree = 1 + max (height of left sub tree, height of right
sub tree)
• # of external nodes = 1 + # of internal nodes (by induction on
tree height)
• h ≤ # of internal nodes ≤ 2' -1
• h+1 ≤ # of external nodes ≤ 2'
• 2h+1 ≤ # of nodes ≤ 2'($ – 1
• For a tree with n nodes, log(n+1) -1 ≤ h ≤ (n-1)/2
Binary tree traversals
• Preorder – root, left subtree, right subtree
• Inorder – left subtree, root, right subtree
• Postorder – left subtree, right subtree, root
binaryPreorder(T,v):
Input : Binary tree T, a node v
Output: Perform action on each node of subtree rooted
at node v
performAction(v)
if T.isInternal(v)
binaryPreorder(T, T.leftChild(v))
binaryPreorder(T, T.rightChild(v))
Time complexity – O(n) as each node is visited only once
BT array implementation
• A tree with height h needs an array A[0..n] where n =
2!"# – 1
• Rank(v) determines index of node v in A
• Rank(v) = 1 for root v
Rank(leftChild(v)) = 2 * Rank(v)
Rank(rightChild(v)) = 2 * Rank(v) + 1
• For a sparse tree, many cells in A will be unused
space complexity is O(2$ ) in the worst-case
• Simple and fast for many tree operations.
• Can use dynamic arrays for expanding trees.
• Revisit this implementation for heap ADTs.
BT Linked structure
• Similar to linked list, each node has item as well links
to left and child children nodes (null for external
nodes) and optionally a link to its parent (null for
root)
• Easy to extend to non-binary trees
• Space-efficient as complexity is O(n) where n is
number of nodes
• Time complexity for many operations is comparable
to array implementation but small overhead for
dereferencing the links.
• Revisit this implementation for binary search trees.
Basic data structures-3
Min-Priority Queue ADT
• Allows a totally ordered set of elements to be stored
in such a way that “minimum” element can be
extracted efficiently – self-reorganizing structure
• Useful for task scheduling, efficient sorting
• Operations:
insertElement(e) – insert element in queue
removeMin() – remove and return smallest
minElement() – return minimum element
• In a Max-heap, maximum elements are of interest.
• In Java PriorityQueue<E> is a class based on
unbounded heap
PQ – simple array implementation
• Two approaches:
(a) Keep array unordered
(b) Keep array sorted at all times
• Unordered array
(a) insertElement(E) – add element to end of array – O(1) time
(b) minElement() – O(n) time even in best-case
(c) removeMin() – O(n) time even in best-case
basis of Selection Sort O(𝑛!) even in best-case

• Sorted array (max element in position 0)


(a) insertElement(E) - find position to insert and shift elements
to right – O(1) time best-case and O(n) time in worst-case
(b) minElement() – O(1) time
(c) removeMin() – O(1) time (basis of Insertion Sort O(𝑛!) in
worst-case and O(n) time in best-case)
Min PQ Binary heap
• It is a complete binary tree with array implementation. In
the tree we fill every level from left to right before
proceeding to next level
• All elements stored in internal nodes only. Ignore
external nodes in discussion.
• The element at a node (except root) is always ≥ element
stored in its parent node – heap property
• Theorem 1 : Root element of a subtree is the minimum
element in that subtree (use induction)
• Recursive structure
• Theorem 2 : The height of an n-element heap is at most
⌈log n⌉. Note # of nodes of a heap of height h is at least
2! and at most 2!"# - 1
• Easy to see that minElement() takes O(1) time.
Heap operations
insertElement(E):
size ← size + 1
A[size] ← E
elemPos ← size
while elemPos > 1
parentPos ← elemPos/2
if A[elemPos] < A[parentPos]
// swap parent with child
temp ← A[elemPos]
A[elemPos] ← A[parentPos]
A[parentPos] ← temp
elemPos ← parentPos // (1)
else // (2)
break // (3)
Heap operations (contd.)
• Correctness proof:
Code equivalent to removing (2), (3) and moving (1) outside IF
statement and complexity bounded by this modified code’s complexity.
Loop invariance : Subtree rooted at elemPos is a heap
We prove this using induction on loop index i.
Basis i=0: Before loop begins, subtree rooted at elemPos is just one-
element subtree and is a heap
Induction Step : Assume true for i=m-1, m ≥ 1. For m-th iteration,
after swap of child with parent, we see that new parent is still smaller
than the sibling (as old parent was smaller than the sibling). So must
be a heap for iteration m also.
• Time complexity: Initialization and each iteration takes O(1) time
and # of iterations bounded by height of the heap which is at most
⌈log n⌉ .
• Time complexity is O(log n).
Heap operations (contd.)
removeMin():
minVal ← A[1]
A[1] ← A[size] // move last element to root
size ← size - 1
elemPos ← 1
bubbleDown(A,elemPos)
return minVal

bubbleDown(A,elemPos): // Takes O(log n) time


while elemPos < 𝑠𝑖𝑧𝑒/2
childPos ← 2 * elemPos
if childPos + 1 <= size && A[childPos+1] < A[childPos]
childPos ← childPos + 1
if A[childPos] < A[elemPos]
// swap parent with child
temp ← A[childPos]
A[childPos] ← A[elemPos]
A[elemPos] ← temp
elemPos ← childPos
else
break
Heap construction
• Given n elements, using insertElement() to construct a heap takes
O(n log n) time. Can be done faster.
Heapify(A[0..n]):
for k ← 𝑛/2 to 1 by -1
bubbleDown(A,k)
• Start with heaps of height 0, then construct heaps of height 1, then
height 2 and so on.
• Correctness works by correctness of bubbleDown() and the
recursive nature of heap
• Time complexity:
Claim : There are at most ⌈n/2()* ⌉ heaps of height i for 0 ≤ i ≤ h where
h is height of the heap. (Proof by induction on i).
bubbleDown() for a heap of height i takes at most c*i for some
constant c.
⌈log n⌉ log n
• Total time ≤ c/2 * ∑(+* ⌈n/2( ⌉ ∗ 𝑖 ≤ c1 * n * ∑(+* 𝑖/2(
• Total time is thus O(n) since ∑, (+* 𝑖/ 2 (
converges and hence is O(1)
Probabilistic data structures for
dictionaries and sets
Ravi Varadarajan
Bloom filters
• isMember(S,e) – check if e belongs to S
• Can use dictionary implementations (hash table, search trees)
• Alternatively we can use a bit-vector of size n where i-th bit
indicates if i-th element belongs to the set
• Advantage : Easy to perform union, intersection operations.
• Disadvantage : Space requirements very high for large
number of elements.
• If we relax restriction that error-free hash is required, we can
use probabilistic data structures using hashes.
• Here we use a bit vector of m-bits (m smaller than n)
• We use “k” (<< m) hash functions to map an element to “k”
positions in the bit vector
False positives in Bloom Filters
• To add an element, need to set bits to 1 in all positions
indexed by “k” hash functions.
• To query an element, need to check all “k” positions indexed
by hash functions to see if they are all 1’s.
• If an element is removed from the set, we cannot set any of
the “k” bits to 0 as other elements may be hashed into these
positions. This causes “false” positives in set membership test.
• Assuming independence of bits being set to 1, we can show
that the probability of a “false” positive (assuming “n”
elements inserted) is given by :
[ 1 – (1 – 1/m)!" ]! ≈ (1−𝑒 #!"/% )!
• Bloom filters very fast and have less space requirements;
typically we require disk accesses to get items only after we
find the key exists.
Source : Wikipedia
Chinese Remainder Theorem for integers
• Two integers a and b are relatively prime if gcd(a,b) = 1
• Let 𝑝!, 𝑝#, …, 𝑝$%# be k relatively prime numbers. Then any
integer r between 0 and 𝑝! ∗ 𝑝# ∗ …𝑝&%# -1 can be
represented as (𝑟!, 𝑟#, …, 𝑟$%# ) (called “residues” or “moduli”)
where 𝑟', = q mod 𝑝', We write it as r « (𝑟!, 𝑟#, …, 𝑟$%# )
• This representation is unique due to Chinese Remainder
Theorem (stated below without proof):
• Given : residues (𝑢!, 𝑢#, …, 𝑢$%# ) w.r.t to 𝑝!, 𝑝#, …, 𝑝$%#
Then 𝑢 = ∑$%#
()! 𝑐( 𝑑( 𝑢( modulo 𝑝 where
𝑝 = 𝑝! ∗ 𝑝# ∗ …𝑝$%# and 𝑐( = 𝑝 / 𝑝(, and 𝑑( = 𝑐( %# mod 𝑝(
• Example: 𝑝! = 2, 𝑝# = 3, 𝑝* = 5 and 𝑝+ = 7 and (𝑢! , 𝑢# , 𝑢* ,
𝑢+) = (1,2, 4,3) . What is 𝑢 such that 𝑢 « (1,2,4,3) ?
Dr. Ravi Varadarajan
Chinese Remainder Algorithm For Integers
Example : when k = 4
∑$!"# 𝑐! 𝑑! 𝑢! = ∑$!"#(𝑝 / 𝑝!, ) 𝑑! 𝑢! =
𝑑# 𝑢# 𝑝& 𝑝' 𝑝$ + 𝑑& 𝑢& 𝑝# 𝑝' 𝑝$ + 𝑑' 𝑢' 𝑝# 𝑝& 𝑝$ +
𝑑$ 𝑢$ 𝑝# 𝑝& 𝑝'
= 𝑝# 𝑝& (𝑑' 𝑢' 𝑝$ + 𝑑$ 𝑢$ 𝑝' ) + 𝑝' 𝑝$ (𝑑# 𝑢# 𝑝& + 𝑑& 𝑢& 𝑝# )
= 𝑞&& (𝑠## 𝑞#& + 𝑠#& 𝑞## ) + 𝑞&# (𝑠#' 𝑞#$ + 𝑠#$ 𝑞#' )
Where at level 0 we have 𝑞## = 𝑝# , 𝑞#& = 𝑝& , 𝑞#' = 𝑝' and 𝑞#$ = 𝑝$ ;
𝑠## = 𝑑# 𝑢# , 𝑠#& = 𝑑& 𝑢& , 𝑠#' = 𝑑' 𝑢' and 𝑠#$ = 𝑑$ 𝑢$

And at level 1, if we define 𝑞&# = 𝑞## * 𝑞#& = 𝑝# 𝑝& and


𝑞&& = 𝑞#' * 𝑞#$ = 𝑝' 𝑝$ ; 𝑠&# = 𝑠## 𝑞#& + 𝑠#& 𝑞## and 𝑠&& = 𝑠#' 𝑞#$ +
𝑠#$ 𝑞#' ,
We get at level 2
∑+()! 𝑐( 𝑑( 𝑢( = 𝑠!" = 𝑠#" 𝑞## + 𝑠&& 𝑞&#
We can continue this pattern to more than 3 levels for larger
values of k.
Dr. Ravi Varadarajan
Chinese Remainder Algorithm For Integers
Pre-conditioning step:
𝑞#! = 𝑞"! * 𝑞""
210

𝑞"! = 𝑞!! * 𝑞!"


6 35 𝑞"" = 𝑞!# * 𝑞!$

𝑝" =2 𝑝! =3 𝑝# = 5 𝑝$ = 7

𝑞!! 𝑞!" 𝑞!# 𝑞!$

Complexity = O(k) integer multiplications

Done only once for the given k relatively prime


integers
Dr. Ravi Varadarajan
Chinese Remainder Algorithm For Integers
Computing step (use tree constructed in pre-conditioning for
multiplication factors):
1109 mod 𝑠#! = 𝑠"! ∗ 𝑞"" + 𝑠"" ∗ 𝑞"!
210 = 59
𝑠"! = 𝑠!! * 𝑞!" + *35 *6
+ 𝑠"" = 𝑠!# * 𝑞!$ +𝑠!$ * 𝑞!#
𝑠!" * 𝑞!!
7 144
*3 + *2 *7 + *5

𝑑" 𝑢" = 1 𝑑! 𝑢! = 2 𝑑# 𝑢# = 12 𝑑$ 𝑢$ = 12

𝑠!! 𝑠!" 𝑠!# 𝑠!$


𝑑" = (3∗5∗7)%! mod 2 = 105%! mod 2 = 1 (i.e. 1*105 mod 2 = 1) ; 𝑢" = 1
𝑑! = (2∗5∗7)%! mod 3 = 70%! mod 3 = 1 (i.e. 1*70 mod 3 = 1) ; 𝑢! = 2
𝑑# = (2∗3∗7)%! mod 5 = 42%! mod 5 = 3 (i.e. 3*42 mod 5 = 1) ; 𝑢# = 4
𝑑$ = (2∗3∗5)%! mod 7 = 30%! mod 7 = 4 (i.e. 4*30 mod 7 = 1) ; 𝑢$ = 3
Complexity = O(k) integer multiplications + O(k) inverse operations
Dr. Ravi Varadarajan
Complexity Theory

Ravi Varadarajan

Source : Dr. Nakayama


Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
RAM Model vs TM model
• Recall RAM model has infinite memory and unlimited
word size
• RAM model assumes unit times for primitive operations
such as arithmetic operations, comparisons etc. Time is
independent of word size (uniform cost model)
• TM uses logarithmic complexity model where input size
for numeric data uses logarithmic space under
“reasonable encoding” schemes
• In TM, arithmetic operations take time that are
dependent on the logarithmic size of the numerical
inputs.
• RAM model can be simulated by a TM with polynomial
time complexity overhead.
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
Source : Dr. Nakayama
e.g. Branch and bound search

Source : Dr. Nakayama


Bounded approximation algorithms
• For a given instance 𝐼 of the minimization optimization
problem 𝜋, let OPT(𝐼) be the optimum value.
• Given an approximation algorithm 𝐴 for 𝜋, let 𝐴(𝐼) be
the solution given by 𝐴 for instance 𝐼.
• Define performance ratio
!(#)
𝑅! I = OPT( #)

OPT(#)
For a maximization problem 𝜋, define 𝑅! I =
A(#)
• Absolute performance ratio 𝑅! is given by smallest value
𝑟 ≥ 1 such that 𝑅! I ≤ 𝑟 for all instances 𝐼 of 𝜋.
Travelling Salesman Problem
• Triangular inequality constraint:
The graph satisfies the property that
𝑤(𝑣! , 𝑣" ) ≤ 𝑤(𝑣! , 𝑣# ) + 𝑤(𝑣# , 𝑣" ) for any 3
distinct vertices 𝑣! , 𝑣# and 𝑣" .
• There is an approximation alg. 𝐴 for TSP with
triangular inequality that achieves 𝑅$ < 2.
• It constructs an Euler tour of the minimum
cost spanning tree of the graph and applies
triangular inequality to find a TSP tour.
Approximation Schemes
• An approximation algorithm 𝐴 which given an
accuracy requirement 𝜀 > 0 and an instance I
of the problem constructs a solution in
polynomial time and achieves 𝑅$" I ≤ 1 + 𝜀
(i.e. a range of app. algorithms one for each 𝜀)
• A fully polynomial-time approximation
scheme runs in time which is a polynomial
%
function of the length of the input and
&
It has time-accuracy trade-offs.
Knapsack problem
• Problem : Max ∑$!"# 𝑏! t !
s.t. ∑$!"# 𝑤! t ! ≤ W, t ! = 0 or 1, ∀ 𝑖
• Decision version is NP-Complete (Can reduce
Partition problem to Knapsack)
• Note the DP algorithm has time complexity O(nW).
This is NOT a polynomial-time algorithm as length of
input includes log 𝑊 and time complexity is O(n
2%&' ( ). It is called a “pseudo-polynomial time”
algorithm.
• By scaling the problem depending on accuracy 𝜀, we
can design an approximation scheme for knapsack
$!
problem that works in time O( ) )
c � g(n)
Running time

f(n) ≤ c � g(n) → f(n) ∈ O(g(n))


f(n)

Input size

f(n)
Running time

c � h(n) f(n) ≥ c � h(n)) → f(n) ∈ Ω(h(n))

Input size
c � g(n)

f(n)
f(n) ∈ Θ(g(n))
c’ � g(n) f(n) and g(n) are asymptotically equal, up to a constant factor.
c � g(n)
Running time

f(n) ≤ c � g(n) → f(n) ∈ O(g(n)) ∃c>0 & ∃n0 ≥ 1


f(n)

∀c>0 & ∃n0 > 0 → f(n) ∈ o(g(n))

Input size
(less than in asymptotic sense)
f(n) Approaches but never touches
Running time

c � h(n) f(n) ≥ c � h(n)) → f(n) ∈ Ω(h(n)) ∃c>0 & n0 ≥ 1

∀c>0 & ∃n0 > 0 → f(n) ∈ 𝜔𝜔(h(n))


Input size
Example:
Prove that f(n) = 6n2 + 3n is o(n3) and ω(n)
Solution:
The ask: f(n) ≤ c⋅n3 and f(n) ≥ c⋅n for ∀c>0 and ∃ n0 > 0

Show f(n) is o(n3): Let c> 0, pick n0 = (6+3)/c = 9/c → cn ≥ 9 for n ≥ n0

f(n) = 6n2 + 3n ≤ 6n2 + 3n2 = 9n2 ≤ cn⋅n2 = cn3 for n ≥ n0

Show f(n) is ω(n): Let c> 0, pick n0 = c/6 → 6n ≥ c for n ≥ n0


f(n) = 6n2 + 3n≥ 6n2= 6n⋅ n ≥ cn
Text Compression Goal:
Store as many documents as possible
16 bits (e.g. 0101010101010101) -> Same doc= 16Megabits
Fact:
ASCII & Unicode system use fixed-length binary strings to encode characters

7 bits (e.g. 0101010) -> Doc of 100M characters = 100M x 7 = 7Megabits

Innovate:
Achieve text compression goal using a variable-length encoding scheme
Core Idea:
least
Most-frequently used characters to use the _______ numbers of bits and vice versa.
Root

Input string X:
a fast runner need never be afraid of the dark
a: 010
f: 1100

Representation Convention:
• Circle: internal
• Square: external
• v: a vertex/node
• c: a character
• C: collection of characters
Define the Tree T:
Total path weight: p(T)

• vc: external node asso. w. c


• dT(vc) or d(vc): depth of vc in T
• f(c): frequency of c
• f(vc): sum of the f(c) for all vc
who are its descendants.
* Depth of a node = number of its proper ancestors.
c a b d e f h i k n o r s t u v
f(c) 9 5 1 3 7 3 1 1 1 4 1 5 1 2 1 1
d(vc) 2 3 5 4 3 4 5 6 6 4 6 3 6 5 6 6
Define the Tree T:
Total path weight: p(T)

• vc: external node asso. w. c


19 + 10 + 9 + 5 + 5 = 9x2 + 5x3 + 5x3 = 48 • dT(vc) or d(vc): depth of vc in T
• f(c): frequency of c
• f(vc): sum of the f(c) for all vc
who are its descendants.
* Depth of a node = number of its proper ancestors.
c a b d e f h i k n o r s t u v
f(c) 9 5 1 3 7 3 1 1 1 4 1 5 1 2 1 1
d(vc) 2 3 5 4 3 4 5 6 6 4 6 3 6 5 6 6
Optimal Tree Structure T:

T Expected code length. Want to minimize this

T1
T2
T3 …
Each of the subset is the optimal for the sub
problem of coding the subset
Optimal Tree Structure T starts with its subtree:

Depth of root node ot T1 (as in T) + depth all T1 external nodes in T1

T1 Only worry about this subtree T1 (independent ofT)


• : Root node of T1
• df(c): weight of c (proportionate w/
f(v))
• d(vc): depth of vc
Abstract Structure Definition
A container of elements, each is assigned a key (at insertion time). This key will determine the “priority” used to pick
elements to be removed (tends to be the min)

Priority Queue Implementations: Method Unsorted List Sorted List Heap


1. Heap (A binary tree whose elements stored at internal nodes only.)
Insert O(1) O(n) O(logn)
2. Unsorted list (the key = value of elements)
3. Sorted list (the key = order in the list) Remove(min priority) O(n) O(1) O(logn)
- Linked list can be used to construct sorted list Access(key) O(1) O(1) O(1)
All 1,2,3 can be array-based

Array Index 1 2 3 4 5 6 7 8 9 10 11
Unsorted List 2 3 13 5 9 15 16 11 17 18
Sorted list 2 3 5 9 11 13 15 16 17 18 Insertion order: 2,3,13,5,9,15,16,11,17,18
Min Heap 2 3 13 5 9 15 16 11 17 18
Max Heap 18 17 15 11 16 3 13 2 9 5
Complexity: O(n + dlogd)
O(n)
• d: number of distinct c in X (size of C)

O(logd)

O(dlogd)
1-level tree containing 2 lowest frequency characters is part of an optimal tree

Consider a set C = {c1, c2, c3, c4,} of 4 characters with frequencies


f1 ≤ f2≤ f3≤ f4.
Need to consider only 2 complete tree structures.
Case 1 : 2 nodes (say x and y with f(x) ≤ f(y)) at level 3 and other nodes (w and z) at level 1 and 2

expected code length e = f(w) + 2 f(z) +3(f(x)+f(y))


If w=c2 & f2≤ f(y) -> can get a new tree T’ by switching w and y with expected frequency
e1 = f(y) + 2 f(z) +3(f(x)+f(w))
e1 – e = 2(f(w)-f(y)) ≤ 0 as f(y) ≥ f(w) -> Change to T’ w. smaller expected code length
Same is true for z. -> In optimal tree, freq(y) = f2. Can also show f(x) = f1
1-level tree containing 2 lowest frequency characters is part of an optimal tree

Consider a set C = {c1, c2, c3, c4,} of 4 characters with frequencies


f1 ≤ f2≤ f3≤ f4.
Need to consider only 2 complete tree structures.
Case2 : All leaves at same level
Show easily we can switch
leaves so that c1, c2 are in same Tree
Tree
subtree. T’
T
⇒1-level subtree containing c1,
c2, is part of an optimal tree. c1 c4 c2 c3
NP-complete problem. Solved using Greedy Algorithm via
dynamic programming: Solution is not the optimal solution
Example:
c = 11. 5 object with weights & values as followed.

Capacity
Previous Round Result @
the same capacity c

Current (new item’s) value +


Previous Round Result @
capacity c – wi

For full example result, visit: https://www.javatpoint.com/0-1-knapsack-problem


Sorted item by weight

Round 1: only consider object 1 with w1 and v1 Round 2: only consider object 1 & 2
Total weight of all objects = 1 Total weight of all objects = 3

max(F3(2),V2+F3(2-2) ) Fill out the rest same as case


where max capacity = 3 has
= max(1, 6 + 0) = 6
Previous Round reached
Result @ the
same capacity c Current (new item’s) max(F3(3),V2+F3(3-1) )
value + Previous Round
Result @ capacity c – wi
= max(1, 6 + 1) = 7
Sorted item by weight

Round 2: only consider object 1,2 Round 3: only consider object 1,2,3

c = 7 , i = 3 w3=5
max(F4(7),V3+F4(7-5) )
= max(7, 18 + 6) = 24
Keep them the same
Since w3 = 5 > current capacity
Previous Round
Result @ the
same capacity c Current (new item’s) value c = 5 , i = 3 w3=5
+ Previous Round Result @
capacity c – wi
max(F4(5),V3+F4(5-5) )
= max(7, 18 + 0) = 18
NP-complete problem. Solved using Greedy Algorithm via
dynamic programming: Solution is not the optimal solution
Example:
c = 11. 5 object with weights & values as followed.

Capacity
Previous Round Result @
the same capacity c

Current (new item’s) value +


Previous Round Result @
capacity c – wi

For full example result, visit: https://www.javatpoint.com/0-1-knapsack-problem


Sorted item by weight

Round 1: only consider object 1 with w1 and v1 Round 2: only consider object 1 & 2
Total weight of all objects = 1 Total weight of all objects = 3

max(F3(2),V2+F3(2-2) ) Fill out the rest same as case


where max capacity = 3 has
= max(1, 6 + 0) = 6
Previous Round reached
Result @ the
same capacity c Current (new item’s) max(F3(3),V2+F3(3-1) )
value + Previous Round
Result @ capacity c – wi
= max(1, 6 + 1) = 7
Sorted item by weight

Round 2: only consider object 1,2 Round 3: only consider object 1,2,3

c = 7 , i = 3 w3=5
max(F4(7),V3+F4(7-5) )
= max(7, 18 + 6) = 24
Keep them the same
Since w3 = 5 > current capacity
Previous Round
Result @ the
same capacity c Current (new item’s) value c = 5 , i = 3 w3=5
+ Previous Round Result @
capacity c – wi
max(F4(5),V3+F4(5-5) )
= max(7, 18 + 0) = 18
4
5 50
44 3
2
2 4
44 62
17 62
2 1 2 2
1 3 2
17 48 54 78
32 50 78
1 1 1
1 2 1
32 52 88
48 54 88

1
52
4
3
20 60 60 60
20 20
1 3
2
20 70 20 70 20 70
60 60
1 2

70 65 65
1
68
Shares Price Age Shares Price Age
1000 4.05 20s 2000 4.06 10s
100 4.05 6s 500 4.07 70s
2100 4.02 2s 1000 4.07 50s
2500 4.01 85s 2100 4.20 5s
100 4.21 1s

Keys determine order to be removed


PQ is implemented in the form of
Unsorted List

A[1..i-1] A[i..n]
In-place Algorithm
 Assign index i to a temp variable
s = i
 Loop through i+1 to n
for j = i + 1 to n Loop it (n-1) times from 1 to n-1
if A[j]<A[s] then
s = j
 If s ≠ I
Swap A[i] and A[s]
i = 0: Run n-1 comparisons
i = 1: Run n-2 comparisons

I = n-2: Run 1 comparisons

(n-1)+(n-2)+…+3+2+1 comparisons

O(n) swaps/exchanges

O(n2) sort : Quadratic sort


PQ is implemented in the
form of sorted List

A[1..i-1] A[i..n]
 Assign index i to a temp variable along with its
value In-place Algorithm
s = i
t = A[i]
 Loop backward till position 0
for j = i - 1 to 0 Loop it (n-1) times from 1 to n-1
if A[j]>A[s] then
A[s] = A[j]
s = s-1
 Found the place for the old index I
A[s] = t
i = 0: Run 1 comparisons
i = 1: Run 2 comparisons

I = n-1: Run n-1 comparisons

(n-1)+(n-2)+…+3+2+1 comparisons

O(n2) sort : Quadratic sort


We can’t see each
other in the booth
Inventor: Sir Tony Hoare –
Computer Scientist (Also
Wikipedia.org extract: invented insertion sort)

… The quicksort algorithm was developed in 1959 by Tony Hoare while he was a
visiting student at Moscow State University. At that time, Hoare was working on
a machine translation project for the National Physical Laboratory. As a part of the ALGOL was used mostly by research computer scientists in the
translation process, he needed to sort the words in Russian sentences before looking United States and in Europe. Its use in commercial applications was
them up in a Russian-English dictionary, which was in alphabetical order on magnetic hindered by the absence of standard input/output facilities in its
tape.[5] After recognizing that his first idea, insertion sort, would be slow, he came up description and the lack of interest in the language by large computer
vendors other than Burroughs Corporation. ALGOL 60 did however
with a new idea. He wrote the partition part in Mercury Autocode but had trouble dealing
become the standard for the publication of algorithms and had a
with the list of unsorted segments. On return to England, he was asked to write code profound effect on future language development.
for Shellsort. Hoare mentioned to his boss that he knew of a faster algorithm and his
boss bet sixpence that he did not. His boss ultimately accepted that he had lost the bet.
Later, Hoare learned about ALGOL and its ability to do recursion that enabled him to
publish the code in Communications of the Association for Computing Machinery, the
premier computer science journal of the time. …”
44 75 23 43 55 12 64 77 33
Legend:
Step 1: Search for the first value at the left end > pivot value Pivot Element A[0]

44 75 23 43 55 12 64 77 33 Greater than
Less than

Step 2: Search for the first value at the right end < pivot value
Left
44 75 23 43 55 12 64 77 33
Right
Step 3: Exchange these values
44 33 23 43 55 12 64 77 75
Repeatly moving Left, Right till condition is met again
44 33 23 43 55 12 64 77 75 Legend:
Pivot Element A[0]
Swap/Exchange Greater than
Less than
44 33 23 43 12 55 64 77 75
Left
Stop when Left “passed” Right. Exchange Pivot with Right.

“Partitioning” completed! Right

12 33 23 43 44 55 64 77 75
Repeat on L

12 33 23 43 44 55 64 77 75 Legend:
Pivot Element A[0]
Greater than
12 33 23 43 44 55 64 77 75 Less than

From Left
12 23 33 43 44 55 64 77 75

From Right
Repeat on G
12 23 33 43 44 55 64 77 75 Legend:
Pivot Element A[0]
Greater than
12 23 33 43 44 55 64 77 75 Less than

From Left
12 23 33 43 44 55 64 77 75

12 23 33 43 44 55 64 75 77 From Right

Done!
44 75 23 43 55 12 64 77 33

12 33 23 43 55 64 77 75

33 23 43 64 77 75

23 43 77 75

Observation:
75
Height of the Quick-sort tree is …………………………………….. in the worse case. When will this happen?
1 2 3 4 5 6 7 8 9

2 3 4 5 6 7 8 9

3 4 5 6 7 …

4 5 6 7 …
………. Etc.
0

0
depth=0 Root node: processes n values
depth=1 Root’s 2 children process n-1 values
depth=2 All nodes process n-(1+2) values

depth =i All nodes process n-(1+2+…+2i-1) = n-(2i-1) values

Height O(logn) each level processes max n values  O(nlogn)


PQ is implemented in the
form of sorted List
2,4,6 3

A[1..i-1] A[i..n] 1st iteration:


S = 3, j = 2
 Assign index i to a temp variable along with its
t = 5
value A[3] = 6  2,4,6,6
In-place Algorithm
s = i = 3
t = A[i] = 3 2nd iteration:
 Loop backward till position 0 S = 2, j = 1
for j = (i – 1) to 0 t = 5 Loop it (n-1) times from 1 to n-1
if A[j]>A[s] then A[1…i] = 2,4,4,6
A[s] = A[j]
s = s-1 3rd iteration
S = 1, j = 0
• Found the place for the old index i
T = 5 A = 2,4,4,6  A[1] = 3  2,3,4,6
A[s] = t
PQ is implemented in the form of
2,4,6 7 3 Unsorted List

Find the index of


A[1..i-1] the min of unsorted
(happen to be value A[i..n]
3 at index x-th) In-place Algorithm
 Assign index i to a temp variable
s = i
 Loop through i+1 to n
for j = i + 1 to n Loop it (n-1) times from 1 to n-1
if A[j]<A[s] then
s = j
 If s ≠ I
Swap A[i] and A[s]
Graduate
B.S.

Master Data
Structures
Graduate
School

Get an A in
CS435

Master A
Language

First Job Intern


Highschool

First Home Family

* Sample goals in graph form in no particular order or priority.


How map services calculate shortest path (excluding highway/tolls, etc.)? What about ticket route offered to you?
Example on an unweighted graph

A E
1
1

1
1 1

u
1

1 1
1 1

1
H
1
A
A E
1
1

1
1 1

u
1

1 1
1 1

C 1
H
1

A E
1
1

1
1 1

u
1

1 1
1 1

1
H
1

A E
1
1

1
1 1

D u
1

1 1
1 1

1
H
1

A E
1
1

1
1 1

u
1

1 1
1 1

1
H
1

A E
1
1

1
1 1

u
1

1 1
1 1

C 1
H
1

A E
1
1

1
1 1

u
1

1 1
1 1

1
H
1

A E
1
1

1
1 1

u
1
B
1 1
1 1

1
HH
1

A E
1
1

1
1 1

u
I
1

1 1
1 1

1
H
1

A E
1
1

1
1 1
G
1

u
1

1 1
1 1

1
HH
1

A E
1
1

1
1 1

u
1

1 1
1 1

1
H
F
1

A E
1
1

1
1 1

u
1

1 1
1 1

1
H
1

A E
1
1

1
1 1
F
1

u
1

1 1
1 1

1
H
1

A E
1
1

1
1 1

u
1

1 1
1 1

1
H
Goal: To approximate the distance in G from v to every other vertex u

Always store the the length of the best path we have found so far from v to u
Goal: To approximate the distance in G from v to every other vertex u ≠ v

Always store the the length of the best path we have found so far from v to u

D[v] = 0

D[u] = +∞
Goal: To approximate the distance in G from v to every other vertex u ≠ v

Always store the the length of the best path we have found so far from v to u

D[v] = 0

D[u] = +∞

Set C={∅} (our Cloud)

Each iteration,
-> we select a vertex u with smallest D[u] to put into C
-> Update D[z] where z is adjacent to u and not yet in C
Goal: To approximate the distance in G from v to every other vertex u ≠ v

Always store the the length of the best path we have found so far from v to u

D[v] = 0

D[u] = +∞

Store length of best


Set C={∅} (our Cloud)
path from v to z
Each iteration,
-> we select a u with smallest D[u] to put into C
-> Update D[z] where z is adjacent to u and not yet in C
Goal: To approximate the distance in G from v to every other vertex u ≠ v

Always store the the length of the best path we have found so far from v to u

D[v] = 0

D[u] = +∞

There is a better way to get to z from v


Set C={∅} (our Cloud) by going through edge (u,z)

Each iteration,
-> we select a u with smallest D[u] to put into C
-> Update D[z] where z is adjacent to u and not yet in C
From BWI

D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]


0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
From BWI
C = {JFK}
D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]
0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
+∞ +∞ +∞ 621 184 +∞ +∞ 946
From BWI
C = {JFK}
D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]
0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
+∞ +∞ +∞ 621 184 +∞ +∞ 946
+∞ +∞ 1575 371 328
From BWI
C = {JFK, PVD}
D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]
0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
+∞ +∞ +∞ 621 184 +∞ +∞ 946
+∞ +∞ 1575 371 328
From BWI
C = {JFK, PVD, BOS}
D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]
0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
+∞ +∞ +∞ 621 184 +∞ +∞ 946
+∞ +∞ 1575 371 328
3075 +∞
From BWI
C = {JFK, PVD, BOS, ORD}
D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]
0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
+∞ +∞ +∞ 621 184 +∞ +∞ 946
+∞ +∞ 1575 371 328
2467 +∞ 1423
From BWI
C = {JFK, PVD, BOS, ORD, MIA}
D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]
0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
+∞ +∞ +∞ 621 184 +∞ +∞ 946
+∞ +∞ 1575 371 328
2467 3288 1423
From BWI
C = {JFK, PVD, BOS, ORD, MIA, DFW}
D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]
0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
+∞ +∞ +∞ 621 184 +∞ +∞ 946
+∞ +∞ 1575 371 328
2467 2658 1423
From BWI
C = {JFK, PVD, BOS, ORD, MIA, DFW, SFO}
D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]
0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
+∞ +∞ +∞ 621 184 +∞ +∞ 946
+∞ +∞ 1575 371 328
2467 2658 1423
From BWI
C = {JFK, PVD, BOS, ORD, MIA, DFW, SFO, LAX}
D[BWI] D[SFO] D[LAX] D[DFW] D[ORD] D[JFK] D[BOS] D[PVD] D[MIA]
0 +∞ +∞ +∞ +∞ +∞ +∞ +∞ +∞
+∞ +∞ +∞ 621 184 +∞ +∞ 946
+∞ +∞ 1575 371 328
2467 2658 1423

Single-source shortest path problem solved.


|V| = n, |E| = m

From v:
G: Adjacency list structure.
Q: Heap, key D[u] for all u from v. removeMin in O(logn) time.
Keep an array to access to all the keys of the vertices in the heap Q.
Update D[z] by first removing object containing z then insert z with new D[z] via
heapification in O(logn) time

1. One way of implementation:


|V| = n, |E| = m

From v:
G: Adjacency list structure.
Q: Heap, key D[u] for all u from v. removeMin in O(logn) time.
Keep an array to access to all the keys of the vertices in the heap Q.
Update D[z] by first removing object containing z then insert z with new D[z] via
heapification in O(logn) time

1. One way of implementation:


1. Inserting all the vertices in Q with their initial D[]: O(nlogn)
2. Inside the while loop:
1. Remove u and maintain heap: O(logn)
2. Relaxation procedure for all adjacent edges of u: O(deg(u)logn)
|V| = n, |E| = m

From v:
G: Adjacency list structure.
Q: Heap, key D[u] for all u from v. removeMin in O(logn) time.
Keep an array to access to all the keys of the vertices in the heap Q.
Update D[z] by first removing object containing z then insert z with new D[z] via
heapification in O(logn) time

1. One way of implementation:


1. Inserting all the vertices in Q with their initial D[]: O(nlogn)
2. Inside the while loop:
1. Remove u and maintain heap: O(logn)
2. Relaxation procedure for all adjacent edges of u: O(deg(u)logn)
3. Total while loop:
|V| = n, |E| = m

If m >> n, total run time is O(mlogn)


|V| = n, |E| = m

If m >> n, total run time is O(mlogn)

(This is just the running time of one way to do this.)


So far, we have avoid negative edge weights (no negative-weight cycle)
To consider negative edge weights, the graph must be directed
otherwise shortest path does not make sense
From BWI

D[BWI] D[LAX] D[DFW] D[ORD] D[JFK] D[MIA]


0 +∞ +∞ +∞ +∞ +∞
From BWI

D[BWI] D[LAX] D[DFW] D[ORD] D[JFK] D[MIA]


0 +∞ +∞ +∞ 10 20

Adjacency list: (BWI, JFK), (BWI,MIA)

Consider (BWI, JFK)


D[BWI] + w((BWI,JFK))= 0 + 10 = 10 < D[JFK]=+∞
-> D[JFK] = 10
Consider (BWI,MIA)
D[MIA] + W((BWI,MIA))=0 + 20 = 20 < D[MIA]=+∞
->D[MIA] = 20
From BWI

D[BWI] D[LAX] D[DFW] D[ORD] D[JFK] D[MIA]


0 +∞ +∞ +∞ 10 20
50 -5 5

Adjacency list FOR MIA: (MIA,DFW),(MIA, JFK), (MIA,LAX)

Consider (MIA, DFW)


D[MIA] + w((MIA,DFW))= 20 + -25 = -5 < D[DFW]=+∞
-> D[JFK] = -5
Consider (MIA,JFK)
D[MIA] + W((MIA, JFK))= 20 + -15 = 5 < D[MIA]=10
->D[JFK] = 5
Consider (MIA,LAX)
D[MIA] + W((MIA, LAX))= 20 + 30 = 50 < D[LAX]=+∞
->D[LAX] = 50
From BWI

D[BWI] D[LAX] D[DFW] D[ORD] D[JFK] D[MIA]


0 +∞ +∞ +∞ 10 20
50 -5 2 5

Adjacency list FOR JFK: (JFK, ORD)

Consider (JFK, ORD)


D*[JFK] + w((JFK,ORD))= 10 + -8 = 2 < D[ORD]=+∞
-> D[ORD] = 2
* Use original previous round.
From BWI

D[BWI] D[LAX] D[DFW] D[ORD] D[JFK] D[MIA]


0 +∞ +∞ +∞ 10 20
50 -5 2 5
-3

Adjacency list FOR JFK: (JFK, ORD)

Consider (JFK, ORD)


D$[JFK] + w((JFK,ORD))= 5 + -8 = -3 < D[ORD]=2
-> D[ORD] = -3
$ Use relaxed
From BWI

D[BWI] D[LAX] D[DFW] D[ORD] D[JFK] D[MIA]


0 +∞ +∞ +∞ 10 20
50 -5 2 5
5 -8 -3

Adjacency list FOR ORD: (ORD, LAX), (ORD,DFW)

Consider (ORD,LAX)
D*[ORD] + w((ORD,LAX))= 2 + 3 = 5 < D[LAX]=50
-> D[ORD] = 5
Consider (ORD, DFW)
D*[ORD] + w((ORD,DFW))= 2 + -10 = -8 < D[DFW]=-5
-> D[ORD] = -8

* Use original previous round.


From BWI

D[BWI] D[LAX] D[DFW] D[ORD] D[JFK] D[MIA]


0 +∞ +∞ +∞ 10 20
50 -5 2 5
5 -8 -3
4 -13
Adjacency list for ORD: (ORD, DFW)

Consider (ORD, DFW)


D[ORD] + w((ORD,DFW))= -3 + -10 = -13 < D[DFW]=-8
-> D[DFW] = -13

Adjacency list FOR DFW: (DFW, LAX)

Consider (DFW, LAX)


D*[DFW] + w((DFW,LAX))=-8 + 12 = 4 < D[LAX]=5
-> D[LAX] = 4
From BWI

D[BWI] D[LAX] D[DFW] D[ORD] D[JFK] D[MIA]


0 +∞ +∞ +∞ 10 20
50 -5 2 5
5 -8 -3
4 -13
-1

Adjacency list FOR DFW: (DFW, LAX)

Consider (DFW,LAX)
D$[DFW] + w((DFW,LAX))= -13 + 12 = -1 < D[LAX]=4
-> D[ORD] = -1
From BWI

D[BWI] D[LAX] D[DFW] D[ORD] D[JFK] D[MIA]


0 +∞ +∞ +∞ 10 20
50 -5 2 5
5 -8 -3
4 -13
-1

Adjacency list FOR DFW: (DFW, LAX)

Consider (DFW,LAX)
D$[DFW] + w((DFW,LAX))= -13 + 12 = -1 < D[LAX]=4
-> D[ORD] = -1
5
10 1. Requirement:
20 7 10 1. Run Internet cable through the neighborhood.
5 2. Interconnect all the houses as cheaply as possible.

5
10 5
5
5 5
10
10
20 7 10
20 7 10
7 10 5
5 5

5
5 5 5
5 10
5 10

One Spanning Tree Not a Tree


Two Trees

Definition: A tree that contain all the vertices in a weighted undirected graph G is a spanning tree.
Definition: A spanning tree that minimizes the sum of the weights of the edges of T.

How many edges does a spanning tree T have?


Lemma: Let G be a weighted connected graph, and let T be a minimum spanning tree for T.
If e is an edge of G that is not in T, then
the weight of e is at least as great as any edge in the cycle created by adding e to T.

Proof by contradiction:

Suppose, for the sake of contradiction, that there is an edge, f, in T


w(e) < w(f)

Then we can remove f from T and replace it with e, and this will
result in a spanning tree, T_
w(T_) < w(T)

But the existence of such a tree, T_, would contradict the fact that
T
is a minimum spanning tree.

So no such edge, f, can exist.


Theorem: Let G be a weighted connected graph, Let G be a weighted connected graph, and let V1 and V2 be a
partition of the vertices of Ginto two disjoint nonempty sets. Furthermore, let e be an edge in G with minimum
weight from among those with one endpoint in V1 and the other in V2. There exists a minimum spanning tree T
that has e as one of its edges.

Proof by contradiction:
Suppose, for the sake of contradiction, that there is an edge, f, in T
w(e) < w(f)

f Then we can remove f from T and replace it with e, and this will
result in a spanning tree, T_
w(T_) < w(T)

But the existence of such a tree, T_, would contradict the fact that
T is a minimum spanning tree.

So no such edge, f, can exist.


{BOS}, {BWI} {SFO} {LAX} {DFW}, {ORD} {PVD} {JFK} {MIA}

Edge (Weights)
(JFK, PVD) 144
(BWI, JFK) 184
(BOS, JFK) 187
(LAX, SFO) 337
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{BOS}, {BWI} {SFO} {LAX} {DFW}, {ORD} {PVD, JFK} {MIA}

T[0] = (JFK, PVD)


Edge (Weights)
(JFK, PVD) 144
(BWI, JFK) 184
(BOS, JFK) 187
(LAX, SFO) 337
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{BOS} {SFO} {LAX} {DFW}, {ORD} {PVD, JFK, BWI} {MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
(BWI, JFK) 184
(BOS, JFK) 187
(LAX, SFO) 337
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{SFO} {LAX} {DFW}, {ORD} {PVD, JFK, BWI, BOS} {MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
T[2] = (BOS, JFK)
(BWI, JFK) 184
(BOS, JFK) 187
(LAX, SFO) 337
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{SFO, LAX} {DFW}, {ORD} {PVD, JFK, BWI, BOS} {MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
T[2] = (BOS, JFK)
(BWI, JFK) 184 T[3] = (LAX, SFO)
(BOS, JFK) 187
(LAX, SFO) 337
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{SFO, LAX} {DFW}, {ORD, PVD, JFK, BWI, BOS} {MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
T[2] = (BOS, JFK)
(BWI, JFK) 184 T[3] = (LAX, SFO)
(BOS, JFK) 187 T[4] = (BWI, ORD)
(LAX, SFO) 337
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{SFO, LAX} {DFW}, {ORD, PVD, JFK, BWI, BOS} {MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
T[2] = (BOS, JFK)
(BWI, JFK) 184 T[3] = (LAX, SFO)
(BOS, JFK) 187 T[4] = (BWI, ORD)
(LAX, SFO) 337
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{SFO, LAX} {DFW, ORD, PVD, JFK, BWI, BOS} {MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
T[2] = (BOS, JFK)
(BWI, JFK) 184 T[3] = (LAX, SFO)
(BOS, JFK) 187 T[4] = (BWI, ORD)
(LAX, SFO) 337 T[5] = (DFW, ORD)
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{SFO, LAX} {DFW, ORD, PVD, JFK, BWI, BOS} {MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
T[2] = (BOS, JFK)
(BWI, JFK) 184 T[3] = (LAX, SFO)
(BOS, JFK) 187 T[4] = (BWI, ORD)
(LAX, SFO) 337 T[5] = (DFW, ORD)
(BWI, ORD) 621
(JFK, ORD) 740
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{SFO, LAX} {DFW, ORD, PVD, JFK, BWI, BOS, MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
T[2] = (BOS, JFK)
(BWI, JFK) 184 T[3] = (LAX, SFO)
(BOS, JFK) 187 T[4] = (BWI, ORD)
(LAX, SFO) 337 T[5] = (DFW, ORD)
(BWI, ORD) 621
(JFK, ORD) 740 T[6] = (BWI, MIA)
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{SFO, LAX} {DFW, ORD, PVD, JFK, BWI, BOS, MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
T[2] = (BOS, JFK)
(BWI, JFK) 184 T[3] = (LAX, SFO)
(BOS, JFK) 187 T[4] = (BWI, ORD)
(LAX, SFO) 337 T[5] = (DFW, ORD)
(BWI, ORD) 621
(JFK, ORD) 740 T[6] = (BWI, MIA)
(DFW, ORD) 802
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
{SFO, LAX, DFW, ORD, PVD, JFK, BWI, BOS, MIA}

T[0] = (JFK, PVD)


Edge (Weights) T[1] = (BWI, JFK)
(JFK, PVD) 144
T[2] = (BOS, JFK)
(BWI, JFK) 184 T[3] = (LAX, SFO)
(BOS, JFK) 187 T[4] = (BWI, ORD)
(LAX, SFO) 337 T[5] = (DFW, ORD)
(BWI, ORD) 621
(JFK, ORD) 740 T[6] = (BWI, MIA)
(DFW, ORD) 802 T[7] = (DFW, LAX)
(ORD, PVD) 849
(BOS, ORD) 867
(BWI, MIA) 946
(JFK, MIA) 1090
(DFW, MIA) 1121
(DFW, LAX) 1235
(BOS, MIA) 1258
(DFW, SFO) 1464
(LAX, MIA) 2342
(BOS, SFO) 2704
Heap
Initialize in O(mlogm)
O(logm)
G is simple ~
O(logn)

Simple: No loop,
multiple edges
Unordered linked
list

O(1)
Unordered linked
list

O(min{|C(u)|, |C(v)|})
Each merge  The size of the cluster got merged into at most doubled.

t(v): Number of times vertex v is moved to a new cluster.

Total time spent merging clusters


O((n+m)logn) which can be simplified to O(mlogn) since G is simple & connected
Suppose you are a manager in the IT department for the government of a corrupt
dictator, who has a collection of computers that need to be connected together to
create a communication network for his spies. You are given a weighted graph,
G, such that each vertex in G is one of these computers and each edge in G is
a pair of computers that could be connected with a communication line. It is
your job to decide how to connect the computers. Suppose now that the CIA has
approached you and is willing to pay you various amounts of money for you to
choose some of these edges to belong to this network (presumably so that they
can spy on the dictator). Thus, for you, the weight of each edge in G is the
amount of money, in U.S. dollars, that the CIA will pay you for using that edge
in the communication network. Describe an efficient algorithm, therefore, for
finding a maximum spanning tree in G, which would maximize the money you
can get from the CIA for connecting the dictator’s computers in a spanning tree.
19th century German=“Association”

Denoted by In algebra, ring theory is the study of rings (algebraic


structures) in which addition and multiplication are defined
and have similar properties to those operations defined for

(R, +, ∘, 0n ,1n) the integers.


1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)

(R, +, ∘, 0n ,1n)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a mnoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
3. Multiplication is distributive w.r.t addition
1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
Associative is not required. 3. Multiplication is distributive w.r.t addition
1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
Associative is not required. 3. Multiplication is distributive w.r.t addition
1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)

Multiplicative inverse is not required.


1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
3. Multiplication is distributive w.r.t addition
1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)
1. Rings: Algebraic structures
Denoted by 1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
(R, +, ∘, 0n ,1n) 3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
3. Multiplication is distributive w.r.t addition
1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)
3. If ab = ba  commutative ring
1. Rings: Algebraic structures
1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
Denoted by (R, +, ∘,0,1) 1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
3. Multiplication is distributive w.r.t addition
Called the “Unity” and is unique 1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)
3. If ab = ba  commutative ring
1. Rings: Algebraic structures
1. Definition: A set R equipped with 2 binary operations (+ and ∘) satisfying
1. R is an abelian group under addition
Denoted by 1. (a+b)+c = a+(b+c) for all a,b,c in R (associative)
2. a+b = b+a for all a,b in R (commutative)
3. a+0 = a for all a in R (0 is the additive identity)
4. a+(-a) = 0 (-a is the additive inverse of a)
(R, +, ∘,0,1) 2. R is a monoid under multiplication
1. (a ∘ b) ∘ c = a ∘ (b ∘ c) for all a,b,c in R (associative)
2. An element 1 in R s.t. a ∘ 1 = a and 1 ∘ a = a (multiplicative identity)
3. Multiplication is distributive w.r.t addition
1. a ∘ (b+c) = (a ∘ b) + (a ∘ c) (left distributivity)
Distributive laws, are the only axioms 2. (b+c) ∘ a = (b ∘ a) + (c ∘ a) (right distributivity)
that connect + and . 3. If ab = ba  commutative ring
• (C,+,x,0,1): C is set of complex numbers, + and * are complex number addition and multiplication (commutative ring). Same for
𝑍𝑍, 𝑅𝑅, 𝑄𝑄

• ({0,1,2,….m-1}, +𝑚𝑚 , ∗𝑚𝑚 , 0, 1) where +𝑚𝑚 and ∗𝑚𝑚 are addition and multiplication modulo m (commutative ring)

• (𝑀𝑀𝑛𝑛 , +, *, 0𝑛𝑛 , 𝐼𝐼𝑛𝑛 ) where 𝑀𝑀𝑛𝑛 is the set of all n x n matrices with elements from a ring, + ,

* are matrix addition and multiplication and 0𝑛𝑛 is zero matrix and 𝐼𝐼𝑛𝑛 is identity matrix

• The set M2(R) of all 2x2 matrices over R is a ring using matrix addition and multiplication.
Abbreviated as
Abbreviated as

R denotes a commutative ring.


Mmn(R): Set of all m x n matrices with entries from R.

A & B & C in Mmn(R):


A+B = B+A and A + (B+C) = (A+B) + C
Abbreviated as

R denotes a commutative ring.


Mmn(R): Set of all m x n matrices with entries from R.

A & B & C in Mmn(R):


A+B = B+A and A + (B+C) = (A+B) + C
Abbreviated as

Zero matrix of size mxn: 0 or 0mn

For all A in Mmn(R)


A+(-A) = 0
A+0 = A
Abbreviated as

-> The additive arithmetic in Mmn(R) is entirely analogous to numerical arithmetic


If A, B in M23(R) find X in M23(R) such that X + A = B
If A, B in M23(R) find X in M23(R) such that X + A = B
The dot product of a row matrix and a column matrix:
(i,j) entry of the product of these two matrices: go across the ith row of A and down jth column of B.
(i,j) entry of the product of these two matrices: go across the ith row of A and down jth column of B.
(i,j) entry of the product of these two matrices: go across the ith row of A and down jth column of B.
Mn(R) = Mnn(R) for any n >=2

Let A, B, C be matrices in Mn(R) Then

1. A + B = B + A
2. (A+B) +C = A+ (B+C)
3. A + On = A
4. A + (-A) = O
5. (AB)C = A(BC)
6. AI = A = IA
7. A(B+C) = AB + AC and (B+C)A = BA + CA

8. Set Mn(R) over R is a ring using matrix addition and multiplication.


O(n3) time (measured in ring operation + and x. Why?
O(n3) time (measured in ring operation + and x. Why?

Each column in B
Multiply through n rows in A
Multiply & Sum (I,j) pair  n multiplication & n-1 summations
 n2

n column in B
 n3
(i,j) entry of the product of these two matrices: go across the ith row of A and down jth column of B.
Instead of 8 multiplications and 4 additions as above
There are only 7 multiplications and 18 additions/subtractions
Instead of 8 multiplications and 4 additions as above
There are only 7 multiplications and 18 additions/subtractions
m1 = (a12 – a22) (b21+ b22)
m2 = (a11 + a22) (b11 + b22)
m3 = (a11 – a21) (b11 + b12)
m4 = (a11 + a12) b22
m5 = a11 (b11 - b22)
m6 = a22 (b21 - b11)
m7 = (a21 + a22) b11
c11 = m1 + m2 – m4 + m6
c12 = m2 + m5
c21 = m6 + m7
c12 = m2 - m3 + m5 – m7
Instead of 8 multiplications and 4 additions as above
There are only 7 multiplications and 18 additions/subtractions
m1 = (a12 – a22) (b21+ b22)
m2 = (a11 + a22) (b11 + b22)
m3 = (a11 – a21) (b11 + b12)
m4 = (a11 + a12) b22
m5 = a11 (b11 - b22)
Main idea: Divide-and-conquer
n/2 x n/2 sub matrices m6 = a22 (b21 - b11)
m7 = (a21 + a22) b11
T(n) = 7T(n/2) + 18(n/2)2 n>=2 c11 = m1 + m2 – m4 + m6
T(1) = 1
c12 = m2 + m5
T(n) is ring operations. c21 = m6 + m7
c12 = m2 - m3 + m5 – m7
Homomorphisms, Isomorphism, decomposition, subrings, prime…. (Abstract Algebra)

Introduction to Homomorphisms:

R: A set with structure of an additive abelian group & a multiplicative monoid, together
with the distributive laws.

Structure-preserving mappings: θ: R → S (S is another ring)


Homomorphisms:

R: A set with structure of an additive abelian group & a multiplicative


monoid, together with the distributive laws.

Structure-preserving mappings: θ: R → S (S is another ring)

Ring homomorphism if, for all r and r1 in R:


1.Θ(r+r1) = Θ(r)+ Θ(r1) (Preserve addition)

2. Θ(rr1) = Θ(r)Θ(r1) (Preserve multiplication)

3. Θ(1R) = 1S (Preserves the unity)

Theorem. Ring homomorphism


Θ preserves zero, negatives, Z-multiplication, powers
r2su5 – 3su-2r + 2 is a rational expression.

θ: R → S

If Θ(x) = 𝑥𝑥̅ for every x ∈ R then

Θ(r2su5 – 3su-2r + 2 ) =
If θ: R -> S is a ring homomorphism, show that
Is also a ring homomorphism where

for all in M2(R)


If θ: R -> S is a ring homomorphism, show that
Is also a ring homomorphism where

for all in M2(R)

You task: verifies the preservation of addition and unity.

The preservation of multiplication: We write


10
1 2
2
20 3
5
4 3 Vertex 1 2 3 4
1 0 10 3 20
2 inf 0 inf 5
3 inf 2 0 inf
4 inf inf inf 0
function giveMeAName(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a
function giveMeAName(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a

Step k a b t Equation Quotient and remainder


0 27 4 4 27 = q04 + r0 q0= 6 and r0 = 3
function giveMeAName(a, b)
while b ≠ 0
t := b
b := a mod b Euclidean divisor (i.e. division with remainder)
a := t
return a

Step k a b t Equation Quotient and remainder


0 27 4 4 27 = q04 + r0 q0= 6 and r0 = 3
function giveMeAName(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a

Step k a b t Equation Quotient and remainder


0 27 4 4 27 = q04 + r0 q0= 6 and r0 = 3
1 4 3 3 4 = q13 + r1 q1= 1 and r1 = 1
function giveMeAName(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a

Step k a b t Equation Quotient and remainder


0 27 4 4 27 = q04 + r0 q0= 6 and r0 = 3
1 4 3 3 4 = q13 + r1 q1= 1 and r1 = 1
2 3 1 1 3 = q21 + r2 q2= 3 and r2 = 0
3 1 0 (end)
function giveMeAName(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a

Step k a b t Equation Quotient and remainder


0 27 18 18 27 = q018 + r0 q0= 1 and r0 = 9
function giveMeAName(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a

Step k a b t Equation Quotient and remainder


0 27= 3x3x3 18=3x3x2 18 27 mod 18 = 9→ 27 = q018 + r0 q0= 1 and r0 = 9
1 18 9 9 18 = q19 + r1 q1= 2 and r1 = 0
2 9 0 (end)
function gcd(a, b)
# Computing the
greatest common divisor
of 2 integers (numbers)
# Known as the
Euclidean algorithm (c.
300 BC)
while b ≠ 0
t := b
b := a mod b
a := t
return a
gcd(a,b) = 1 -> a, b are said to be coprime (or relatively prime)
There are certain things whose number is unknown.

If we count themby threes, we have two left over;

by fives, we have three left over;

and by sevens, two are left over.

Howmany things are there?

Chinese Mathematician Sun-tzu 3rd century A.D.


x = 2 (mod 3)
x = 3 (mod 5)
x = 2 (mod 7)

x = ?
x = 2 (mod 3)
x = 3 (mod 5)
x = 2 (mod 7)

x = ?

Suppose y is our solution (x = y)


Then,
Y + 3x5x7 = y + 105 is also a solution.

 So we only need to look for solutions mod 105.

By brute force, we find the solution x = 23 (mod 105)


Given : Let 𝑝𝑝0, 𝑝𝑝1, …, 𝑝𝑝𝑘𝑘−1 be k relatively prime numbers. Denote residues (𝑢𝑢0, 𝑢𝑢1, …, 𝑢𝑢𝑘𝑘−1 ) w.r.t to 𝑝𝑝0, 𝑝𝑝1, …,
𝑝𝑝𝑘𝑘−1
Then

where
𝑝𝑝 = 𝑝𝑝0 ∗ 𝑝𝑝1 ∗ …𝑝𝑝𝑘𝑘−1 and 𝑐𝑐𝑖𝑖 = 𝑝𝑝 / 𝑝𝑝𝑖𝑖, and 𝑑𝑑𝑖𝑖 = 𝑐𝑐𝑖𝑖 −1 mod 𝑝𝑝𝑖𝑖

We write: u ↔ (𝑢𝑢0, 𝑢𝑢1, …, 𝑢𝑢𝑘𝑘−1 )


x = 2 (mod 3), x = 3 (mod 5), x = 2 (mod 7)

i pi ci = p/pi di = ci-1 mod pi cidiui


x = 2 (mod 3), x = 3 (mod 5), x = 2 (mod 7)
p = p0p1p2= 3x5x7= 105
i
0
1
2
x = 2 (mod 3), x = 3 (mod 5), x = 2 (mod 7)
p = p0p1p2 = 3x5x7= 105
i pi ci = p/pi di = ci-1 mod pi cidiui
0 3 35
1 5 21
2 7 15
x = 2 (mod 3), x = 3 (mod 5), x = 2 (mod 7)
p = p0p1p2 = 3x5x7= 105
i pi ci = p/pi di = ci-1 mod pi cidiui
0 3 35 35d0= 1 mod 3
2d0 =1 mod 3
d0 = 2
1 5 21
2 7 15
x = 2 (mod 3), x = 3 (mod 5), x = 2 (mod 7)
p = p0p1p2 = 3x5x7= 105
i pi ci = p/pi di = ci-1 mod pi cidiui
0 3 35 35d0= 1 mod 3
2d0 =1 mod 3
d0 = 2
1 5 21 21d1 = 1 mod 5
d1 = 1
2 7 15 15d2 = 1 mod 7
d2 = 1
x = 2 (mod 3), x = 3 (mod 5), x = 2 (mod 7)
p = p0p1p2 = 3x5x7= 105
i pi ci = p/pi di = ci-1 mod pi cidiui
0 3 35 35d0= 1 mod 3 35x2x2 = 140
2d0 =1 mod 3
d0 = 2
1 5 21 21d1 = 1 mod 5
d1 = 1
2 7 15 15d2 = 1 mod 7
d2 = 1

140+63+30
x = 2 (mod 3), x = 3 (mod 5), x = 2 (mod 7)
p = p0p1p2 = 3x5x7= 105
i pi ci = p/pi di = ci-1 mod pi cidiui
0 3 35 35d0= 1 mod 3 35x2x2 = 140
2d0 =1 mod 3
d0 = 2
1 5 21 21d1 = 1 mod 5 21x1x3 = 63
d1 = 1
2 7 15 15d2 = 1 mod 7 15x1x2 = 30
d2 = 1

140+63+30 = 233 mod 105 = 23


x = 2 (mod 3), x = 3 (mod 5), x = 2 (mod 7)
p = p0p1p2 = 3x5x7= 105
i pi ci = p/pi di = ci-1 mod pi cidiui
0 3 35 35d0= 1 mod 3 35x2x2 = 140
2d0 =1 mod 3
d0 = 2
1 5 21 21d1 = 1 mod 5 21x1x3 = 63
d1 = 1
2 7 15 15d2 = 1 mod 7 15x1x2 = 30
d2 = 1

140+63+30 = 233 mod 105 = 23 Complexity = O(k) integer multiplications


𝑝𝑝0 = 2, 𝑝𝑝1 = 3, 𝑝𝑝2 = 5 and 𝑝𝑝3 = 7 and (𝑢𝑢0 , 𝑢𝑢1 , 𝑢𝑢2 , 𝑢𝑢3 ) = (1,2, 4,3) . What is 𝑢𝑢
such that 𝑢𝑢 ↔ (1,2,4,3) ?
P = p0 p1 p2 p3 = 2x3x5x7 = 210
i pi ci = p/pi di = ci-1 mod pi cidiui
0 2 15x7=105 d0 = 1/105 (mod 2) 105x1x1 = 105
105d0= 1 (mod2)
d0= 1
1 3 70 70d1= 1 (mod3) 70x1x2= 140
1109 mod 210
d1= 1
69 mod 210
2 5 42 42d2= 1 (mod5) 42x3x4 =504
2d2= 1 (mod5)
d2 = 3
3 7 30 30d3 = 1 (mod7) 30x4x3= 360
2d3= 1 (mod7 )
d3 = 4
1996: Search engine BackRub: Studying the importance of links with his own homepage on Stanford website.
-> Later renamed “Page-Rank” after Larry Page. Two big ideas:
1. (A rough measure of) a site authority: The # of times it was linked to others
2. Automate and sanctify the search process and cope with the ever-increasing number of sites.

Aug 96: google.Stanford.edu

Talked w/ David Filo -> Formed their own company.

"Others assumed large servers were the fastest way to handle massive amounts of data.
Google found networked PCs to be faster,“ – Google Info Web page.

Sep 99, beta label came off Google.com


n: Total number of pages in the corpus (web)
Ti: page pointing to page A
C(Ti): The number of outgoing links of Ti

PR(A) can be interpreted as the probability of visiting web-page A during the random walk obtained through the
application following 2 rules we learnt.
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations

Adjacency List Adjacency Matrix Adjacency Matrix Transposed


0 1 0: 1 2 0 1 1 0 0 0
1: 2 0 0 1 1 0 0
2: - 0 0 0 1 1 0

2
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations

Adjacency List Adjacency Matrix Adjacency Matrix Transposed


0 1 0: 1 2 0 1 1 0 0 0
1: 2 0 0 1 1 0 0
2: - 0 0 0 1 1 0

2
For every link (i,j) in E that point to document dj
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations

Adjacency List Adjacency Matrix Adjacency Matrix Transposed


0 1 0: 1 2 0 1 1 0 0 0
1: 2 0 0 1 1 0 0
2: - 0 0 0 1 1 0

2
For every link (i,j) in E that point to document dj

For every link (j,k) in E of documents dk pointed by dj


Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations

Adjacency List Adjacency Matrix A Adjacency Matrix Transposed AT


0 1 0: 1 2 0 1 1 0 0 0
1: 2 0 0 1 1 0 0
2: - 0 0 0 1 1 0

a(0) = 0
a(1) = h(0)
a(2)= h(0) + h(1)

h(0) = a(1) + a(2)


h(1) = a(2)
h(2) = 0
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations

Adjacency List Adjacency Matrix A Adjacency Matrix Transposed AT


0 1 0: 1 2 0 1 1 0 0 0
1: 2 0 0 1 1 0 0
2: - 0 0 0 1 1 0

Initialize h: 1 1 1 , a: 1 1 1
Step 1: a: 0 1 2
a(0) = 0 h: 3 2 0
a(1) = h(0)
a(2)= h(0) + h(1)

h(0) = a(1) + a(2)


h(1) = a(2)
h(2) = 0
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations

Adjacency List Adjacency Matrix A Adjacency Matrix Transposed AT


0 1 0: 1 2 0 1 1 0 0 0
1: 2 0 0 1 1 0 0
2: - 0 0 0 1 1 0

Let’s rewrite the formula to track iterations:

a(0) = 0
a(1) = h(0)
a(i) = ATh(i-1)
a(2)= h(0) + h(1) h(i) = A a(i)
h(0) = a(1) + a(2)
h(1) = a(2)
h(2) = 0
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations

Adjacency List Adjacency Matrix A Adjacency Matrix Transposed AT


0 1 0: 1 2 0 1 1 0 0 0
1: 2 0 0 1 1 0 0
2: - 0 0 0 1 1 0

Let’s rewrite the formula to track iterations:

a(0) = 0
a(1) = h(0)
a(i) = ATh(i-1)
a(2)= h(0) + h(1) h(i) = A a(i)
h(0) = a(1) + a(2) Power method:
h(1) = a(2) a(1) = ATh(0)
h(2) = 0 a(2) = ATh(1) -> a(1) = AT A a(1)
a(i) = ATh(i-1) -> a(i) = AT A a(i-1) = (AT A)i-1 a(1)
Understand how we can use a simple iterative algorithm, PR(A) can be computed in a few iterations

Adjacency List Adjacency Matrix A Adjacency Matrix Transposed AT


0 1 0: 1 2 0 1 1 0 0 0
1: 2 0 0 1 1 0 0
2: - 0 0 0 1 1 0

Doing the same for h(i), we have:

a(0) = 0
a(1) = h(0)
a(i) = (ATA)i-1 a(1)
a(2)= h(0) + h(1) h(i) = Aa(i) = AATh(i-1) = (AAT)i-1 h(1)
h(0) = a(1) + a(2) Avoid a,h grows large by scaling to ensure they always stay within 0,1
h(1) = a(2)
h(2) = 0 What does convergence mean?
PageRank(G,V,E) //The code performs a SYNCHRONOUS Rank update
1. for all vertices u in V /* Initialization Step */
2. Src[u] = 1/n;
3. small = something-small;
4. while (convergence-distance > small) {
5. for all v in V
6. D[v]=0;
7. for(i=0;i<|V|;i++) {
8. Read-Adjacency-List(u,m,k1,k2,...,km); /*k1,k2,…,km: endpoints-of-outgoing edges */
9. for(j=1;j<=m;j++) /* m : out degrees of vertex u*/
10. D[kj] = D[kj] + Src[u]/m
11. }
12. for all v in V
13. D[v] = d * D[v] + (1-d)/n
14. convergence-distance = ||Src-D|| /* Euclidean distance */
15. Src=D;
16. }
Intricacy: Google's Technology page explains how the process gets more complicated:

PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual
page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But Google looks at
considerably more than the sheer volume of votes, or links the page receives. For example, it also analyzes the page that
casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages
"important." Using these and other factors, Google provides its views on the pages' relative importance. And that's still only part
of the protocol.

Performance: It's almost impossible to fathom, but PageRank considers more than 500 variables and 3 billion terms and still
manages to deliver results in fractions of a second. Yet there also is a certain simplicity to the search process.
Searching Problems
Dictionary ADT
• Stores associations between keys and items (also
called associative map)
• Operations :
(a) insertElement(k,e) – insert association between a
key and item; replace if necessary
(b) removeElement(k) – remove association between
key k and its element if it exists, else return
NO_SUCH_KEY error
(c) findElement(k) – find the element associated with
key k if it exists, else return EMPTY_ELEMENT
• Keys are from a set which may or may not be totally
ordered but keys checked for “equality”.
Dictionary implementations
• Store keys as sequence (unordered) with new elements
inserted at end.
-insertElement(k,e) – O(1) time worst-case
- removeElement(k) – O(n) time worst-case
- findElement(k) – O(n) time worst-case
• Hash tables
-- Define a function f : S → {0,1,…N-1} where S is the set of keys
and N is the table size.
--- Use this hash function to identify the index in the table
where the key is stored. (similar to direct access in an array)
--- Since it is not 1-1 function, more than one key may map into
the same index in the table causing collision.
-- Ideally f should distribute keys evenly in the table.
Hash tables
• Hash function f is composed of two functions
f1 : S → Z and f2 : Z → {0,1,…N_1} where is set of integers.
f1 is called “hashcode” function (defined in Java Object) and f2
is called “compression map”
• Hashcode function should be representative of all fields of an
object and for a primitive type (e.g. integer) representative of
all bits in integer. It is defined independent hash table size.
• Polynomial hash codes:
𝑥!"# 𝑎!"# + 𝑥!"$ 𝑎!"$ +…… 𝑥# 𝑎 + 𝑥%
where 𝑥!"#, 𝑥!"$,…. are integer representations of components
of key object and 𝑎 is a constant not equal to 1
-- 33, 37, 39, 41 are found to be good choices for character
strings.
Example hashcode in Java
public class Key implements Comparable<Key> {
private final String firstName, lastName;
public Key(String fName, String lName) { ..}

@Override
public int hashCode() {
int hash = 17 + firstName.hashCode();
hash = hash * 31 + lastName.hashCode();
return hash;
}
@Override
public boolean equals(Object obj) {….}
• a.equals(b) => a.hashCode() == b.hashCode() but not vice
versa.
Hash tables (contd.)
Compression Maps
• Division method:
h(k) = |k| mod N , k is hash code and choose N as a
prime number so as to distribute hash values evenly
among table indices.
• MAD (Multiply, Add and Divide) method:
h(k) = |ak+b| mod N where a and b are integers
randomly chosen .s.t. a,b ≥ 0 and a mod N ≠ 0
-- provides close to “ideal” hash function where
Prob(two keys hash into same value) ≈ 1/N
Collision resolution
• Each location in hash table is called “bucket”.
• (a) Chaining, (b) Open addressing
• Chaining – keep colliding keys in a list or sequence
called “chain” in the same bucket
-- in findElement(k), after hashing need to search for
keys in the chain
--- in insertElement(k,e) and removeElement(k), need
to hash into bucket and then insert/remove element
in/from chain
• Load factor – n/N (n number of items in hash table)
preferably n/N < 1
• Expected time complexity – O(⌈n/N⌉) (O(1) if n is
O(N))
Open Addressing
• Open Addressing – no chaining of keys that have same
hash value
• Probes other locations for the key
Suppose i = h(k). Probe sequence of locations (i+f(𝑗))
mod N, 𝑗 = 0,1,2,…. until the key is found for search/remove
or until an empty slot is found for insert
-- if f(𝑗) = 𝑗 for all j, it is called “linear probing”
-- if f(𝑗) = 𝑗 ! , it is called “quadratic probing”
--- if f(𝑗) = 𝑗. g(k) where g(k) is another hashing function, it is
called “double hashing”
• Faster than chaining for search/insert but removal is
complicated as there should not be any “holes” in
sequence for a particular h(k)
• Tends to introduce clusters of keys in the table for linear
and quadratic probing.
Universal Hashing
• A family of hash functions that minimizes expected number of
collisions
• Stated formally, let H be a subset of functions from [0,M-1] to
[0,N-1] satisfying the property that for any randomly chosen
function h from H and for any two integers j, k in [0,M-1],
Pr(h(j)) = h(k)) ≤ 1/N
• Implies that E[# of collisions between j and “n” integers from
[0,M-1]) ≤ n/N
• Set of hash functions of form (ak+ b mod p) mod N
where 0 < a < p, 0 ≤ b < p where p is a prime number with M ≤ p
< 2M ( M is number of hash codes)
can be shown to be “universal”.
• All Dictionary ADT operations can be done in expected time
O(⌈n/N⌉) using a randomly chosen hash function from this set
and chaining.
FFT computations
CS 610 Spring 2019
Instructor : Ravi Varadarajan

April 25, 2019

Remainder theorem: The value of a polynomial p(x) at a point, i.e. p(a)


is equal to the remainder when p(x) is divided by x − a.
Note DFT([a0 , a1 , a2 , ...an−1 ]) = [p(ω 0 ), p(ω 1 ), ...p(ω n−1 )] where ω is a prinicipal
n-th root of unity in a commutative ring and p(x) = a0 +a1 x+a2 x2 +....an−1 xn−1 .

By using remainder theorem, to compute DFT, we just need to compute


remainders of p(x) when divided by (x − w0 ), (x − w1 ), (x − w2 ), ...(x − wn−1 )
respectively. We can do this cleverly by using the following fact :
Suppose r(x) is the remainder polynomial when p(x) is divided by the product
polynomial d1 (x)d2 (x). Then the remainder polynomial r1 (x) when p(x) is
divided by d1 (x) is the same as remainder ploynomial when r(x) is divided by
d1 (x).
As an example we use the following tree of product polynomials for successive
divisions to get remainder polynomials. In the tree, the parent ploynomial is
the product of children polynomials. For example, (x − ω 0 )(x − ω 4 ) = (x2 − ω 0 )
and (x2 − ω 0 )(x2 − ω 4 ) = (x4 − ω 0 ).
Let’s us illustrate for the case of n = 8. Let r02 (x), r42 (x) be the remainder
polynomials when p(x) = a0 + a1 x + a2 x2 + ....a7 x7 is divided by (x4 − ω 0 ) and
(x4 − ω 4 ) respectively. It can be shown that
r02 (x) = (a0 + ω 0 a4 ) + (a1 + ω 0 a5 )x + (a2 + ω 0 a6 )x2 + (a3 + ω 0 a7 )x3
and
r42 (x) = (a0 + ω 4 a4 ) + (a1 + ω 4 a5 )x + (a2 + ω 4 a6 )x2 + (a3 + ω 4 a7 )x3 .
If we order coefficients of these polynomials, we get (a0 +ω 0 a4 ), (a1 +ω 0 a5 ), (a2 +
ω 0 a6 ), (a3 + ω0 a7 ), (a0 + ω 4 a4 ), (a1 + ω 4 a5 ), (a2 + ω 4 a6 ), (a3 + ω4 a7 ). This is ex-
actly what we get after the first stage of computation in the butterfly network
by pairing elements in positions which differ in the k = 3-th bit (i.e. most
significant position. The powers of ω to be used in each position is given by left
shifting by k − 1 = 2 bits after bit reversal of bits representing that position.
For example for 6 = (110)2 , we use (100)2 = 4. Let these values be denoted by
b0 , b1 , ....b7 .
Now if r01 (x), r21 (x) are the remainder polynomials when r02 (x) is divided by
(x2 − ω 0 ) and (x2 − ω 4 ) respectively, then we get
r01 (x) = (b0 + ω 0 b2 ) + (b1 + ω 0 b3 )x
and
r21 (x) = (b0 + ω 4 b2 ) + (b1 + ω 4 b3 )x
Also if r41 (x), r61 (x) are the remainder polynomials when r42 (x) is divided by
(x2 − ω 2 ) and (x2 − ω 6 ) respectively, then we get
r41 (x) = (b4 + ω 2 b6 ) + (b5 + ω 2 b7 )x
and
r61 (x) = (b4 + ω 2 b6 ) + (b5 + ω 6 b7 )x
If we order coefficients of r01 (x), r21 (x), r41 (x), r61 (x), we get (b0 + ω 0 b2 ), (b1 +
ω 0 b3 ), (b0 + ω 4 b2 ), (b1 + ω 4 b3 ), (b4 + ω 2 b6 ), (b5 + ω 2 b7 ), (b4 + ω 6 b6 ), (b5 + ω 6 b7 ).
This is precisely what we get after the second stage of computation in the but-
terfly network by pairing elements in positions which differ in the k − 1 = 2-th
bit. The powers of ω to be used in each position is given by left shifting by
k − 2 = 1 bit after bit reversal of bits representing that position. For example
for 7 = (111)2 , we use (110)2 = 6.
Finally we get remainders r00 , r40 when r01 is divided by (x − w0 ) and (x − w4 ),
remainders r20 , r60 when r21 is divided by (x − w2 ) and (x − w6 ),
remainders r10 , r50 when r41 is divided by (x − w1 ) and (x − w5 ), and
remainders r30 , r70 when r61 is divided by (x − w3 ) and (x − w7 ).
These remainders are exactly the values of DFT in positions 0, 4, 2, 6, 1, 5, 3, 7
permuted in bit reversal order.

Example : Compute DFT of [2, 4, 3, 4, 2, 1, 5, 7] in commutative ring


[0, 1, 2, ...16] where + and * operations are modulo 17. Note that 16 ≡ −1, 15 ≡
−2, ....1 ≡ −16
First let us determine ω, 8-th principal root of unity in this ring, i.e. ω 8 ≡ 1
mod 17. Hence ω = 2. Thus ω 0 = 1, ω 1 = 2, ω 2 = 4, ω 3 = 8, ω 4 = 16 ≡ −1
mod 17, ω 5 = −2, ω 6 = −4, ω 7 = −8
In the butterfly network, there are log n stages, where at each stage we pair
elements to do computations. For a pair with coefficients (bi , bj ), the new
coefficients to be used for next stage at positions i and j are computed as
ci = bi + ω pi bj and cj = bi + ω pj bj where the powers of ω, namely pi and pj are
obtained by reversing the bits of i and j and left shifting k − m − 1 bits for
stage 0 ≤ m < log n computation.

In the diagram below we show 3 stages, m = 0, 1, 2. The powers of ω to be


used in these positions are indicated below the elements.
In the first stage, i.e, when m = 0, computations are done by pairing elements
in positions that differ in the most significant bit, i.e. 0 with 4, 1 with 5, 2 with
6 and 3 with 7. For example, new values for position 0 = a0 + ω 0 a4 = 2 + 2 = 4
and for position 4 = a0 + ω 4 a4 = 2 − 2 = 0.
In the second stage, i.e, when m = 1, computations are done by pairing
elements in positions that differ in the 2nd bit, i.e. 0 with 2, 1 with 3, 4 with
6 and 5 with 7. For example, new values for position 0 = 4 + ω 0 ∗ 8 = 12
and for position 2 = 4 + ω 4 ∗ 8 = 4 − 8 = −4. New values for position 4 =
0 + ω 2 ∗ (−2) = −8 and for position 6 = 0 + ω 6 ∗ (−2) = 8.

In the third stage, i.e, when m = 2, computations are done by pairing ele-
ments in positions that differ in the 3rd bit, i.e. 0 with 1, 2 with 3, 4 with 5 and
6 with 7. For example, new values for position 0 = 12 + ω 0 ∗ (−1) = 11 and for
position 1 = 12+ω 4 ∗(−1) = 13 . New values for position 6 = 8+ω 3 ∗(−2) = −8
and for position 7 = 8 + ω 7 ∗ (−2) = 7.

The DFT vector is given by [11, 8, 6, 9, 13, 10, 3, 7]. Note that elements in
positions 1 and 4 were switched, as well positions in 3 and 6.

Let’s compute inverse DFT of the DFT vector to see if we get back the
original vector. For inverse FFT, ω1 = ω −1 = 2−1 ≡ 9 mod 17 ≡ −8 mod 17.
The powers of this root are : ω10 = 1, ω11 = 9, ω12 = −4, ω13 = −2, ω14 = −1, ω15 =
8, ω16 = 4, ω17 = 2. We do the exat type of computations as in FFT but at the
end we need to multiply the values by n− 1 = 8− 1 = 15 ≡ −2 mod 17. In
the diagram below at the end, we got back the original vector for which we
computed the DFT before.
Graph algorithms
Graphs
• A directed graph G = (V,E), V is set of vertices and E is
set of edges, i.e. subset of V x V
• A digraph defines a binary relation on V.
• Useful representation in many applications (e.g. web
page links, transportation routes, semantic, social
networks)
• InEdges(v) = { (u,v)| (u,v) ϵ E } and outEdges(v) = {
(v,u) | (v,u) ϵ E }
• indeg(v) = |InEdges(v)| outDeg(v) = |outEdges(v)|
• ∑! ∈# 𝑖𝑛𝑑𝑒𝑔(𝑣) = ∑! ∈# 𝑜𝑢𝑡𝑑𝑒𝑔(𝑣) = |E|
• |E| ≤ 𝑛$ where n is number of vertices and |E| ≤
n(n-1) if there are no self-cycles.
Undirected graphs
• In undirected graph, no edge orientation.
(u,v) ϵ E → (v,u) ϵ E
• Defines a symmetric relation on V. Not
reflexive i.e. (u,u) not in E.
• incidentedges(v) = { (u,v)| (u,v) ϵ E or (v,u) ϵ E}
• deg(v) = |incidentedges(v)|
• ∑! ∈# 𝑑𝑒𝑔(𝑣) = 2 |E|
• |E| ≤ n(n-1)/2
Graph data Structures
• Adjacency matrix A – n x n boolean matrix where A(i,j) =
1 iff (𝑣! , 𝑣" ) ϵ E where V = {𝑣# , 𝑣$ , …𝑣% }
- Useful for matrix operations to solve all-pair problems
• Incidence matrix B – n x m matrix where B(i,j) = 1 if 𝑒" =
(𝑣! , 𝑣& ) for some k and B(i,j) = -1 if 𝑒" = (𝑣& , 𝑣! ) for some
k. We assume V = {𝑣# , 𝑣$ , …𝑣% } and E = {𝑒# , 𝑒$ , …𝑒' } –
Useful for analysis of some cycle problems
• Adjacency List – Array L[0..n-1] of lists where L[i] contains
the list of indices of vertices 𝑣" such that (𝑣! , 𝑣" ) ϵ E
Space is O(n+m) as opposed to O(𝑛$ ) for adjacency
matrix. Useful for many efficient graph algorithms.
Graph representations (example)
𝑒! 𝑣" 𝑒"
𝑣! 𝑒%
𝑒# 𝑒$ 𝑣$
𝑣#

𝑣! 𝑣" 𝑣# 𝑣$ 𝑒! 𝑒" 𝑒# 𝑒$ 𝑒%

0 1 0 0 1 0 −1 −1 0
0 0 1 1 −1 1 0 0 1
A= . B=
1 0 0 0 0 0 1 0 −1
1 0 0 0 0 −1 0 1 0
Adj lists:
L(𝑣! ) 𝑣"
L(𝑣" ) 𝑣# 𝑣$
L(𝑣# ) 𝑣!
L(𝑣$ ) 𝑣!
Graph Search
• Two approaches:
(a) Breadth-first search - From start node s, get vertices immediately reachable from s and
fan-out from each of those nodes and so on

BFS(G):
Input : G=(V,E) as adjacency list
Output: List of nodes visited in BFS order
q ← empty queue; bfsList ← {}
for i ← 0 to n-1
visited[i] ← false
for i ← 0 to n-1
if !visited[𝑣! ]
q.enqueue(𝑣! )
visited[s] ← true
while !q.empty()
v ← q.dequeue()
bfsList.append(v)
for each w in G.adjList[v]
if !visited[w]
visited[w] ← true
q.enqueue(w)

Return bfsList
Graph Search (contd.)
• BFS takes O(n) additional space and O(n+m) time primitive queue, list operations,
assignment and boolean operations
• Depth-first search is a recursive traversal exploring a path before back-tracking by a step
and trying a different path from that vertex.

DFSMain(G,s):
Input : G=(V,E) as adjacency list and s start vertex index
Output: List of nodes visited in DFS order
dfsList ← {}
for i ← 0 to n-1
visited[i] ← false
for i ← 0 to n-1
if !visited(𝑣! )
DFS(𝐺, 𝑣! ,dfsList)

DFS(G,v,dfsList):
visited[v] ← true
dfsList.append(v)
for each w in G.adjList[v]
if !visited[w]
DFS(G,w,dfsList))
Return dfsList
• Takes O(L) additional stack space (L- longest path length from v) and O(n+m) time
Graph search example

𝑣! 𝑣&

𝑣$
𝑣" 𝑣%

𝑣#
Graph search example - BFS
• Start vertex : 1; Q = {𝑣! }; bfslist = []
• Remove 𝑣! from Q, bfslist=[𝑣! ]; Q = {𝑣" , 𝑣# }
• Remove 𝑣" , bfslist=[𝑣! , 𝑣" ]; Q = {𝑣# , 𝑣$ }
• Remove 𝑣# , bfslist=[𝑣! , 𝑣" , 𝑣# ]; Q = {𝑣$ , 𝑣% }
• Remove 𝑣$ , bfslist=[𝑣! , 𝑣" , 𝑣# , 𝑣$ ]; Q = {𝑣% , 𝑣& }
• Remove 𝑣% , bfslist=[𝑣! , 𝑣" , 𝑣# , 𝑣$ , 𝑣% ]; Q = {𝑣& }
• Remove 𝑣& , bfslist=[𝑣! , 𝑣" , 𝑣# , 𝑣$ , 𝑣% , 𝑣& ]; Q = {}
Graph search example - DFS
• Start vertex : 𝑣# ; dfslist ={}
DFS(𝑣# ) dfslist={𝑣# }

DFS(𝑣$ ) dfslist={𝑣#, 𝑣$ } DFS(𝑣& ) dfslist={𝑣#, 𝑣$ , 𝑣' , 𝑣( , 𝑣& }

DFS(𝑣' ) dfslist={𝑣#, 𝑣$ , 𝑣' } DFS(𝑣) ) dfslist={𝑣#, 𝑣$ , 𝑣' , 𝑣( , 𝑣& , 𝑣) }

DFS(𝑣( ) dfslist={𝑣#, 𝑣$ , 𝑣' , 𝑣( }


Undirected graph connectivity
• A graph is connected if there is a path between every
pair of vertices in G.
• A connected component is a maximal subgraph of G
which is connected.
• Can use BFS and DFS to find connected components
of a G in O(n+m) time.
• A spanning tree of a connected graph G is a subgraph
of G which is connected and has minimum number
of edges. It has n-1 edges where n = |V|
• BFS and DFS provide spanning trees.
• A spanning forest is a collection of spanning trees of
a graph that contains all the vertices.
Cycles in graphs
• A cycle is a path that starts and ends with the same
vertex. A simple cycle is a cycle which itself does not
contain any cycles.
• A tree does not have any cycle. It contradicts minimal
subgraph property.
• A directed graph which does not have any directed cycles
is called a directed acyclic graph (DAG)
• Examples of DAG : task precedence graph, PERT charts
used in project management, block-chains used in
crypto-currency technologies
• Topological sorting in DAG : Assign unique number to
each of the vertices such that there is no edge from
higher indexed vertex to lower indexed vertex.
Topological sorting of DAG
• There exists at least one vertex (source) whose indegree
= 0; Symmetrically, at least one vertex (sink) whose
outdegree = 0. Why ?
• Idea:
(i) We can first number all source vertices.
(ii) If we remove source vertices from graphs along with
their outgoing edges, resulting graph is still a DAG and
should have at least one source vertex. These edges will be
bound to be directed from a lower numbered to higher
numbered vertex.
(iii) Repeat steps (i) and (ii) until we have an empty graph.

This can be done in O(n+m) time with additional space


complexity of O(n)
K-connectedness
• An undirected graph is k-connected if it requires the
removal of at least k vertices to make the graph
disconnected.
• For example, computer networks are less resistant to
failure of a gateway if it has higher connectivity.
• A tree is a 1-connected graph
• A bi-connected graph is 2-connected. This means
that for every pair of vertices (u,v), there is a cycle
that includes u and v
• A bi-connected component of a graph is a maximal
bi-connected subgraph of a graph. Note that bi-
connected components create a partition of edge
set.
Graph algorithms
K-connectedness
• An undirected graph is k-connected if it requires the
removal of at least k vertices to make the graph
disconnected.
• For example, computer networks are less resistant to
failure of a gateway if it has higher connectivity.
• A tree is a 1-connected graph
• A bi-connected graph is 2-connected. This means
that for every pair of vertices (u,v), there is a cycle
that includes u and v
• A bi-connected component of a graph is a maximal
bi-connected subgraph of a graph. Note that bi-
connected components create a partition of edge
set.
Bi-connectivity algorithm
• An articulation point (separation vertex) of a graph G is a
vertex whose removal splits G into at least 2 components. A
bi-connected graph has no articulation points
• In DFS of a (undirected) graph there must be either forward
edges (of spanning tree) or back edges from a descendant to
an ancestor.
• If there is a cycle in graph involving vertex v then there must
be a back-edge (x,y) from the depth-first sub tree of v where x
is a descendant of v and y is an ancestor of v.
• Define DF(u) be depth-first number of vertex u.
• Define LOW(v) = min{ DF(v) U {DF(y) |∃ (x,y) ϵ E s.t. x is a
descendant of v and y is an ancestor of v in DFS tree} }
Bi-connectivity alg. proof
Claim :
v is an articulation point or a separation vertex iff either
(a) v is the root and v has more than one child
or
(b) v is not the root and has a child w such that LOW(w) ≥ DF(v).
Proof:
(i) (if-part) If root v has two tree edges (v,x) and (v,y) then there is no
edge between subtrees T1 and T2 rooted at x and y.
→ every path from a vertex in T1 to a vertex in T2 has to have vertex v
→ v is an articulation point
(only-if) We use proof by contradiction. Suppose v is the root and an
articulation point but has only one child x.
→ Removal of v along with edge (v,x) does not disconnect the graph of
vertices reachable from v.
→This is a contradiction as v is an articulation point.
Hence v must have more than one child.
Bi-connectivity alg. Proof (contd.)
(ii)
(if-part) v is not the root and v has a child w such that LOW[w] ≥
DF(v).
Let’s prove by contradiction. Suppose v is not articulation point.
→ there is a path from w to a proper ancestor of v, say y that
bypasses v.
→ ∃ a back edge (x,y) where DF(x) ≥ DF(w) and DF(y) < DF(v)
→ LOW(w) ≤ DF(y) < DF(v) which contradicts defn. of LOW
Hence v is an articulation point
(only-if) Left as exercise.
Example

𝑆𝑖𝑛𝑐𝑒 𝑣! has child 𝑣" with LOW(𝑣" ) = 6 ≥ 𝐷𝐹(𝑣! ) = 2, 𝑣! is an articulation point


𝑆𝑖𝑛𝑐𝑒 𝑣" has child 𝑣# with LOW(𝑣# ) = 6 ≥ 𝐷𝐹(𝑣" ) = 6, 𝑣" is an articulation point
Bi-connectivity algorithm
DFSB(G,v,count,isRoot,S):
Output: Articulation points of G and max DF value in subtree of v
visited[v] ← true; DF[v] ← count
LOW[v] ← DF[v]
nChildren ← 0
for each vertex w in G.adjList[v]
if !visited[w]
parent[w] ← v;
nChildren ← nChildren + 1
count ← DFSB(G,w,count+1,false,S)
if (isRoot and nChildren > 1) or (!isRoot and LOW[w] ≥ DF[v])
S.add(v) // v is an articulation point
LOW[v] ← min(LOW[v], LOW[w])
else if w != parent[v]
LOW[v] ← min(LOW[v], DF[w])
Return count
Bi-connectivity algorithm (contd.)
Biconnected(G):
Input : Graph G=(V,E) with adjacency list data structure
Output: Set of articulation points
for v ← 0 to n-1
visited[v] ← false
count ← 0; S ← {}
for v ← 0 to n-1
if !visited[v]
count ← DFSB(G,v,count+1,true,S)
Return S

Time complexity : O(n+m) and additional space complexity O(L)


Graph algorithms
Digraph strong connectivity
• A digraph G=(V,E) is strongly connected if there is a
directed path between every pair of vertices (u,v)
• A strongly connected component is a maximal subgraph
that is strongly connected.
• Unlike bi-connected components, strongly connected
components do not partition the edges but only vertices.
• In a DFS tree of a directed graph, there are 3 types of
non-tree edges (u,v)
(a) back edge from a descendant to an ancestor (DF(v) < DF(u))
(b) forward edge from an ancestor to a descendant (DF(v) > DF(u))
(c) cross edge (to left in tree) where neither of u and v is a
descendant of another (DF(v) < DF(u))
Strong connectivity algorithm basis
• We use DFS and LOW computation similar to bi-
connectivity algorithm.
• Define
LOWLINK(v) = min{ DF(v) U {DF(y) |∃ (x,y) ϵ E s.t.
x is a descendent of v and
(a) either (x,y) is a back-edge or
(b) (x,y) is a cross edge and root of the strongly
connected component containing y is an ancestor of
v} }
• Claim: v is root of a strongly connected component
in DFS tree, iff LOWLINK(v) = DF(v)
Strong connectivity algorithm basis
Proof:
• If part : proof by contradiction. Suppose LOWLINK(v) = DF(v)
but v is not the root of a strongly connected component in
DFS tree
→ let r be a proper ancestor of v in DFS tree and be the root of
component containing v
→ ∃ a directed path from v to r in the graph and let w be first
vertex in this path such that w is a descendant of v and (w,x) is a
non-tree edge with DF(x) < DF(v)
Then there is a path from r to v and then v to r that contains w
and w must be in the same component as v and r
→ LOWLINK(v) ≤ DF(x) < DF(v), contradiction
• Only-if part: left as an exercise.
O(n+m) Strong connectivity alg.
DFSB(G,v,count):
Output: strongly connected components detected during DFS of v
visited[v] ← true; DF[v] ← count
LOWLINK[v] ← DF[v]
st.push(v); inStack[v] ← true
for each vertex w in G.adjList[v]
if !visited[w]
count ← DFSB(G,w,count+1)
LOWLINK[v] ← min(LOWLINK[v], LOWLINK[w])
else if DF[w] < DF[v] and inStack[w]
LOWLINK[v] ← min(LOWLINK[v], DF[w])
if LOWLINK[v] = DF[v]
print “beginning of a strongly connected component”
while !st.isEmpty()
w ← st.pop(); print w ; inStack[w] ← false
if w = v
break
print “end of a strongly connected component”
Return count
Example
𝑣'
𝑣!

𝑣% 𝑣& 𝑣(
𝑣"
𝑣# 𝑣$ Stack has 𝑣! , 𝑣" , 𝑣# , 𝑣$ , 𝑣% after
DFS(𝑣! ) is completed.
DF LOWLINK (in square) LOWLINK(𝑣! ) = 1 = DF(𝑣! ).
Eject vertices from stack until 𝑣!
1 1 𝑣! 1 1 𝑣! as part of a strongly connected
component

1 2 𝑣" 1 2 𝑣" 3
5 𝑣%

3 3 Stack has 𝑣! , 𝑣" , 𝑣# , 𝑣$ , 𝑣% , 𝑣& ,


1 𝑣$ 1 𝑣$ 6 𝑣& 6 𝑣' , 𝑣( after DFS(𝑣& ) is
completed.
LOWLINK(𝑣& ) = 6 = DF(𝑣& ).
4 7 𝑣' 6 Eject vertices from stack until
1 𝑣# 1 4 𝑣# 𝑣& as part of a strongly
connected component
Stack(not recursion stack) has 𝑣! , Cross edge (𝑣% , 𝑣# )
used as 𝑣# still in 8 𝑣( 6
𝑣" , 𝑣# , 𝑣$ (𝑣$ on top) after DFS(𝑣" )
is completed stack
Minimum Edit Distance - DP example
CS 435 Spring 2021
Instructor : Ravi Varadarajan

February 11, 2021

We need to transform the string X = x1 x2 ....xm into the string Y = (y1 y2 ....yn ).
Let’s keep a pair of cursors one for X and another for Y which indicate the
transformation remaining to be done. For example when the cursor pair is
(i, j), it remains to transform xi ....xm into yj .....yn . This cursor pair defines a
state and we move from state to state as we make decisions to keep a character,
insert a character from Y, delete a character from X or replace a character in
X from Y in these cursor positions.
Let us define F (i, j) as the mimimum number of operations required to
transform xi ....xm into yj .....yn . i.e. it is optimization function for a subproblem
when we are in state (i, j). So what we require ultimately is F (1, 1). Let us
focus on the subproblem defined by the state (i, j).
(a) When xi = yj , then the optimum number of operations required to
transform xi ....xm into yj ....yn should be the same as the optimum number of
operations required to transform xi+1 ....xm into yj+1 ....yn ; in this case we move
to cursor pair state (i + 1, j + 1). So F (i, j) = F (i + 1, j + 1).
(b) When xi 6= yj , we have 3 choices :
(i) insert yj before xi in which case we move to state (i, j + 1) as what remains
is to transform xi ....xm into yi+1 .....yn ; here the cost will be 1 unit for insert +
mimimum number of operations required to transform xi ....xm into yj+1 .....yn ,
i.e. 1 + F (i, j + 1),
(ii) delete xi in which case we move to state (i + 1, j) as what remains is to
transform xi+1 ....xm into yj .....yn ; cost here is 1 + F (i + 1, j),
(iii) replace xi by yj in which case we move to state (i + 1, j + 1); here cost is
1 + F (i + 1, j + 1). Obviously we like to take the choice that gives the minimum.
This means that F (i, j) = min(1 + F (i + 1, j), 1 + F (i, j + 1), 1 + F (i + 1, j + 1)).
This is the recursive formulation of DP. But we do not compute it recursively
but iteratively in an order which gurantees that when we compute F (i, j), we
have already computed F (i + 1, j), F (i, j + 1), F (i + 1, j + 1). What order
gurantees that ?
We start from the following boundary conditions first :
(a) For 1 ≤ i ≤ m, F (i, n + 1) i.e. for the problem of transforming xi ...xm into
empty string as n + 1-th cursor position in Y moves past the end of the string.
It is easy to see F (i, n + 1) = m − i + 1 as we just to have to delete these
m − i + 1 characters from X.
(b) similary for 1 ≤ j ≤ n, F (m + 1, j) i.e. for the problem of transforming
empty string into yj ...yn as m + 1-th cursor position in X moves past the end
of the string. It is easy to see F (m + 1, j) = n − j + 1 as we just to have to
insert these n − j + 1 characters of Y into X.
(c) F (m + 1, n + 1) = 0 as all transformation is done here.
Let’s have two (m+1)×(n+1) tables, one to store F (i, j), 1 ≤ i ≤ m+1, 1 ≤
j ≤ n + 1 and another to store A(i, j) for optimum action in that state which
has one of the values O, I, D, R corresponding to the four actions of keeping xi
when xi = yj , inserting yj , deleting xi and replacing xi with yj
We fill first the bottom row and right most column for boundary conditions,
then we fill the rows from bottom to top and from right to left in each row.
Final entry to be filled is F (1, 1). The algorithm steps are follows:
for i ← 1 to m + 1 do
F [i, n + 1] ← m − i + 1
end
for j ← 1 to n + 1 do
F [m + 1, j] ← n − j + 1
end
for i ← m downto 1 do
for j ← n downto 1 do
if X.elementAtRank(i)=Y.elementAtRank(j) then
(F [i, j], A[i, j]) ← (F [i + 1, j + 1],0 O0 ]
end
else
F (i, j) = min(1 + F [i + 1, j], 1 + F [i, j + 1], 1 + F [i + 1, j + 1])
if F [i, j] = 1 + F [i + 1, j] then
A[i, j] =0 D0
end
if F [i, j] = 1 + F [i, j + 1] then
A[i, j] =0 I 0
end
if F [i, j] = 1 + F [i + 1, j + 1] then
A[i, j] =0 R0
end
end
end
end
// get optimum solution
print ’minimum edit distance = ’, F [1, 1]
(i, j) ← (1, 1)
while i ≤ m and j ≤ n do
if A[i, j] =0 O0 then
print ’Keep ’,X.elementAtRank(i)
(i, j) ← (i + 1, j + 1)

end
if A[i, j] =0 I 0 then
print ’Insert ’,Y.elementAtRank(j)
(i, j) ← (i, j + 1)

end
if A[i, j] =0 D0 then
print ’Delete ’,X.elementAtRank(i)
(i, j) ← (i + 1, j)

end
if A[i, j] =0 R0 then
print ’Replace ’,X.elementAtRank(i), ’ by ’, Y.elementAtRank(j)
(i, j) ← (i + 1, j + 1)

end
end
while i ≤ m do
print ’Delete ’, ,X.elementAtRank(i)
i←i+1

end
while j ≤ n do
print ’Insert ’, ,Y.elementAtRank(j)
j ←j+1

end
Try to compute these tables for X=’AMAZING’ and Y=’HORIZONS’ and
get optimal solution.
Note optimal actions are indicated by O - keep symbol in X, I - insert symbol
from Y in X, D - delete symbol in X, R - replace symbol in X by symbol in Y.
Table after filling in boundary conditions:

H O R I Z O N S
j→ 1 2 3 4 5 6 7 8 9
i↓
A 1 7 D
M 2 6 D
A 3 5 D
Z 4 4 D
I 5 3 D
N 6 2 D
G 7 1 D
8 8 I 7 I 6 I 5 I 4 I 3 I 2 I 1 I 0
Table after filling in the row i = 7 from right left:

H O R I Z O N S
j→ 1 2 3 4 5 6 7 8 9
i↓
A 1 7 D
M 2 6 D
A 3 5 D
Z 4 4 D
I 5 3 D
N 6 2 D
G 7 8 R 7 R 6 R 5 R 4 R 3 R 2 I 1 R 1 D
8 8 I 7 I 6 I 5 I 4 I 3 I 2 I 1 I 0
For example entry for the cell (7, 8), namely for the subproblem of transforming
’G’ to ’S’ is given by F (7, 8) = min(1 + F (8, 9), 1 + F (7, 9), 1 + F (8, 8)) =
1 + F (8, 9) = 1. The corresponding action here is replace ’G’ by ’S’ and we will
be done with transformation as we move to state (8, 9).
On the other hand for the cell (7, 7), for the subproblem of transforming ’G’ to
’NS’, F (7, 7) = min(1 + F (8, 8), 1 + F (7, 8), 1 + F (8, 7)) and both 1 + F (8, 8)
and 1 + F (7, 8) give the same min value of 2 and we arbitrarily pick 1 + F (7, 8)
with the corresponding action of inserting ’N’ before ’G’ and then move state
to (7, 8).
Table after filling in the row i = 6 from right left:

H O R I Z O N S
j→ 1 2 3 4 5 6 7 8 9
i↓
A 1 7 D
M 2 6 D
A 3 5 D
Z 4 4 D
I 5 3 D
N 6 7 I 6 I 5 I 4 I 3 I 2 I 1O 2 R 2 D
G 7 8 R 7 R 6 R 5 R 4 R 3 R 2 I 1 R 1 D
8 8 I 7 I 6 I 5 I 4 I 3 I 2 I 1 I 0
For example for the cell (6, 7), the subproblem is transforming ’NG’ to ’NS’ and
since the symbols are same at the cursor positions, F (6, 7) = F (7, 8) = 1 and
we move to state (7, 8).
Final table values are given below:
H O R I Z O N S
j→ 1 2 3 4 5 6 7 8 9
i↓
A 1 6 R 5 R 5 R 5 D 6 R 6 R 6 D 7 R 7 D
M 2 6 R 5 R 4 R 4 R 5 R 5R 5 D 6 R 6 D
A 3 6 R 5 R 4 R 3 R 3 D 4 R 4 D 5 R 5 D
Z 4 6 R 5 R 4 R 3 I 2O 3 R 3 D 4 R 4 D
I 5 6 I 5 I 4 I 3 O 3 R 2 R 2 D 3 R 3 D
N 6 7 I 6 I 5 I 4 I 3 I 2 I 1O 2 R 2 D
G 7 8 R 7 R 6 R 5 R 4 R 3 R 2 I 1 R 1 D
8 8 I 7 I 6 I 5 I 4 I 3 I 2 I 1 I 0
Optimal value is given by F (1, 1) = 6. We can get the optimal sequence of
actions by starting from (1, 1) and using optimal action in the cell to move to
next state. The optimal sequence of states : (1, 1) → (2, 2) → (3, 3) → (4, 4) →
(4, 5) → (5, 6) → (6, 7) → (7, 8) → (8, 9).
The corresponding sequence of actions : R (replace ’A’ by ’H’) → R (replace
’M’ by ’O’) → R (replace ’A’ by ’R’) → I (insert ’I’) → O (keep ’Z’) → R
(replace ’I’ by ’O’) → O (keep ’N”) → R (replace ’G’ by ’S)
Numerical Problems
Ring Algebra
• (S, +, ◦, 0, 1) is a ring if
(a) (S, +, 0) is a monoid
(b) (S, ◦, 1) is a monoid
(c) + is commutative (a + b = b + a)
(d) ◦ distributes over + ( a ◦ (b+c) = (a ◦ b) + (a ◦ c))
(e) every element a in S has additive inverse b, i.e. a +b = b + a
= 0 (we call it –a)

• If operation ◦ is also commutative, it is a commutative ring


Examples of Rings
• Set of real numbers with + and * operations (commutative
ring)

• (C, + , *, 0, 1) where C is set of complex numbers, + and * are


complex number addition and multiplication (commutative
ring)

• ({0,1,2,….m-1}, +! , ∗! , 0, 1) where +! and ∗! are addition


and multiplication modulo m (commutative ring)

• (𝑀" , +, *, 0" , 𝐼" ) where 𝑀" is the set of all n x n matrices with
elements from a ring, + , * are matrix addition and
multiplication and 0" is zero matrix and 𝐼" is identity matrix
Fourier Transforms
• Fourier transform is used in signal processing to convert a
signal from time domain to frequency domain.
• Discrete Fourier Transform (DFT) is used in digital signal
processing to examine frequency spectrum of a sampled
signal (e.g. audio) over a period of time
• Let (S, +, ◦, 0, 1) be a commutative ring. An element w in S is
called a principal n-th root of unity if
(a) w ¹ 1
(b) w" = 1 and
(c) ∑"&'
#$% w #( = 0, 1 ≤ p < n

Root of equation 𝑥 " - 1 = 𝑥 − 1 (1 + 𝑥 ) +𝑥 * + ⋯ 𝑥 "&') = 0


which is not 1.
• w%, w',…. w"&' are roots of unity, all except 1 are principal
roots.
DFT
• In complex number ring (C, +, *, 0, 1), a principal n-th root of
unity is given by e)p+/" where i = −1 = cos 2p𝑖/𝑛 +
sin 2p𝑖/𝑛
• In ({0,1,2,,16} , +, *, 0, 1) where +, * are modulo 17, 2 is the 8-
th root of unity
• DFT definition:
Let a = [𝑎% , 𝑎' , …𝑎"&' ] be a n-element column vector
A is a n x n matrix with A(i,j) = w+# where w is principal n-th
root of unity
F(a) = A * a is a n-element transform vector
• Inverse A&' exists, i.e. A&'(F(a)) = a
• If n has multiplicative inverse, A&'(i,j) = 𝑛&' w&+# . Note w&'is
also a principal n-th root of unity.
𝜔 is the 8-th root of unity
𝜔! in the ring
𝜔" 𝜔! is the 4-th root of unity
𝜔 𝜔)* = 𝜔% is also the 8-th
root of unity.

−1 = 𝜔# 𝜔& = 𝜔' = 1

𝜔( 𝜔%
−𝜔 −𝜔"
𝜔$
−𝜔!

!"#
!+ !+
In Complex number ring , 𝜔 = 𝑒 $ = cos + 𝑖 𝑠𝑖𝑛
& &
!+ !+
𝜔)* = cos - 𝑖 𝑠𝑖𝑛
& &
In modulo 17 ring {0,1,2…16}, 𝜔 = 2, 𝜔! = 4, 𝜔" = 8, 𝜔# = 16 ≡ −1 𝑚𝑜𝑑 17 ,
𝜔( = (-1) * 2= -2 ≡ 15 (mod 17), 𝜔$ = -4, 𝜔% = -8, 𝜔& = -16 ≡ 1 (mod 17)
𝜔)* = 𝜔% = -8 ≡ 9 (mod 17)
DFT and polynomials
• Let p(x) be the polynomial = 𝑎! + 𝑎" x + …𝑎#$" x #$"
• Then i-th element of F(a) (DFT)
= ∑#$" '% '
%&! 𝑎% w = p(w ) – evaluated at root w
'

• F(a) computes evaluation of p(x) at roots of unity


• Inverse-DFT does polynomial interpolation (determining coefficients) from
values at roots of unity
• Multiplication of polynomials 𝑝(𝑥) = 𝑎! + 𝑎" 𝑥 + … 𝑎#$" 𝑥 #$" and
q(𝑥) =𝑏! + 𝑏" 𝑥 + …𝑏#$" 𝑥 #$" is given by
𝑝(𝑥). 𝑞(𝑥) = 𝑐! + 𝑐" 𝑥 + … 𝑐(#$( 𝑥 ((#$")
where 𝑐+ = ∑+%&! 𝑎% 𝑏+$%
• [𝑐! , 𝑐" , …. 𝑐(#$( ] vector is called “convolution” of two vectors [𝑎! , 𝑎" , ….
𝑎#$" ] and [𝑏! , 𝑏" , ….𝑏#$" ] commonly used in signal processing.
One way to do convolution of two n-1 element vectors is to
(a) evaluate each of them at 2n-th roots of unity after padding each vector
with 0’s to length 2n – DFT of each vector
(b) do pointwise multiplication of evaluated (DFT) values
(c ) Inverse-DFT of the result to get convolution vector
Numerical Problems (contd.)
DFT and polynomials
• Let p(x) be the polynomial = 𝑎! + 𝑎" x + …𝑎#$" x #$"
• Then i-th element of F(a) (DFT)
= ∑#$" '% '
%&! 𝑎% w = p(w ) – evaluated at root w
'

• F(a) computes evaluation of p(x) at roots of unity


• Inverse-DFT does polynomial interpolation (determining coefficients) from
values at roots of unity
• Multiplication of polynomials 𝑝(𝑥) = 𝑎! + 𝑎" 𝑥 + … 𝑎#$" 𝑥 #$" and
q(𝑥) =𝑏! + 𝑏" 𝑥 + …𝑏#$" 𝑥 #$" is given by
𝑝(𝑥). 𝑞(𝑥) = 𝑐! + 𝑐" 𝑥 + … 𝑐(#$( 𝑥 ((#$")
where 𝑐+ = ∑+%&! 𝑎% 𝑏+$%
• [𝑐! , 𝑐" , …. 𝑐(#$( ] vector is called “convolution” of two vectors [𝑎! , 𝑎" , ….
𝑎#$" ] and [𝑏! , 𝑏" , ….𝑏#$" ] commonly used in signal processing.
One way to do convolution of two n-1 element vectors is to
(a) evaluate each of them at 2n-th roots of unity after padding each vector
with 0’s to length 2n – DFT of each vector
(b) do pointwise multiplication of evaluated (DFT) values
(c ) Inverse-DFT of the result to get convolution vector
Fast Fourier Transform
• Evaluation of a n-1 degree polynomial at a point takes
O(n) time (ring operations) using Horner’s rule:
𝑝(𝑥) = ((. . (𝑎012 𝑥 + 𝑎013 ) 𝑥 + 𝑎014 ) 𝑥 +
⋯ 𝑎2 ) 𝑥 + 𝑎5
• If we evaluate 𝑝(𝑥) at each point 𝑥 = w6 , 0 ≤ 𝑖 <
𝑛 separately, DFT computation takes O(𝑛3 ) time.
• FFT algorithm is much faster, takes 𝑂(𝑛 log 𝑛) ;
requires 𝑛 to be a power of 2. Relies on 2 results:
• Reduction property: If w is n-th root of unity when n
is even, w3 is the n/2-th root of unity in the same ring.
• Reflexive property : If w is n-th root of unity,
w0/3 = -1 (additive inverse of 1 in the ring)
𝜔 is the 8-th root of unity
𝜔! in the ring
𝜔" 𝜔! is the 4-th root of unity
𝜔 𝜔)* = 𝜔% is also the 8-th
root of unity.

−1 = 𝜔# 𝜔& = 𝜔' = 1

𝜔( 𝜔%
−𝜔 −𝜔"
𝜔$
−𝜔!

!"#
!+ !+
In Complex number ring , 𝜔 = 𝑒 $ = cos + 𝑖 𝑠𝑖𝑛
& &
!+ !+
𝜔)* = cos - 𝑖 𝑠𝑖𝑛
& &
In modulo 17 ring {0,1,2…16}, 𝜔 = 2, 𝜔! = 4, 𝜔" = 8, 𝜔# = 16 ≡ −1 𝑚𝑜𝑑 17 ,
𝜔( = (-1) * 2= -2 ≡ 15 (mod 17), 𝜔$ = -4, 𝜔% = -8, 𝜔& = -16 ≡ 1 (mod 17)
𝜔)* = 𝜔% = -8 ≡ 9 (mod 17)
Basis of FFT algorithm
• Suppose 𝑝(𝑥) = 𝑎2 + 𝑎3 𝑥 + … 𝑎453 𝑥 453 where 𝑛 =
2𝑘
Then 𝑝(𝑥) can be written as 𝑝6764 (𝑥 8) + 𝑥 . 𝑝9:: (𝑥 8)
where
𝑝6764 (𝑦) = 𝑎2 + 𝑎8 𝑦 + 𝑎; 𝑦 8+ ……+ 𝑎8(=53) 𝑦 (=53)
𝑝9:: (𝑦) = 𝑎3 + 𝑎? 𝑦+ 𝑎@ 𝑦 8+ ……+ 𝑎8 =53 A3 𝑦 =53
• If b = FFT([𝑎2 ,….. 𝑎453 ]), for 0 ≤ 𝑖 ≤ 𝑘 = n/2,
𝑏B = p(wB ) = 𝑝6764 ((w8)B ) + wB 𝑝9:: ((w8)B )
= 𝑐B + wB 𝑑B
where c = = FFT([𝑎2 , 𝑎8 ….. 𝑎8(=53) ]), d= FFT([𝑎3 , 𝑎? …..
𝑎8 =53 A3 ]) evaluated at k=n/2-th root of unity w8
• Also note that 𝑏=AB = 𝑐B + w=AB 𝑑B = 𝑐B - wB 𝑑B as w= = w4/8 = -
1
Recursive FFT algorithm
FFT(a, w ):
Input : a = [𝑎2 , 𝑎3 ….. 𝑎453 ] from a commutative ring, where
n=2= , 𝑘 ≥ 0 and w is n-th root of unity in the ring
Output: b = F. a is the FFT of a
if n = 1
return a
c ← FFT([𝑎2 , 𝑎8 ….. 𝑎458 ] , w8)
d ← FFT([𝑎3 , 𝑎? ….. 𝑎453 ] , w8)
𝑥←w
for 𝑖 ← 0 to n/2 -1
𝑏B ← 𝑐B + 𝑥 * 𝑑B
𝑏!AB ← 𝑐B − 𝑥 * 𝑑B
"
𝑥 ← 𝑥*w
return b
More on FFT
• Time complexity (primitive ring operations):
T(n) ≤ 2 T(n/2) + k n, n ≥ 2, T(1) = d → T(n) is O(n log n)
• Another way to look at FFT is division by polynomials
• For a polynomial 𝑝(𝑥), value at 𝑥 = 𝑎
𝑝(𝑎) = remainder when 𝑝(𝑥) is divided by 𝑥 − 𝑎 (remainder
theorem)
• To compute values of 𝑝(𝑥) at w! , w" , ….. w#$" ,
we need remainders when 𝑝(𝑥) is divided by
𝑥 − w! , 𝑥 − w" , … , 𝑥 − w#$"
• To find out remainders of 𝑝(𝑥) when we divide by 𝑞" (𝑥) and 𝑞% (𝑥),
we can first find remainder polynomial 𝑟(𝑥) of 𝑝(𝑥) when divided
by 𝑞" (𝑥) * 𝑞% (𝑥) and then
take remainders of r(𝑥) when divided by 𝑞" (𝑥) and 𝑞% (𝑥)
• If we pair 𝑥 - w! , 𝑥 - w" ,… , 𝑥 - w#$" in a particular order and
successively multiply products to get final product polynomial and
then find remainders successively by the product polynomials we
can find evaluations efficiently.
FFT Butterfly network
• Iterative bottom-up approach amenable to parallel processing
• For n = 2= , there are 𝑘 stages
• At each stage m, 0 ≤ 𝑚 < 𝑘, we have 2D smaller butterfly
networks of size 2=5D
• In a butterfly network at stage m, for each 0 ≤ 𝑖 ≤ 𝑛 − 1,
value at position 𝑖 namely 𝑣(𝑖) paired with value at position
𝑗 𝑣(𝑗) where 𝑖 and 𝑗 differ in 𝑘 − 𝑚 −th bit position
• Assuming 𝑖 < 𝑗, value for next stage at 𝑖 = 𝑣(𝑖) +
wB# 𝑣(𝑗) and value for next stage at 𝑗 = 𝑣(𝑖) + wE# 𝑣(𝑗)
𝑖D - integer resulting from reversing bits of integer 𝑖 and
shifting by (𝑘 − 𝑚 − 1) bits to the left
Numerical Problems (contd.)
Wrapped convolution
• Convolution of two n-length vectors results in 2n-length
vector and hence FFT algorithm requires padding input
vectors with zeros for the remaining n elements
• Wrapped convolution results in n-length vector and does not
require padding with zeros.
• Positive wrapped convolution (useful in many integer
algorithms):
𝑐! = ∑!"#$ 𝑎! 𝑏!%" + ∑(%'
"#!&' 𝑎" 𝑏(&!%" , 0 ≤ i ≤ n-1
• Negative wrapped convolution:
𝑑! = ∑!"#$ 𝑎! 𝑏!%" - ∑(%'
"#!&' 𝑎" 𝑏(&!%" , 0 ≤ i ≤ n-1

• Positive wrapped convolution c of a and b = 𝐹 %'(F(a) * F(b))


• Let g(a) = [𝑎$ ,Y 𝑎' ….. Y(%'𝑎(%' ] where Y) = w. Then
g(d) = 𝐹 %'(F(g(a)) * F(g(b))) where d is the negative wrapped
convolution of a and b
Polynomial vs Integer arithmetic
• N-bit integer u = ∑(%' !
!#$ 𝑢! 2 is like a polynomial
• Integer product u.v can be computed using convolution. In
fact we do positive wrapped convolution by doing
computation of uv modulo 2( + 1 using FFT.
• Schӧnage-Strassen integer multiplication algorithm uses this
approach to achieve time complexity of O(n log n log log n) bit
operations.
• Polynomial division and squaring take same order of time as
polynomial multiplication → integer operations of same kind
take same order of time as integer multiplication
• Finding a mod b « polynomial evaluation at a point
• Constructing integer from residues (Chinese remaindering) «
polynomial interpolation
Chinese Remainder Theorem
• Two integers a and b are relatively prime if gcd(a,b) = 1
• Let 𝑝$, 𝑝', …, 𝑝+%' be 𝑘 relatively prime numbers. Then any
integer r between 0 and 𝑝$ ∗ 𝑝' ∗ …𝑝+%' - 1 can be
represented as (𝑟$, 𝑟', …, 𝑟+%' ) (called “residues”) where 𝑟", = q
mod 𝑝", We write it as r « (𝑟$, 𝑟', …, 𝑟+%' )
• This representation is unique due to Chinese Remainder
Theorem
• Given : residues (𝑢$, 𝑢', …, 𝑢+%' ) w.r.t to 𝑝$, 𝑝', …, 𝑝+%'
Then u = ∑+%'
!#$ 𝑐! 𝑑! 𝑢! modulo 𝑝 where
p = 𝑝$ ∗ 𝑝' ∗ …𝑝(%' and 𝑐! = 𝑝/ 𝑝!, and 𝑑! = 𝑐! %' mod 𝑝!
• Example: 𝑝$ = 2, 𝑝' = 3, 𝑝) = 5 and 𝑝, = 7 and (𝑢$ , 𝑢' , 𝑢) ,
𝑢,) = (1,2, 4,3) . What is u such that u « (1,2,4,3)
Polynomial vs Integer arithmetic complexity
Problem Integer (bit operations) Polynomial (ring ops)
Addition/Subtraction O(n) O(n)
Multiplication O(n log n log log n) O(n log n)
Modulo m = 𝑤 ! + 1 or O(p log w) O(t)
modulo 𝑥 " - c
Division O(n log n log log n) O(n log n)
Evaluation at n points O(n 𝑙𝑜𝑔# n)
Interpolation from n points O(n 𝑙𝑜𝑔# n)
Residues for “k” relatively O(nk log k log nk log log nk) O(nk log k log nk)
prime numbers (n-
bit)/polynomials (n-
degree)
Chinese Remaindering O(nk log k log nk log log nk) O(nk log k log nk)
(with preconditioning)

GCD O(n 𝑙𝑜𝑔# n log log n) O(n 𝑙𝑜𝑔# n)


Linear Recurrence Relations

Dr. Ravi Varadarajan


Why need them ?
• Need for time and space complexity analysis
• Natural for recursive algorithms but useful for non-
recursive iterative algorithms also
• Example: BinarySearch
T(n) – worst case complexity of while loop when high-
low = n
T(0) = 1
T(n) = T(⌊n/2⌋) + 8 for n > 0
S(n) – Space complexity is O(n) and O(1) for additional
space

Dr. Ravi Varadarajan


Complexity for recursion
factorial(n):
Input : An integer n >= 0
Output: factorial of n
if n = 0
return 1
else
return n * factorial(n-1)
• Time complexity: T(n) = 4 + T(n-1), n > 0 and T(0) = 2
• Space complexity S(n) = 1 + S(n-1), n >0 and S(0) = 1

Dr. Ravi Varadarajan


Solving linear recurrences
• Expand and derive by inspection – use series
summation
• Use characteristic equations – solve
polynomial equation for general solution
• Use generating functions – T(n) -> f(z) and
convert recurrence to equation in f(z), solve
for f(z), convert f(z) solution back to T(n) – Will
not discuss it here.

Dr. Ravi Varadarajan


Expansion & Inspection examples
!
1. 𝑇 𝑛 = 𝑇
"
+ 1, 𝑛 > 1 and n a power of 2
= 1, 𝑛 = 1
!
𝑇 𝑛 = 𝑇 +1
"
! !
= (𝑇 + 1) + 1 = 𝑇 +2
"! "!
! !
= (𝑇 " + 1) + 2 = 𝑇 +3
" ""
Generalizing it by inspection,
!
𝑇 𝑛 = 𝑇 # + 𝑖, 𝑖 ≥ 1
"
For what value of 𝑖, we can reach boundary condition, i.e. n= 1
!
= 1 ⇒ 𝑖 = log " 𝑛
"#
Hence 𝑇 𝑛 = T(1) + log " 𝑛 = 1 + log " 𝑛. i.e.𝑇(𝑛)is Θ(log " 𝑛)

Dr. Ravi Varadarajan


Expansion & Inspection examples
!
2. 𝑇 𝑛 = 3 𝑇 + 𝑐 𝑛, 𝑛 > 1, n a power of 2 and some 𝑐 > 0
"
= k , 𝑛 = 1 for some 𝑘 > 0
!
𝑇 𝑛 =3𝑇 + 𝑐𝑛
"
! ! ! #
= 3 (3 𝑇 + 𝑐 ) + 𝑐𝑛 = 3" 𝑇 + 𝑐𝑛 (1 + )
"! " "! "
! ! #
= 3" (3 𝑇 +𝑐 ! ) + 𝑐𝑛 (1 + )
"" " "
! # #
= 3# 𝑇 + 𝑐𝑛 [1 + + ( )" ]
"" " "
Generalizing it by inspection for 𝑖 > 1,
! #
𝑇 𝑛 = 3$ 𝑇 # + 𝑐𝑛 ∑$() (
%&' ") %
"
Again setting 𝑖 = log " 𝑛,
(*+, !()) #
Hence 𝑇 𝑛 = 3*+,! ! 𝑇 1 + 𝑐𝑛 ∑%&' ! ( )$
"

Dr. Ravi Varadarajan


Expansion & Inspection examples
Again setting 𝑖 = log " 𝑛,
(*+, !()) # $
Hence 𝑇 𝑛 = 3*+,! ! 𝑇 1 + 𝑐𝑛 ∑%&' ! ( )
"
Note that first term = 3*+,! ! 𝑘 = 𝑛 *+,! #
𝑘 which is Θ(𝑛*+,! # )
(*+, !()) # $
Second term = 𝑐𝑛 ∑%&' ! ( )
"
"
( )$%&! ' −1
= 𝑐𝑛 !
" (using geometric series
()
!
summation)
# *+, !
≤ 2𝑐𝑛 ( ) ! = 𝑐 𝑛*+,! # which is O(𝑛*+,! # ) , in fact it is
"
Θ(𝑛*+,! # )
Hence T(n) is Θ(𝑛*+,! # )

Dr. Ravi Varadarajan


Using characteristic equations
• Solving higher order (i.e. having more than one recurrence
term on RHS) recurrences with expansion is difficult.
• Consider homogenous recurrence equation
𝑇(𝑛) = 𝑎! 𝑇(𝑛 − 1) + 𝑎"#! 𝑇(𝑛 − 2) +…. 𝑎" 𝑇(𝑛 − 𝑚)
We can show that 𝑇(𝑛) = 𝑏 $ satisfies the above equation for
some value of 𝑏.
𝑏 $ − 𝑎! 𝑏 $#! − 𝑎% 𝑏 $#% − …. 𝑎" 𝑏 $#" = 0
𝑏 $#" (𝑏 " − 𝑎! 𝑏 "#! − 𝑎% 𝑏 "#% − …. 𝑎" ) = 0
=> (𝑏 " − 𝑎! 𝑏 "#! − 𝑎% 𝑏 "#% − …. 𝑎" ) = 0
There are m roots of the equation.
Simple case : all m roots 𝑏!, 𝑏%, … 𝑏" are distinct
General solution : 𝑘! 𝑏!$ + 𝑘% 𝑏%$ +…. + 𝑘" 𝑏" $
Use 𝑚 boundary conditions to solve for constants 𝑘! , … 𝑘"
Dr. Ravi Varadarajan
Characteristic equations examples
1. Fibonacci sequence
𝐹(𝑛) = 𝐹(𝑛 − 1) + 𝐹(𝑛 − 2), 𝑛 > 1
𝐹(1) = 1, 𝐹(0) = 0
𝐹 𝑛 − 𝐹 𝑛−1 − 𝐹 𝑛−2 =0
Characteristic equation :
𝑏 ! − 𝑏 − 1 = 0 ; quadratic equation in 𝑏
"# $ "% $
Roots are : 𝑏" = and 𝑏! =
! !
& &
"# $ "% $
General solution : 𝑘" + 𝑘!
! !
𝐹(0) = 𝑘" + 𝑘! = 0
"# $ "% $
𝐹(1) = 𝑘" + 𝑘! =1
! !
" "
Solving for 𝑘" , 𝑘! we get 𝑘" = and 𝑘! = -
$ $
& &
" "# $ " "% $
Final solution : 𝐹 𝑛 = −
$ ! $ !
Dr. Ravi Varadarajan
Characteristic equations examples
• Equal roots of characteristic equation:
Example:
𝐹 𝑛 − 𝐹(𝑛 − 1) = 𝐹(𝑛 − 1) + 𝐹(𝑛 − 2)
𝐹 0 = 1 and 𝐹 1 = 𝑑 + 1
Þ 𝐹 𝑛 − 2𝐹 𝑛 − 1 + 𝐹 𝑛 − 2 = 0
Þ Characteristic equation: 𝑏 % −2 𝑏 + 1 = 0
(𝑏 − 1)% = 0 . Equal roots b = 1
Here general solution : 𝐹 𝑛 = (𝑘! 𝑛 + 𝑘%) 𝑏 $ = (𝑘! 𝑛 + 𝑘%)

𝐹 0 = 𝑘% = 1
𝐹 1 = (𝑘! + 𝑘%) = 𝑑 + 1 => 𝑘! = d

Final solution : 𝐹 𝑛 = 𝑛 + 𝑑
Dr. Ravi Varadarajan
Characteristic equation method for non-homogeneous
recurrences

𝑎! T(n) + 𝑎% 𝑇(𝑛 − 1) + 𝑎"#! 𝑇(𝑛 − 2) +…. 𝑎" 𝑇 𝑛 − 𝑚 =


𝑓(𝑛)
• Two approaches
1. convert it into higher order homogeneous recurrence:
T(n) = T(n-1) + d
T(n) - T(n-1) = d
T(n-1) – T(n-2) = d
Hence T(n) - T(n-1) = T(n-1) – T(n-2)
T(n) – 2 T(n-1) + T(n-2) = 0 (previous example)
2. Solution : T(n) = G(n) + H(n) (homogeneous + particular
solution)
Dr. Ravi Varadarajan
Transforming into linear recurrences
!
• 𝑇 𝑛 =3𝑇 + 𝑐 𝑛, 𝑛 > 1, n a power of 2
"
= k ,𝑛 = 1
Setting 𝑛 = 2# and letting S 𝑙 = 𝑇 𝑛
We get
S 𝑙 = 3 S 𝑙 − 1 + c 2:
S(0) = k

Dr. Ravi Varadarajan


Probabilistic data structures for
dictionaries and sets
Ravi Varadarajan
Skip-lists
• A linked list of keys in non-decreasing order with
short-cut pointers to speed-up search.
• Every key node in list has level_0 link to next key
node (with keys in non-decreasing order)
• In addition a node may have level_1,
level_2,…level_m links with the link always pointing
to node with key no larger than this key.
• In effect there are m+1 lists
• Two categories of operations :
(a) before(p), after(p) – to navigate within level_i list
(b) below(p), above(p) – to navigate between levels.
Randomization in skip lists
• Each key has ½ probability of being in next level list. If it is
not in level_i list it will not be in any higher level list.
• There are n items in level_0 list, n/2 items on the average
on level_1 list, n/4 items on average in level_2 list,…. n/ 2!
on the average on level_i etc..
• Bounding # of levels or height:
Prob(an item is in level i) = 1/ 2! (consecutive “i” heads in
coin tosses)
→ Prob(level i is not empty) = 𝑃! ≤ n / 2!
→ Prob(height > c log n for c > 1) ≤ n / 2" #$% & = 1/ 𝑛"'(
→ Prob(height ≤ c log n) ≥ 1 - 1/ 𝑛"'( (approaches 1 as n→∞)
→ With high probability, height is O(log n).
• Space complexity : E(size) = ∑+!)* 𝑛/ 2! < 2n , hence O(n)
FindElement in skip list
findElement(k):
Input : key k
Output : item associated with k if it exists, else return “key error”
p ← first position of highest level
while below(p) ≠ null and key(p) != k
p ← below(p)
while key(after(p)) <= k
p ← after(p)
if key(p) = k return item(p) else return “key error”
• Time complexity :
(a) # of outer loop iterations is O(log n) with high probability
(b) Expected # of inner loop iterations at level i = Expected number
of items encountered in level i but not in level (i+1) = Expected number
of coin tosses until a tail
∑$ !
!"# 𝑖 (1/ 2 ) = 2
With high probability findElement() takes O(log n) time.
Insertion/Deletion in skip list
• InsertElement() first does a search all the way
to level_0 list
• Flip a coin repeatedly to decide if it has to be
in the next higher level list. If it is not in the
next level we stop. Else insert it in the list
• Complexity is O(log n) expected
• DeleteElement() requires removing item from
multiple levels and since height is O(log n)
with high prob., complexity is O(log n)
expected.
Sorting and selection
Distribution based sorting
• Stable sorting – if for two elements 𝑎! , 𝑎" in input
sequence, i < j and 𝑎! = 𝑎" , in output sequence 𝑎!
should appear before 𝑎"
• Allows sorting on multiple keys of elements easily.

• Bucket-Sort : Keys in range [0,N-1] (e.g. state


abbreviations)
N buckets, each key id added to list of
corresponding bucket, bucket lists are concatenated.
Can be made stable.
Time complexity is O(n+N) where n is number of
items to be sorted.
Radix sort algorithm
RadixSort(A):
Input : Sequence of m-component items 𝑘! , 𝑘" , … 𝑘# where 𝑘$ is a
m-tuple (𝑘$! , 𝑘$" , .. 𝑘$% ), where 𝑘$& ϵ [0,N-1]
Output : Lexicographically sorted sequence of the items
l → copy of input sequence 𝑘! , 𝑘" , … 𝑘# // output list
for t ← 0 to N-1
Q[t] ← {} // bucket lists
for j ← m down to 1 // each component
for i ← 1 to n
remove 𝑘$ from list l and add it to the end of the list Q[𝑘$& ]
for t ← 0 to N-1
remove items from Q[t] add them to end of list “l”
Return l

Time complexity : O(m(n+N)) - For fixed value if m and N, complexity


is O(n) best we can do.
Radix sort example
158, 241, 136, 222, 357, 782, 438
Bucket sort on least significant digit :
Keep 10 buckets one for each digit:
0 [] 1 [241] 2 [222, 782] 3[] 4 [] 5 [] 6 [136] 7 [357] 8 [158,438] 9 []
241, 222, 782, 136, 357, 158, 438
Sort on 2nd digit
0[] 1 [] 2 [222] 3 [136,438] 4 [241] 5 [357,158] 6 [] 7 [] 8 [782] 9 []
222, 136, 438, 241, 357, 158, 782
Sort most significant digit
0[] 1 [136,158] 2 [222,241] 3 [357] 4 [438] 5 [] 6 [] 7 [782] 8 [] 9 []

136, 158, 222, 241, 357, 438, 782


Radix Sort
• Lexicographic ordering for a multi-component key. For two keys
(𝑘! , 𝑙! ) and (𝑘" , 𝑙" ), we say (𝑘! , 𝑙! ) < (𝑘" , 𝑙" ) if either 𝑘! < 𝑘" V
(𝑘! = 𝑘" ^ 𝑙! < 𝑙" ). We can extend this ordering to keys with more
than 2 components.
• Radix sorting can be used to sort items lexicographically in most
efficient manner if all components are from range [0,N-1] for some
N.
• For an item (𝑘$! , 𝑘$" , .. 𝑘$% ), we refer to 𝑘$% least significant and
𝑘$! most significant
• Example : Dictionary of words, m-digit numbers
• Idea:
(a) Sort by least significant component first using a stable bucket-
sort strategy
(b) Sort the result list by next significant component using same
strategy.
Stable sorting maintains least significant ordering of items which are
in the same bucket in this step.
Sorting lower bound
• How fast can we sort elements ?
• If we don’t know the distribution or range of input elements a
priori, we can derive lower bound based on comparisons only.
• Suppose we have input elements 𝑎! , 𝑎", 𝑎# ….. 𝑎$
• Any comparison based algorithm can be reduced to a decision
tree where in each node we check if 𝑎% ≤ 𝑎& for two
elements 𝑎% and 𝑎& . Only 2 outcomes.
• Leaves of decision tree correspond to sorting order.
• Tree must have at least 𝑛! leaves (permutations of n input
elements) and must have height ≥ ⌈log n!⌉
• Worst-case time complexity is O(h) where h is height of
decision tree
$
• By Stirling’s formula: n! ≈ ( ' )$ → any sorting algorithm must
have of # of comparisons ≥ n log n – 1.44n
𝑎! ≤ 𝑎" ?

T F

𝑎" ≤ 𝑎# ? 𝑎! ≤ 𝑎# ?

T T F
F

𝑎" ≤ 𝑎! < 𝑎# 𝑎" ≤ 𝑎# ?


𝑎! ≤ 𝑎" ≤ 𝑎# 𝑎! ≤ 𝑎" ?
T T F
F

𝑎! ≤ 𝑎# < 𝑎" 𝑎# < 𝑎! ≤ 𝑎" 𝑎" ≤ 𝑎# < 𝑎! 𝑎# < 𝑎" < 𝑎!

Decision tree for a 3-element


sorting algorithm
Comparison based Sorting
• Criteria:
(a) # of comparisons
(b) # of data movements
(c) Additional space required
(d) in-memory vs external sorting
(e) stable sorting
• O(𝑛" ) sorting algorithms
1. Bubble Sort (lot of comparisons + data movements),
2. Selection Sort ( lot of comparisons + less data movements),
3. Insertion Sort (on average less comparisons + data
movements); takes advantage of input ordering
• O(n log n) sorting algorithms
Heap sort, Merge Sort, Quick Sort
Comparison of 𝑂(𝑛! ) sorting methods
Heap sort
HeapSort(A):
Input : Array A[1..n] of elements to be sorted
Output : Same array sorted
h ←Heapify(A) // use MAX_HEAP A
for i ← 0 to n-2
e ←removeMax(A[n-i]) // consider heap A[1..n-i]
A[n-i] ← e
Time complexity : O(n) for heap construction and O(log
n) in each iteration of for loop
• Total time complexity : O(n log n) worst-case
• In-place sorting
Heap sort example
35 25 15 20 22 40 14 38

After heapify to 40 38 35 25 22 15 14 20
create max heap :

i=0; A[8]← removeMax(A[1..8]) 38 25 35 20 22 15 14 40

i=1; A[7]← removeMax(A[1..7]) 35 25 15 20 22 14 38 40

i=2; A[6]← removeMax(A[1..6]) 25 22 15 20 14 35 38 40

i=3; A[5]← removeMax(A[1..5]) 22 20 15 14 25 35 38 40


i=4; A[4]← removeMax(A[1..4]) 20 14 15 22 25 35 38 40
i=5; A[3]← removeMax(A[1..3]) 15 14 20 22 25 35 38 40

i=6; A[2]← removeMax(A[1..2]) 14 15 20 22 25 35 38 40


Recursive Sort
RecursiveSort(S):
Input : Sequence S of n elements to be sorted
Output : Sorted sequence of S
if |S| =1
return S
else
(S1,S2) <- Split(S) // split S into S1 and S2 subsequences of
roughly equal size
RecursiveSort(S1)
RecursiveSort(S2)
return Join(S1,S2) // join S1 and S2
• In MergeSort, join step takes O(n) time
• In QuickSort, split step takes O(n) time
• Recurrence : T(n) ≤ 2T(n/2) + k n => T(n) is O(n log n)
MergeSort
Merge(S1,S2):
Input : Two sorted sequences S1 and S2
Output : A single sorted sequence of elements from S1 and S2
i ← 0; j ← 0;
S <- {}
while i < size(S1) and j < size(S2)
if S1.elementAtRank(i) ≤ S2.elementAtRank(j)
S.append(S1.elementAtRank(i)); i ← i+1
else
S.append(S1.elementAtRank(j)); j ← j+1
// append remaining elements
while (i++ < size(S1))
S.append(S1.elementAtRank(i))
while (j++ < size(S1))
S.append(S2.elementAtRank(j)
return S
Time and additional space complexity is O(m+n), m = |S1| and n = |S2|
Merging example
Quick Sort
• Due to Hoare
• Fast in-memory sorting algorithm on the average
• Idea is to choose a “pivot” element “e” in S and
divide the original sequence S into
S1 = {a ϵ S | a < e}
S2 = {a ϵ S | a ≥ e}
Recursively sort S1 and S2 and the concatenated
sorted subsequences S1 and S2
• If we have an array for S, we can do all the above
steps in-place unlike merge sort.
• Recursive function : quicksort(𝐴, 𝑓𝑖𝑟𝑠𝑡, 𝑙𝑎𝑠𝑡) where
0 ≤ 𝑓𝑖𝑟𝑠𝑡 ≤ 𝑙𝑎𝑠𝑡 ≤ 𝑛 − 1
Quick sort algorithm
Quick Sort Partitioning
Partition(A,first,last):
Input: Array A[first..last] and A[first] contains pivot element e
Output : Return f𝑖𝑟𝑠𝑡 ≤ pivotIndex ≤ last such that A is partitioned
into subarrays A1 = A[first..pivIndex-1] and A2 = A[pivIndex+1..last]
such that A1 = {a ϵ A | a < e} and A2 = {a ϵ A | a ≥ e } and A[pivotIndex] =
e
e ← A[first]
up ← first+1; down ← last
while up < down
while up < down and A[up] < e
up ← up + 1
while up < down and A[down] ≥ e
down ← down – 1
if up < down
swap A[up] and A[down]
up ← up + 1; down ← down – 1
swap A[down] with A[first]
return down
Quick sort example
44 75 23 43 55 12 64 77 33 12 33 23 43 44 55 64 77 75

After partitioning on pivot


Pivot
12 33 23 43 12 33 23 43

Pivot After partitioning on pivot

33 23 43 23 33 43 12 23 33 43

After partitioning on pivot Sorted sub array


Pivot

55 64 77 75 55 64 77 75

After partitioning on pivot


Pivot
64 77 75 64 77 75 77 75 75 77

After partitioning on pivot Pivot After partitioning on pivot


Pivot
55 64 75 77 12 23 33 43 44 55 64 75 77
Sorted sub array Sorted array
Quick Sort
• Due to Hoare
• Fast in-memory sorting algorithm on the average
• Idea is to choose a “pivot” element “e” in S and
divide the original sequence S into
S1 = {a ϵ S | a < e}
S2 = {a ϵ S | a ≥ e}
Recursively sort S1 and S2 and the concatenated
sorted subsequences S1 and S2
• If we have an array for S, we can do all the above
steps in-place unlike merge sort.
• Recursive function : quicksort(𝐴, 𝑓𝑖𝑟𝑠𝑡, 𝑙𝑎𝑠𝑡) where
0 ≤ 𝑓𝑖𝑟𝑠𝑡 ≤ 𝑙𝑎𝑠𝑡 ≤ 𝑛 − 1
Quick sort algorithm
Quick Sort Partitioning
Partition(A,first,last):
Input: Array A[first..last] and A[first] contains pivot element e
Output : Return f𝑖𝑟𝑠𝑡 ≤ pivotIndex ≤ last such that A is partitioned
into subarrays A1 = A[first..pivIndex-1] and A2 = A[pivIndex+1..last]
such that A1 = {a ϵ A | a < e} and A2 = {a ϵ A | a ≥ e } and A[pivotIndex] =
e
e ← A[first]
up ← first+1; down ← last
while up < down
while up < down and A[up] < e
up ← up + 1
while up < down and A[down] ≥ e
down ← down – 1
if up < down
swap A[up] and A[down]
up ← up + 1; down ← down – 1
swap A[down] with A[first]
return down
Quick sort example
44 75 23 43 55 12 64 77 33 12 33 23 43 44 55 64 77 75

After partitioning on pivot


Pivot
12 33 23 43 12 33 23 43

Pivot After partitioning on pivot

33 23 43 23 33 43 12 23 33 43

After partitioning on pivot Sorted sub array


Pivot

55 64 77 75 55 64 77 75

After partitioning on pivot


Pivot
64 77 75 64 77 75 77 75 75 77

After partitioning on pivot Pivot After partitioning on pivot


Pivot
55 64 75 77 12 23 33 43 44 55 64 75 77
Sorted sub array Sorted array
Sorting and selection (contd.)
Quick Sort
• Due to Hoare
• Fast in-memory sorting algorithm on the average
• Idea is to choose a “pivot” element “e” in S and
divide the original sequence S into
S1 = {a ϵ S | a < e}
S2 = {a ϵ S | a ≥ e}
Recursively sort S1 and S2 and the concatenated
sorted subsequences S1 and S2
• If we have an array for S, we can do all the above
steps in-place unlike merge sort.
• Recursive function : quicksort(𝐴, 𝑓𝑖𝑟𝑠𝑡, 𝑙𝑎𝑠𝑡) where
0 ≤ 𝑓𝑖𝑟𝑠𝑡 ≤ 𝑙𝑎𝑠𝑡 ≤ 𝑛 − 1
Choice of pivot element
• Quick sort can have worst-case O(𝑛! ) time complexity. e.g. choice
of pivot at each recursive step divides into n-1 and 1 size partitions
(recursion tree skewed)
• It has O(n log n) best case time complexity when recursion tree is
balanced with equal size partitions
• Typically randomized quick sort used where pivot is chosen at
random. Customary to pick median of a random sample of few
elements.
• For input size m, let 𝑎" ≤ 𝑎! …≤ 𝑎# Consider app. m/2 elements,
𝑎⌊m/4⌋$" ,…. 𝑎⌊3m/4⌋ . Each of these elements bounds
sizes 𝑎⌊m/4⌋ of smaller and larger subsequences between m/4 and
3m/4; input size for recursion ≤ 3m/4
• Probability of choosing one of these as pivot is ½
• Expected height of recursion tree = expected # of recursive
invocations until we choose 𝑙𝑜𝑔%/' n such pivots = 2 𝑙𝑜𝑔%/' n
• Time spent at each level is O(n) → expected time complexity is O(n
log n)
Example for m=12
Let 𝑎! ≤ 𝑎" ≤ 𝑎# ….. ≤ 𝑎!" . m/4 = 3 and 3m/4 =9
Consider elements 𝑎$ , 𝑎% , 𝑎& , 𝑎' , 𝑎( , 𝑎) .
Approximately m/2 = 6 elements
𝑎$ ≥ 4 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 and 𝑎$ ≤ 9 𝑒𝑙𝑒𝑚𝑒𝑛ts
𝑎% ≥ 5 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 and 𝑎% ≤ 8 𝑒𝑙𝑒𝑚𝑒𝑛ts
𝑎& ≥ 6 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 and 𝑎& ≤ 7 𝑒𝑙𝑒𝑚𝑒𝑛ts
𝑎' ≥ 7 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 and 𝑎' ≤ 6 𝑒𝑙𝑒𝑚𝑒𝑛ts
𝑎( ≥ 8 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 and 𝑎( ≤ 5 𝑒𝑙𝑒𝑚𝑒𝑛ts
𝑎) ≥ 9 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 and 𝑎) ≤ 4 𝑒𝑙𝑒𝑚𝑒𝑛ts
If you take any of these as m/2 elements as pivots,
max(|S1|, |S3| ) ≤ 9 = 3m/4
Probability of choosing any of these as pivots = (m/2)/m =
1/2
Quick sort (worst-case)recursion tree for m/2 pivot
choices

m-elements

3m/4-elements < 3m/4-


elements

(3/4)! m elements

(3/4)" m elements
Setting (3/4)! m = 1, we get k = 𝑙𝑜𝑔(!) m
"
Selection problem
• Finding k-th smallest element of a set S of n elements
• Using heap and extracting minimum k times gives time
complexity O(n + k log n). This finds all p-th smallest elements
where 1 ≤ 𝑝 ≤ 𝑘
• Can be done faster using divide-and-conquer approach.
• Similar to QuickSort, choose a pivot e and divide S into 3 sets
S1, S2 and S3.
(a) if k <= |S1|, recursively find k-th smallest element in S1
(b) if |S1| < k ≤ |S1| + |S2|, ‘e’ is k-th smallest element
(c) Otherwise recursively find k – (|S1|+|S2|)-th smallest
element in S3.
• If we choose pivot to be median of 5-element medians
• T(|S|) ≤ T(max(|S1|,|S3|) + T(|S4|) + k |S|
S4 is the set of 5-element medians from S.
Selection example
• S has 12 elements using pivot e
|S1| = 5, |S2| = 3 |S3| = 4

3rd smallest element must be in |S1|


6th smallest element = pivot element
10th smallest element -- 2nd smallest element in
S3
Choice of pivot
• Pivot element ≥ At least ½ of ⌈n/5⌉medians Each of
these elements ≥ 3 elements
→ pivot element ≥ 3/2 ⌈n/5⌉ ≥ 3n/10 elements
→ |S3| ≤ 7n/10. By symmetric argument, |S1| ≤ 7n/10
• |S4| ≤ n/5
• T(n) ≤ T(7n/10) + T(n/5) + k n where |S| = n
• Note that 7n/10 + n/5 = 9n/10 < n
• T(n) is O(n) – linear time worst-case complexity but
constant may be prohibitive
• A better choice is to use random pivot and it gives
expected O(n) time complexity.
Text Processing Algorithms
Pattern (String) Matching
• Given : A text of characters T of length n and a pattern string P
of length m
Find if P occurs in T and if it does get starting position of first
occurrence in T
• Brute force approach :
Align pattern at each position of text and check if it matches
At each position it takes at most m comparisons and time
complexity is O(mn) assuming character comparison takes O(1)
time.
If m is very small compared to n this is ok
• Better approaches take advantage of
(i) failed comparison at a position to skip text characters
(ii) repetition of sub pattern within a pattern to avoid back
tracking in text
Boyer-Moore Algorithm
• Simple version of BM algorithm:
(a) At text position, compare pattern with text T starting from
end of pattern (say of length m)
(b) If a mismatch occurs for a text character “c” at a position “k”,
if it does not occur in the pattern at all, then we can slide the
pattern past position k so that we can start comparison from end
of pattern from position k+m
This allows some text characters to be skipped
• we define for each character “c” of text :
last(c) = highest index j in pattern P such that P(j) = c,
= -1 if c does not occur in P
• For a mismatch at position k, restart pattern comparison at
text position k + m – min(j, 1 + last(T[i]) where j is the pattern
position where mismatch occurred. In the worst case, it can
slide by only one position
Example of BM matching alg.
BM algorithm complexity
• Simple version has worst-case complexity O(nm + d) where d
is size of alphabet
• Simple version wastes comparisons already made when
pattern slides less than m
• Original BM algorithm avoids this using ideas of KMP
• More complex but takes O(n+m+d) time in the worst-case
Knuth-Morris-Pratt Algorithm
• Uses a DFA (deterministic finite state automaton) in recognizing
substring patterns of a text
• A DFA has (S, S, d, 𝑠! , F)
- S is a finite set of states
- S finite alphabet
- 𝑠! ϵ S is start state
- F Í S set of final states
- d : S x S → S (partial) transition function
d(s,a) specifies for a current state s and reading an input character a
what the next state should be
• A string 𝑎" 𝑎# ….. 𝑎$ is accepted if we reach a final state after
processing string using d
• We are interested in constructing a DFA such that all strings with a
pattern (regular expression) are accepted by that DFA.
Failure function in string matching
• KMP algorithm uses a DFA where
S = {0,1,2…m}, 0 is start state and m is final state
For a pattern string P = 𝑝! 𝑝!….. 𝑝" transition function d is
defined as follows for 1 ≤ j ≤ m-1:
d(j-1, 𝑝# ) = j +1
d(j-1, 𝑎) = f(j), for a ¹𝑝#
• f is called failure function which specifies the latest previous
position in pattern from which we can resume comparison for
the current or next text position.
• If f(j) = k and k > 0, then k is the largest value < j such that 𝑝!
𝑝$….. 𝑝% is a suffix of 𝑝! 𝑝$….. 𝑝#
Text Processing Algorithms
Pattern (String) Matching
• Given : A text of characters T of length n and a pattern string P
of length m
Find if P occurs in T and if it does get starting position of first
occurrence in T
• Brute force approach :
Align pattern at each position of text and check if it matches
At each position it takes at most m comparisons and time
complexity is O(mn) assuming character comparison takes O(1)
time.
If m is very small compared to n this is ok
• Better approaches take advantage of
(i) failed comparison at a position to skip text characters
(ii) repetition of sub pattern within a pattern to avoid back
tracking in text
Boyer-Moore Algorithm
• Simple version of BM algorithm:
(a) At text position, compare pattern with text T starting from
end of pattern (say of length m)
(b) If a mismatch occurs for a text character “c” at a position “k”,
if it does not occur in the pattern at all, then we can slide the
pattern past position k so that we can start comparison from end
of pattern from position k+m
This allows some text characters to be skipped
• we define for each character “c” of text :
last(c) = highest index j in pattern P such that P(j) = c,
= -1 if c does not occur in P
• For a mismatch at text position k, restart pattern comparison
at text position k + m – min(j, 1 + last(T[k]) where j is the
pattern position where mismatch occurred. In the worst case,
it can slide by only one position
Example of BM matching alg.
BM algorithm complexity
• Simple version has worst-case complexity O(nm + d) where d
is size of alphabet
• Simple version wastes comparisons already made when
pattern slides less than m
• Original BM algorithm avoids this using ideas of KMP
• More complex but takes O(n+m+d) time in the worst-case
Knuth-Morris-Pratt Algorithm
• Uses a DFA (deterministic finite state automaton) in recognizing
substring patterns of a text
• A DFA has (S, S, d, 𝑠! , F)
- S is a finite set of states
- S finite alphabet
- 𝑠! ϵ S is start state
- F Í S set of final states
- d : S x S → S (partial) transition function
d(s,a) specifies for a current state s and reading an input character a
what the next state should be
• A string 𝑎" 𝑎# ….. 𝑎$ is accepted if we reach a final state after
processing string using d
• We are interested in constructing a DFA such that all strings with a
pattern (regular expression) are accepted by that DFA.
Failure function in string matching
• KMP algorithm uses a DFA where
S = {0,1,2…m}, 0 is start state and m is final state
For a pattern string P = 𝑝! 𝑝!….. 𝑝" transition function d is
defined as follows for 1 ≤ j ≤ m-1:
d(j-1, 𝑝# ) = j
d(j-1, 𝑎) = f(j), for a ¹𝑝#
• f is called failure function which specifies the latest previous
position in pattern from which we can resume comparison for
the current or next text position.
• If f(j) = k and k > 0, then k is the largest value < j such that 𝑝!
𝑝$….. 𝑝% is a suffix of 𝑝! 𝑝$….. 𝑝# if it exists else 0.
KMP Matching Algorithm
KMPMatch(T,P):
Input : text T[0..n-1] and pattern P[0..m-1] of n and m characters
Output : Start index of first match of P in T, -1 otherwise
f ← KMPFailureFunction(P)
i ← 0; j ← 0
while i < n
if T[i] = P[j]
if j= m-1
return i-m+1
i ← i+1; j ← j+1
else if j > 0
j ← f(j-1)
else
i ← i+1
return -1
Example of KMP matching alg.
KMP failure function
• Idea: Feed pattern itself as an input . If we know f(1),…f(j-1) and f(j-
1) = 𝑘% . we can compute f(j) as:
(i) if 𝑝% = 𝑝&! '" then f(j) = 𝑘% +1
(ii) Else apply f repeatedly, if f(𝑘( ) = 𝑘('" then we check (i) again as
long as 𝑘('" > 0

KMPFailureFunction(P[1..m]):
f(1) ← 0
for j ← 2 to m
k ← f(j-1)
while P[j] ¹ P[k+1] and k > 0
k ← f(k)
if P[j] ¹ P[k+1] and k = 0
f(j) ← 0
else
f(j) ← k+1
return f
KMP Time complexity
• Excluding construction of failure function, in KMPMatch, O(1) time
is spent in each while loop iteration assuming character comparison
takes O(1) time.
• To determine number of iterations, consider i-j, (i-j ≤ n)
(i) when there is a match i-j remains same (i, j both increase by 1)
(ii) when there is no match and j = 0, i-j increases by 1 (only i
increases by 1)
(iii) when there is no match and j > 0, j is set to f(j-1) which is less than
j and hence i-j increases by at least 1
• Hence at end of each iteration either i increases (text position
advances) or i-j increases (pattern shift) by at least 1
→ # of iterations ≤ 2n → complexity is O(n) excl. failure function
construction
• Failure function takes O(m) time to compute→ total complexity is
O(n+m)
Tries
• Efficient data structure for processing a series of search
queries on the same set of text strings
• Given (S, S) where S Í S∗ , a (compressed) trie T is an
ordered tree where
(a) each node is labeled with a string from alphabet S
(b) ordering of children nodes according to some canonical
(usually alphabetical) ordering of labels
(c) an external node is associated with a string of S formed by
concatenation of all labels from root to that node; every
string of S is associated with an external node.
(d) every internal node must have at least 2 children
• In a standard trie, the label is just a character in S and an
internal node can have just one child
Trie example
Compressed Trie example
Compressed Trie (with positions) example
Complexity of tries
• Given (S, S) where |S| = n and |S|= d
• Number of nodes in trie is O(n),
• Height is length of longest string in S denoted by m
• Every internal node has at least 2 but at most d children
• Space complexity is O(n m)
• Time to search for a string of size k is O(dk)
find path in T by matching substrings of search string
• Time complexity to construct T
(a) inserting a string at a time --- find path in T by tracing prefix
of string and when we stop at an internal node (i.e. cannot
match any children), insert a new node there for suffix
Time to insert a string 𝑥 is O(d |𝑥|)→ complexity is O(dn)
where n = ∑' ( ) |𝑥| where | 𝑥 | is length of 𝑥.
Suffix Tries
• Also called “position tree” that has external nodes representing all
possible suffixes of a given string s = 𝑠! 𝑠# .. 𝑠)*"
• Each node is label with an interval (i,j) which represents the
substring 𝑠( .. 𝑠% for 0 ≤ i < j < m
• Useful for many efficient string operations
(a) substring search O(m)
(b) longest common substring
(c) longest repeated substring
(d) useful in Bioinformatics to search for patterns in DNA or protein
sequences
• Space complexity is O(n)
• Time to construct a suffix trie – O(dn)
• Time to search for a substring of size m in the Suffix Trie – O(dm)
• Generalized version for a set of words
Suffix Trie example
ADT Set
• Stores distinct elements
• Operations supported :
makeSet(e) – make a set with a single element e
union(S1,S2) – returns S1 U S2
intersect(S1,S2) – returns S1 ∩ S2
substract(S1,S2) – returns set of elements in S1 but not
in S2

• Implementation with ordered sequence takes O(n) time


for union, intersect and subtract operations
(use modified merging algorithm)

• Implementation with hash table (no ordering needed)


takes O(n) time on the average
Disjoint set Union-Find
• Many algorithms require maintenance of partition
sets, i.e. element belonging to a set and only one set.
• They typically require a sequence of union and find
operations:
union(S1,S2) – modify S1 by S1 U S2 (merge)
find(e) – find the set containing element e.
• Two types of implementations:
(a) linear – let each element node have a direct link to
the set identifier node (changes during union)
-makeSet(e) – takes O(1) time
-- find(e) – takes O(1) time
(b) tree – let each set node have a link to its parent
(new set node) during union operation
Example of disjoint set union-finds
S = {2, 5, 10, 11, 15, 35}
• S1 ← makeSet(2); S2 ← makeSet(5); S3 ← makeSet(10);…….
S1 = {2}, S2 = {5}, S3 = {10}, S4 = {11}, S5 = {15}, S6 = {35}
• Find(15) = S5
• S1 ← Union(S1,S3)
S1 = {2,10}, S2= {5}, S4 = {11}, S5 = {15}, S6 = {35 }
• S2 ← Union(S2,S5)
S1 = {2,10}, S2={5,15}, S4 = {11}, S6 = {35}
• Find(15) = S2
• S1 ← Union(S1,S2)
S1 = {2,10,5,15}, S4 = {11}, S6 = {35}
• Find(15) = S1
Array implementation (with parent links)
2 5 10 11 15 35
S1 S2 S3 S4 S5 S6

Initially S1 S2 S3 S4 S5 S6
S1 S2 S3 S4 S5 S6

S1 S2 S3 S4 S5 S6
S1 ← Union(S1,S3)
S1 S2 S1 S4 S5 S6

S1 S2 S3 S4 S5 S6
S2 ← Union(S2,S5)
S1 S2 S1 S4 S2 S6

S1 S2 S3 S4 S5 S6
S1 ← Union(S1,S2)
S1 S1 S1 S4 S2 S6

Find(15) : S5 -> S2 -> S1


Weighted union
2 5
S1 ← Union(S1,S3) S2 ← Union(S2,S5)

10 15

S1 ← Union(S1,S2)
10 5

15 2

S4 ← Union(S1,S4) 11
v5
10
Make root of smaller set
child of root of larger set. 15
Amortized cost analysis for union-find
• union(S1,S2) – make smaller sequence elements point to the
set node of larger sequence elements
• Complexity of sequence of n operations consisting of union
and find starting with singleton sets of n elements.
• Amortization : Accounting method
(i) For find operation charge unit cost to operation itself
(ii) For union operation, charge unit cost to each of the
elements whose links have changed (no cost to operation itself).
• Total amortization cost = # of cost units assigned to elements
+ # of cost units assigned to Find operation >= total actual cost
• # of cost units assigned to elements ≤ # of times an element
can change set
• A set doubles when an element moves, at most log n times an
element can change sets → time complexity is O(n log n)
Amortized analysis for a sequence of n
(weighted) union-find operations
• Accounts : Find_1, Find_2, Find_3….
Element_1, Element_2,………..Element_n
Amortized cost using accounting method is given as follows:
(a) Union à if Element_k (root of smaller set) points to Element_p, assign 1 unit cost to
Element_k account
(b) Find_k à assign unit cost to Find_k account
• Note that actual cost of a Find(e) is the path length from e to the root = number of
unions performed before this find that causes the element to change its set
membership.
• Actual cost of a union operation is unit cost
Total amortized cost of union and finds >= actual cost of union and finds
• Element e in S1 – find operation finds path from e to root of S1 – path length = p
After S1 U S2 – find(e) – finds path from e to root of S1 U S2 . Call this path length p1.
When can p1 = p+1 ? |S1| <= |S2| before merge. After merge , combined set |S1| + |S2|
>= 2 |S1| e has moved to a set which is double the size of the set it belonged to before
Max. # of times path length increases for e = 1 -> 2 -> 4 -> 8 ….. -> n == log n times
• Each element account charged at most log n units
Actual cost of union and finds <= Total amortized cost of union and finds <= n + n log n
Efficient Union-Find DS
• Use a tree for a set where root node identifies the set and each
child node has a parent link. Root node’s parent link points to itself.
• Union – make the root of the tree for one set a child node of root of
another. – Takes O(1) time
• To make find op more efficient, make root of smaller height (rank)
set child of root of bigger height (rank) set – keep track of # of
nodes in set
Let S(h) – minimum size of set with a tree of height h
S(h) ≥ 2 S(h-1), h ≥ 1, S(0) = 1→ S(h) ≥ 2! → h ≤ log n, number of
elements in partition sets
Complexity of Find is O(log n)
• Total complexity of a sequence of n union-find operations is O(n
log n)
• Can we do better ?
Efficient Union-Find DS
• Due to Robert E. Tarjan
• Use Union by rank – Similar to union by weight by making root
of set of smaller rank a child of root of set of larger rank.
MakeSet(x):
x.parent ← x; x.rank ← 0

Union(x,y):
if x.rank > y.rank
y.parent ← x
else
x.parent ← y
if x.rank = y.rank
y.rank ← y.rank+1
Efficient Union-Find DS (contd.)
• Use path-compression – After a find(e), make all
nodes in path from e to root children of the root.
Should take same order of time as without
compression.
Find(e):
if e ≠ e.parent
e.parent ← Find(e.parent)
return e.parent
• Recursion down finds path from element to root and
unwinding recursion sets parent of all elements in
the path to the root.
• Run-time : shown to be almost linear in the worst-
case.
Efficient Union-Find (path-compression)

9 9

2 1 15 2 1 3
3 5

10 5 10 11
11

15

Before Find(15) After Find(15)


Efficient Union-Find DS Complexity
• Ackerman functions (fast-growing) :
𝐴! (n) = 2n, n ≥ 0
𝐴" (n) = 𝐴("$%) (𝐴" (n-1)), n > 0 𝑎𝑛𝑑 𝐴" (0) = 1 for 𝑖 > 0
𝐴! seq : 0, 2, 4, 6,…….
𝐴% seq : 2! , 2% , 2' , 2( ,….
" ""
2' ,
𝐴' seq (modified, call it F) : 1, 2, '
2 ,2' ,……
• Inverse 𝐹 function I(very-slowly growing):
log*(n) = min{i > 0 : 𝐹(i) ≥ n}
log*(𝑘) = 1, 𝑘 =1,2 ; log*(𝑘) = 2, 𝑘 = 3,4,5,…16
log*(𝑘) = 3, 17 ≤ 𝑘 ≤ 2%) , log*(65536) = 4
• We will use amortized analysis to show that sequence of
n union-finds will take O(n log*n)
Offline MIN problem
Given: A sequence of two types of instructions:
(a) insert(k) – insert an integer k, 1 ≤ k ≤n, in set T
(b) extract_min – get and remove min value from T
Required: Sequence of values output by extract_min instructions
• Simple solution – Treat it as online problem.
Keep values in a heap, extract min when required. May take O(n log
n) where n is number of insert instructions.
• Better solution – use union-find data structure. TakesO(n log*n) time
Let 𝐼# be the set of insert operations before j-th extract_min in the
sequence, 1 ≤ j ≤ m where m is number of extract_min instructions.
There is a total of n insert operations.
Sequence S : 𝐼$ 𝐸$ 𝐼% 𝐸% …. 𝐼& 𝐸& 𝐼&'$ (𝐼# may be empty)
Approach: Create key sets 𝐾# ’s corresponding to 𝐼# ’s.
For each i from 1 to n, find 𝐾# i belongs to. If j ≤ m, then output of
𝐸# must be “i”.
We can merge 𝐾# to following adjacent 𝐾( as they are candidates
for 𝐸(
Off line min example
Insert 3, Insert 5, ExtractMin_1, Insert 4, Insert 1, ExtractMin_2, Insert 6, Insert 2,
Insert 7, ExtractMin_3
Initially form insert sets i.e. Set of inserts before each extract, Final insert set follows
last extractMin which can be empty.
𝐼! = {Ins. 3, Ins. 5}, 𝐼" = {Ins. 4, Ins. 1}, 𝐼# = {Ins. 6, Ins. 2, Ins. 7}, 𝐼$ = ∅
𝐾! = {3,5 }, 𝐾" = {4,1}, 𝐾# = {6,2,7}, 𝐾$ = ∅
For each element from 1 to n we try to find if it is the result of an extract min op.
For element 1 : What key set 1 belongs to ? 𝐾" . i.e Find(1) = 2 Hence ExtractMin_2() =
1
Then we merge 𝐾" and its adjacent set 𝐾# <- Union(𝐾# , 𝐾" ) = {4,1,6,2,7}
What key set 2 belongs to ? Find(2) = 𝐾# -> ExtractMin_3() = 2
𝐾$ <- Union(𝐾$ , 𝐾# ) = {4,1,6,2,7}
What key set 3 belongs to ? Find(3) = 𝐾! -> ExtratMin_1() = 3
𝐾$ <- Union(𝐾$ , 𝐾! ) = = {3,5,4,1,6,2,7}
What key set 4 belongs to ? Find(4) = 𝐾$ -> will not be result of any ExtractMin
We conclude the same for Elements 5, 6 and 7.
Weighted graph problems
Minimum cost spanning tree
• Given : Undirected connected graph G = (V,E) with
weights (costs) on edges (w : E -> R)
• Required: Minimum cost spanning tree (tree with min.
sum of costs of edges)
• Useful in network designs
• Greedy approach basis:
If (V1,V-V1) is a cut and E1 is the cut-set. If w(e1) =
m𝑖𝑛 𝑤 𝑒 then there is a min cost spanning tree that
eϵ !"
includes e1.
• Two popular greedy algorithms:
(a) Kruskal’s algorithm (edge addition to tree)
(b) Prim-Jarnik algorithm (vertex addition to tree)
T is a spanning tree
(shown as thick edges)

𝑉! V-𝑉!

𝑒!

𝑒"

Cut set 𝐸!

w(𝑒! ) = min 𝑤(𝑒)


" ∈$!

𝑇! = (T - {𝑒! }) ∪ {𝑒" }
𝐶(𝑇! ) = C T − w 𝑒! + w(𝑒" ) ≤ 𝐶(𝑇)
Kruskal’s MST algorithm
• Idea:
-- At each step we find a minimum cost edge that
connects a vertex in one spanning tree to another in the
forest.
-- This greedy choice property is possible
--- Initially each vertex by itself is a spanning tree in the
forest
--- We choose in non-decreasing order of edge weights
(a) if the edge connects vertices in same tree, ignore it
(b) else choose the edge and reduce number of spanning
trees by 1
Kruskal’s MST algorithm example

2 v2
v1 7
12
v5
8 9
5 6
v6
v4 v3
4

2 7 9
v1 v2 v5 v6

5
4 Cost = 27
v3 v4
Kruskal’s algorithm time complexity
• Use a min-heap priority queue for edges using w(e)
• At each step O(log m) time to find min cost edge, m
number of edges
Total number of steps ≤ m
• Use a disjoint set union-find DS for spanning forest
• n makeSet() operations, n is number of vertices
• set find to detect if vertices of an edge are in same
spanning tree
• set union merge spanning trees
Time complexity of union-finds – O(m log *m)
• Total time complexity is O(n + m log m)
Prim-Jarnik’s MST algorithm
• Idea:
-- Start from a vertex and grow a spanning tree
-- Maintain D[v] , min cost of an edge from v to
a vertex in existing spanning tree; Initially D[v]
set to +∞
--- At each step choose vertex u with minimum
D[v] to be added to the spanning tree
--- This greedy choice property possible
--- Then update D[w] for each vertex not in
spanning tree that is adjacent to u
Prim-Jarnik’s Algorithm example
2 v2
v1 7
12
v5
8 9
5 6
v5
v6
v4 v3 7 9
4
v2 v6
D[v] v1 v2 v3 v4 v5 v6
Add v5; 𝟏𝟐 𝟕 ∞ ∞ 0 (*) 𝟗 2
update
v1,v2,v6 v1
Add v2 𝟐 𝟕(*) 𝟔 ∞ 0(*) 𝟗
update v1,v3 5
Add v1
𝟐(*) 𝟕(*) 𝟓 𝟖 0(*) 𝟗 v3
update v4,v3
Cost = 27
Add v3
𝟐(*) 𝟕(*) 𝟓 (*) 𝟒 0(*) 𝟗 4
update v4
v4
Add v4; no 𝟐(*) 𝟕(*) 𝟓 (*) 𝟒 (*) 0(*) 𝟗
updates

Add v6 𝟐(*) 𝟕(*) 𝟓 (*) 𝟒 (*) 0(*) 𝟗(*)


Prim’s MST algorithm time complexity
• Use min-heap priority queue for vertices : D[v] is
heap key
• At each step extract-min to find one with min D[v]
This takes O(log n) time
• Total time for extract-min – O(n log n)
• Updating D[v] takes O(log n) time, n number of
vertices
Total number of updates ≤ m, number of edges
• Total time for updates – O(m log n)

• Total time complexity – O((m+n) log n) which is O(m


log n) time.
Single source shortest path problem
• Given : A directed graph G = (V,E) with weights on
edges (w : E -> R) and a source vertex s ϵ V
• Required: Find the shortest path length D[v] from s
to v for all v ϵ V
• Example : shortest route between cities
Length of path = sum of weights of edges in the path
• Negative weights with cycles can cause shortest path
between vertices to have infinite number of edges
(not converge)
• Focus on non-negative weights on edges
Dijkstra’s shortest path alg.
• Cut – Partition of V into disjoint subsets V1 and V2 (V2 = V-V1)
Cut-set – {(u,v) ϵ E | u ϵ V1 and v ϵ V2}
• Greedy approach idea :
(a) Have a cut where V1 is set of vertices we know shortest path length
from s; initially V1 = {s}
(b) For all v ϵ V2, we know shortest path length from s to v with
vertices in path (except v) restricted to V1 (at most one edge in cut-set).
Let it be D[v]
(c) At each step we can move one vertex from V2 to V1 if V2 is not
empty. Why ?
Let D[u] = m𝑖𝑛 𝐷 𝑣 .
*ϵ +,
Cannot have a better path from s to u that includes a vertex from V2
→ D[u] is shortest path from s to u
Move u to V1 and also update D[v] for all other v in V2. How ?
Similar to Prim’s MST alg. Only need to consider paths that go through u
Dijkstra’s algorithm example

5 3
v1 v2
v4 4
Source = v1 v6
2 2 4 2
3
v5 Relaxation :
v3 6 D[v] <- min(D[v], D[u] + w((u,v))

D[v] v1 v2 v3 v4 v5 v6
Add v1; 𝟎(*) 𝟓 𝟐 ∞ ∞ ∞
update
v1
v2,v5,v6
Add v3 𝟎(∗) 𝟒 𝟐(∗) ∞ 𝟖 ∞ 2 v3
update v2,v5
Add v2
𝟎(∗) 𝟒 (∗) 𝟐(∗) 𝟕 8 ∞
update v4 4 v2 v5 8
Add v4
𝟎(*) 𝟒(*) 𝟐 (*) 𝟕(*) 8 𝟏𝟏
update v6 7 v4

Add v5; no 𝟎(*) 𝟒(*) 𝟐 (*) 𝟕 (*) 8(*) 𝟏𝟏


updates 11 v6

Add v6
Shortest path alg. time complexity
• Min-Heap priority queue for vertices in V2 based on
D[v]
• O(log n) to get min D[v]
• number of extract-min ops ≤ n, number of nodes
• For a vertex v added to V1, need to update D[w] only
for each edge (v,w); each heap update O(log n)
• Total updates ≤ m, number of edges
• Total time complexity – O(n log n + m log n)
• Since for a connected graph m ≥ n-1, time complexity
for a connected graph is O(m log n)
Bellman Ford algorithm
• Finds single-source shortest paths even when
there are negative edges
• Can identify negative weight cycles
• Works by iterating at most n-1 times
• During each iteration look at all edges (u,v) and
update D[v] to better value as in Dijkstra’s
algorithm if possible, i.e. D[v] = min(D[v],
D[u]+w((u,v)) This is called “relaxation”.
• Total time-complexity is O(nm)
All pair path problems
Closed semi-rings
• (S, ◦ , 1) is a monoid if
(a) S is closed under ◦ ( a ◦ b ϵ S, " a, b ϵ S)
(b) ◦ is associative (a ◦ (b ◦ c) = (a ◦ b) ◦ c)
(c ) 1 is identity for ◦ (a ◦ 1 = 1 ◦ a = a)
• (S, +, ◦, 0, 1) is a closed semi-ring if
(a) (S, +, 0) is a monoid
(b) (S, ◦, 1) is a monoid and 0 is annihilator for ◦ (i.e a ◦ 0 = 0 ◦ a
= 0, " a)
(c) + is commutative (a + b = b + a) and idempotent ( a + a = a)
(d) ◦ distributes over + i.e. a ◦ (b+c) = (a ◦ b) + (a ◦ c)
(e) Associative, distributive, commutative and idempotency
properties extend to finite and infinite sums
∑! 𝑎! ◦ ∑" 𝑏" = ∑!," 𝑎! ◦ 𝑏" = ∑!( ∑" 𝑎! ◦ 𝑏" )
Note infinite sum ∑! 𝑎! exists and is unique (idempotency)
Examples of closed semi-rings
• Boolean algebra : ({0,1}, Ú, Ù, 0, 1) is a closed semi-ring
Ù distributes over Ú, Ú is idempotent ≥ 0 and 0 is annihilator for Ù
• (𝑅≥ 0 ⋃ {+∞}, MIN, +, +∞, 0) is a closed semi-ring (here + is
arithmetic addition)
-- (𝑅≥ 0 ⋃ {+∞}, MIN, +∞) is a monoid
--- (𝑅≥ 0 ⋃ {+∞}, +, 0) is a monoid and +∞ is annihilator for + ( a +
∞ = +∞)
--- + distributes over MIN : a + MIN(b,c) = MIN(a+b, a+c)
--- Infinite sum MIN(a1, Min(a2,…..) = Min(a1,a2,a3,….) exists and is
unique
• (𝐹S , È , ◦, Æ, {ϵ}) is a closed semi-ring where
𝐹S is the family of sets of finite length strings from alphabet S
including empty string ϵ (countably infinite sets)
È - set union, associative, identity empty set Æ
◦ - concatenation of sets S1 ◦ S2 = { xy | x ϵ S1 and y ϵ S2}; Æ is
annihilator for ◦
◦ distributes over È : S1 ◦ (S2 È S3) = (S1 ◦ S2) È (S1 ◦ S3)
All pairs path problem
Given : A directed graph G = (V,E) with possible
self-cycles and a label function l : E -> S where
(S,+, ◦, 0, 1) is a closed semi-ring
• For a directed path p = (𝑒! , 𝑒" , … 𝑒# ), path
product l(p) = l(𝑒! ) ◦ l(𝑒" ) ◦ … l(𝑒# )
• Sum of two path products 𝑝! and 𝑝" = l(𝑝! ) +
l(𝑝" )
• S(u,v) is sum of product of all paths from u to
v in the graph
Required: Find S(u,v) for all pairs of vertices in
G.
All pair path problems (contd.)
All pairs path problem
Given : A directed graph G = (V,E) with possible
self-cycles and a label function l : E -> S where
(S,+, ◦, 0, 1) is a closed semi-ring
• For a directed path p = (𝑒! , 𝑒" , … 𝑒# ), path
product l(p) = l(𝑒! ) ◦ l(𝑒" ) ◦ … l(𝑒# )
• Sum of two path products 𝑝! and 𝑝" = l(𝑝! ) +
l(𝑝" )
• S(u,v) is sum of product of all paths from u to
v in the graph
Required: Find S(u,v) for all pairs of vertices in
G.
Closure in closed semi-rings
• Define a closure operation * for a closed semi-ring
(S,+, ◦, 0, 1) element a as follows:
𝑎∗ = 1 + a + a ◦ a + a ◦ a ◦ a + ……. = 1 + a + 𝑎" + 𝑎# +..
By infinite idempotency of +, this infinite sum exists
and is unique.
• For ({0,1}, Ú, Ù, 0, 1) ,
𝑎∗ = 1 Ú a Ú 𝑎" …. = 1 for a = 0 or 1
• For (𝑅≥ 0 ⋃ {+∞}, MIN, +, +∞, 0) ,
𝑎∗ = MIN(0, a, 2a,3a,…) = 0 for a 𝜖 𝑅≥ 0 ⋃ {+∞}
• For (𝐹S , È , ◦, Æ, {ϵ}),
𝑆 ∗ = {ϵ} È S È (S ◦ S) È (S ◦ S ◦ S)…. = ⋃$%&{ 𝑥' 𝑥" …𝑥$ |
𝑥( ϵ S, 1 ≤ j ≤ i}
Closed semi-ring matrices
• Define 𝑀) be set of n x n matrices where elements
are from a closed semi-ring (S,+, ◦, 0, 1)
• (𝑀) , +) , ∗) , 0) , 𝐼) ) is a closed semi-ring
+) - addition of n x n matrices, (+ is closed semi-ring
idempotent operation)
∗) - multiplication of n x n matrices, ( +, ◦ closed semi-
ring operations) ; distributes over +)
0) - n x n matrix with all 0 (identity for +)
𝐼) - n x n identity matrix with 0 identity for + and 1
identity for ◦ - 0) is annihilator for ∗)
Infinite sum of matrices exists and is unique
Digraph Matrix Closure
• For a digraph with n vertices, define an n x n matrix L where
L(i,j) = l((𝑣! , 𝑣" ))
• L matrix is an element of closed semi-ring (𝑀# , +# , ∗# , 0# ,
𝐼# )
• Define 𝐿$ = L ∗# L …. multiplied k times. 𝐿% = 𝐼#
• What does 𝐿&(i,j) indicate ?
𝐿&(i,j) = L(i,1) ◦ L(1,j) + L(i,2) ◦ L(2,j) + ….. + L(i,n) ◦ L(n,j)
Sum of products of paths of length 2 from 𝑣! to 𝑣"
• 𝐿$ matrix gives for all vertex pairs sum of k length path
products between these vertices
• Closure matrix 𝐿∗ = ∑) $
$(% 𝐿 exists and is unique where sum is
+# and 𝐿%= 𝐼#
• It gives exactly what we need in all pairs-path problem.
DP algorithm for all-pair paths
• Due to Floyd-Warshall
• Let 𝐷* (u,v) = sum of path products from u to v that
go through w
= 𝑆' (u,w) ◦ 𝑆" (w,w) ◦ 𝑆# (w,v) where
𝑆' (u,w) = sum of paths from u to w that do not go
through w
𝑆' (w,w) = sum of paths from w to w
𝑆# (w,v) = sum of paths from w to v that do not go
through w
Distributivity of ◦ over + for finite and infinite sums
• Define 𝐷+ (i,j) as sum of paths from 𝑣$ to 𝑣( that do
not go through vertices other than 𝑣' , 𝑣" , …𝑣+
Floyd-Warshall Algorithm
AllPair(G, l):
Input : Digraph G = (V,E) with vertex numbered 1,2..n arbitrarily
and a labeling function l : E → S where (S,+, ◦, 0, 1) is a closed semi-
ring
Output : Compute 𝑳∗ matrix
for i ← 1 to n
for j ← 1 to n
l ← (𝑣" , 𝑣# ) ϵ E ? l((𝑣" , 𝑣# ) : 0
𝐷$ (i,j) ← i = j ? 1 + l : l --- (1)
for k ← 1 to n
for i ← 1 to n
for j ← 1 to n
𝐷% (i,j) = 𝐷%&' (i,j) + 𝐷%&' (i,k) ◦ (𝐷%&' (k,k))* ◦ 𝐷%&' (k,j) --
(2)
Return 𝐷#
Floyd-Warshall alg. time complexity

• Initialization takes O(𝑛" ) operations (assignment)


• There are n iterations of matrix computations
• Each iteration computes 𝑛" values
• Each computation of matrix value requires constant
number of closed semi-ring ops +, ◦ and *
• Total complexity is O(𝑛# ) closed semi-ring ops +, ◦
and *
Transitive Closure Algorithm
• Problem : Given a directed graph G = (V,E), return a n x n
matrix T where T[i,j] = 1 if there exists a directed path from 𝑣!
to 𝑣" , 0 otherwise
• Solve all-pairs problem where closed semi-ring is : ({0,1}, Ú, Ù,
0, 1) with adjacency matrix A. 𝐷# gives value of T
• Step 1 becomes :
𝐷* (i,j) ← i = j ? 1 : A[i,j]
• 0∗ = 1∗ = 1 so we can remove the closure element in the
algorithm.
• Hence step 2 becomes :
𝐷$ (i,j) = 𝐷$+, (i,j) Ú (𝐷$+, (i,k) Ù 𝐷$+, (k,j))
• Using Floyd-Warshall alg, we can compute it in O(𝑛-) semi-
ring operations Ú, Ù .
• Another way is to compute 𝐴∗ matrix is to compute 𝐴# which
requires O(𝑛- log n) time. Think about path doubling.
All-pair shortest path alg.
• Problem : Given a directed graph G = (V,E) and a length
function w : E → 𝑅≥ 0 , return a n x n matrix D where
D(i,j) = length of shortest path from i,j
• Solve all-pairs problem where closed semi-ring is :
(𝑅≥ 0 , MIN, +, +∞, 0) and matrix L where L[i,j] is length
of edge (𝑣$ , 𝑣( ) if it exists else +∞ . 𝐷) gives value of D
• Step 1 becomes (note Min(0,L[i,i]) = 0):
𝐷, (i,j) ← i = j ? 0 : L[i,j]
• Note that 𝑎∗ = MIN(0,a,2a,..) = 0. So we can omit it in
step 2
• Step2 then becomes:
𝐷+ (i,j) = Min(𝐷+-' (i,j), (𝐷+-' (i,k) + 𝐷+-' (k,j))
• Using Floyd-Warshall alg, we can compute it in O(𝑛# )
semi-ring operations MIN, +.
Closed semi-ring matrix closure
• Given a n x n matrix A of elements from a closed semi-ring
(S,+, ◦, 0, 1), the closure matrix 𝐴∗ = ∑) $
$(% 𝐴 can be
computed by divide-and-conquer strategy
𝐵 𝐶
• Let A = where B,C,D and E are n/2 x n/2 submatrices;
𝐷 𝐸
Let 𝑉, and 𝑉& be partition of vertices
𝑊 𝑋
• Then 𝐴∗ =
𝑌 𝑍
W = closure of paths that either (i) stay in 𝑉, or (ii) stay in 𝑉, for
some time, jump to 𝑉&, stay in 𝑉& for sometime and then jump
back to 𝑉,, this pattern repeated 1 or more time
= 𝐵 + 𝐶. 𝐸 ∗ . 𝐷 ∗ -- + and . are n x n matrix semi-ring operations
X = closure of paths that start in 𝑉, and end in 𝑉&
= W . C . 𝐸∗
Closed semi-ring matrix closure
Y = closure of paths that start in 𝑉( and end in 𝑉'
= 𝐸 ∗ . D. W
Z = 𝐸∗ + 𝑌 . C . 𝐸∗
It we have T = C . 𝐸 ∗ , then
W = 𝐵 + 𝑇. 𝐷 ∗
X=W.T
Y = 𝐸 ∗ . D. W
Z = 𝐸∗ + 𝑌 . T
Requires 2 recursive n/2 x n/2 matrix closures and 6 n/2 x n/2 matrix
multiplications and 2 n/2 x n/2 matrix additions
Time complexity T(n) – number of closed-semi ring operations
T(n) ≤ 2 T(n/2) + c 𝑔(𝑛) , n > 1 and T(1) = b for constants b and c
g(n) time to multiply 2 n/2 x n/2 matrices
T(n) is O(g(n)) provided g(n) satisfies g(2n) ≥ 4 g(n) (normally the case)
→ Matrix closure can be computed in the same order of time as matrix
multiplication using closed semi-ring.

You might also like