Algorithms For Data Science: CSOR W4246
Algorithms For Data Science: CSOR W4246
CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Thursday, September 10, 2015
Outline
1 Asymptotic notation
I
I
Today
1 Asymptotic notation
Aymptotic analysis
T(n) = O(f(n))
c f(n)
T(n)
n
n0
Definition 2 (O).
We say that T (n) = O(f (n)) if there exist constants c > 0 and
n0 0 s.t. for all n n0 , we have T (n) c f (n) .
Examples:
I
T(n) = (f(n))
T(n)
c f(n)
n0
Definition 4 ().
We say that T (n) = (f (n)) if there exist constants c > 0 and
n0 0 s.t. for all n n0 , we have T (n) c f (n).
Examples:
I
T(n)
c1 f(n)
n
n0
Definition 6 ().
We say that T (n) = (f (n)) if there exist constants c1 , c2 > 0
and n0 0 s.t. for all n n0 , we have
c1 f (n) T (n) c2 f (n).
Equivalent definition
T (n) = (f (n)) if T (n) = O(f (n)) and T (n) = (f (n))
Examples:
I
T (n)
n f (n)
T (n)
n f (n)
Examples:
I
Definition 8 ().
We say that T (n) = (f (n)) if for any constant c > 0 there
exists n0 0 s.t. for all n n0 , we have T (n) > c f (n).
Definition 8 ().
We say that T (n) = (f (n)) if for any constant c > 0 there
exists n0 0 s.t. for all n n0 , we have T (n) > c f (n).
I
T (n)
n f (n)
= if the limit
Definition 8 ().
We say that T (n) = (f (n)) if for any constant c > 0 there
exists n0 0 s.t. for all n n0 , we have T (n) > c f (n).
I
T (n)
n f (n)
= if the limit
Today
1 Asymptotic notation
Mergesort: pseudocode
Mergesort (A, lef t, right)
if right == lef t then return
end if
middle = lef t + b(right lef t)/2c
Mergesort (A, lef t, middle)
Mergesort (A, middle + 1, right)
Merge (A, lef t, middle, right)
Remarks
I
Merge: intuition
update the front of the list from which the item was
extracted.
Merge: pseudocode
Merge (A, lef t, right, mid)
L = A[lef t, mid]
R = A[mid + 1, right]
Maintain two pointers CurrentL, CurrentR initialized to point to
the first element of L, R
while both lists are nonempty do
Let x, y be the elements pointed to by CurrentL, CurrentR
Compare x, y and append the smaller to the output
Advance the pointer in the list with the smaller of x, y
end while
Append the remainder of the non-empty list to the output.
Remark: the output is stored directly in A[lef t, right], thus the
subarray A[lef t, right] is sorted after Merge(A, lef t, right, mid).
Analysis of Merge
1. Correctness
2. Running time
3. Space
3. Space
Merge: pseudocode
Merge (A, lef t, right, mid)
L = A[lef t, mid]
not a primitive computational step!
R = A[mid + 1, right] not a primitive computational step!
Maintain two pointers CurrentL, CurrentR initialized to point to
the first element of L, R
while both lists are nonempty do
Let x, y be the elements pointed to by CurrentL, CurrentR
Compare x, y and append the smaller to the output
Advance the pointer in the list with the smaller of x, y
end while
Append the remainder of the non-empty list to the output.
Remark: the output is stored directly in A[lef t, right], thus the
subarray A[lef t, right] is sorted after Merge(A, lef t, right, mid).
3. Space
Example of Mergesort
Input: 1, 7, 4, 3, 5, 8, 6, 2
Analysis of Mergesort
1. Correctness
2. Running time
3. Space
Mergesort: correctness
For simplicity, assume n = 2k , integer k 0. We will use
induction on k.
I
I
I
Remarks
I
Today
1 Asymptotic notation
Total work:
Plogb n
i=0
ai c(n/bi )k = cnk
log
Pb n
i=0
a i
bk
O(nlogb a ) ,
O(nk log n) ,
T (n) =
O(nk ) ,
What about...
1. T (n) = 2T (n 1) + 1, T (1) = 2