CS3000: Algorithms & Data
Jonathan Ullman
CS3000: Algorithms & Data
Unit 2: Divide & Conquer and Asymptotic Analysis
a. Introduction to sorting
Sorting
11 3 42 28 17 8 2 15
𝐴[1] 𝐴[𝑛]
Given a list of 𝑛 numbers,
put them in ascending order
2 3 8 11 15 17 28 42
Applications of Sorting
• Obvious applications • Non-obvious applications
• List files in a directory • Data compression
• Organize a video library • Computer graphics
• Use in spreadsheets • Job scheduling
• Problems made easier • Bioinformatics
• Search • Minimum spanning trees
• Selection • Recommendation systems
• Finding closest pair • 𝑛-Body simulations
• Deduplication • Supply chain management
• … • …
Applications of Sorting
UPS Worldport: 90 football fields, 416k packages/hr
A Simple Algorithm: Selection Sort
Find the
minimum 11 3 42 28 17 8 2 15
Swap it into
place, repeat 2 3 42 28 17 8 11 15
on the rest
2 3 42 28 17 8 11 15
Repeat
𝑛 − 1 times.
2 3 8 11 15 17 28 42
A Simple Algorithm: Selection Sort
Iteration 𝑗:
Find minimum of 2 3 8 28 42 17 11 15
𝐴 𝑗: 𝑛 and
swap it with 𝐴[𝑗]
SelectionSort(A[1:n]):
1. For j = 1,…,n-1:
2. min_pos = j
Pseudocode 3. For k = j+1,…,n:
4. If (A[k] < A[min_pos]):
5. min_pos = k
6. Swap A[min_pos] and A[j]
A Simple Algorithm: Selection Sort
Iteration 𝑗:
Find minimum of 2 3 8 28 42 17 11 15
𝐴 𝑗: 𝑛 and
swap it with 𝐴[𝑗]
SelectionSort(A[1:n]):
1. For j = 1,…,n-1:
2. min_pos = j
Pseudocode 3. For k = j+1,…,n:
4. If (A[k] < A[min_pos]):
5. min_pos = k
6. Swap A[min_pos] and A[j]
Analysis:
• Proof of correctness
• Analysis of running time
Selection Sort: Proof of Correctness
SelectionSort(A[1:n]):
1. For j = 1,…,n-1: Correctness:
2. min_pos = j At start of iteration 𝑗,
3. For k = j+1,…,n: • 𝐴[1: 𝑗 − 1] contains the 𝑗 − 1
4. If (A[k] < A[min_pos]): smallest elements of 𝐴 in order
• 𝐴[𝑗: 𝑛] contains the remaining
5. min_pos = k elements of 𝐴
6. Swap A[min_pos] and A[j]
Selection Sort: Proof of Correctness
SelectionSort(A[1:n]):
1. For j = 1,…,n-1: Correctness:
2. min_pos = j At start of iteration 𝑗,
3. For k = j+1,…,n: • 𝐴[1: 𝑗 − 1] contains the 𝑗 − 1
4. If (A[k] < A[min_pos]): smallest elements of 𝐴 in order
• 𝐴[𝑗: 𝑛] contains the remaining
5. min_pos = k elements of 𝐴
6. Swap A[min_pos] and A[j]
Selection Sort: Running Time
SelectionSort(A[1:n]):
1. For j = 1,…,n-1:
2. min_pos = j
3. For k = j+1,…,n:
4. If (A[k] < A[min_pos]):
5. min_pos = k
6. Swap A[min_pos] and A[j]
CS3000: Algorithms & Data
Unit 2: Divide & Conquer and Asymptotic Analysis
a. Introduction to sorting
b. Divide & conquer: mergesort
Divide and Conquer Algorithms
διαίρει και βασίλευε!
-Philip II of Macedon
• Split your problem into smaller subproblems
• Recursively solve each subproblem
• Combine the solutions to the subproblems
Divide and Conquer Algorithms
• Examples:
• Mergesort: sorting a list
• Binary Search: searching in a sorted list
• Finding the median of a list
• Fast Fourier Transform
• Integer Multiplication
• Finding Closest Pair
• Key Tools:
• Algorithm Design: recursion
• Correctness: proof by induction
• Running Time Analysis: recurrences
• Asymptotic Analysis
Mergesort
Split 11 3 42 28 17 8 2 15
11 3 42 28 17 8 2 15
Recursively Recursively
Sort Sort
3 11 28 42 2 8 15 17
Merge 2 3 8 11 15 17 28 42
Divide and Conquer: Mergesort
• Key Idea: If 𝑳, 𝑹 are sorted lists of length 𝑛, then we can
merge them into a sorted list 𝑨 of length 2𝑛 in time 𝐶𝑛
• Merging two sorted lists is faster than sorting from scratch
3 11 28 42 𝑳
2 8 15 17 𝑹
𝑨
Pseudocode for Merge
Merge(L,R):
Let n ← len(L) + len(R)
Let A be array of length n
j ← 1, k ← 1
For i = 1,…,n:
If (j > len(L)):
A[i] ← R[k], k ← k+1
ElseIf (k > len(R)):
A[i] ← L[j], j ← j+1
ElseIf (L[j] <= R[k]):
A[i] ← L[j], j ← j+1
Else:
A[i] ← R[k], k ← k+1
Return A
Correctness of Merge
Merge(L,R): Why is Merge correct? How
Let n ← len(L) + len(R) can we make this argument
Let A be array of length n convincing and rigorous?
j ← 1, k ← 1
For i = 1,…,n:
If (j > len(L)):
A[i] ← R[k], k ← k+1
ElseIf (k > len(R)):
A[i] ← L[j], j ← j+1
ElseIf (L[j] <= R[k]):
A[i] ← L[j], j ← j+1
Else:
A[i] ← R[k], k ← k+1
Return A
Correctness of Merge
Merge(L,R):
Let n ← len(L) + len(R)
Let A be array of length n
j ← 1, k ← 1
For i = 1,…,n:
If (j > len(L)):
A[i] ← R[k], k ← k+1
ElseIf (k > len(R)):
A[i] ← L[j], j ← j+1
ElseIf (L[j] <= R[k]):
A[i] ← L[j], j ← j+1
Else:
A[i] ← R[k], k ← k+1
Return A
At the start of iteration 𝑖:
1. We have 𝑖 = 𝑗 + 𝑘 − 1, 𝑗 ≤ len 𝐿 + 1, 𝑘 ≤ len 𝑅 + 1
2. Subarray 𝐴[1: 𝑖 − 1] contains 𝐿 1: 𝑗 − 1 and 𝑅[1: 𝑘 − 1] in sorted order
3. Every element in 𝐿[𝑗: ] and 𝑅[𝑘: ] is larger than every element in 𝐴[1: 𝑖 − 1]
Running Time of Merge
Merge(L,R):
Let n ← len(L) + len(R)
Let A be array of length n
j ← 1, k ← 1
For i = 1,…,n:
If (j > len(L)):
A[i] ← R[k], k ← k+1
ElseIf (k > len(R)):
A[i] ← L[j], j ← j+1
ElseIf (L[j] <= R[k]):
A[i] ← L[j], j ← j+1
Else:
A[i] ← R[k], k ← k+1
Return A
Key Facts about Merge
• Key Idea: If 𝑳, 𝑹 are sorted lists of length 𝑛, then we can
merge them into a sorted list 𝑨 of length 2𝑛 in time 𝐶𝑛
• Merging two sorted lists is faster than sorting from scratch
3 11 28 42 𝑳
2 8 15 17 𝑹
2 3 8 11 15 17 28 42 𝑨
Mergesort
Split 11 3 42 28 17 8 2 15
11 3 42 28 17 8 2 15
Recursively Recursively
Sort Sort
3 11 28 42 2 8 15 17
Merge 2 3 8 11 15 17 28 42
Divide and Conquer: Mergesort
MergeSort(A):
If (len(A) = 1): Return A // Base Case
Let 𝒎 ← ⌈𝐥𝐞𝐧(𝑨)⁄𝟐⌉ // Split
Let L ← A[1:m], R ← A[m+1:n]
Let L ← MergeSort(L) // Recurse
Let R ← MergeSort(R)
Let A ← Merge(L,R) // Merge
Return A
Running Time of Mergesort
MergeSort(A):
If (n = 1): Return A
Let 𝒎 ← ⌈𝒏⁄𝟐⌉
Let L ← A[1:m]
R ← A[m+1:n]
Let L ← MergeSort(L)
Let R ← MergeSort(R)
Let A ← Merge(L,R)
Return A
𝑇 𝑛 = 2 ⋅ 𝑇 𝑛⁄2 + 𝐶𝑛
Recursion Trees 𝑇 1 =𝐶
𝑇 𝑛 = 2 ⋅ 𝑇 𝑛⁄2 + 𝐶𝑛
Proof by Induction 𝑇 1 =𝐶
• Claim: 𝑇 𝑛 = 𝐶𝑛 log 2𝑛
Divide and Conquer: Mergesort
MergeSort(A):
If (len(A) = 1): Return A // Base Case
Let 𝒎 ← ⌈𝐥𝐞𝐧(𝑨)⁄𝟐⌉ // Split
Let L ← A[1:m], R ← A[m+1:n]
Let L ← MergeSort(L) // Recurse
Let R ← MergeSort(R)
Let A ← Merge(L,R) // Merge
Return A