Notes 2
Notes 2
Level 1
Level 2
Level 3
The depth of a binary tree is the maximum of the levels of all its
leaves.
B C
D E F
G H
B C
D E F
G H
B C
D E F
G H
B C
D E F
G H
B C
D E F
G H
36 65
25 52 79
9 32
36 65
25 52 79
9 32
36 65
25 52 79
9 32
36 65
25 52 79
9 32
36 65
25 52 79
9 32
36 65
25 52 79
9 32
index
index
2. last = first + 1
first last
index
index
2. last = first + 1
first last
index
3. last = first
first = last
index
2 9
0 4 7 11
1 3 5 8 10 12
Sorting
Comparison-based sorting
I Basic operation: compare two items.
Comparison-based sorting
I Basic operation: compare two items.
I Abstract model.
Comparison-based sorting
I Basic operation: compare two items.
I Abstract model.
I Advantage: doesn’t use specific properties of the data items.
So same algorithm can be used for sorting integers, strings,
Comparison-based sorting
I Basic operation: compare two items.
I Abstract model.
I Advantage: doesn’t use specific properties of the data items.
So same algorithm can be used for sorting integers, strings,
etc.
I Disadvantage: under certain circumstances, specific properties
of the data item can speed up the sorting process.
I Measure of time: number of comparisons
I Consistent with philosophy of counting basic operations,
discussed earlier.
I Could be misleading if other operations dominate. For
example, what if there are more assignment statements than
comparisons? a lot of data movement.
I Comparison-based sorting requires Ω(n log n) comparisons.
(We will prove this.)
600000
500000
n
400000 y= 2
300000
200000
y = 10 n lg n
100000
n
200 400 600 800 1000
Some terminology
18 29 12 15 32 10
has 9 inversions:
Insertion sort
I Work from left to right across array
I Insert each item in correct position with respect to (sorted)
elements to its left
1
(Unsorted)
(Sorted) x (Unsorted)
(Sorted)
23 19 42 17 85 38
19 23 42 17 85 38
19 23 42 17 85 38
17 19 23 42 85 38
17 19 23 42 85 38
17 19 23 38 42 85
X
n
n(n − 1)
(k − 1) = = Θ(n2 ).
2
k=1
X
n
n(n − 1)
(k − 1) = = Θ(n2 ).
2
k=1
n2
I Average-case Time: Approximately 4 = Θ(n2 )
X
n
n(n − 1)
(k − 1) = = Θ(n2 ).
2
k=1
n2
I Average-case Time: Approximately 4 = Θ(n2 )
I Insertion Sort is efficient if the input is “almost sorted”:
Selection Sort
I Two variants:
Selection Sort
I Two variants:
1. Repeatedly (for i from 0 to n − 1) find the minimum value,
output it, delete it.
I Values are output in sorted order
Selection Sort
I Two variants:
1. Repeatedly (for i from 0 to n − 1) find the minimum value,
output it, delete it.
I Values are output in sorted order
2. Repeatedly (for i from n − 1 down to 1)
I Find the maximum of A[0],A[1],. . . ,A[i].
I Swap this value with A[i] (if necessary).
Selection Sort
I Two variants:
1. Repeatedly (for i from 0 to n − 1) find the minimum value,
output it, delete it.
I Values are output in sorted order
2. Repeatedly (for i from n − 1 down to 1)
I Find the maximum of A[0],A[1],. . . ,A[i].
I Swap this value with A[i] (if necessary).
I Both variants run in O(n2 ) time and are in place.
Quicksort
Basic idea
I Classify keys as “small” keys or “large” keys. All small keys
are less than all large keys
I Rearrange keys so small keys precede all large keys.
I Recursively sort ”small keys”, recursively sort ”large” keys.
keys
first last
x ?
<x x ≥x
procedure quickSort(A,first,last);
int splitpoint;
begin {quickSort}
if first < last then
splitpoint = split(A,first,last);
quickSort(A,first,splitpoint-1);
quickSort(A,splitpoint+1,last);
end {quickSort};
<x x ≥x
Loop invariants:
I A[first + 1..splitpoint] contains keys < x.
I A[splitpoint + 1..k − 1] contains keys ≥ x.
I A[k..last] contains unprocessed keys.
CompSci 161—FQ 2017—
M.
c B. Dillencourt—University of California, Irvine
2-29
x ?
splitpoint
In middle:
first splitpoint k last
x <x ≥x ?
At end:
first splitpoint last
x <x ≥x
27 83 23 36 15 79 22 18
s k
27 23 83 36 15 79 22 18
s k
27 23 83 36 15 79 22 18
s k
27 23 15 36 83 79 22 18
s k
27 23 15 36 83 79 22 18
s k
27 23 15 22 83 79 36 18
s k
27 23 15 22 18 79 36 83
s
18 23 15 22 27 79 36 83
s
Analysis of Quicksort
We can visualize the lists sorted by quicksort as a binary tree.
I The root is the top-level list (of all elements to be sorted)
I Identify each list with its split value.
I The children of a node are the two sublists to be sorted.
27 83 23 36 15 79 22 18
18 23 15 22 79 36 83
15 23 22 36 83
22
1 2 3 ... n − 1 n
2 3 ... n − 1 n
3 ... n − 1 n
n−1 n
n
2 comparisons required. So the worst-case time for Quicksort is
Θ(n2 ). But what about the average case . . . ?
CompSci 161—FQ 2017—
M.
c B. Dillencourt—University of California, Irvine
2-34
Our approach:
Our approach:
1. Use the binary tree of sorted lists
Our approach:
1. Use the binary tree of sorted lists
2. Number the elements in sorted order
Our approach:
1. Use the binary tree of sorted lists
2. Number the elements in sorted order
3. Calculate the probability that two elements get compared
Our approach:
1. Use the binary tree of sorted lists
2. Number the elements in sorted order
3. Calculate the probability that two elements get compared
4. Use this to compute the expected number of comparisons
performed by Quicksort.
27 83 23 36 15 79 22 18
18 23 15 22 79 36 83
15 23 22 36 83
22
Sorted order: 15 18 22 23 27 36 79 83
Key Fact
Key Fact
Key Fact
I Examples where both statements are false: (15, 23), (18, 83)
X X 2
n n−i+1
= (k = j − i + 1)
k
i=1 k=2
X X 2
n n−i+1
= (k = j − i + 1)
k
i=1 k=2
Xn Xn
2
<
k
i=1 k=1
X X 2
n n−i+1
= (k = j − i + 1)
k
i=1 k=2
Xn Xn
2
<
k
i=1 k=1
Xn Xn
1
= 2
k
i=1 k=1
X X 2
n n−i+1
= (k = j − i + 1)
k
i=1 k=2
Xn Xn
2
<
k
i=1 k=1
Xn Xn
1
= 2
k
i=1 k=1
Xn
= 2 Hn
i=1
X X 2
n n−i+1
= (k = j − i + 1)
k
i=1 k=2
Xn Xn
2
<
k
i=1 k=1
Xn Xn
1
= 2
k
i=1 k=1
Xn
= 2 Hn = 2nHn
i=1
X X 2
n n−i+1
= (k = j − i + 1)
k
i=1 k=2
Xn Xn
2
<
k
i=1 k=1
Xn Xn
1
= 2
k
i=1 k=1
Xn
= 2 Hn = 2nHn ∈ O(n lg n).
i=1
X X 2
n n−i+1
= (k = j − i + 1)
k
i=1 k=2
Xn Xn
2
<
k
i=1 k=1
Xn Xn
1
= 2
k
i=1 k=1
Xn
= 2 Hn = 2nHn ∈ O(n lg n).
i=1
X X 2
n n−i+1
= (k = j − i + 1)
k
i=1 k=2
Xn Xn
2
<
k
i=1 k=1
Xn Xn
1
= 2
k
i=1 k=1
Xn
= 2 Hn = 2nHn ∈ O(n lg n).
i=1
Mergesort
I Split array into two equal subarrays
I Sort both subarrays (recursively)
I Merge two sorted subarrays
first mid last
procedure Mergesort(A,first,last);
begin {Mergesort}
if first < last then
mid = b(first + last)/2c;
Mergesort(A,first,mid);
Mergesort(A,mid+1,last);
merge(A,first,mid,mid + 1,last);
end {Mergesort};
temp
19 26 42 71 14 24 31 39
14 19 24 26 31 39 42 71
Analysis of Mergesort
Analysis of Mergesort
Analysis of Mergesort
6
5 5
4 2
3
3
1
2
6
1
4
Inversion Counting
Inversion Counting
Inversion Counting
Inversion Counting
Inversion Counting
Inversion Counting
temp
tempindex
last1 − index1 + 1
Example
19 26 42 71 14 24 31 39
14 19 24 26 31
Listing inversions
We have just seen that we can count inversions without increasing
the asymptotic running time of Mergesort. Suppose we want to list
inversions. When we remove inversions, we list all inversions
removed:
first1 index1 last1 first2 index2 last2
temp
tempindex