0% found this document useful (0 votes)

49 views

Notes 2

The document provides an overview of data structures and binary trees. It discusses basic data structures like arrays, linked lists, hash tables and trees. It also covers abstract data types like vectors, lists, sequences and dictionaries. The document then reviews binary trees and different traversal methods for binary trees like preorder, inorder and postorder traversal. It concludes with facts about binary trees like the maximum number of nodes at each level and relationships between the number of nodes/leaves and depth.

Uploaded by

Bilal Hameed

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views

Notes 2

Uploaded by

Bilal Hameed

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 136

2-1

Start with a quick review of data structures

Basic Data structures (ICS/CSE 23; [GT] Chapter 2)

I Arrays
I Linked lists
I Hash Tables
I Trees

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-1

Start with a quick review of data structures

Basic Data structures (ICS/CSE 23; [GT] Chapter 2)

I Arrays
I Linked lists
I Hash Tables
I Trees
Abstract data types
I Vectors: support access by rank

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-1

Start with a quick review of data structures

Basic Data structures (ICS/CSE 23; [GT] Chapter 2)

I Arrays
I Linked lists
I Hash Tables
I Trees
Abstract data types
I Vectors: support access by rank
I Lists: support access by current position

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-1

Start with a quick review of data structures

Basic Data structures (ICS/CSE 23; [GT] Chapter 2)

I Arrays
I Linked lists
I Hash Tables
I Trees
Abstract data types
I Vectors: support access by rank
I Lists: support access by current position
I Sequences: supports access by both rank and current position

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-1

Start with a quick review of data structures

Basic Data structures (ICS/CSE 23; [GT] Chapter 2)

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-2

Binary Trees: a quick review

We will use as a data structure and as a tool for analyzing

algorithms.
Level 0 (root)

Level 1

Level 2

Level 3

The depth of a binary tree is the maximum of the levels of all its
leaves.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-3

Traversing binary trees

B C

D E F

G H

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-3

Traversing binary trees

B C

D E F

G H

I Preorder: root, left subtree (in preorder), right subtree (in

preorder): ABDGHCEF

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-3

Traversing binary trees

B C

D E F

G H

I Preorder: root, left subtree (in preorder), right subtree (in

preorder): ABDGHCEF
I Inorder: left subtree (in inorder), root, right subtree (in
inorder): GDHBAECF

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-3

Traversing binary trees

B C

D E F

G H

I Preorder: root, left subtree (in preorder), right subtree (in

preorder): ABDGHCEF
I Inorder: left subtree (in inorder), root, right subtree (in
inorder): GDHBAECF
I Postorder: left subtree (in postorder), right subtree (in
postorder), root: GHDBEFCA

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-3

Traversing binary trees

B C

D E F

G H

I Preorder: root, left subtree (in preorder), right subtree (in

preorder): ABDGHCEF
I Inorder: left subtree (in inorder), root, right subtree (in
inorder): GDHBAECF
I Postorder: left subtree (in postorder), right subtree (in
postorder), root: GHDBEFCA
I Breadth-first order (level order): level 0 left-to-right, then
level 1 left-to-right, . . . : ABCDEFGH
CompSci 161—FQ 2017— M.
c B. Dillencourt—University of California, Irvine
2-4

Facts about binary trees

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-4

Facts about binary trees

1. There are at most 2k nodes at level k.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-4

Facts about binary trees

1. There are at most 2k nodes at level k.

2. A binary tree with depth d has:
I At most 2d leaves.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-4

Facts about binary trees

1. There are at most 2k nodes at level k.

2. A binary tree with depth d has:
I At most 2d leaves.
I At most 2d+1 − 1 nodes.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-4

Facts about binary trees

1. There are at most 2k nodes at level k.

2. A binary tree with depth d has:
I At most 2d leaves.
I At most 2d+1 − 1 nodes.
3. A binary tree with n leaves has depth ≥ dlg ne.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-4

Facts about binary trees

1. There are at most 2k nodes at level k.

2. A binary tree with depth d has:
I At most 2d leaves.
I At most 2d+1 − 1 nodes.
3. A binary tree with n leaves has depth ≥ dlg ne.
4. A binary tree with n nodes has depth ≥ blg nc.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-5

Binary search trees

36 65

25 52 79

9 32

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-5

Binary search trees

36 65

25 52 79

9 32

I Function as ordered dictionaries

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-5

Binary search trees

36 65

25 52 79

9 32

I Function as ordered dictionaries

I find, insert, and remove can all be done in O(h) time
(h = tree height)

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-5

Binary search trees

36 65

25 52 79

9 32

I Function as ordered dictionaries

I find, insert, and remove can all be done in O(h) time
(h = tree height)
I AVL trees and Red-Black Trees: h = O(log n), so find, insert,
and remove can all be done in O(log n) time.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-5

Binary search trees

36 65

25 52 79

9 32

I Function as ordered dictionaries

I find, insert, and remove can all be done in O(h) time
(h = tree height)
I AVL trees and Red-Black Trees: h = O(log n), so find, insert,
and remove can all be done in O(log n) time.
I listAllItems in O(n) time, where n = number of items in tree.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-5

Binary search trees

36 65

25 52 79

9 32

I Function as ordered dictionaries

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-6

Binary Search: Searching in a sorted array

I Input is a sorted array A and an item x.

I Problem is to locate x in the array.
I Several variants of the problem, for example. . .
1. Determine whether x is stored in the array
2. Find the largest i such that A[i] ≤ x (with a reasonable
convention if x < A[0]).
We will focus on the first variant.
I We will show that binary search is an optimal algorithm for
solving this problem.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-7

Binary Search: Searching in a sorted array

Input: A: Sorted array with n entries [0..n − 1]
x: Item we are seeking
Output: Location of x, if x found
-1, if x not found
Top-level call is binarySearch(A,x,0,n − 1)
int binarySearch(A,x,first,last);
if first > last then
return (-1);
else
index = b(first+last)/2c;
if x == A[index] then
return index;
else if x < A[index] then
return binarySearch(A,x,first,index-1);
else
return binarySearch(A,x,index+1,last);

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-8

Correctness of Binary Search

We need to prove two things:

1. If x is in the array, its location (index) is between first and
last, inclusive.
2. On each recursive call, the difference last − first gets strictly
smaller.
first last

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-9

Correctness of Binary Search

To prove that the invariant continues to hold, we need to consider
three cases.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-9

Correctness of Binary Search

To prove that the invariant continues to hold, we need to consider
three cases.
1. last ≥ first + 2
first last

index

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-9

Correctness of Binary Search

To prove that the invariant continues to hold, we need to consider
three cases.
1. last ≥ first + 2
first last

index

2. last = first + 1
first last

index

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-9

Correctness of Binary Search

To prove that the invariant continues to hold, we need to consider
three cases.
1. last ≥ first + 2
first last

index

2. last = first + 1
first last

index

3. last = first
first = last

index

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-10

Binary Search: Analysis of Running Time

I We will count the number of 3-way comparisons of x against

elements of A. (also known as decisions)
I This is the essentially the same as the number of recursive
calls. Every recursive call, except for possibly the very last
one, results in a 3-way comparisons.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-11

Binary Search: Analysis of Running Time (continued)

I Binary search in an array of size 1: 1 decision

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-11

Binary Search: Analysis of Running Time (continued)

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-11

Binary Search: Analysis of Running Time (continued)

I Binary search in an array of size 1: 1 decision
I Binary search in an array of size n > 1: after 1 comparison,
either we are done, or the problem is reduced to binary search
in a subarray of size ≤ bn/2c (with equality possible).
I So the worst-case time to do binary search on an array of size
n satisfies the inequality

1 if n = 1
T (n) = n
1+T 2 otherwise

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-11

Binary Search: Analysis of Running Time (continued)

I The solution to this equation is:

T (n) = blg nc + 1
This can be proved by induction.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-11

Binary Search: Analysis of Running Time (continued)

I The solution to this equation is:

T (n) = blg nc + 1
This can be proved by induction.
I So binary search does blg nc + 1 3-way comparisons on an
array of size n, in the worst case.
CompSci 161—FQ 2017— M.
c B. Dillencourt—University of California, Irvine
2-12

Optimality of binary search

I We will establish a lower bound on the number of decisions

required to find an item in an array, using only 3-way
comparisons of the item against array entries.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-12

Optimality of binary search

I We will establish a lower bound on the number of decisions

required to find an item in an array, using only 3-way
comparisons of the item against array entries.
I The lower bound we will establish is blg nc + 1 comparisons.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-12

Optimality of binary search

I We will establish a lower bound on the number of decisions

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-12

Optimality of binary search

I We will establish a lower bound on the number of decisions

required to find an item in an array, using only 3-way
comparisons of the item against array entries.
I The lower bound we will establish is blg nc + 1 comparisons.
I Since Binary Search performs within this bound, it is optimal.
I Our lower bound is established using a Decision Tree model.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-12

Optimality of binary search

I We will establish a lower bound on the number of decisions

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-13

The decision tree model for searching in an array

Consider any algorithm that searches for an item x in an array A of
size n by comparing entries in A against x. Any such algorithm can
be modeled as a decision tree:
I Each node is labeled with an integer ∈ {0 . . . n − 1}.
I A node labeled i represents a comparison between x and A[i].
I The left subtree of a node labeled i describes the decision tree
for what happens if x < A[i].
I The right subtree of a node labeled i describes the decision
tree for what happens if x > A[i].
Example: Decision tree for binary search with n = 13:
6

2 9

0 4 7 11

1 3 5 8 10 12

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-14

Lower bound on locating an item in an array of size n

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-14

Lower bound on locating an item in an array of size n

1. Any algorithm for searching an array of size n can be modeled

by a decision tree with at least n nodes.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-14

Lower bound on locating an item in an array of size n

1. Any algorithm for searching an array of size n can be modeled

by a decision tree with at least n nodes.
2. Since the decision tree is a binary tree with n nodes,
the depth is at least blg nc.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-14

Lower bound on locating an item in an array of size n

1. Any algorithm for searching an array of size n can be modeled

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-14

Lower bound on locating an item in an array of size n

1. Any algorithm for searching an array of size n can be modeled

by a decision tree with at least n nodes.
2. Since the decision tree is a binary tree with n nodes,
the depth is at least blg nc.
3. The worst-case number of comparisons for the algorithm is the
depth of the decision tree +1. (Remember, root has depth 0).
Hence any algorithm for locating an item in an array of size n using
only comparisons must perform at least blg nc + 1 comparisons.

So binary search is optimal.

CompSci 161—FQ 2017— M.
c B. Dillencourt—University of California, Irvine
2-15

Sorting

I Rearranging a list of items in nondescending order.

I Useful preprocessing step (e.g., for binary search)
I Important step in other algorithms
I Illustrates more general algorithmic techniques
We will discuss
I Comparison-based sorting algorithms (Insertion sort, Selection
Sort, Quicksort, Mergesort, Heapsort)
I Bucket-based sorting methods
I Disk-based sorting

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-16

Comparison-based sorting
I Basic operation: compare two items.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-16

Comparison-based sorting
I Basic operation: compare two items.
I Abstract model.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-16

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-16

Comparison-based sorting
I Basic operation: compare two items.
I Abstract model.
I Advantage: doesn’t use specific properties of the data items.
So same algorithm can be used for sorting integers, strings,
etc.
I Disadvantage: under certain circumstances, specific properties
of the data item can speed up the sorting process.
I Measure of time: number of comparisons
I Consistent with philosophy of counting basic operations,
discussed earlier.
I Could be misleading if other operations dominate. For
example, what if there are more assignment statements than
comparisons? a lot of data movement.
I Comparison-based sorting requires Ω(n log n) comparisons.
(We will prove this.)

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-17

Θ(n log n) work vs. quadratic (Θ(n2 )) work

y
700000

600000

500000

n

400000 y= 2

300000

200000

y = 10 n lg n
100000

n
200 400 600 800 1000

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-18

Some terminology

I A permutation of a sequence of items is a reordering of the

sequence. A sequence of n items has n! distinct permutations.
I Note: Sorting is the problem of finding a particular
distinguished permutation of a list.
I An inversion in a sequence or list is a pair of items such that
the larger one precedes the smaller one.
Example: The list

18 29 12 15 32 10

has 9 inversions:

{(18,12), (18,15), (18,10), (29,12), (29,15),

(29,10), (12,10), (15,10), (32,10)}

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-19

Insertion sort
I Work from left to right across array
I Insert each item in correct position with respect to (sorted)
elements to its left
1

(Unsorted)

(Sorted) x (Unsorted)

(Sorted)

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-20

≤x >x ··· >x x

procedure insertionSort(n, A);

item x
int j,k;
begin {insertionSort}
for k = 1 to n − 1 do
x = A[k];
j = k − 1;
while (j ≥ 0) and (A[j] > x) do
A[j + 1] = A[j];
j = j − 1;
A[j + 1] = x;
end {insertionSort};

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-21

Insertion sort example

23 19 42 17 85 38

19 23 42 17 85 38

17 19 23 42 85 38

17 19 23 38 42 85

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-22

Analysis of Insertion Sort

I Storage: in place: O(1) extra storage

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-22

Analysis of Insertion Sort

I Storage: in place: O(1) extra storage

I Worst-case running time:

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-22

Analysis of Insertion Sort

I Storage: in place: O(1) extra storage

I Worst-case running time:
I On kth iteration of outer loop, element A[k] is compared with
at most k − 1 elements:
A[k − 1], A[k − 2], . . . , A[1].

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-22

Analysis of Insertion Sort

I Storage: in place: O(1) extra storage

I Worst-case running time:
I On kth iteration of outer loop, element A[k] is compared with
at most k − 1 elements:
A[k − 1], A[k − 2], . . . , A[1].
I Total number comparisons over all iterations is at most:

X
n
n(n − 1)
(k − 1) = = Θ(n2 ).
2
k=1

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-22

Analysis of Insertion Sort

I Storage: in place: O(1) extra storage

X
n
n(n − 1)
(k − 1) = = Θ(n2 ).
2
k=1

n2
I Average-case Time: Approximately 4 = Θ(n2 )

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-22

Analysis of Insertion Sort

I Storage: in place: O(1) extra storage

X
n
n(n − 1)
(k − 1) = = Θ(n2 ).
2
k=1

n2
I Average-case Time: Approximately 4 = Θ(n2 )
I Insertion Sort is efficient if the input is “almost sorted”:

Time ≤ n − 1 + (# inversions) (Why?)

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-23

Selection Sort

I Two variants:

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-23

Selection Sort

I Two variants:
1. Repeatedly (for i from 0 to n − 1) find the minimum value,
output it, delete it.
I Values are output in sorted order

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-23

Selection Sort

I Two variants:
1. Repeatedly (for i from 0 to n − 1) find the minimum value,
output it, delete it.
I Values are output in sorted order
2. Repeatedly (for i from n − 1 down to 1)
I Find the maximum of A[0],A[1],. . . ,A[i].
I Swap this value with A[i] (if necessary).

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-23

Selection Sort

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-24

Sorting algorithms based on Divide and Conquer

Divide and conquer paradigm

1. Split problem into subproblem(s)
2. Solve each subproblem (usually via recursive call)
3. Combine solution of subproblem(s) into solution of original
problem

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-24

Sorting algorithms based on Divide and Conquer

Divide and conquer paradigm

1. Split problem into subproblem(s)
2. Solve each subproblem (usually via recursive call)
3. Combine solution of subproblem(s) into solution of original
problem
We will discuss two sorting algorithms based on this paradigm:
I Quicksort
I Mergesort

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-25

Quicksort

Basic idea
I Classify keys as “small” keys or “large” keys. All small keys
are less than all large keys
I Rearrange keys so small keys precede all large keys.
I Recursively sort ”small keys”, recursively sort ”large” keys.

keys

small keys large keys

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-26

Quicksort: One specific implementation

I Let the first element in the array be the pivot value x.
I Small keys are the keys < x.
I Large keys are the keys ≥ x.

first last

x ?

first splitpoint last

<x x ≥x

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-27

Top level pseudocode

procedure quickSort(A,first,last);
int splitpoint;
begin {quickSort}
if first < last then
splitpoint = split(A,first,last);
quickSort(A,first,splitpoint-1);
quickSort(A,splitpoint+1,last);
end {quickSort};

first splitpoint last

<x x ≥x

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-28

The split step

int split(A,first,last);
item x;
begin {split}
splitpoint = first;
x = A[first];
for k = first + 1 to last do
if A[k] < x then
A[splitpoint + 1] ↔ A[k];
splitpoint = splitpoint + 1;
A[first] ↔ A[splitpoint];
return splitpoint;
end {split};

Loop invariants:
I A[first + 1..splitpoint] contains keys < x.
I A[splitpoint + 1..k − 1] contains keys ≥ x.
I A[k..last] contains unprocessed keys.
CompSci 161—FQ 2017— M.
c B. Dillencourt—University of California, Irvine
2-29

The split step

At start:
first k last

x ?

splitpoint

In middle:
first splitpoint k last

x <x ≥x ?

At end:
first splitpoint last

x <x ≥x

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-30

Example of split step

27 83 23 36 15 79 22 18
s k

27 23 83 36 15 79 22 18
s k

27 23 15 36 83 79 22 18
s k

27 23 15 22 83 79 36 18
s k

27 23 15 22 18 79 36 83
s

18 23 15 22 27 79 36 83
s

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-31

Analysis of Quicksort
We can visualize the lists sorted by quicksort as a binary tree.
I The root is the top-level list (of all elements to be sorted)
I Identify each list with its split value.
I The children of a node are the two sublists to be sorted.

27 83 23 36 15 79 22 18

18 23 15 22 79 36 83

15 23 22 36 83

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-32

Worst-case Analysis of Quicksort

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-32

Worst-case Analysis of Quicksort

I Any pair of values x and y gets compared at most once

during the entire run of Quicksort.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-32

Worst-case Analysis of Quicksort

I Any pair of values x and y gets compared at most once

during the entire run of Quicksort.
I The number of possible comparisons is

n
= Θ(n2 )
2

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-32

Worst-case Analysis of Quicksort

I Any pair of values x and y gets compared at most once

during the entire run of Quicksort.
I The number of possible comparisons is

n
= Θ(n2 )
2

I Hence the worst-case number of comparisons performed by

Quicksort when sorting n numbers is O(n2 ).

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-32

Worst-case Analysis of Quicksort

I Any pair of values x and y gets compared at most once

during the entire run of Quicksort.
I The number of possible comparisons is

n
= Θ(n2 )
2

I Hence the worst-case number of comparisons performed by

Quicksort when sorting n numbers is O(n2 ).
I Can we be more precise? Is it o(n2 )? Is it Θ(n2 )?

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-32

Worst-case Analysis of Quicksort

I Any pair of values x and y gets compared at most once

during the entire run of Quicksort.
I The number of possible comparisons is

n
= Θ(n2 )
2

I Hence the worst-case number of comparisons performed by

Quicksort when sorting n numbers is O(n2 ).
I Can we be more precise? Is it o(n2 )? Is it Θ(n2 )?
I It is Θ(n2 ). . .

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-33

A bad case case for Quicksort: 1, 2, 3, . . . , n − 1, n

1 2 3 ... n − 1 n

2 3 ... n − 1 n

3 ... n − 1 n

n−1 n

n

2 comparisons required. So the worst-case time for Quicksort is
Θ(n2 ). But what about the average case . . . ?
CompSci 161—FQ 2017— M.
c B. Dillencourt—University of California, Irvine
2-34

Average-case analysis of Quicksort:

Our approach:

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-34

Average-case analysis of Quicksort:

Our approach:
1. Use the binary tree of sorted lists

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-34

Average-case analysis of Quicksort:

Our approach:
1. Use the binary tree of sorted lists
2. Number the elements in sorted order

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-34

Average-case analysis of Quicksort:

Our approach:
1. Use the binary tree of sorted lists
2. Number the elements in sorted order
3. Calculate the probability that two elements get compared

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-34

Average-case analysis of Quicksort:

Our approach:
1. Use the binary tree of sorted lists
2. Number the elements in sorted order
3. Calculate the probability that two elements get compared
4. Use this to compute the expected number of comparisons
performed by Quicksort.

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-35

Average-case analysis of Quicksort:

27 83 23 36 15 79 22 18

18 23 15 22 79 36 83

15 23 22 36 83

Sorted order: 15 18 22 23 27 36 79 83

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-36

Key Fact

During the run of Quicksort, two values x and y get compared if

and only if the first key in the range [x..y ] to be chosen as a pivot
is either x or y .

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-36

Key Fact

During the run of Quicksort, two values x and y get compared if

and only if the first key in the range [x..y ] to be chosen as a pivot
is either x or y .
I Examples where both statements are true: (18, 23), (27, 83)

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-36

Key Fact

During the run of Quicksort, two values x and y get compared if

and only if the first key in the range [x..y ] to be chosen as a pivot
is either x or y .
I Examples where both statements are true: (18, 23), (27, 83)

I Examples where both statements are false: (15, 23), (18, 83)

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-37

Average-case analysis of Quicksort

Assume:
I All permutations are equally likely

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-37

Average-case analysis of Quicksort

Assume:
I All permutations are equally likely
I All n values are distinct

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-37

Average-case analysis of Quicksort

Assume:
I All permutations are equally likely
I All n values are distinct
I The values in sorted order are S1 < S2 < · · · < Sn .

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-37

Average-case analysis of Quicksort

Assume:
I All permutations are equally likely
I All n values are distinct
I The values in sorted order are S1 < S2 < · · · < Sn .
Let Pi,j = The probability that keys Si and Sj are compared
with each other during the invocation of quicksort

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-37

Average-case analysis of Quicksort

Pi,j = The probability that the first key from

{Si , Si+1 , . . . , Sj } to be chosen as a pivot value is
either Si or Sj

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-37

Average-case analysis of Quicksort

Pi,j = The probability that the first key from

{Si , Si+1 , . . . , Sj } to be chosen as a pivot value is
either Si or Sj
2
=
j −i +1

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-38

Average-case analysis of Quicksort

Define indicator random variables {Xi,j : 1 ≤ i < j ≤ n}

1 if keys Si and Sj get compared
Xi,j =
0 if keys Si and Sj do not get compared

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-38

Average-case analysis of Quicksort

Define indicator random variables {Xi,j : 1 ≤ i < j ≤ n}

1 if keys Si and Sj get compared
Xi,j =
0 if keys Si and Sj do not get compared

1. The total number of comparisons is:

X
n X n
Xi,j
i=1 j=i+1

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-38

Average-case analysis of Quicksort

Define indicator random variables {Xi,j : 1 ≤ i < j ≤ n}

1 if keys Si and Sj get compared
Xi,j =
0 if keys Si and Sj do not get compared

1. The total number of comparisons is:

X
n X n
Xi,j
i=1 j=i+1

2. The expected (average) total number of comparisons is:

 
Xn X n X
n X n
E Xi,j  = E (Xi,j )
i=1 j=i+1 i=1 j=i+1

CompSci 161—FQ 2017— M.

c B. Dillencourt—University of California, Irvine
2-38

Average-case analysis of Quicksort

Define indicator random variables {Xi,j : 1 ≤ i < j ≤ n}

1 if keys Si and Sj get compared
Xi,j =
0 if keys Si and Sj do not get compared

1. The total number of comparisons is:

X
n X n
Xi,j
i=1 j=i+1

2. The expected (average) total number of comparisons is:

 
Xn X n X
n X n
E Xi,j  = E (Xi,j )
i=1 j=i+1 i=1 j=i+1

3. The expected value of Xi,j is:

2
E (Xi,j ) = Pi,j =
j −i +1
CompSci 161—FQ 2017— M.
c B. Dillencourt—University of California, Irvine
2-39

Average-case analysis of Quicksort

Hence the expected number of comparisons is
X
n X
n
E (Xi,j )
i=1 j=i+1