0% found this document useful (0 votes)
79 views103 pages

11 Sorting

The document discusses various sorting algorithms including selection sort, bubble sort, insertion sort, merge sort, heap sort, and quick sort. It provides descriptions of how each algorithm works, including key steps, time complexities, and whether they sort in place or require additional memory. Merge sort is analyzed in more depth, showing how it uses a divide-and-conquer approach with O(n log n) time complexity. Heap sort is also explained in detail as a sorting algorithm that combines the efficiency of merge sort with the in-place sorting of insertion sort.

Uploaded by

fNx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views103 pages

11 Sorting

The document discusses various sorting algorithms including selection sort, bubble sort, insertion sort, merge sort, heap sort, and quick sort. It provides descriptions of how each algorithm works, including key steps, time complexities, and whether they sort in place or require additional memory. Merge sort is analyzed in more depth, showing how it uses a divide-and-conquer approach with O(n log n) time complexity. Heap sort is also explained in detail as a sorting algorithm that combines the efficiency of merge sort with the in-place sorting of insertion sort.

Uploaded by

fNx
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 103

Sorting and Searching

-1-
Sorting and Searching Algorithms
 Selection Sort
 Bubble Sort
 Insertion Sort
 Merge Sort
 Heap Sort
 Quick Sort

-2-
Sorting Algorithms and Memory
 Some algorithms sort by swapping elements within the
input array
 Such algorithms are said to sort in place, and require
only O(1) additional memory.
 Other algorithms require allocation of an output array into
which values are copied.
 These algorithms do not sort in place, and require O(n)
additional memory.

4 3 7 11 2 2 1 3
5
swap

-3-
Selection Sort
 Selection Sort operates by first finding the smallest
element in the input list, and moving it to the output list.
 It then finds the next smallest value and does the
same.
 It continues in this way until all the input elements have
been selected and placed in the output list in the correct
order.
 Note that every selection requires a search through the
input list.
 Thus the algorithm has a nested loop structure
 Selection Sort Example
-4-
Selection Sort
for i = n-1 downto 0
jmin = 0
for j = 1 to i O(i)
if A[ j ] < A[jmin]

jmin = j

add A[jmin] to output


remove
A[jmin] from
n input
2

T (n)  
i i 
0
O(n2 )
-5-
Bubble Sort
 Bubble Sort operates by successively comparing
adjacent elements, swapping them if they are out of
order.
 At the end of the first pass, the largest element is in the
correct position.
 A total of n passes are required to sort the entire array.
 Thus bubble sort also has a nested loop structure
 Bubble Sort Example

-6-
Bubble Sort
for i = n-2 downto 0
for j = 0 to i
if A[ j ] > A[ j + O(i)
1]
swap A[
j]
n
2 and
T (n)  
i i 
A[ j +
0
O(n2 ) 1]

-7-
Comparison
 Thus both Selection Sort and Bubble Sort have O(n2)
running time.
 However, both can also easily be designed to
 Sort in place
 Stable sort

-8-
Example: Insertion Sort

-9-
Example: Insertion Sort

- 10 -
Example: Insertion Sort

- 11 -
Comparison
 Selection Sort
 Bubble Sort
 Insertion Sort
 Sort in place
 Stable sort
 But O(n2) running time.

 Can we do better?

- 12 -
Merge Sort

88 52 14

31 30
25 98 23
62 79

Divide and Conquer

- 13 -
Merge Sort
 Merge-sort is a sorting algorithm based on the divide-
and-conquer paradigm
 It was invented by John von Neumann, one of the
pioneers of computing, in 1945

- 14 -
Divide-and-Conquer
 Divide-and conquer is a general algorithm design paradigm:
 Divide: divide the input data S in two disjoint subsets S1 and S2
 Recur: solve the subproblems associated with S1 and S2
 Conquer: combine the solutions for S1 and S2 into a solution for
S
 The base case for the recursion are subproblems of size 0
or 1

- 15 -
Merge Sort

88 52 14 Split Set into Two


31 98
30 (no real work)
25 23
62 79
Get one friend to Get one friend to
sort the first sort the second
half. half.
25,31,52,88,98 14,23,30,62,79

- 16 -
Merge Sort

Merge two sorted lists into one

25,31,52,88,98

14,23,25,30,31,52,62,79,88,98
14,23,30,62,79

- 17 -
Merge-Sort
 Merge-sort on an input sequence S with n elements
consists of three steps:
 Divide: partition S into two sequences S1 and S2 of about
n2
elements each
 Recur: recursively sort S1 and S2
 Conquer: merge S1 and S2 into a unique sorted sequence
Algorithm mergeSort(S)
Input sequence S with n elements
Output sequence S sorted
if S.size() > 1
(S1, S2) € split(S,
n/2) mergeSort(S1)
mergeSort(S2)
merge(S1, S2, S)

- 18 -
Merging Two Sorted Sequences
 The conquer step of merge-sort consists of merging two sorted
sequences A and B into a sorted sequence S containing the union of
the elements of A and B
 Merging two sorted sequences, each with n2 elements takes O(n)
time
 Normally, merging is not in-place: new memory must be
allocated to hold S.
 It is possible to do in-place merging using linked lists.
 Code is more complicated
 Only changes memory usage by a constant factor

- 19 -
Merging Two Sorted Sequences (As Arrays)
Algorithm merge(S1, S2 , S):
Input: Sorted sequences S1 and S2 and an empty sequence S, implemented as arrays
Output: Sorted sequence S containing the elements from S1 and S2
ij0
while i  S1.size() and j  S2.size() do
if S1.get(i)  S2.get(j) then
S.addLast(S1.get(i))
ii1
else
S.addLast(S2 .get(j)
) jj1
while i  S1.size() do
S.addLast(S1.get(i)
) ii1
while j  S2.size() do

S.addLast(S2.get(j))
jj1 - 20 -
Merging Two Sorted Sequences (As Linked Lists)

Algorithm merge(S1, S2 , S):


Input: Sorted sequences S1 and S2 and an empty sequence S, implemented as linked lists
Output: Sorted sequence S containing the elements from S1 and S2
while S1   and S2   do
if S1.first().element()  S2.first().element() then
S.addLast(S1.remove(S1.first()))
ii1
else
S.addLast(S2.remove(S2.first())
)
while S1   do
S.addLast(S1.remove(S1.first()))
while S2   do
S.addLast(S2.remove(S2.first())) - 21 -
Merge-Sort Tree
 An execution of merge-sort is depicted by a binary tree
 each node represents a recursive call of merge-sort and stores
 unsorted sequence before the execution and its partition
 sorted sequence at the end of the execution
 the root is the initial call
 the leaves are calls on subsequences of size 0 or 1

7 29 4 € 2 4 7
9

72 € 2 94 € 4
7 9

7 2 9 4
€  €  €  € 
7 2 - 22 - 9 4
Execution Example

 Partition

- 23 -
Execution Example (cont.)

 Recursive call, partition

- 24 -
Execution Example (cont.)

 Recursive call, partition

- 25 -
Execution Example (cont.)

 Recursive call, base case

- 26 -
Execution Example (cont.)

 Recursive call, base case

- 27 -
Execution Example (cont.)

 Merge

- 28 -
Execution Example (cont.)

 Recursive call, …, base case, merge

- 29 -
Execution Example (cont.)

 Merge

- 30 -
Execution Example (cont.)

 Recursive call, …, merge, merge

- 31 -
Execution Example (cont.)

 Merge

- 32 -
Analysis of Merge-Sort
 The height h of the merge-sort tree is O(log n)
 at each recursive call we divide in half the sequence,
 The overall amount or work done at the nodes of depth i is O(n)
 we partition and merge 2i sequences of size n2i
 we make 2i1 recursive calls
 Thus, the total running time of merge-sort is O(n log n)

T (n)  2T (n / 2) 
depth #seqs size
O(n)
0 1
n

1 2
i 2i

… … …

- 33 -

n2
Heapsort
 Invented by Williams & Floyd in 1964
 O(nlogn) worst case – like merge sort
 Sorts in place – like insertion sort
 Combines the best of both algorithms

- 34 -
Selection Sort

Largest i values are sorted on the right.


Remaining values are off to the left.

<
3 5
6,7,8,9
1
4 2

Max is easier to find if a heap.

- 35 -
Heap-Sort Algorithm
 Build an array-based (max) heap
 Iteratively call removeMax() to extract the keys in
descending order
 Store the keys as they are extracted in the unused tail
portion of the array

- 36 -
Heap-Sort Algorithm
Algorithm HeapSort(S)
Input: S, an unsorted array of comparable elements
Output: S, a sorted array of comparable elements
T = MakeMaxHeap (S)
for i = n-1 downto 1
S[i] = T.removeMax()

- 37 -
Heap-Sort Running Time
 The heap can be built bottom-up in O(n) time
 Extraction of the ith element takes O(log(n - i+1)) time
(for downheaping)
 Thus total run time is
n

T (n)  O(n) i log(n  i 


1
1) n

 O(n)  ilog
1
i n

 O(n)  i 
1 log


n O(n log
n)
- 38 -
Quick-Sort

88 52 14

31 30
25 98 23
62 79

Divide and Conquer

- 39 -
QuickSort
 Invented by C.A.R. Hoare in 1960
 “There are two ways of constructing a software design:
One way is to make it so simple that there are obviously
no deficiencies, and the other way is to make it so
complicated that there are no obvious deficiencies. The
first method is far more difficult.”

- 40 -
Quick-Sort

 Quick-sort is a divide-and-
conquer algorithm:
x
 Divide: pick a random
element x (called a pivot)
and partition S into
 L elements less
than x
x

 E elements equal
to x L E G
 G elements greater
than x
 Recur: Quick-sort L and x
G
 Conquer: join L, E and G
- 41 -
The Quick-Sort Algorithm
Algorithm QuickSort(S)
if S.size() > 1
(L, E, G) = Partition(S)
QuickSort(L)
QuickSort(G)
S = (L, E, G)

- 42 -
Partition
 Remove, in turn, each Algorithm Partition(S)
element y from S and Input sequence S
Output subsequences L, E, G of the
 Insert y into sequence L, E elements of S less than, equal
to, or greater than the pivot,
or G, depending on the resp.
result of the comparison L, E, G € empty sequences x
with the pivot x (e.g., last € S.getLast().element
element in S) while S.isEmpty()
y€ S.removeFirst(S)
 Each insertion and removal if y < x
is at the beginning or at L.addLast(y)
the end of a sequence, else if y = x
and hence takes O(1) time E.addLast(y)
else { y > x }
 Thus, partitioning takes G.addLast(y)
O(n) time return L, E, G

- 43 -
Partition
 Since elements are Algorithm Partition(S)
removed at the beginning Input sequence S
Output subsequences L, E, G of the
and added at the end, this elements of S less than, equal
partition algorithm is stable. to, or greater than the pivot,
resp.
L, E, G € empty sequences x
€ S.getLast().element
while S.isEmpty()
y€ S.removeFirst(S)
if y < x
L.addLast(y)
else if y = x
E.addLast(y)
else { y > x }
G.addLast(y)
return L, E, G

- 44 -
Quick-Sort Tree
 An execution of quick-sort is depicted by a binary tree
 Each node represents a recursive call of quick-sort and stores
 Unsorted sequence before the execution and its pivot
 Sorted sequence at the end of the execution
 The root is the initial call
 The leaves are calls on subsequences of size 0 or 1

- 45 -
Execution Example

 Pivot selection

- 46 -
Execution Example (cont.)

 Partition, recursive call, pivot selection

- 47 -
Execution Example (cont.)

 Partition, recursive call, base case

- 48 -
Execution Example (cont.)

 Recursive call, …, base case, join

- 49 -
Execution Example (cont.)

 Recursive call, pivot selection

- 50 -
Execution Example (cont.)

 Partition, …, recursive call, base case

- 51 -
Execution Example (cont.)

 Join, join

- 52 -
Quick-Sort Properties
 The algorithm just described is stable, since elements
are removed from the beginning of the input sequence
and placed on the end of the output sequences (L,E, G).
 However it does not sort in place: O(n) new
memory is allocated for L, E and G
 Is there an in-place quick-sort?

- 53 -
In-Place Quick-Sort
 Note: Use the lecture slides here instead of the
textbook implementation (Section 11.2.2)

Partition set into two using


randomly chosen pivot

88 52 14

31 30
25 98 23
62 79

14 88
98
30 ≤ 52 ≤ 62
31
25 23 79
- 54 -
In-Place Quick-Sort

14 88
98
30 ≤ 52 ≤ 62
31
25 23 79

Get one friend to Get one friend to


sort the first sort the second
half. half.
14,23,25,30,31 62,79,98,88

- 55 -
In-Place Quick-Sort

14,23,25,30,31
52
6
2
,
7
Glue pieces together.9
,
(No real work) 9
8
,
14,23,25,30,31,52,62,79,88,98
8
8

- 56 -
The In-Place Partitioning Problem

Input: Output:
x=52
88 14 88
14 98
30 ≤ 52 < 62
31 30 31
25 98 25 23 79
52 23
62 79

Problem: Partition a list into a set of small values and a set of large values.

- 57 -
Precise Specification

p r

p q r

- 58 -
Loop Invariant

 3 subsets are maintained


 One containing values less
than or equal to the
pivot
 One containing values
greater than the pivot
 One containing values yet
to be processed

- 59 -
Maintaining Loop Invariant

• Consider element at location j

– If greater than pivot, incorporate into


‘> set’ by incrementing j.

– If less than or equal to pivot,


incorporate into ‘S set’ by swapping
with element at location i+1 and
incrementing both i and j.

– Measure of progress: size of


unprocessed set.
- 60 -
Maintaining Loop Invariant

- 61 -
Establishing Loop Invariant

- 62 -
Establishing Postcondition

Exhaustive on exit

- 63 -
Establishing Postcondition

- 64 -
An Example

- 65 -
In-Place Partitioning: Running Time

Each iteration takes O(1) time € T o t a l =


On
()

or

- 66 -
In-Place Partitioning is NOT Stable

or

- 67 -
The In-Place Quick-Sort Algorithm
Algorithm QuickSort(A, p, r)
if p < r
q = Partition(A, p, r)
QuickSort(A, p, q - 1)
QuickSort(A, q + 1,
r)

- 68 -
Running Time of Quick-Sort

- 69 -
Quick-Sort Running Time
 We can analyze the running time of Quick-Sort using a recursion
tree.
 At depth i of the tree, the problem is partitioned into 2i sub-
problems.
 The running time will be determined by how balanced these
partitions are.

depth
0

- 70 -
Quick Sort
Let pivot be the first
88 52 14 element in the list?
31 98 30
25
23
62 79

14 88
98
30 ≤ 31 ≤ 62
25 23 52 79

- 71 -
Quick Sort

14,23,25,30,31,52,62,79,88,98

≤ 14 ≤ 23,25,30,31,52,62,79,88,98

If the list is already sorted,


then the list is worst case unbalanced.

- 72 -
QuickSort: Choosing the Pivot
 Common choices are:
 random element
 middle element
 median of first, middle and last element

- 73 -
Best-Case Running Time
 The best case for quick-sort occurs when each pivot partitions the
array in half.
 Then there are O(log n) levels
 There is O(n) work at each level
 Thus total running time is O(n log n)

depth time
0 n



i
- 74 -
n

Quick Sort

Best Time: T(n) = 2T(n/2) +


(n)
= (n log n)
Worst Time:

Expected Time:

- 75 -
Worst-case Running Time
 The worst case for quick-sort occurs when the pivot is the unique
minimum or maximum element
 One of L and G has size n – 1 and the other has size 0
 The running time is proportional to the sum
n  (n – 1)  …  2  
 Thus, the worst-case running time of quick-sort is O(n2)
depth time
0 n

1 n– 1

… …

n– 1 1
- 76 -
Average-Case Running Time
 If the pivot is selected randomly, the average-case running time
for Quick Sort is O(n log n).
 Proving this requires a probabilistic analysis.
 We will simply provide an intution for why average-case O(n log n)
is reasonable.

depth
0

- 77 -
Expected Time Complexity for Quick Sort

Q: Why is it reasonable to expect O (n log n) time


complexity?

- 78 -
Expected Time Complexity for Quick Sort

Then T (n)  T (p(n  1))  T (q(n  1))  O (n)


wlog, suppose that q  p.
Let k be the depth of the recursion tree
Then qk n  1  k  log n / log(1 / q)
Thus k  O (log n ):
O (n) work done per level  T (n)  O (n
log n).

- 79 -
Properties of QuickSort
 In-place? 
 Stable? 
 Fast?
 Depends.
 Worst Case:
 Expected Case:

- 80 -
Summary of Comparison Sorts

Algorithm Best Worst Average In Stable Comments


Case Case Case Place
Selection n2 n2 Yes Yes

Bubble n n2 Yes Yes

Insertion n n2 Yes Yes Good if often almost sorted

Merge n log n n log n No Yes Good for very large datasets that
require swapping to disk
Heap n log n n log n Yes No Best if guaranteed n log n required
Quick n log n n2 n log n Yes No Usually fastest in practice

- 81 -
Comparison Sort: Lower Bound

MergeSort and HeapSort are both (n log n) (worst


case).

Can we do better?

- 82 -
Comparison Sort: Decision Trees

 Example: Sorting a 3-element array


A[1..3]

- 83 -
Comparison Sort: Decision Trees
 For a 3-element array, there are 6 external nodes.

 For an n-element array, there are n! external nodes.

- 84 -
Comparison Sort

 To store n! external nodes, a decision tree must have a


height of at least log n!

 Worst-case time is equal to the height of the binary


decision tree.

Thus T (n) nlog n! ⎝n /

where log n! i 
1 log i   log ⎝n / 2 (n
2
i 1 log
n)
Thus T (n) (n log n)

Thus MergeSort & HeapSort are asymptotically optimal.

- 85 -
Searching and Sorting

Linear Search
Binary Search

86
© 2006 Pearson Addison-Wesley. All rights reserved
Linear Search

Searching is the process of determining whether or not a given


value exists in a data structure or a storage media.
We discuss two searching methods on one-dimensional
arrays: linear search and binary search.
The linear (or sequential) search algorithm on an array is:
Sequentially scan the array, comparing each array item with the searched value.
If a match is found; return the index of the matched element; otherwise return –1.
Note: linear search can be applied to both sorted and unsorted
arrays.

87
© 2006 Pearson Addison-Wesley. All rights reserved
Linear Search
The algorithm translates to the following C method:
// Linear Search in C

#include <stdio.h>

int search(int array[], int n, int x) {

// Going through array sequencially


for (int i = 0; i < n; i++)
if (array[i] == x)
return i;
return -1;
}
88
Linear Search

int main() {
int array[] = {2, 4, 0, 1, 9};
int x = 1;
int n = sizeof(array) / sizeof(array[0]);

int result = search(array, n, x);

(result == -1) ? printf("Element not found") : printf("Element


found at index: %d", result);
}

89
Binary Search

Binary search uses a recursive method to


search an array to find a specified value
The array must be a sorted array:
a[0]≤a[1]≤a[2]≤. . . ≤
a[finalIndex]
If the value is found, its index is returned
If the value is not found, -1 is returned
Note: Each execution of the recursive method
reduces the search space by about a half

90
BinarySearch

An algorithm to solve this task looks at the


middle of the array or array segment first
If the value looked for is smaller than the
value in the middle of the array
Then the second half of the array or array segment
can be ignored
This strategy is then applied to the first half of the
array or array segment

91
Binary Search

If the value looked for is larger than the value in the


middle of the array or array segment
Then the first half of the array or array segment can be
ignored
This strategy is then applied to the second half of the array
or array segment
If the value looked for is at the middle of the array or
array segment, then it has been found
If the entire array (or array segment) has been
searched in this way without finding the value, then it
is not in the array

92
Pseudocode for Binary Search

93
Recursive Method for Binary Search

94
© 2006 Pearson Addison-Wesley. All rights reserved
Execution of
the Method
search
(Part 1 of 2)

95
© 2006 Pearson Addison-Wesley. All rights reserved
Execution of the Method search
(Part 1 of 2)

96
Checking the search Method

1. There is no infinite recursion


• On each recursive call, the value of first is increased, or the
value of last is decreased
• If the chain of recursive calls does not end in some other way, then
eventually the method will be called with first larger than last
2. Each stopping case performs the correct
action for that case
• If first > last, there are no array elements between
a[first] and a[last], so key is not in this segment of the
array, and result is correctly set to -1
• If key == a[mid], result is correctly set to mid

97
Checking the search Method

3. For each of the cases that involve


recursion, if all recursive calls perform
their actions correctly, then the entire case
performs correctly
• If key < a[mid], then key must be one
of the elements a[first] through
a[mid-1], or it is not in the array
• The method should then search only those
elements, which it does
• The recursive call is correct, therefore the
entire action is correct

98
Checking the search Method

• If key > a[mid], then key must be one


of the elements a[mid+1] through
a[last], or it is not in the array
• The method should then search only those
elements, which it does
• The recursive call is correct, therefore the
entire action is correct
The method search passes all three tests:
Therefore, it is a good recursive method
definition

99
Efficiency of Binary Search

The binary search algorithm is extremely


fast compared to an algorithm that tries all
array elements in order
About half the array is eliminated from consideration right at the
start
Then a quarter of the array, then an eighth of the array, and so
forth

100
Efficiency of Binary Search

Given an array with 1,000 elements, the binary search


will only need to compare about 10 array elements to
the key value, as compared to an average of 500 for a
serial search algorithm
The binary search algorithm has a worst-case running
time that is logarithmic: O(log n)
A serial search algorithm is linear: O(n)
If desired, the recursive version of the method
search can be converted to an iterative version that
will run more efficiently

101
Iterative Version of Binary Search
(Part 1 of 2)

102
Iterative Version of Binary Search
(Part 2 of 2)

103

You might also like