Data Structures and
Algorithms
Lecture 7. Basic sorting algorithms
Sorting problem
Input
A sequence of n numbers a1, a2, . . . , an .
Output
A permutation (reordering) a1, a2, . . . , an of the
input sequence such that a1 a2 . . . an
The numbers are referred to as keys.
The sequences are typically stored in arrays.
Insertion sort
An efficient algorithm for sorting a small number of
elements.
In insertion sort, an array A[1..n] containing a
sequence of n that is to be sorted is given as a
parameter.
Insertion sort works the way many people sort a
bridge or gin rummy hand.
We start with an empty left hand and the cards face
down on the table.
We then remove one card at a time from the table
and insert it into the correct position in the left hand.
To find the correct position for a card, we compare it
with each of the cards already in the hand, from
right to left,
Insertion-sort procedure
Example
The operation of INSERTION-SORT on the array A = [5, 2, 4, 6,1, 3].
Array indices appear above the rectangles.
Values stored in the array positions appear within the rectangles.
(a)-(e) The iterations of the for loop of lines 1-8. In each iteration, the
black rectangle holds the key taken from A[j], which is compared with
the values in shaded rectangles to its left in the test of line 5. Shaded
arrows show array values moved one position to the right in line 6, and
black arrows indicate where the key is moved to line 8.
(f) The final sorted array.
Running time
The time cost of each statement and the number of times each
statement is executed.
The running time of the algorithm is the sum of running times
for each statement executed.
To compute T(n), the running time of INSERTION-SORT, we
sum the products of the cost and times columns.
Running time
Insertion Sort Analysis Best case
Best case: The array is already sorted.
Always find that A[i ] key upon the first time the
while loop test is run (when i = j 1).
All tj are 1.
Running time is
T(n) = c1n + c2(n 1) + c4(n 1) + c5(n 1) + c8(n 1)
= (c1 + c2 + c4 + c5 + c8)n (c2 + c4 + c5 + c8) .
Can express T(n) as an + b for constants a and b
(that depend on the statement costs ci ).
T(n) is a linear function of n.
Insertion Sort Analysis Worst case
Worst case: The array is in reverse sorted order.
Insertion Sort Analysis
Best Case O(n) Array is already sorted
Average Case O(n2)
Worst Case O(n2) The array is in reverse order
Selection Sort
Selection sort orders a list of values by repetitively
putting a particular value into its final position
More specifically:
find the smallest value in the list
switch it with the value in the first position
find the next smallest value in the list
switch it with the value in the second position
repeat until all values are in their proper places
Example
Selection Sort in Pseudocode
SelectionSort(A)
// GOAL: place the elements of A in ascending order
1 n := length[A]
2 for i := 1 to n
3 // GOAL: place the correct number in A[i]
4 j := FindIndexOfSmallest( A, i, n )
5 swap A[i] with A[j]
// L.I. A[1..i] the i smallest numbers sorted
6 end-for
7 end-procedure
FindIndexOfSmallest( A, i, n )
// GOAL: return j in the range [i,n] such
// that A[j]<=A[k] for all k in range [i,n]
1 smallestAt := i ;
2 for j := (i+1) to n
3 if ( A[j] < A[smallestAt] ) smallestAt := j
// L.I. A[smallestAt] smallest among A[i..j]
4 end-for
5 return smallestAt
6 end-procedure
Running time
Selecting the lowest element requires scanning all
n elements (this takes n - 1 comparisons) and
then swapping it into the first position.
Finding the next lowest element requires scanning
the remaining n - 1 elements and so on, for (n - 1)
+ (n - 2) + ... + 2 + 1 = n(n - 1) / 2 = (n2)
comparisons.
Each of these scans requires one swap for n - 1
swaps (the final element is already in place).
Thus, the comparisons dominate the running time,
which is (n2).
Bubble Sort
Suppose we have an array of data which is
unsorted:
starting at the front, traverse the array, find the
largest item, and move (or bubble) it to the top
with each subsequent iteration, find the next largest
item and bubble it up towards the top of the array
Bubble Sort
Starting with the first item, assume that it is the
largest
Compare it with the second item:
if the first is larger, swap the two,
otherwise, assume that the second item is the
largest
Continue up the array, either swapping or
redefining the largest item
Bubble Sort
After one pass, the largest item must be the last in
the list
Start at the front again:
the second pass will bring the second largest
element into the second last position
Repeat n 1 times, after which, all entries will be
in place
Bubble Sort: Example
Consider the unsorted
array to the right
We start with the element
in the first location, and
move forward:
if the current and next
items are in order,
continue with the next
item, otherwise
swap the two entries
Bubble Sort: Example
After one loop, the
largest element is in the
last location
Repeat the procedure
Bubble Sort: Example
Now the two largest
elements are at the end
Repeat again
Bubble Sort: Example
With this loop, 12 is
brought to its appropriate
location
Bubble Sort: Example
Finally, we swap the last
two entries to order them
At this point, we have a
sorted array
Pseudocode
procedure bubbleSort( A : list of sortable
items ) defined as:
n := length( A )
do
swapped := false
n := n - 1
for each i in 0 to n do:
if A[ i ] > A[ i + 1 ] then
swap( A[ i ], A[ i + 1 ] )
swapped := true
end if
end for
while swapped
end procedure
Worst-case Run-time Analysis
With the first loop, we must make n 1
comparisons
This equals the maximum number of swaps, thus
we need only consider this case
The total number of comparisons is:
n 1
n 1 n
k 1
n k n n 1
2
Worst-case Run-time Analysis
Consequently, the worst-case run-time of bubble
sort is O(n2)
The average case is no different: the form of the
algorithm requires O(n2) comparisons
Observation
Any algorithm which relies on pair-wise swaps
must consequently have an average run time of
O(n2)