0% found this document useful (0 votes)
184 views

Analysis of Algorithms - Medians and Order Statistics

The document discusses algorithms for finding the ith order statistic of a set of n elements. Specifically: - The ith order statistic is the ith smallest element in the set. The minimum is the 1st order statistic and the maximum is the nth order statistic. - A naive algorithm takes O(n log n) time by first sorting the elements. More efficient algorithms take O(n) expected time using a randomized divide-and-conquer approach called randomized selection. - Randomized selection works similarly to quicksort but uses a randomized partition to evenly divide the set in half with high probability, resulting in expected linear time performance.

Uploaded by

felix kagota
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
184 views

Analysis of Algorithms - Medians and Order Statistics

The document discusses algorithms for finding the ith order statistic of a set of n elements. Specifically: - The ith order statistic is the ith smallest element in the set. The minimum is the 1st order statistic and the maximum is the nth order statistic. - A naive algorithm takes O(n log n) time by first sorting the elements. More efficient algorithms take O(n) expected time using a randomized divide-and-conquer approach called randomized selection. - Randomized selection works similarly to quicksort but uses a randomized partition to evenly divide the set in half with high probability, resulting in expected linear time performance.

Uploaded by

felix kagota
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

6/15/2011

ith Order Statistic


The ith order statistic of a set of n elements is the ith smallest element The minimum of a set of elements is the first order statistic while the maximum is the nth order statistic The ith order statistic is for instance used to create filters in image processing The median is the item at the halfway point of the set (when the set is sorted) If n is odd, the median is the item at position (n+1)/2 If n is even, there are two medians, but we take the median at position (n+1)/2 also called lower median

MSc Computer Science ICS 801 Design and Analysis of Algorithms

medians and order statistics

ith Order Statistic


We need to be able to select the ith order statistic from a set of n numbers The problem is stated formally as follows:

1st and nth Order Statistics The 1st and nth order statistics can be got in using n-1 comparisons using the following algorithm and its variants minimum(A) 1 min=A[1] 2 for i=2 to n 3 if min>A[i] i >A[i] 4 min=A[i] 5 return min Exercise: 1. Analyze the algorithm to get its running time 2. Modify the algorithm to get the 2nd order statistic
3

Input: A set A of n numbers, an integer 1 i n Output: the ith smallest value in A.


The problem can be solved in O(nlogn) time if we sort the numbers Practical algorithms exist for solving the problem in O(n) time

medians and order statistics

medians and order statistics

Selection in Expected Linear Time There is a divide and conquer algorithm for the selection problem Works in the same way as quick sort in partitioning the array of numbers While as quick sort is expected to run in (nlogn), RANDOMIZED SELECT RANDOMIZED-SELECT is (n) Uses RANDOMIZED-PARTITION to increase the likelihood of items being partitioned into two equal sets The algorithm is given in the next slide

Selection in Expected Linear Time RANDOMIZED-SELECT(A, p, r, i) 1 if p = r 2 return A[p] 3 q RANDOMIZED-PARTITION(A, p, r) 4 kq-p+1 5 if i = k // the pivot value is the answer 6 return A[q] 7 elseif i < k 8 return RANDOMIZED-SELECT(A, p, q - 1, i) 9 else return RANDOMIZED-SELECT(A, q + 1, r, i - k)

medians and order statistics

medians and order statistics

6/15/2011

Selection in Expected Linear Time


1 2 3 RANDOMIZED-PARTITION(A, p, r) i RANDOM(p, r) exchange A[r] A[i] return PARTITION(A, p, r)

Selection in Expected Linear Time


If we assume that the randomization of the array A makes the two partitions of A to be of equal size, then the following summation can be used to describe the running time for the algorithm Note that we sum the time used to partition the array

PARTITION(A, p, r) 1 x A[r] 2 ip-1 3 for j p to r - 1 4 if A[j] x 5 ii+1 6 exchange A[i] A[j] 7 exchange A[i + 1] A[r] 8 return i + 1
medians and order statistics
7

n + n + n + ... + 1 2 4 = n(1 + 1 + 1 + ... + 1 ) 2 4 n = n ( 1 )i 2


i =0 log n

Using the following geometric series,

x
i =0

1 = 1 x , if | x |< 1

medians and order statistics

Selection in Expected Linear Time


Then we have

n ( 1 )i 2
i =0

log n

< n( 11 .5 ) 0 = 2n = ( n )

medians and order statistics

You might also like