Journal of Statistical Computation and Simulation
ISSN: 0094-9655 (Print) 1563-5163 (Online) Journal homepage: http://www.tandfonline.com/loi/gscs20
Ranked set sampling with unequal samples for
skew distributions
Dinesh S. Bhoj & Debashis Kushary
To cite this article: Dinesh S. Bhoj & Debashis Kushary (2015): Ranked set sampling with
unequal samples for skew distributions, Journal of Statistical Computation and Simulation,
DOI: 10.1080/00949655.2015.1028405
To link to this article: http://dx.doi.org/10.1080/00949655.2015.1028405
Published online: 01 Apr 2015.
Submit your article to this journal
Article views: 39
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=gscs20
Download by: [COMSATS Headquarters]
Date: 02 December 2015, At: 21:34
Journal of Statistical Computation and Simulation, 2015
http://dx.doi.org/10.1080/00949655.2015.1028405
Ranked set sampling with unequal samples
for skew distributions
Downloaded by [COMSATS Headquarters] at 21:34 02 December 2015
Dinesh S. Bhoj and Debashis Kushary
Department of Mathematical Sciences, Rutgers University-Camden, Camden, NJ 08102, USA
(Received 20 November 2014; accepted 8 March 2015)
A ranked set sampling procedure with unequal samples for positively skew distributions (RSSUS) is
proposed and used to estimate the population mean. The estimators based on RSSUS are compared with
the estimators based on ranked set sampling (RSS) and median ranked set sampling (MRSS) procedures.
It is observed that the relative precisions of the estimators based on RSSUS are higher than those of the
estimators based on RSS and MRSS procedures.
Keywords: heavy right tail distributions; log-normal distribution; mean square error; median rankedset
sampling; Pareto distribution; positively skew distribution; relative precision; unbiased estimator; Weibull
distribution
1.
Introduction
The ranked set sampling (RSS) procedure has been used advantageously in agriculture, environmental, ecological, and recently in human studies where the exact measurement of unit is
either difficult or expensive. For such situations, McIntyre [1] introduced RSS to estimate the
population mean. The RSS is a cost-efficient alternative to simple random sampling (SRS) if the
observations can be ranked according to the characteristic under investigation by means of visual
inspection or other methods not requiring actual measurements. McIntyre indicated that the use
of RSS is more powerful and superior to SRS procedure to estimate the population mean. However, Dell and Clutter [2] and Takahasi and Walkimoto [3] were the first to provide mathematical
foundations for RSS. Dell and Clutter [2] also showed that the estimator for population mean
based on RSS is at least as efficient as the estimator based on SRS with the same number of measurements even when there are ranking errors. Recently, RSS has been used in the parametric
setting.[48] Most of the distributions considered by these investigators belong to the family of
random variables with cumulative distribution function for the form F((x )/ ), where and
are the location and scale parameters, respectively. Bhoj [9] proposed a parametric new ranked
set sampling (NRSS) procedure to estimate the population mean. In NRSS, you select only half
the random samples by doubling the sample size. Then, from each sample you select two observations. The selection of these two observations depend on the distribution under consideration.
The variance for the estimator for population mean based on NRSS has smaller variance than the
one based on RSS.
*Corresponding author. Email: [email protected]
2015 Taylor & Francis
D.S. Bhoj and D. Kushary
Downloaded by [COMSATS Headquarters] at 21:34 02 December 2015
The selection of ranked set sample of size n involves drawing n random samples with n units
in each sample. The n units in each sample are ranked by using judgement or other methods not
requiring actual measurements. The unit with the lowest rank is measured from the first sample,
the unit with the second lowest rank is observed from the second sample, and this procedure is
continued until the unit with the highest rank is measured from the last sample. The n2 ordered
observations in n samples can be displayed in the matrix form as
x(11)
x(21)
..
.
x(12)
x(22)
..
.
...
...
..
.
x(1n) ,
x(2n) ,
..
.
x(n1)
x(n2)
...
x(nn).
We measure only n(x(ii) , i = 1, 2, . . . , n) observations and they constitute RSS. We note that
the n observations are independently but not identically distributed. In RSS, n is usually small,
and therefore, in order to increase sample size, the above whole procedure is repeated r times.
For convenience, without loss of generality, we assume that r = 1.
2.
Variations of RSS
In practice, the distribution of the variable under consideration is unknown. Therefore, the estimation of the population mean, , under the nonparametric framework is very important. There
are various modifications of RSS to get better estimator for . One of the popular schemes is
to use the median ranked set sampling (MRSS).[4,10,11] In the MRSS procedure, we use the
n2 ranked observations as in RSS. However, we measure the observation with rank (n + 1)/2
from each sample if n is odd. If n = 2m is even, we measure the mth-order statistics from the
first m samples and the (m + 1)th-order statistics from the last m samples. In recent years, the
investigators have considered the varied set size RSS and RSS with random sub samples.[12,13]
In this paper, we concentrate on ranked set sampling with unequal samples (RSSU) proposed
by Bhoj.[14] In RSSU, we draw n samples, where the size of the ith sample is ni = 2i 1, for i =
1, 2, . . . , n. The steps in RSSU are the same as in RSS. In both sampling procedures, we measure
accurately only n observations. In RSSU, we rank only n2 1 observations. When n is even,
half the sample sizes are less than n and the other half are greater than n. In the case of odd n,
one sample is of size n, (n 1)/2 are smaller than n, and the other (n 1)/2 samples are greater
than n. Bhoj [14] proposed estimators for using RSSU and showed that the estimators are
superior to the estimators based on RSS and MRSS when the distributions under consideration
are symmetrical or moderately skew. However, the proposed estimators based on RSSU do not
work well if the distributions are highly skewed. In practice, the data obtained are nonnegative
and skewed with a heavy right tail. Hence, we concentrate on the estimators for by using RSSU
for highly positively skewed distributions.
3.
Estimation of the population mean
McIntyre [1] proposed the estimator for population mean, , based on RSS as
1
x(ii) .
n i=1
n
This is an unbiased estimator of with the property that Var()
< Var(x), where x is the
sample mean based on the simple random sample of size n.
Journal of Statistical Computation and Simulation
The estimator, , based on MRSS, defined in Section 2, is
m
n
1
x(im) +
x(im+1)
for even n,
n
i=1
i=m+1
=
n
1
x(ik) ,
for odd n.
n
i=1
Downloaded by [COMSATS Headquarters] at 21:34 02 December 2015
is an unbiased estimator for when the distribution is symmetric around . In this paper, we
are going to use the highly positively skew distributions. For these distributions, is a biased
estimator for except for n = 2. In this case, for comparison purposes with other estimators, we
use the mean square error (MSE) of , where MSE = variance + (bias)2 .
Now we propose a set of estimators for based on RSSU. Bhoj [14] proposed the estimators
for , which are weighted averages of x(ii)ni , where x(ii)ni is the ith order statistic from a sample of
size ni . The weights used were proportional to ni + h, where 0 h 1. These weights worked
quite well when the distributions under consideration are symmetric around or moderately
skew. However, these weights are not appropriate for highly positively skew distributions with
heavy right tail. In this paper, we propose the estimators for which are weighted linear combinations of x(ii)ni for the heavy right tail distributions. We have chosen the appropriate weights
in estimating for extremely positively skewed distributions. These distributions may have a
high coefficient of skewness or extreme values of coefficient of variation or both. In this paper,
we have chosen the well-known log-normal and Weibull (.5) with heavy right tail distributions.
In addition, we also selected two distributions from Pareto family which are heavily used in
studying income distributions. For these four distributions, the means and variances of the order
statistics are readily available in the literature for computations; see Harter and Balakrishnan.[15]
We now propose the following set of nonparametric estimators for based on RSSUS:
k =
n
wk x(ii)ni ,
k = 1, 2, . . . , 6.
i=1
We considered various weights that were based on the ratios wi /w1 and are given by
wi
= ni + (ni ni1 + di hi )h where
w1
i
1
di = 2
, h2 = 1, h3 = ,
(i 6)
h
h4 = h
and
0 < h 1.
The values of w1 for different values of n are determined so that the new set of estimators for
based on RSSUS would perform better than the estimators for based on RSS and MRSS
procedures for the chosen four heavy right tail distributions. We use
w1 =
(n1 + h1 )
,
Di
i = 2, 3, and 4 for n = 2, 3 and 4 where
Di = n2 + (2n 3)h + (i 2)
for i = 2 and 3,
D4 = n2 + 1 + (2n 3)h + 0.4h2
for n = 4, and
n(n 2 + |(i + ch)/(i2 6)|)
100
where c = 0 for i = 2 and 3, and c = 4 for i = 4.
h1 =
D.S. Bhoj and D. Kushary
In order to keep the number of weights within reasonable limits, in this article, we used only 5
values of wk with h = 0.75, 0.80, 0.85, 0.9, and 0.95. The main reason for the choice of the values
of h was, for some distributions, near optimal ratios of the weights belong to some values of h.
For example, h = 0.75 gives near optimal values of the ratios of weights for Weibull distribution
for n 3 and h = 0.95 gives near optimal values of the ratios of weights for Pareto(5) and
log-normal distributions for n = 2.
Downloaded by [COMSATS Headquarters] at 21:34 02 December 2015
4.
Comparison of estimators
In this section, we compare the various estimators for based on RSS, MRSS and RSSUS. For
this purpose, we define the following nonparametric relative precisions (RPNs):
Var()
for k = 1, 2, . . . , 5.
MSE( k )
Var()
if is a biased estimator
MSE( )
RPN6 =
Var()
if is an unbiased estimator.
Var( )
RPNk =
We note that is always an unbiased estimator for . However, k is a biased estimator for
skew distributions.
In order to minimize the number of columns of RPNs, we did not give the columns of RPNs for
comparing with k . One can easily use RPNk /RPN6 for comparison of the estimators based
on MRSS and RSSUS. k is better than if RPNk > RPN6 . The values of RPNj , j = 1, 2, . . . , 6
are presented in Table 1 for the four distributions and the three small sample sizes. The biases and
variances of the estimators based on RSSUS and MRSS are given in Tables 2 and 3, respectively.
We note that k , k = 1, 2, . . . , 5 based on RSSUS are all superior to the estimators of based on
RSS and MRSS for all distributions and sample sizes. The gains in precisions of the estimators
of based on RSSUS over the estimator based on RSS are substantial. However, the gains
in precisions of k over the estimators based on MRSS are very good to marginal depending
on the value of n and the distribution. We note that the values of RPNk , for K = 1, 2, . . . , 5
increase with n for Pareto(2.5) and log-normal distributions. However, this property does not
hold for Pareto(5) and Weibull(0.5) which are extremely heavy right tail distributions. For these
distributions, when n increases from three to four, the relative precisions decrease. In these cases,
Table 1.
Nonparametric relative precisions for the estimators of .
Distribution
RPN1
RPN2
RPN3
RPN4
RPN5
RPN6
Pareto(5)
2
3
4
2
3
4
2
3
4
2
3
4
1.913
2.734
1.962
4.086
6.575
9.115
2.184
2.631
2.661
2.654
2.984
2.472
1.930
2.732
1.937
4.129
6.598
9.163
2.263
2.632
2.664
2.769
2.984
2.472
1.962
2.606
1.913
4.155
6.872
9.212
2.244
2.651
2.666
2.743
2.987
2.472
1.913
2.732
1.889
4.164
6.624
9.260
2.272
2.630
2.669
2.789
2.983
2.472
1.886
2.737
1.866
4.193
6.621
9.308
2.283
2.627
2.671
2.810
2.981
2.472
1.000
2.057
1.859
1.000
5.637
5.039
1.000
2.546
2.277
1.000
2.958
2.470
Pareto(2.5)
Log-normal
Weibull(0.5)
Journal of Statistical Computation and Simulation
Table 2.
Bias for the estimators of .
Distribution
Bias1
Bias2
Bias3
Bias4
Bias5
Bias6
Pareto(5)
2
3
4
2
3
4
2
3
4
2
3
4
0.0072
0.0000
0.0555
0.1101
0.1594
0.1083
0.2532
0.3587
0.3442
0.6892
0.9540
1.0222
0.0375
0.0000
0.0564
0.1657
0.1590
0.1074
0.3035
0.3585
0.3437
0.7377
0.9547
1.0224
0.0250
0.0002
0.0573
0.1506
0.1410
0.1065
0.2904
0.3434
0.3431
0.7266
0.9432
1.0226
0.0419
0.0000
0.0581
0.1719
0.1593
0.1056
0.3099
0.3593
0.3425
0.7461
0.9568
1.0227
0.0476
0.0000
0.0590
0.1795
0.1614
0.1046
0.3175
0.3616
0.3420
0.7556
0.9600
1.0229
0.0000
0.0595
0.0596
0.0000
0.2244
0.2244
0.0000
0.3969
0.3969
0.0000
0.9444
0.9444
Pareto(2.5)
Log-normal
Weibull(0.5)
Downloaded by [COMSATS Headquarters] at 21:34 02 December 2015
Table 3.
Variance for the estimators of .
Distribution
Var1
Var2
Var3
Var4
Var5
Var6
Pareto(5)
2
3
4
2
3
4
2
3
4
2
3
4
0.0221
0.0090
0.0053
0.2386
0.0728
0.0391
0.8367
0.3131
0.1798
2.8691
0.8972
0.4708
0.0206
0.0090
0.0053
0.2206
0.0726
0.0391
0.7773
0.3132
0.1799
2.6607
0.8961
0.4704
0.0210
0.0092
0.0053
0.2238
0.0741
0.0390
0.7924
0.3206
0.1800
2.7078
0.9160
0.4700
0.0204
0.0090
0.0053
0.2164
0.0721
0.0389
0.7697
0.3128
0.1801
2.6259
0.8927
0.4696
0.0202
0.0089
0.0053
0.2121
0.0714
0.0389
0.7610
0.3118
0.1802
2.5877
0.8875
0.4693
0.0424
0.0084
0.0053
1.0243
0.0642
0.0417
1.9672
0.2990
0.1912
8.8750
0.9311
0.6250
Pareto(2.5)
Log-normal
Weibull(0.5)
we note from Tables 2 and 3 that although the variances of k , k = 1, 2, . . . , 5 decrease with n,
the biases increase as n increases. Therefore, the MSE of the estimators increases for extremely
heavy right tail distributions. Moreover, the estimators based on RSSUS are adversely affected
by the extreme values of means and variances of the extremely heavy tail distributions since
RSSUS uses n1 = 1. We note that the estimators based on MRSS are not directly affected by the
extreme values of means and variances of the probability distributions. We observe from Table 1
that the relative precision of the estimator based on MRSS decreases as n increases from three to
four for all distributions considered in this paper.
5.
Conclusions
In this paper, we proposed RSS procedure with unequal samples for skew distributions (RSSUS).
The set of estimators for the population mean based on RSSUS are derived under nonparametric
settings. The proposed estimators are weighted linear combinations of RSSU observations, where
the weights are functions of sample sizes. These estimators are compared with the estimators
based on RSS and median ranked set sampling (MRSS) schemes. We computed the relative
precisions of the proposed estimators for four heavy right tail distributions and sample sizes 2, 3
and 4. The numerical computations show that all estimators based on RSSUS are better than the
estimator based on RSS for all distributions and sample sizes considered in this paper. The gains
in precision are substantial. The relative precisions of the estimators based on RSSUS over the
D.S. Bhoj and D. Kushary
estimator based on MRSS for n = 2 and 3 are very good. However, the gains in precisions are
marginal for n = 4. We recommend the RSSUS procedure for heavy right tail distributions and
sample sizes n 4. In order to increase the sample size, the cycle may be repeated r 2 times.
Disclosure statement
No potential conflict of interest was reported by the authors.
Downloaded by [COMSATS Headquarters] at 21:34 02 December 2015
References
[1] McIntyre GA. A method of unbiased selective sampling, using ranked sets. Aust J Agric Res. 1952;3:385390.
[2] Dell TR, Clutter JL. Ranked set sampling theory with order statistics background. Biometrics. 1972;28:545555.
[3] Takahasi K, Wakimoto K. On unbiased estimates of the population mean based on the sample stratified by means
of ordering. Ann Inst Statist Math. 1968;20:131.
[4] Bhoj DS. Estimation of parameters using modified ranked set sampling. In: Ahsanullah M, editor. Applied statistical
science. Vol. II. New York: Nova Science; 1997. p. 145163.
[5] Bhoj DS, Ahsanullah M. Estimation of parameters of the generalized geometric distribution using ranked set
sampling. Biometrics. 1996;52:685694.
[6] Lam K, Sinha BK, Zhong W. Estimation of parameters in the two-parameter exponential distribution using ranked
set sample. Ann Inst Statist Math. 1994;46:723736.
[7] Lam K, Sinha BK, Zhong W. Estimation of location and scale parameters of a logistic distribution using a ranked set
sample. In: David Hubert A, Nagaraja HN, Sen PK, Morrison DF, editors. Collected essays in honor of professor.
New York: Springer; 1995. p. 189197.
[8] Stokes SL. Parametric ranked set sampling. Ann Inst Statist Math. 1995;47:465482.
[9] Bhoj DS. New parametric ranked set sampling. J Appl Statist Sci. 1997;6:275289.
[10] Muttlak HA. Median ranked set sampling. J Appl Statist Sci. 1997;6:245255.
[11] ztrk O, Wolfe DA. Alternative ranked set sampling protocols for the sign test. Statist Probab Lett. 2000;47:1523.
[12] Amiri S, Modarres R, Bhoj DS. Ranked set sampling with random subsamples. J Stat Comput Simul. 2015;85(5):
935946.
[13] Samawi HM. Varied set size ranked set sampling with applications to mean and ratio estimation. Int J Model Simul.
2011;31:613.
[14] Bhoj DS. Ranked set sampling with unequal samples. Biometrics. 2001;57:957962.
[15] Harter HL, Balakrishnan N. CRC handbook of tables for the use of order statistics in estimation. Boca Raton: CRC
Press; 1996.