0% found this document useful (0 votes)
26 views16 pages

tests of significance

Statistics

Uploaded by

kpaul4202
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
26 views16 pages

tests of significance

Statistics

Uploaded by

kpaul4202
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 16
6. TESTS OF SIGNIFICANCE (Small Samples) 6.0 Introduction: In the previous chapter we have discussed problems relating to large samples. The large sampling theory is based upon two important assumptions such as (a) The random sampling distribution of a statistic is approximately normal and (b) The values given by the sample data are sufficiently close to the population values and can be used in their place for the calculation of the standard error of the estimate. The above assumptions do not hold good in the theory of small samples. Thus, a new technique is needed to deal with the theory of small samples. A sample is small when it consists of less than 30 items. ( n< 30) Since in many of the problems it becomes necessary to take a small size sample, considerable attention has been paid in developing suitable tests for dealing with problems of small samples. The greatest contribution to the theory of small samples is that of Sir William Gosset and Prof. R.A. Fisher. Sir William Gosset published his discovery in 1905 under the pen name ‘Student’ and later on developed and extended by Prof. R.A.Fisher. He gave a test popularly known as ‘t-test’. 6.1 t - statistic definition: If X1, X2, ...Xn is a random sample of size n from a normal population with mean p and variance o”, then Student’ s t-statistic is defined as t= = -_mx, where x = =x is the sample mean n 143 and S$? = hex -x) n is an unbiased estimate of the population variance o It follows student’ s t-distribution with v =n-I d.f 6.1.1 Assumptions for students t-test: 1. 2. 3. The parent population from which the sample drawn is normal. The sample observations are random and independent. The population standard deviation o is not known. 6.1.2 Properties of t- distribution: - 2 3. 4. t-distribution ranges from —oo to oo just as does a normal distribution. Like the normal distribution, t-distribution also symmetrical and has a mean zero. t-distribution has a greater dispersion than the standard normal distribution. As the sample size approaches 30, the t-distribution, approaches the Normal distribution. Comparison between Normal curve and corresponding t - curve: 144 o7 6.1.3 Degrees of freedom (d.f): Suppose it is asked to write any four number then one will have all the numbers of his choice. If a restriction is applied or imposed to the choice that the sum of these number should be 50. Here, we have a choice to select any three numbers, say 10, 15, 20 and the fourth number is 5: [50 — (10 +15+20)]. Thus our choice of freedom is reduced by one, on the condition that the total be 50. therefore the restriction placed on the freedom is one and degree of freedom is three. As the restrictions increase, the freedom is reduced. The number of independent variates which make up the statistic is known as the degrees of freedom and is usually denoted by v (Nu) The number of degrees of freedom for n observations is n — k where k is the number of independent linear constraint imposed upon them. For the student’ s t-distribution. The number of degrees of freedom is the sample size minus one. It is denoted by v = n —1 The degrees of freedom plays a very important role in ¢ test of a hypothesis, When we fit a distribution the number of degrees of freedom is (n— k—-1) where n is number of observations and k is number of parameters estimated from the data. For e.g., when we fit a Poisson distribution the degrees of freedom is v=n—1-1 In a contingency table the degrees of freedom is (r—1) (¢ —1) where r refers to number rows and c refers to number of columns. Thus in a 3 x 4 table the d.f are (3-1) (4-1) = 6 d.fIna 2 x 2 contingency table the d.f are (2-1) (2-1) = 1 In case of data that are given in the form of series of variables in a row or column the d.f will be the number of observations in a series less one ie v = n-1 Critical value of t: The column figures in the main body of the table come under the headings to,100, to.so, to.o2s, to.c1o and to.oos. The subscripts 145 give the proportion of the distribution in ‘tail’ area. TI hus for two- tailed test at 5% level of significance there will be two rejection areas each containing 2.5% of the total area and the required column is headed to.o2s For example, ty (.05) for single tailed test = t, (0.025) for two tailed test ty (.01) for single tailed test = ty (0.005) for two tailed test Thus for one tailed test at 5% level the rejection area lies in one end of the tail of the distribution and the required column is headed to.0s- Critical value of t - distribution 61.472 peaeennit 20 fa +o The t-distribution has a number of applications in statistics, of which we shall discuss the following in the coming sections: (i) t-test for significance of single mean, population variance being unknown. (ii) t-test for significance of the difference between two sample means, the population variances being equal but unknown. (a) Independent samples (b) Related samples: paired t-test 6.2 Test of significance for Mean: We set up the corresponding null and alternative hypotheses as follows: 146 Ho: jt = jto; There is no significant difference between the sample mean and population Mean. Hy # fo (qt < Ho (or) [> Ho) Level of significance: 5% or 1% Calculation of statistic: Under Hp the test statistic is aie Ree where x = zx is the sample mean n and $t=—! yy¢x-39° or) t= LYK)’ n-1 n Expected value : t= a ~ student’ s t-distribution with (n-1) d.f vn Inference : If to < t, it falls in the acceptance region and the null hypothesis is accepted and if t, > t, the null hypothesis Hy may be rejected at the given level of significance. Example 1: Certain pesticide is packed into bags by a machine. A random sample of 10 bags is drawn and their contents are found to weigh (in kg) as follows: 50 49 52.44 45 48 46 45 49 45 Test if the average packing can be taken to be 50 kg. Solution: Null hypothesis: Ho : p = 50 kgs in the average packing is 50 kgs. 147 Alternative Hypothesis: Hy : 1 # 50kgs (Two -tailed ) Level of Significance: Let a= 0.05 Calculation of sample mean and $.D xX d=x-48 d 50 2 4 49 1 1 52. 4 16 44 4 16 45 =3 9 48 0 0 46 2 4 45 3 9 49 + 1 45 33 9 Total 7 69 X=A+ zd n =4g+ 22 10 = 48-0.7 =47.3 1 (ds) = —[69-~—+ 9 t 10 1 =. 64.1 =7,12 9 Calculation of Statistic: Under Hp the test statistic is : X-H VS? /n 148 A Gay? n-1 [za n ] = 2.262 Inference: Since ty > te , Ho is rejected at 5% level of significance and we conclude that the average packing cannot be taken to be 50 kgs. Example 2: A soap manufacturing company was distributing a particular brand of soap through a large number of retail shops. Before a heavy advertisement campaign, the mean sales per week per shop was 140 dozens. After the campaign, a sample of 26 shops was taken and the mean sales was found to be 147 dozens with standard deviation 16. Can you consider the advertisement effective? Solution: We are given n= 26; x = 147dozens; s=16 Null hypothesis: Hp: 1. = 140 dozens i.e. Advertisement is not effective. Alternative Hypothesis: Hy: 1 > 140kgs (Right -tailed) Calculation of statistic: Under the null hypothesis Ho, the test statistic is tp = 2B s/vn—1 _ ve i 16/25 16 149 =2.19 Expected value: te= follows t-distribution with (26-1) = 25d.f ee ae | s/¥n=1 = 1.708 Inference: Since to > te, Ho is rejected at 5% level of significance. Hence we conclude that advertisement is certainly effective in increasing the sales. 6.3 Test of significance for difference between two means: 6.3.1 Independent samples: Suppose we want to test if two independent samples have been drawn from two normal populations having the same means, the population variances being equal. Let x,, Hay0Xp, and yi, Ys Fn, be two independent random samples from the given normal populations. Null hypothesis: Ho : ji = H2 ie. the samples have been drawn from the normal populations with same means. Alternative Hypothesis: Hy: ba # Ha (Hi < be OF pr > pa) Test statistic: Under the Hp, the test statistic is 1 = t, we reject the null hypothesis. Example 3: A group of 5 patients treated with medicine ‘A’ weigh 42, 39, 48, 60 and 41 kgs: Second group of 7 patients from the same hospital treated with medicine ‘ B’ weigh 38, 42 , 56, 64, 68, 69 and 62 kgs. Do you agree with the claim that medicine ‘ B’ increases the weight significantly? Solution: Let the weights (in kgs) of the patients treated with medicines A and B be denoted by variables X and Y respectively. Null hypothesis: Ho : pi = pa ie. There is no significant difference between the medicines A and B as regards their effect on increase in weight. Alternative Hypothesis: Hi : i < pp (left-tail) ie. medicine B increases the weight significantly. Level of significance : Let a = 0.05 Computation of sample means and S.Ds Medicine A x xX-x (x= 46) (x-xy? 42 -4 16 39 Sy 49 48 2 4 60 14 196 41 -5 25 230 0 290 151 x= S5= = =46 n 5 Medicine B u y-y (y=57) |_y-yP 38 -19 361 42 -15 225 56 -l 1 64 7 49 68 ll 121 69 12 144 62 5 25 399 0 926 y= Zy _ 399 =57 n, qd a 1 > 2 Ss a ees [X&-x) +2Z0y-y)'] n, +n, - 1 =—[29 = 1216 5p 1290+ 926) Under Ho the test statistic is = 1.812 Inference: Since to < t, it is not significant. Hence Hp is accepted and we conclude that the medicines A and B do not differ significantly as regards their effect on increase in weight. Example 4: Two types of batteries are tested for their length of life and the following data are obtained: No of samples Mean life Variance (in hrs) Type A 9 600 121 Type B 8 640 144 Is there a significant difference in the two means? Solution: We are given ny=9; x =600hrs; 5;7=121; m=8; x2 =640hrs; —sy"=144 Null hypothesis: Ho : }1 = pa ie. Two types of batteries A and B are identical ie. there is no significant difference between two types of batteries. 153 Alternative Hypothesis: Hi: ju # 2 (Two- tailed) Level of Significance: Let a= 5% Calculation of statistics: Under Hp, the test statistic is ns; +n,8," n, +n, -2 _ 9x121+8%144 9+8-2 = 2241 _ 149.4 15 where S? = Expected value: follows t-distribution with 9+8-2 =15 d.f 154 Inference: Since tp > ty it is highly significant. Hence Hp is rejected and we conclude that the two types of batteries differ significantly as regards their length of life. 6.3.2 Related samples —Paired t-test: In the t-test for difference of means, the two samples were independent of each other. Let us now take a particular situations where (i) The sample sizes are equal; i.e., n) = ny = n(say), and (ii) The sample observations (x1, x2, ..-.%n) and (yi, ya, smn) are not completely independent but they are dependent in pairs. That is we are making two observations one before treatment and another after the treatment on the same individual. For example a business concern wants to find if a particular media of promoting sales of a product, say door to door canvassing or advertisement in papers or through T.V. is really effective. Similarly a pharmaceutical company wants to test the efficiency of a particular drug, say for inducing sleep after the drug is given. For testing of such claims gives rise to situations in (i) and (ii) above, we apply paired t-test. Paired — t test: Let di = Xi ~ Yi (i = 1, 2, ...n) denote the difference in the observations for the i* unit. Null hypothesis: Ho : p41 = Ha ie the increments are just by chance Alternative Hypothesis: Hy: pi # a (Hi > Ha (or) br < Ha) Calculation of test statistic: to= 4 °” |s7Jal| 2 where a-24 and g=ya-a pe ra? -29 ] n n-1 n-l n 155 Expected value: t= 7 -| follows t-distribution with n—I d.f S/vn Inference: By comparing to and t, at the desired level of significance, usually 5% or 1%, we reject or accept the null hypothesis. Example 5: To test the desirability of a certain modification in typists desks, 9 typists were given two tests of as nearly as possible the same nature, one on the desk in use and the other on the new type. The following difference in the number of words typed per minute were recorded: Typists A|BI/[C/]D/E/]F/|G/H|I Increase in number of words | 2 | 4 | 0 | 3 |-1| 4 |-3| 2 | 5 Do the data indicate the modification in desk promotes speed in typing? Solution: Null hypothesis: Ho 3 = ba i.e. the modification in desk does not promote speed in typing. Alternative Hypothesis: Hy: hi <2 (Left tailed test) Level of significance: Let a = 0.05 Typist d & A 2 4 B 4 16 c 0 0 D 3 9 E =I I F 4 16 G 3 9 H 2 4 I 5 25 Ed=16 | xd =84 Xd _ 16 des 2 i 9 1.778 1 2 S= [Xa - (Xd) ] n-1 n I 16)? gl84 oy = 469 = 2.635 Calculation of statistic: Under Hp the test statistic is = LTI8x3 _ 5 p94 Expected value: te ex follows t- distribution with 9-1 = 8 d.f = 1.860 Inference: When to < t, the null hypothesis is accepted. The data does not indicate that the modification in desk promotes speed in typing. Example 6: An IQ test was administered to 5 persons before and after they were trained. The results are given below: Candidates I IL IL IV Vv 1Q_ before 110 120 123 132 125 training 1Q after 120 118 125 136 121 0 Test whether there is any change in IQ after the training programme (test at 1% level of significance) Solution: Null hypothesis: . Ho : 1 = pa ie. there is no significant change in IQ after the training programme. 157 Alternative Hypothesis: Hy: pu # pa (two tailed test) Level of significance : Under Ho the test statistic is Inference: a=0.01 x io | 120 [| 123 132 125 __[_ Total y 120 118 125 136 121 : d=x-y | -10 2 -2 —4 4 -10 & 100 4 4 16 16 140 q-24_=10_, n 5 g= ope (24d) ] n-1 n 1 100. = =[140-—— qu 0 z ] =30 Calculation of Statistic: Since to < t, at 1% level of significance we accept the null hypothesis. We therefore, conclude that there is no change in IQ after the training programme. 158

You might also like