0% found this document useful (0 votes)
39 views

Assignment No 3 (Repaired)

This document contains information about testing for normality of data using the Kolmogorov-Smirnov test and Shapiro-Wilk test. It provides details on applying these tests, including advantages and disadvantages of the Kolmogorov-Smirnov test. An example data set on achievement motivation is analyzed to test for normality using the Kolmogorov-Smirnov statistic and the Shapiro-Wilk test in SPSS. Descriptive statistics are also reported.

Uploaded by

Ayesha Iqbal
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Assignment No 3 (Repaired)

This document contains information about testing for normality of data using the Kolmogorov-Smirnov test and Shapiro-Wilk test. It provides details on applying these tests, including advantages and disadvantages of the Kolmogorov-Smirnov test. An example data set on achievement motivation is analyzed to test for normality using the Kolmogorov-Smirnov statistic and the Shapiro-Wilk test in SPSS. Descriptive statistics are also reported.

Uploaded by

Ayesha Iqbal
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Assignment No 03

Name Sadiqua Iqbal

Roll no 19011513-029

Course Title Non parametric

Submitted to Dr! Muqaddas Javeid

Department of statistics

Hafiz Hayat Campus

Tests to detect normality:


The Kolmogorov–Smirnov test and the Shapiro–Wilk test  are most widely
used methods to test the normality of the data.

Shapiro–Wilk test:

The Shapiro-Wilk test is commonly used for small samples to determine whether
or not a sample fits a normal distribution.

While

The Kolmogorov-Smirnov test is  well-known tests of used for larger samples when
n>1000

Now,

Firstly we discussed the kolomogorov test

Definition:
The kolomogorov-Smirnov test is a non parametric test that
compares two probability distributions to determine if they are different. It is
used to test whether a sample comes from a specific distribution.

OR

In statistics,

the Kolmogorov–Smirnov test (K–S test or KS test) is a nonparametric


test of the equality of continuous (or discontinuous, one-dimensional probability
distributions that can be used to compare a sample with a reference probability
distribution (one-sample K–S test), or to compare two samples (two-sample K–S
test). 

ADVANTAGES OF KOLOMOGOROV TEST:


Kol mogor ov- Smir nov tests have the advantages that

( a)  The di stri buti on of stati stic does not depend on cum ul ative distr ibution function bei ng tested  

( b) T he test is exact.

Dis-ADVANTAGES OF KOLOMOGOROV TEST:


T hey have the di sadvantage that they ar e mor e sensitive to devi ations near the centr e of the distr ibution
than at the tail s
Numerical no 01
Given data:

Achievement
Moti vati on
49
49
49
50
53
53
53
54
54
54
55
56
56
56
57
58
58
58
59
60
61
61
61
61
61
63
64
64
64
65

Solution:

Achievement Sample size N Mean SD


Motivation
49 30 57.2 4.787915823
49 30 57.2 4.787915823
49 30 57.2 4.787915823
50 30 57.2 4.787915823
53 30 57.2 4.787915823
53 30 57.2 4.787915823
53 30 57.2 4.787915823
54 30 57.2 4.787915823
54 30 57.2 4.787915823
54 30 57.2 4.787915823
55 30 57.2 4.787915823
56 30 57.2 4.787915823
56 30 57.2 4.787915823
56 30 57.2 4.787915823
57 30 57.2 4.787915823
58 30 57.2 4.787915823
58 30 57.2 4.787915823
58 30 57.2 4.787915823
59 30 57.2 4.787915823
60 30 57.2 4.787915823
61 30 57.2 4.787915823
61 30 57.2 4.787915823
61 30 57.2 4.787915823
61 30 57.2 4.787915823
61 30 57.2 4.787915823
63 30 57.2 4.787915823
64 30 57.2 4.787915823
64 30 57.2 4.787915823
64 30 57.2 4.787915823
65 30 57.2 4.787915823

Next part,

Rank (Rank-1)/k Normal distribution Difference(G-F)


1 0 0.043388936 0.043388936
2 0.03333333 0.043388936 0.010055603
3
3 0.06666666 0.043388936 -0.02327773
7
4 0.1 0.06631826 -0.03368174
5 0.13333333 0.190186726 0.056853393
3
6 0.16666666 0.190186726 0.023520059
7
7 0.2 0.190186726 -0.009813274
8 0.23333333 0.251955338 0.018622005
3
9 0.26666666 0.251955338 -0.014711329
7
10 0.3 0.251955338 -0.048044662
11 0.33333333 0.322941124 -0.010392209
3
12 0.36666666 0.401049717 0.03438305
7
13 0.4 0.401049717 0.001049717
14 0.43333333 0.401049717 -0.032283617
3
15 0.46666666 0.483340296 0.01667363
7
16 0.5 0.566349327 0.066349327
17 0.53333333 0.566349327 0.033015993
3
18 0.56666666 0.566349327 -0.00031734
7
19 0.6 0.64652165 0.04652165
20 0.63333333 0.720660782 0.087327448
3
21 0.66666666 0.786304686 0.119638019
7
22 0.7 0.786304686 0.086304686
23 0.73333333 0.786304686 0.052971352
3
24 0.76666666 0.786304686 0.019638019
7
25 0.8 0.786304686 -0.013695314
26 0.83333333 0.887125681 0.053792348
3
27 0.86666666 0.922231406 0.055564739
7
28 0.9 0.922231406 0.022231406
29 0.93333333 0.922231406 -0.011101927
3
30 0.96666666 0.948354214 -0.018312452
7

Next,
Achievement motivation

Mean 57.2
Standard error 0.87414983
Median 57.5
Mode 61
Standard deviation 4.78791582
Sample Variance 22.9241379
Kurtosis -0.924434
Skewness -0.1505046
Range 16
Minimum 49
Maximum 65
Sum 1716
Count 30

Note:

The value for skewness & kurtosis between -2 & +2 are considered
acceptable in order to prove normal distribution .

Value of KS:
KS= 0.119638019

Null & Alternative Hypothesis :

Ho: Data is normal.

H1: Data is not normal

Now,

Check on SPSS:

Method:
Analysis

Explore
Then check normality

Tests of Normality
Kolmogorov-Smirnova Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Achievemotivation .123 30 .200 *
.948 30 .146

Normality :
The degree to which the sample data distribution corresponds to a
normal distribution (In graphical form, the normal distribution appears as
symmetrical and bell-shaped).

Descriptives
Statistic Std. Error
Achievemotivation Mean 57.1000 .88454
95% Confidence Interval for Lower Bound 55.2909
Mean Upper Bound 58.9091
5% Trimmed Mean 57.1296
Median 57.5000
Variance 23.472
Std. Deviation 4.84483
Minimum 49.00
Maximum 65.00
Range 16.00
Interquartile Range 8.00
Skewness -.102 .427
Kurtosis -1.025 .833

Histogram:
Graphical display of the distribution of a variable. By forming
frequency counts in categories, the shape of the variable’s distribution can be
shown. Used to make a visual comparison to the normal distribution…
QQ-plot:
Visual method for identifying whether two sets of data are drawn from
the same distribution. The QQ-plot shows a reference line at a 45 degrees angle, if
the two data sets are drawn from the same distribution, the points will fall on that
line
Conclusion:
Conclusion:
Our sample size is 30 & wants to check level of significance at 0.05…

So our critical value is 0.242

Our KS test statistic is 0.11964

Which is less than critical value which means that null hypothesis is accepted
means there is no difference between two distributions.

__________________________
(SHAPIRO-WILK TEST)

Data scientists usually have to check if data is normally distributed. An


example is the normality check on the residuals of  linear regression in order to
correctly use the F-test. One way to do that is through the Shapiro-Wilk test,
which is a hypothesis test applied to a sample with a null hypothesis that the
sample stems from a normal distribution .

Definition:
Shapiro-Wilk test is a hypothesis test that evaluates whether a data
set is normally distributed. It evaluates data from a sample with the null
hypothesis that the data set is normally distributed. A large p-value indicates the
data set is normally distributed, a low p-value indicates that it isn’t normally
distributed.

Another definition:
The Shapiro-Wilk test is a hypothesis test that is applied to a
sample with a null hypothesis that the sample has been generated from a normal
distribution. If the p-value is low, we can reject such a null hypothesis and say
that the sample has not been generated from a normal distribution .

It’s an easy-to-use statistical tool that can help us find an answer to the
normality check we need, but it has one flaw: It doesn’t work well with large data
sets. The maximum allowed size for a data set depends on the implementation, but
in  Python , we see that a sample size larger than 5,000 will give us an approximate
calculation for the p-value

. Advantages of the Shapiro-Wilk Test:


1: The Shapiro-Wilk test for normality is a very simple-to-use tool of statistics
to assess the normality of a data set.

2: I typically apply it after creating  data visualization  set either via a


histogram and/or a Q-Q plot.

3: It’s a very useful tool to ensure that a normality requirement is satisfied every
time we need it, and it must be present in every data scientist’s toolbox.

Assumption of Shapiro-Wilk Test:


1: If the Si g.  val ue of the Shapi ro- Wil k Test i s gr eater than 0.05, the data i s norm al .

Note:
If the significance value of the Shapiro wilk test is greater than 0.05,
the data is normal…
Null & Alternative Hypothesis:
Ho: the sample belongs to normal distribution…

H1: the sample does not belong to normal distribution…

Level Of Significance:
𝜶 =0.05

Test Statistic

Shapiro-Wilk Tables

Given data:
Achievement Motivation-(xi)
49
49
49
50
53
53
53
54
54
54
55
56
56
56
57
58
58
58
59
60
61
61
61
61
61
63
64
64
64
65

Next,

Achievement Motivation-(xi) x-bar (x-x)2 ai (ai)(xi)


49 57.2 67.24 0.4254 20.8446
49 2401 0.2944 14.4256
49 2401 0.2487 12.1863
50 2500 0.2148 10.74
53 2809 0.187 9.911
53 2809 0.163 8.639
53 2809 0.1415 7.4995
54 2916 0.1219 6.5826
54 2916 0.1036 5.5944
54 2916 0.0862 4.6548
55 3025 0.0697 3.8335
56 3136 0.0537 3.0072
56 3136 0.0381 2.1336
56 3136 0.0227 1.2712
57 3249 0.0076 0.4332
58 3364 -0.4254 -24.6732
58 3364 -0.2944 -17.0752
58 3364 -0.2487 -14.4246
59 3481 -0.2148 -12.6732
60 3600 -0.187 -11.22
61 3721 -0.163 -9.943
61 3721 -0.1415 -8.6315
61 3721 -0.1219 -7.4359
61 3721 -0.1036 -6.3196
61 3721 -0.0862 -5.2582
63 3969 -0.0697 -4.3911
64 4096 -0.0537 -3.4368
64 4096 -0.0381 -2.4384
64 4096 -0.0227 -1.4528
65 4225 -0.0076 -0.494
96486.24 -18.111

Next,

W(NUMINATOR)=-18.111^2

W-DENOMINATOR=96486.24

W=18.112/96486.24=0.18771589

p-value= 0.927

Conclusion:

p-value>0.05

On the basis on provided


sample researcher accepts the
null hypothesis …

________________________________________________________

You might also like