UJI CHI-SQUARE DAN
KORELASI PERINGKAT SPEARMAN
Uji Chi-Square
Uji Chi-square dapat digunakan untuk:
Menguji kesesuaian (goodness of fits), apakah
frekuensi pengamatan setiap kategorik setiap
peubah sama dengan frekuensi harapannya.
Menguji kebebasan (independensi) antar dua
peubah kategorik
UJI KESESUAIAN (GOODNESS
OF FIT)
Langkah-langkah pengujian :
Hipotesis H0 : P1=P10,P2=P20,.....,Pk=Pk0
Taraf Nyata
Statistik Uji
o e
f f 2
2
fe
Daerah penerimaan Ho
Keputusan : Tolak Ho, jika 2 hitung > 2 tabel (db=k-1)
Chi-Square Goodness-of-Fit Test
Does sample data conform to a hypothesized
distribution?
Examples:
Do sample results conform to specified expected
probabilities?
Are technical support calls equal across all days of
the week? (i.e., do calls follow a uniform
distribution?)
Do measurements from a production process follow
a normal distribution?
Chi-Square Goodness-of-Fit Test
(continued)
Are technical support calls equal across all days of the
week? (i.e., do calls follow a uniform distribution?)
Sample data for 10 days per day of week:
Sum of calls for this day:
Monday 290
Tuesday 250
Wednesday 238
Thursday 257
Friday 265
Saturday 230
Sunday 192
= 1722
Logic of Goodness-of-Fit Test
If calls are uniformly distributed, the 1722 calls
would be expected to be equally divided across
the 7 days:
1722
246 expected calls per day if uniform
7
Chi-Square Goodness-of-Fit Test: test to see if
the sample results are consistent with the
expected results
Observed vs. Expected
Frequencies
Observed Expected
Oi Ei
Monday 290 246
Tuesday 250 246
Wednesday 238 246
Thursday 257 246
Friday 265 246
Saturday 230 246
Sunday 192 246
TOTAL 1722 1722
Chi-Square Test Statistic
H0: The distribution of calls is uniform
over days of the week
H1: The distribution of calls is not uniform
The test statistic is
K
(O E ) 2
2 i i
(where d.f. K 1)
i1 Ei
where:
K = number of categories
Oi = observed frequency for category i
Ei = expected frequency for category i
The Rejection Region
H0: The distribution of calls is uniform
over days of the week
H1: The distribution of calls is not uniform
K
(O E ) 2
2 i i
i1 Ei
Reject H0 if
2 2
α
(with k – 1 degrees of
freedom) 2
0
Do not Reject H0
reject H0 2
Chi-Square Test Statistic
H0: The distribution of calls is uniform
over days of the week
H1: The distribution of calls is not uniform
(290 246)2
(250 246)2
(192 246)2
2 ... 23.05
246 246 246
k – 1 = 6 (7 days of the week) so use 6
degrees of freedom:
2.05 = 12.5916
= .05
Conclusion:
2 = 23.05 > 2 = 12.5916 so reject 2
0 Do not Reject H0
H0 and conclude that the reject H0
distribution is not uniform
2.05 = 12.5916
UJI INDEPENDENSI
Langkah Pengujian :
Menyusun hipotesa:
H0: Pij=PixPj vs H1: Pij≠PixPj
Menentukan frekuensi harapan tiap sel
jumlah _ baris jumlah _ kolom
fe
jumlah _ total
Menghitung nilai Chi-Kuadrat hitung
f fe
2
2 o
fe
Menentukan daerah penolakan H0, jika 2 hitung > 2 tabel (db=(a-1)(b-1))
Mengambil keputusan
Contingency Tables
Contingency Tables
Used to classify sample observations according
to a pair of attributes
Also called a cross-classification or cross-
tabulation table
Assume r categories for attribute A and c
categories for attribute B
Then there are (r x c) possible cross-classifications
r x c Contingency Table
Attribute B
Attribute A 1 2 ... C Totals
1 O11 O12 … O1c R1
2 O21 O22 … O2c R2
. . . … . .
. . . … . .
. . . … . .
r Or1 Or2 … Orc Rr
Totals C1 C2 … Cc n
Test for Association
Consider n observations tabulated in an r x c
contingency table
Denote by Oij the number of observations in
the cell that is in the ith row and the jth column
The null hypothesis is
H0 : No association exists
between the two attributes in the population
The appropriate test is a chi-square test with
(r-1)(c-1) degrees of freedom
Test for Association
(continued)
Let Ri and Cj be the row and column totals
The expected number of observations in cell row i and
column j, given that H0 is true, is
R iC j
Eij
n
A test of association at a significance level is based on the
chi-square distribution and the following decision rule
r c (Oij Eij )2
Reject H0 if χ 2 χ (r2 1)c 1),α
i1 j1 Eij
Contingency Table Example
Left-Handed vs. Gender
Dominant Hand: Left vs. Right
Gender: Male vs. Female
H0: There is no association between
hand preference and gender
H1: Hand preference is not independent of gender
Contingency Table Example
(continued)
Sample results organized in a contingency table:
Hand Preference
sample size = n = 300:
Gender Left Right
120 Females, 12 were
left handed Female 12 108 120
180 Males, 24 were
left handed Male 24 156 180
36 264 300
Logic of the Test
H0: There is no association between
hand preference and gender
H1: Hand preference is not independent of gender
If H0 is true, then the proportion of left-handed females
should be the same as the proportion of left-handed males
The two proportions above should be the same as the
proportion of left-handed people overall
Finding Expected
Frequencies
120 Females, 12 were Overall:
left handed
180 Males, 24 were P(Left Handed)
left handed
= 36/300 = .12
If no association, then
P(Left Handed | Female) = P(Left Handed | Male) = .12
So we would expect 12% of the 120 females and 12% of the 180 males to be
left handed…
i.e., we would expect (120)(.12) = 14.4 females to be left handed
(180)(.12) = 21.6 males to be left handed
Expected Cell Frequencies
(continued)
Expected cell frequencies:
th th
Ri C j
(i Row total)(j Column total)
Eij
n Total sample size
Example:
(120)(36)
E11 14.4
300
Observed vs. Expected
Frequencies
Observed frequencies vs. expected frequencies:
Hand Preference
Gender Left Right
Observed = 12 Observed = 108
Female 120
Expected = 14.4 Expected = 105.6
Observed = 24 Observed = 156
Male 180
Expected = 21.6 Expected = 158.4
36 264 300
The Chi-Square Test
Statistic
The Chi-square test statistic is:
r c (Oij Eij )2
2
with d.f . (r 1)(c 1)
i1 j1 Eij
where:
Oij = observed frequency in cell (i, j)
Eij = expected frequency in cell (i, j)
r = number of rows
c = number of columns
Observed vs. Expected
Frequencies
Hand Preference
Gender Left Right
Observed = 12 Observed = 108
Female 120
Expected = 14.4 Expected = 105.6
Observed = 24 Observed = 156
Male 180
Expected = 21.6 Expected = 158.4
36 264 300
(12 14.4)2 (108 105.6)2 (24 21.6)2 (156 158.4)2
2
0.6848
14.4 105.6 21.6 158.4
Contingency Analysis
2 0.6848 with d.f. (r - 1)(c - 1) (1)(1) 1
Decision Rule:
If 2 > 3.841, reject H0, otherwise, do not reject
H0
= 0.05 Here, 2 = 0.6848 < 3.841,
so we
do not reject H0 and
2.05 = 3.841 2 conclude that gender and
hand preference are not
Do not reject H0 Reject H0 associated
Korelasi Peringkat Spearman
Digunakan untuk melihat tingkat keeratan
hubungan antar peubah yang skala
pengukurannya minimal ordinal
Langkah Penentuan Koefisien Korelasi
Spearman :
Menyusun Peringkat data untuk setiap peubah
Mencari Selisih Peringkat kedua peubah(Di)
Menghitung koefisien Korelasi Spearman dengan rumus :
n
6 Di
2
rs 1 i 1
n n 1
2
Ilustrasi
No X Peringkat X Y Peringkat Y Beda Peringkat
(RX) (RY) (Di=RXi-RYi)
1 X1 RX1 Y1 RY1 D1=RX1-RY1
2 X2 RX2 Y2 RY2 D2=RX2-RY2
3 X3 RX3 Y3 RY3 D3=RX3-RY3
4 X4 RX4 Y4 RY4 D4=RX4-RY4
5 X5 RX5 Y5 RY5 D5=RX5-RY5
6 X6 RX6 Y6 RY6 D6=RX6-RY6
7 X7 RX7 Y7 RY7 D7=RX7-RY7
.... .... .... .... ....
n Xn RXn Yn RYn Dn=RXn-Ryn
SEKIAN DAN TERIMA KASIH