Discriminant Analysis For Risk Classification and Prediction
Discriminant Analysis For Risk Classification and Prediction
for
Risk Classification
and
Prediction
Accuracy of Classification:
1 1 3 4
RISKL AG INC YRSM
1 OHI1 E35 4000 ARID
OME 8
2 1 33 4500
0 6
3 1 29 3600
0 5
4 2 22 3200
0 0
5 2 26 3000
0 1
6 1 28 3500
0 6
7 2 30 3100
0 7
8 2 23 2700
0 2
9 1 32 4800
0 6
10 2 24 1200
0 4
11 2 26 1500
0 3
12 1 38 2500
0 7
13 1 40 2000
0 5
14 2 32 1800
0 4
15 1 36 2400
0 3
16 2 31 1700
0 5
17 2 28 1400
0 3
18 1 33 1800
0 6
0
We will perform a DA and advise SBB on
how to set up its system to screen potential
good customers (low risk) from bad customers
(high risk). In particular, we will build a
discriminant function (model) and find out
STAT Standardized
Variab
. Root 1
Coefficients
AGE
DISC _.9239
le (discrbkl.sta) for
Eigen 2.1360
INCO Canonical
RIM. 55
val
ANA 12
ME Variables
_.7747
This
YRSoutput
Cum.
LYSI 80shows that Age is the best
1.0000
predictor,
Prop with
00 the coefficient of –0.92,
SMARI _.1512
followed
D by 98Income, with a coefficient of –
0.77, Years of Marriage is the last, with a
coefficient of – 0.15, Please recall that the
absolute value of the standardised coefficient
Q4. How do we classify a new credit
card applicant into either a ‘high risk’
or ‘low risk’ category, and make a
decision on accepting or refusing him a
credit card?
STAT Means of
.Group Canonical
Root 1
DISC
G_1:1 Variables
-
RIM. (discrbkl.sta)
Thus, the new mean for group
1 (low risk) is – 1.37793, and
the new mean for group 2 (high
risk) is + 1.37792. This means
that the midpoint of these two
is 0. This is clear when we plot
the two means on a straight
line, and locate their midpoint,
as shown below-
-1.37 0 +1.37