Predicting Loan Default
Submitted To:
Dr. Imad Bou hamad
Business and Data Analytics (EMBA 562)
Class 2021
Prepared By: Jad Rizk
March 12th, 2020
a. Being the senior manager of the credit department my business objectives would
be the following:
- Granting credit and processing transactions
- Monitoring credit balances
- Measuring performance of the credit function
- Making credit policy decisions
So basically my ultimate objective would be to develop the right credit analysis
tool that will help me distinguish between clients who will and who will not pay.
Developing a statistical model that will help me predict which clients will default
would be a great tool for decision making.
Data analytics will help me evaluate customers' credit worthiness using the
available data on existing clients and I can consequently develop a model that will
help me predict which client profiles have a high risk of default and which clients
are not likely to default.
b. A predictor is significant when the p-value of its coefficient is smaller than 5%.
As per the Appendix, the following predictors have coefficients with a p-value <
5%.
1- Years with current employer
2- Debt to income ratio (*100)
3- Credit Card debt in thousands
c. The coefficient for Years with current employer is estimated from the data to be
−0.233. We interpret this coefficient as follows: e–0.233 = 0.792 is the odds ratio, as
this ratio is below 1, we calculate 1/OR= 1/0.792= 1.26, which means that people
with 1 additional year with current employer are 26% less likely to
default than people who have 1 fewer year with the current employer, holding all
other factors constant. This means that people with 1 less year with the current
employer are 26% more likely to default than those who have one additional year
with the current employer.
The coefficient for Debt to income ratio is estimated from the data to be 0.076.
We interpret this coefficient as follows: e0.076= 1.078 is the odds ratio, which
means that people with 1 additional point of debt to income ratio are
7.8% more likely to default than people who have 1 less point of debt to
income ratio.
The coefficient for Credit Card debt in thousands ratio is estimated from the data
to be 0.533. We interpret this coefficient as follows: e0.533= 1.704 is the odds ratio,
which means that people with a $1.000 additional credit card debt, are
70.4% more likely to default, than people who have $1.000 less credit
card debt.
d. The overall accuracy of this model is 74.9%
The sensitivity or true positive rate is 67.1%
The specificity or true negative rate is 79%
The model does better in correctly predicting cases that did not default, and is
less accurate in classifying cases that defaulted.
e. The accuracy measures: Overall accuracy, sensitivity and specificity are relatively
good, and the number of predictors is only 8, so I recommend implementing the
model.
f. I think the following variables can be good predictors of loan default:
- The interest rate of the loan
- The credit score of the borrower
- Loan period in years
- Loan purpose: divide the loan purpose to several categories and convert them
to dummy variables =1 if the loan falls in a certain category and =0 otherwise
(Business loan 0-1, car loan 0-1, personal loan 0-1, travel loan 0-1…)
- Marital Status (Married 0-1, Divorced 0-1, Single 0-1, Widow 0-1, Engaged 0-
1)
- Occupation: divide the occupation to several categories and convert them to
dummy variables =1 if the client works in a certain category and =0 otherwise
(Medical doctor 0-1, Engineer 0-1, Lawyer 0-1, University professor 0-1…)
g. Based on the data analysis I would recommend we add more predictors
(mentioned in section f), and remove the insignificant predictors, run a logistic
regression, and compare the new model to the available model in terms of
accuracy measures: overall accuracy, sensitivity, and specificity, and choose the
best one.
In case it is not possible to gather data on the suggested variables, and the
available data covers only the variables reported in the pdf document, then my
recommendation would be the following:
- When choosing among several potential loan requesters, people who have
been working for the same employer for a longer period of time should be
given priority.
- When choosing among several potential loan requesters, people who have
lower debt to income ratio should be given priority.
- When choosing among several potential loan requesters, people who have
lower Credit Card debt should be given priority.