0% found this document useful (0 votes)
60 views17 pages

Loan Risk Analysis in Banking Sector

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Topics covered

  • Document Submission,
  • Customer Demographics,
  • Credit History,
  • SQL Queries,
  • Previous Applications,
  • Debt Collection Strategies,
  • Loan Risk Factors,
  • Loan Application Trends,
  • Geographical Analysis,
  • Banking Sector Analysis
0% found this document useful (0 votes)
60 views17 pages

Loan Risk Analysis in Banking Sector

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Topics covered

  • Document Submission,
  • Customer Demographics,
  • Credit History,
  • SQL Queries,
  • Previous Applications,
  • Debt Collection Strategies,
  • Loan Risk Factors,
  • Loan Application Trends,
  • Geographical Analysis,
  • Banking Sector Analysis

Loan Risk Analysis Project

Description:
This repository contains a comprehensive analysis of loan risk factors in the banking
sector using two datasets: application_data.csv and previous_application.csv.
The project aims to provide insights into customer demographics, credit types, risk
assessment, and business strategies.

Project Structure:

Part 1: Understanding the Bank

 Total Records: Determine the total number of records in the


'application_data' table.
 Credit Types: Analyse the different types of credits offered by the bank.
 Gender Distribution: Explore the gender distribution of loan applicants.
 Gender-wise Credit Distribution: Analyse the distribution of credits based on
gender.
 Ownership of Assets: Investigate the volume of applicants who own cars and
realty in relation to credit type.
 Income Distribution: Analyse income distribution and descriptive statistics
concerning credit type.
 Income & Credit Distribution: Explore the relationship between income and
credit amounts based on credit type.
 Goods Amount Analysis: Analyse the goods amount for which loans are
given in the case of cash loans.
 Basic Income Type Distribution: Investigate the distribution of income types
among applicants.
 Basic Housing Type Distribution: Explore the distribution of housing types
among applicants.
 Basic Occupation Distribution: Analyse the distribution of occupations
among applicants.
 Region & City Rating Distribution: Investigate the distribution of region and
city ratings among applicants.

Part 2: Understanding the Client Base & Business Operations

 Family Status: Analyse the family status of the bank's clients.


 Housing Distribution: Explore the distribution of housing types among
clients.
 Age Brackets: Investigate the age brackets of the clients.
 Contacts Availability: Analyse the availability of contact information for
clients.
 Bank's Contact Reach: Explore the reach of the bank's contacts.
 Documents Submission Analysis: Analyse the submission of required
documents by clients.
 Loan Application Day Analysis: Investigate the distribution of loan
applications over days.

Part 3: Target Variable & Risk Analysis

 Credit Enquiries Analysis: Analyse credit enquiries on clients before the loan
application.
 Risk Classification: Classify clients based on risk factors such as default
percentages.
 Deeper Risk Analysis: Conduct a deeper analysis of clients with payment
difficulties and low-risk surroundings.
 Integration of Previous Application Data: Integrate insights from previous
loan application data.

Part 4: Insights & Recommendations

Part 5: Challenges on the Analysis

Part 6: Challenges on the Bank

SQL Queries & Insights:


After every SQL query, insights and interpretations are provided to facilitate a better
understanding of the data and its implications.

Data:
 Source: [Link]
risk/data?select=application_test.csv
 application_data.csv: Contains client information at the time of loan
application.
o Total columns: 122
 previous_application.csv: Provides data on clients' previous loan
applications.
o Total columns: 37

This project aims to provide valuable insights into risk factors influencing loan default
and recommendations for mitigating such risks in the banking sector.

Snapshot of the work


Credit Types
select
name_contract_type,
cast(count(1)*100.0/(select count(1) from application_data) as decimal(4,2)) as
percentage
from application_data
group by NAME_CONTRACT_TYPE;

90% of the loans are Cash Loans while


around 10% are Revolving Loans. There are 2
kinds of credits namely revolving loans and
cash loans. Cash loans are credits given
upfront with periodical repayments (car loan), while revolving loans are loans based
on usage having a credit limit like Credit Cards. The company seems to pitch more
cash loans. Usually, these structured and secured loans. One can infer that the
company is conservative in giving loans since the earning is usually higher in
Revolving Loans. This however depends on the risk appetite of a bank, competition
of other banks, sales strategy, training of employees, the legal regulations, economy
and credit worthiness of the customer base.

select
CODE_GENDER,
cast(count(1)*100.0/(select count(1) from application_data) as decimal(4,2)) as
percentage
from application_data
group by CODE_GENDER;

65% of the customers are female,34% are males


and rest are others. This bank has a larger
female customer base! Few reasons why this
could be the case is that

 Demographic conditions in the region -


More working females, Higher financial literacy & education, risk taking
appetite
 Marketing Strategy - The bank might be targeting more females. One reason
could be that the bank has better & loyal female customers.
 Fraud rate may be lesser in this gender.
 Social Image & Initiatives - The bank could be promoting women
empowerment.
 Government Benefits - The bank might be receiving Government Benefits for
having a higher female customer base.
 Geographical Conditions - The region where the bank operates might have
more females

Income Distribution & Descriptive Statistics wrt. Credit type


SELECT
distinct name_contract_type AS name_contract_type
,cast(count(1)over(partition by name_contract_type) *100.0/(select count(1) from

application_data) as decimal(4,2)) as percentage


,cast(avg(amt_income_total)over(partition by name_contract_type) as int) as
average_income
,min(amt_income_total) over(partition by name_contract_type) as min_income
,max(amt_income_total) over(partition by name_contract_type) as max_income
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY amt_income_total) OVER (PARTITION BY
name_contract_type) AS Median_Income
FROM application_data;

The average income of clients is equal in both the loan segments. One reason could
be that cash loans require security and a higher income level eligibility criterion. The
Min income in both the loans average around 26000. The Maximum income is much
higher in case of cash loans. With higher credit, banks require higher security. The
Median Income and max income in case of Cash Loans show a huge gap. This gap
can be further analysed by categorizing customers into income_level_flags.
SELECT
distinct name_contract_type AS name_contract_type
,cast(count(1)over(partition by name_contract_type) *100.0/(select count(1) from
application_data) as decimal(4,2)) as percentage
,cast(avg(amt_income_total)over(partition by name_contract_type) as int) as
average_income
,cast(avg(AMT_CREDIT)over(partition by name_contract_type) as int) as average_credit
,min(amt_income_total) over(partition by name_contract_type) as min_income
,min(AMT_CREDIT) over(partition by name_contract_type) as min_credit
,max(amt_income_total) over(partition by name_contract_type) as max_income
,max(AMT_CREDIT) over(partition by name_contract_type) as max_credit
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY amt_income_total) OVER (PARTITION BY
name_contract_type) AS Median_Income
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY AMT_CREDIT) OVER (PARTITION BY
name_contract_type) AS Median_Credit
FROM application_data;

The Average Credit in Cash Loans is twice the Revolving Loan credits, while the
Average & Minimum income is similar. This supports the bank's conservative
approach of dealing credits. The Minimum Credit however is much higher for
Revolving Loans. But the Median Credit is half of Cash Loans. Also, the bank gives 5
times the income as a revolving loan to the person with lowest income. This also
supports the bank's risk-free approach since clients with less assets can avail the
loans. The bank pushes for secured loans.

Analysis of Goods Amount for which loan is given in case of Cash Loans
SELECT
distinct name_contract_type AS name_contract_type
,cast(count(1)over(partition by name_contract_type) *100.0/(select count(1) from
application_data) as decimal(4,2)) as percentage
,cast(avg(AMT_GOODS_PRICE)over(partition by name_contract_type) as int) as
average_goods_amt
,cast(avg(AMT_CREDIT)over(partition by name_contract_type) as int) as average_credit
,min(AMT_GOODS_PRICE) over(partition by name_contract_type) as min_goods_amt
,min(AMT_CREDIT) over(partition by name_contract_type) as min_credit
,max(AMT_GOODS_PRICE) over(partition by name_contract_type) as max_goods_amt
,max(AMT_CREDIT) over(partition by name_contract_type) as max_credit
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY AMT_GOODS_PRICE) OVER (PARTITION BY
name_contract_type) AS Median_goods_amt
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY AMT_CREDIT) OVER (PARTITION BY
name_contract_type) AS Median_Credit
FROM application_data
where NAME_CONTRACT_TYPE = 'Cash Loans';

Usually, the credit is higher than the goods amount for which the loan is taken. The
reasons why it could be so are-

 The Loan might cover additional charges


 The borrower might have a discretion to use the money according to their
needs
 The borrower might be paying off previous dues with a new loan

Overall, the bank does not allow a significant gap between the goods being
purchased and the loan amount.
Age Brackets of the Clients
with age_application as (
select
case when datediff(year,DATEADd(dd,DAYS_BIRTH,getdate()),GETDATE()) <=25 then '18-25'
when datediff(year,DATEADd(dd,DAYS_BIRTH,getdate()),GETDATE()) between 26 and
40 then '26-40'
when datediff(year,DATEADd(dd,DAYS_BIRTH,getdate()),GETDATE()) between 41 and
55 then '41-55'
when datediff(year,DATEADd(dd,DAYS_BIRTH,getdate()),GETDATE()) between 56 and
65 then '56-65' else '65above' end as age_bracket
from application_data)
select
age_bracket
,count(1) as Frequency
,cast(count(1)*100.0/(select count(1) from application_data)as decimal(4,2)) as
Percentage
from age_application
group by age_bracket
order by Percentage desc;

 37% of the clients are between the


age 26 and 55 and 20% of the clients are
above 55
 Only 4% of the clients are below 25.
Like iterated earlier, the need for credit
comes with more responsibilities and
interests
 Few people who get really successful early in their career, tend to avail credit
options to accelerate their growth
 Also, very few clients are Students
with contact_data as

(select
case when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =3 then 'All Contacts Available'
when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =2 then 'Two Contacts Available'
when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =1 then '1 Contact Available'
else 'No Contact Available' end as contacts_provided
from application_data)
select
contacts_provided,
count(1) as Frequency,
cast(count(1)*100.0/(select count(1) from contact_data) as decimal(4,2)) as
percentage
from contact_data
group by contacts_provided;

Around 62% of the Clients have


provided 2 Contacts, and 19%
have given either 1 or all
contacts. There is no client
without any contact. The documentation seems clearly executed.

Documents Submission Analysis

with Documents_data as
(select
case when
FLAG_DOCUMENT_2+FLAG_DOCUMENT_3+FLAG_DOCUMENT_4+FLAG_DOCUMENT_5+FLAG_DOCUMENT_6+FLAG_D
OCUMENT_7+FLAG_DOCUMENT_8+FLAG_DOCUMENT_9+FLAG_DOCUMENT_10+FLAG_DOCUMENT_11+FLAG_DOCUM
ENT_12+FLAG_DOCUMENT_13+FLAG_DOCUMENT_14+FLAG_DOCUMENT_15+FLAG_DOCUMENT_16+FLAG_DOCUME
NT_17+FLAG_DOCUMENT_18+FLAG_DOCUMENT_19+FLAG_DOCUMENT_20+FLAG_DOCUMENT_21
between 15 and 20 then '15-20 Documents Available'
when
FLAG_DOCUMENT_2+FLAG_DOCUMENT_3+FLAG_DOCUMENT_4+FLAG_DOCUMENT_5+FLAG_DOCUMENT_6+FLAG_D
OCUMENT_7+FLAG_DOCUMENT_8+FLAG_DOCUMENT_9+FLAG_DOCUMENT_10+FLAG_DOCUMENT_11+FLAG_DOCUM
ENT_12+FLAG_DOCUMENT_13+FLAG_DOCUMENT_14+FLAG_DOCUMENT_15+FLAG_DOCUMENT_16+FLAG_DOCUME
NT_17+FLAG_DOCUMENT_18+FLAG_DOCUMENT_19+FLAG_DOCUMENT_20+FLAG_DOCUMENT_21
between 10 and 14 then '10-14 Documents Available'
when
FLAG_DOCUMENT_2+FLAG_DOCUMENT_3+FLAG_DOCUMENT_4+FLAG_DOCUMENT_5+FLAG_DOCUMENT_6+FLAG_D
OCUMENT_7+FLAG_DOCUMENT_8+FLAG_DOCUMENT_9+FLAG_DOCUMENT_10+FLAG_DOCUMENT_11+FLAG_DOCUM
ENT_12+FLAG_DOCUMENT_13+FLAG_DOCUMENT_14+FLAG_DOCUMENT_15+FLAG_DOCUMENT_16+FLAG_DOCUME
NT_17+FLAG_DOCUMENT_18+FLAG_DOCUMENT_19+FLAG_DOCUMENT_20+FLAG_DOCUMENT_21
between 5 and 9 then ' 5-9 Documents Available'
else 'Less than 5 Documents Available' end as Documents_provided
from application_data)
select
Documents_provided,
count(1) as Frequency,
cast(count(1)*100.0/(select count(1) from documents_data) as decimal(5,2)) as
percentage
from documents_data
group by Documents_provided;

In Terms of Documents, up to 4 Documents were procured at max (100%). These


documents vary from loan to loan. This could be a good sign in the sense that the
bank takes less documentation before providing credit. A point to check would be
that all the necessary information is collected. While less paperwork and online
documentation is a plus point, the bank should ensure that no information is missed.
Occupation details are clearly not part of this check (Again it depends on the loan
type). Would be a plus if most of it is digitised.

Overall Analysis of Credit enquiries on the Clients

select
AMT_REQ_CREDIT_BUREAU_YEAR
,count(1) as Frequency
,cast(count(1)*100.0/(select count(1) from application_data) as decimal(4,2)) as
Percentage
from application_data
group by AMT_REQ_CREDIT_BUREAU_YEAR
order by percentage desc;
 43% of Loan Applications come from clients having 0 or 1 Cibil checks & 16%
from clients having 2 cibil checks. This is a decent sign that could suggest that
the client does not seem to be risky. This could be further analysed by looking
at their cibil reports for 2 years.
 20% of clients have more than 2 enquiries in 1 year. This is further analysed
below by looking at their quarterly and monthly enquiries.
 13.5% values are null which I assume are the clients having no credit
history/taking credit for the 1st time. This depends on multiple factors like the
bank's strategy, legal implications, client relationship (might be a customer
having deposits), etc.
 Past behaviour of clients in that geographical locations need to be checked in
order to know if this is risky sign or not. Macro changes in economy (fall in
interest rates, increase in taxes, etc) could also affect this factor.

Analysis of individual applications based on the credit enquiries

with enquiry_table as
(select
case when AMT_REQ_CREDIT_BUREAU_YEAR is null then 'No Credit History'
when AMT_REQ_CREDIT_BUREAU_YEAR = 0 then 'No Enquiry in the past year'
when AMT_REQ_CREDIT_BUREAU_QRT = 0 then 'Had Enquiries within the year'
when AMT_REQ_CREDIT_BUREAU_MON = 0 then 'Had Enquiries within the quarter'
when AMT_REQ_CREDIT_BUREAU_WEEK = 0 then 'Had Enquiries within the month'
when AMT_REQ_CREDIT_BUREAU_DAY = 0 then 'Had Enquiries within the week'
when AMT_REQ_CREDIT_BUREAU_HOUR = 0 then 'Had Enquiries within the day' end as
Enquiry_Status
from application_data)
select
Enquiry_Status
,count(Enquiry_Status) as Frequency
,cast(count(Enquiry_Status)*100.0/(select count(1) from enquiry_table)as
decimal(4,2)) as Percentage
from enquiry_table
group by Enquiry_Status
order by Percentage desc;
with default_scope as
(select isnull(cast(DEF_60_CNT_SOCIAL_CIRCLE*100.0/NULLIF(OBS_60_CNT_SOCIAL_CIRCLE,0)
as decimal(5,2)),0) as Percentage
from application_data)
,risk_scope as
(select
case when Percentage=100 then 'Very High Risk'
when Percentage between 75 and 99 then 'High Risk'
when Percentage between 50 and 74 then 'Moderate Risk'
when Percentage between 25 and 49 then 'Low Risk'
when Percentage <25 then 'Very Low Risk' end as Risk_category_60_Days
from default_scope)
select
Risk_category_60_Days,
count(1) as Frequency,
cast(count(1)*100.0/(select count(1) from risk_scope) as decimal(5,2)) as Percentage
from risk_scope
group by Risk_category_60_Days
order by Percentage desc;

with default_scope as
(select isnull(cast(DEF_30_CNT_SOCIAL_CIRCLE*100.0/NULLIF(OBS_30_CNT_SOCIAL_CIRCLE,0)
as decimal(5,2)),0) as Percentage
from application_data)
,risk_scope as
(select
case when Percentage=100 then 'Very High Risk'
when Percentage between 75 and 99 then 'High Risk'
when Percentage between 50 and 74 then 'Moderate Risk'
when Percentage between 25 and 49 then 'Low Risk'
when Percentage <25 then 'Very Low Risk' end as Risk_category_30_Days
from default_scope)
select
Risk_category_30_Days,
count(1) as Frequency,
cast(count(1)*100.0/(select count(1) from risk_scope) as decimal(5,2)) as Percentage
from risk_scope
group by Risk_category_30_Days
order by Percentage desc;
 92% of the applications look to be of low risk based on the social
surroundings default history in the last 60 days.
 This means that the geographical region is good to do business. The people
from that region have made timely payments, defaults not exceeding 60dpd.
 Around 3% clients tend to be highly risky. Around 9421, customers to be
precise. 6760 clients have moderate risk.
 Overall, the individual behaviours need to be given more weightage while
approving applications even though banks do have specific insights about
regions.

with default_scope as
(select
target,
isnull(cast(DEF_30_CNT_SOCIAL_CIRCLE*100.0/NULLIF(OBS_30_CNT_SOCIAL_CIRCLE,0) as
decimal(5,2)),0) as Percentage
from application_data)
,risk_scope as
(select
target,
case when Percentage=100 then 'Very High Risk'
when Percentage between 75 and 99 then 'High Risk'
when Percentage between 50 and 74 then 'Moderate Risk'
when Percentage between 25 and 49 then 'Low Risk'
when Percentage <25 then 'Very Low Risk' end as Risk_category_30_Days
from default_scope)
select
case when target = 0 then 'Never had Payment Difficulties' else 'Had Payment
Difficulties' end as Target
,Risk_category_30_Days
,count(1) as Frequency
,cast(count(1)*100.0/(select count(1) from risk_scope) as decimal(5,2)) as
Percentage
from risk_scope
group by case when target = 0 then 'Never had Payment Difficulties'
else 'Had Payment Difficulties' end, Risk_category_30_Days
order by Target;
 Around 7% customers who are Very Low Risk based on the social
surrounding's 30 days payment default history have had Payment Difficulties.
 This is the most important bracket according to me. These are the clients who
need to be studied more. A deeper dive on the client demographics is crucial
to understand this.
 Proper meetings with the Debt Managers and other heads of the Collection
team will reveal the reason on why the clients defaulted. Maybe they had an
emergency, maybe the collection method was not appropriate.
 It could also happen that they changed their address or they could not be
contacted via email or cell.
 For the clients who never had any Payment Difficulties, proper customer
service, cross-product selling, long-term relationship building and proper
customer service is the key.

Deeper analysis on the Contact reach for clients who had payment difficulties but
were from the Very Low Risk social surroundings

with default_scope as
(select
target
,case when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =3 then 'All Contacts
Available'
when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =2 then 'Two Contacts Available'
when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =1 then '1 Contact Available'
else 'No Contact Available' end as contacts_provided
,isnull(cast(DEF_30_CNT_SOCIAL_CIRCLE*100.0/NULLIF(OBS_30_CNT_SOCIAL_CIRCLE,0) as
decimal(5,2)),0) as Percentage
from application_data)
,risk_scope as
(select
target,
contacts_provided,
case when Percentage=100 then 'Very High Risk'
when Percentage between 75 and 99 then 'High Risk'
when Percentage between 50 and 74 then 'Moderate Risk'
when Percentage between 25 and 49 then 'Low Risk'
when Percentage <25 then 'Very Low Risk' end as Risk_category_30_Days
from default_scope)
,risk_based_on_contact_reach as
(select
case when target = 0 then 'Never had Payment Difficulties’ else 'Had Payment
Difficulties' end as Target
,contacts_provided
,Risk_category_30_Days
,count(1) as Frequency
,cast(count(1)*100.0/(select count(1) from risk_scope) as decimal(5,2)) as
Percentage
from risk_scope
group by case when target = 0 then 'Never had Payment Difficulties'
else 'Had Payment Difficulties' end, Risk_category_30_Days,contacts_provided)
select
Target,
contacts_provided,
Risk_category_30_Days,
Frequency,
cast(Frequency*100.0/sum(frequency)over() as decimal(5,2)) as Percentage
from risk_based_on_contact_reach
where Target = 'Had Payment Difficulties' and Risk_category_30_Days = 'Very Low Risk'
order by Percentage desc;

 Out of the clients who have had payment difficulties and were from Very Low
Risk regions, all contacts were available for around 24% clients.
 64% clients have provided 2 contacts and 12% clients have provided only 1
contact. The team needs to get access of more contact details for these two
classes of clients.
 There could be family relatives of these clients whom the bank can contact. Of
course it is done only in extreme cases. Usually, it is done for clients having
more than 90-120dpd or Bucket 3-4.
 Further Analysis needs to be done whether the client lives in the given city or
not. Also, an assessment of the credit collection team needs to done.
 All changes made in the collection strategy should be analysed. Redundant
changes should be overruled.

with default_scope as
(select
target
,case when REG_REGION_NOT_LIVE_REGION = 1 then 'Address Mismatch' else 'Address
Match' end as Address_city_match
,case when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =3 then 'All Contacts
Available'
when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =2 then 'Two Contacts Available'
when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =1 then '1 Contact Available'
else 'No Contact Available' end as contacts_provided
,isnull(cast(DEF_30_CNT_SOCIAL_CIRCLE*100.0/NULLIF(OBS_30_CNT_SOCIAL_CIRCLE,0) as
decimal(5,2)),0) as Percentage
from application_data)
,risk_scope as
(select
target
,contacts_provided,Address_city_match,
case when Percentage=100 then 'Very High Risk'
when Percentage between 75 and 99 then 'High Risk'
when Percentage between 50 and 74 then 'Moderate Risk'
when Percentage between 25 and 49 then 'Low Risk'
when Percentage <25 then 'Very Low Risk' end as Risk_category_30_Days
from default_scope)
,risk_based_on_contact_reach as
(select
case when target = 0 then 'Never had Payment Difficulties’ else 'Had Payment
Difficulties' end as Target
,Address_city_match
,contacts_provided
,Risk_category_30_Days
,count(1) as Frequency
,cast(count(1)*100.0/(select count(1) from risk_scope) as decimal(5,2)) as
Percentage
from risk_scope
group by case when target = 0 then 'Never had Payment Difficulties'
else 'Had Payment Difficulties' end,
Risk_category_30_Days,contacts_provided,Address_city_match)
select
Target,
contacts_provided,
Address_city_match,
Risk_category_30_Days,
Frequency,
cast(Frequency*100.0/sum(frequency)over() as decimal(5,2)) as Percentage
from risk_based_on_contact_reach
where Target = 'Had Payment Difficulties' and Risk_category_30_Days = 'Very Low Risk'
order by Percentage desc;

 Around 2% Cases had an address mismatch, while having the contact details.
Although it is a tiny fraction of the whole, it should still be assessed by the
debt managers.
 The underlying reasons for their payment difficulties could be unavailability of
funds, lack of contingency fund, or a typical pay in the beginning and then
default kind of scenario.

Integration of previous application data

with credit_data as
(select
case when AMT_APPLICATION between 0 and 500000 then 'Very Low Amount'
when AMT_APPLICATION between 500001 and 1000000 then 'Low Amount'
when AMT_APPLICATION between 1000001 and 1500000 then 'Moderate Amount'
when AMT_APPLICATION between 1500001 and 2000000 then 'High Amount' else 'Very High
Amount' end as prev_credits
from application_data a
join previous_application p on a.SK_ID_CURR = p.SK_ID_CURR)
select
prev_credits,
count(1) as frequency,
cast(count(1)*100.0/(select count(1) from credit_data) as decimal(5,2)) as
Percentage
from credit_data
group by prev_credits;

Top 15 customers and contact reach


with prev_app_data as
(select
SK_ID_CURR
,count(sk_id_prev) as previous_applications
,cast(sum(case when NAME_CONTRACT_STATUS = 'approved' then 1 else 0
end)*100.0/count(SK_ID_PREV) as decimal(5,2)) as application_approval_rate
from previous_application
group by SK_ID_CURR
having cast(sum(case when NAME_CONTRACT_STATUS = 'approved' then 1 else 0
end)*100.0/count(SK_ID_PREV) as decimal(5,2)) =100.0)
select top 15
p.*
,a.NAME_INCOME_TYPE,a.NAME_EDUCATION_TYPE,OCCUPATION_TYPE
,case when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =3 then 'All Contacts
Available'
when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =2 then 'Two Contacts Available'
when FLAG_MOBIL+FLAG_EMP_PHONE+FLAG_WORK_PHONE =1 then '1 Contact Available' else
'No Contact Available' end as contacts_provided
from prev_app_data p
join application_data a on p.SK_ID_CURR = a.SK_ID_CURR
order by previous_applications desc;
INSIGHTS & RECOMMENDATIONS

 The bank should try to source more Revolving Loans


 Provide more loans to Businessmen
 Targeting more Single person could give banks more income. These are the
customers with whom banks can build a long-term relationship and provide
products at every stage of life. Of course this comes with a higher risk, but an
evaluation of current Single clients could reveal the credit behaviour of this
class
 Reach out to the addresses of the clients whose contact info is unreachable
 Occupation details are missing for more than 31.35% of the clients. The bank
should reach out and collect more information about it. This not only ensures
more security; it also gives the bank a chance to pitch more products
according to the client's occupation
 Reach out to more Occupations like HR Staff, IT Staff and Realty Agents.
 Train employees/agents to reach out to Tier 1 Regions. Need to penetrate and
investigate the reasons on why the reach is so low on Tier 1 & 3 Regions. One
of the most effective ways is to have periodical meetings with the Executives
managing the Sales Channels. They work on ground level and can say the
correct reason. Also, doing so empowers & motivates them that the upper
management takes their ideas & it makes them feel important and needed
 Reach out to Students or the young age group by tying up with Universities,
Colleges & other Online/Offline Education Institutes
 Maintain the current volume of Sales Programs/Strategies on regions,
occupations, classes where there is high application rate
 More analysis is required on the 4 stages - Pre-Transaction, During
Transaction, Post Transaction & Renewal
 Target Low Risk Customers as well. Tailor made solutions for these buckets
could prove fruitful for the business. Cross product targeting to Low & Very
Low Risk classes, tie ups with their organisations (if any) and building long
term relationships is the key for a stable & profitable business
 Deeper Analysis on High Risk & Moderate Risk Clients needs to be done. The
quantum of profit from these customers’ needs to be taken into consideration
 A Very Low Risk client giving less revenue might be less preferable than a
Moderate Risky client giving more revenue
 A lot of the bank's revenue depends on how the Credit Collection team
functions. Proper methodology and action on the ground level ensures timely
payment collection
 Periodical training of debt managers, collection agents, third party vendors
need to be done to deal with cases where the contact details are available and
the social surroundings have Very Low Risk in terms of Payment, but the client
has defaulted Also, harsh customer service or debt collection methods can
hurt the brand image in the mind of the client and in the surrounding(long-
term). Proper check needs to be taken to ensure that the methods are strict
but not overly harsh
 The bank needs to provide the clients with the proper information about the
effects a default can have on the credit score and the future difficulties the
client could be facing. There could be instances where the debt managers are
too rigid with the collection while they should be educating the customers
about the consequences of such behaviour
 The bank could enquire about the persons who were accompanying the client
during the application. The employees at the bank should be well trained to
build knowledge about that person. This increases reliability on the client who
is applying for credit as well as gives an opportunity to pitch products to the
companion
 There is a need to sit down with the people working on ground level and
providing them with the info of the analysis. Integrating these minute details
could be really fruitful for any organisation. Banking as a sector is highly
personalised. It becomes unavoidable to take in account these intricate details
and apply them in the day-to-day operations. Ex - Finding that a person has
incomplete education, could mean that they started a venture. Although the
bank could have details about the person's org, a 5 min conversation of the
relationship manager with the client about his journey from being a dropout
to starting his own venture could have a really positive outlook
 Although it is cheaper for a bank to maintain current customers than acquiring
new ones, it should try to target more clients who have completed their
higher education. A large chunk of clients has only completed secondary
education
CHALLENGES ON RESEARCH

 The Organization type description was not clear. Terms like 'Business Entity
Type 1', 'Industry Type 1' was vague
 Application Date is absent, no analysis could be performed in that aspect. We
could not ascertain the increase and decrease in count of applications or the
revenue over various periods
 It is important to know the peak seasons. Usually, the need for credit arrives
when there is shortage of money. Month wise, it is the 3rd week of a month.
This is the time when the need of credit arrives due to unplanned
expenditures or increased spending. Year wise, people tend to need more
credit during the 3rd & 4th Quarter. This is the peak time for retail shopping
 Rural & Urban Segments could not be analysed since it was not clear from the
data
 The same customer might have multiple applications
 The enquiries made on the client's credit report to the credit bureau do not
highlight which banks enquired about the client. An analysis of that data could
reveal
 whether it was this bank or multiple banks involved
 The Quantum of Revenue is missing in these applications. It is a crucial aspect
of analysis

CHALLENGES ON THE BANK

 In general, there is a fall in NPA in India, which is a good sign. It now remains
as a challenge on the bank's end to take advantage of this factor while facing
competition from other players
 There is minimal control over Interest Rates. It is a question of marketing.
 To increase the profitability, on the revenue side, the bank needs to either
increase its number of clients or its revenue charges (annual fees, transaction
fees, etc. – these are normally regulated)
 On the cost side, the bank needs to decrease its fixed/variable costs. Fixed
costs like Rent, Maintenance, Employee Salary, etc need to be checked.
Variable costs include interests on deposits, customer handing costs, etc.

Common questions

Powered by AI

The predominant type of loan offered by the bank is Cash Loans, which account for 90% of the total loans, with Revolving Loans making up around 10% . The bank might favor Cash Loans because they require security and are usually perceived as less risky due to their structured, periodic repayment model . Additionally, the conservative approach aligns with providing loans that demand higher security with income .

The income distribution shows average incomes to be similar across both Cash and Revolving Loan segments, yet the max income is notably higher in Cash Loans . These variations can exist because Cash Loans typically require higher security and thus attract applicants with greater income levels. Median and max income discrepancies in Cash Loans suggest varying income levels among customers, potentially catering to different credit needs and security requirements .

Key challenges include addressing variations in repayment reliability, as around 3% of clients are classified as high risk despite an otherwise low-risk environment . Challenges also involve analyzing the demographic of clients who are unexpectedly high risk to effectively strategize solutions, such as improving contact availability for risk mitigation and precise adjustments in collection strategy .

The bank can enhance strategies by focusing on deeper client demographic analysis to understand upper-risk factors and personal circumstances leading to defaults . Also, improving the debt collection process, assessing and potentially increasing the accessibility of contact information, and fostering better communication might mitigate further risks . Banking on strong client relationships could reduce the chance of similar issues recurring in these low-risk surroundings.

Geographic considerations imply that clients from the operating region generally show timely payment behaviors, with 92% of cases characterized as low risk . However, highly localized influences such as regional economic conditions, client demographics, cultural factors, and regional policies also impact the risk portfolio, demanding individualized assessment and tactical adaptability in operations .

Integrating previous application data allows for detailed tracking of credit history, including approval success rates and previous amounts applied for, which helps identify consistent behaviors such as liquidity status and decision-making patterns among applicants . This integration provides a structured basis for understanding credit limits and preferences, enhancing customer risk assessment with historical performance data .

Clients aged 26-55 make up 37% of the bank’s client base, whereas those over 55 represent 20%, and only 4% are under 25 . Credit needs in the 26-55 age bracket are influenced by increased responsibilities and interests, whereas those over 55 might have stable financial needs. The younger demographic might have less demand due to limited income and early career stages .

Approximately 65% of the bank's loan applicants are female, 34% are male, and the rest belong to other gender categories . Factors influencing this distribution may include demographic conditions such as a higher number of working females, bank marketing strategies aimed at women, lower fraud rates among females, societal initiatives promoting women empowerment, governmental benefits for banks supporting female financial literacy, and the bank’s operational region having a higher female population .

The availability of client contacts indicates 24% of those with prior payment difficulties had all contact details available, whereas 64% had two contacts, and 12% only one . Contact details show a significant correlation with delinquencies, implying better contact reach potentially aids timely recovery and helps identify underlying payment failure causes, thus aiding in risk reduction strategies .

The average credit amount is higher than the goods amount for Cash Loans because loans often include additional coverage beyond just the price of goods. These might consist of extra charges, or the borrower might be utilizing the loan for broader financial discretion, such as paying off previous debts . This approach aligns with the bank's risk-averse strategy by ensuring adequate coverage and flexibility for borrowers' needs .

You might also like