Open In App

Telecom Customer Churn Analysis in R

Last Updated : 04 Jul, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Customer churn is a topic of the telecom industry as retaining customers is as important as acquiring new customers. Telecom Customer Churn Analysis in R Programming Langauge involves examining a dataset related to Telecom Customer Churn to derive insights into why customers leave and what can be done to retain them.

The objective of Telecom Customer Churn Analysis

Customer churn analysis helps telecom companies identify the factors that influence customer departure. By understanding these factors, companies can implement targeted interventions to retain customers. This has implications not only for the telecom sector but also for broader economic and social ecosystems. Effective churn management can lead to improved customer satisfaction, better resource allocation, and enhanced profitability. Additionally, communities benefit from stable and reliable telecom services.

Dataset Link: Telecom Customer Churn

In this case, the dataset contains columns such as customer ID, gender, senior citizen, Partner, Dependents, tenure, phone service, Internet service, Churn, and other telecom customer-related information. The insights derived from this analysis can significantly impact various sectors, ecosystems, and communities by helping telecom companies improve their customer retention strategies. now we will discuss step by step for Telecom Customer Churn Analysis in R Programming Language.

Step 1 : Load Packages and Data

First, install and load the required packages and read the Dataset and check the first few rows.

R
# Install and load necessary libraries
library(dplyr)
library(tidyverse)
library(caret)
library(ggplot2)

# Load the "Telecom Customer Churn " dataset
churn_data <- read.csv("Your//path")
head(churn_data) 

Output:

  customerID gender SeniorCitizen Partner Dependents tenure PhoneService    MultipleLines
1 7590-VHVEG Female 0 Yes No 1 No No phone service
2 5575-GNVDE Male 0 No No 34 Yes No
3 3668-QPYBK Male 0 No No 2 Yes No
4 7795-CFOCW Male 0 No No 45 No No phone service
5 9237-HQITU Female 0 No No 2 Yes No
6 9305-CDSKC Female 0 No No 8 Yes Yes
InternetService OnlineSecurity OnlineBackup DeviceProtection TechSupport StreamingTV
1 DSL No Yes No No No
2 DSL Yes No Yes No No
3 DSL Yes Yes No No No
4 DSL Yes No Yes Yes No
5 Fiber optic No No No No No
6 Fiber optic No No Yes No Yes
StreamingMovies Contract PaperlessBilling PaymentMethod MonthlyCharges
1 No Month-to-month Yes Electronic check 29.85
2 No One year No Mailed check 56.95
3 No Month-to-month Yes Mailed check 53.85
4 No One year No Bank transfer (automatic) 42.30
5 No Month-to-month Yes Electronic check 70.70
6 Yes Month-to-month Yes Electronic check 99.65
TotalCharges Churn
1 29.85 No
2 1889.50 No
3 108.15 Yes
4 1840.75 No
5 151.65 Yes
6 820.50 Yes

The head(churn_data) function in R displays the first six rows of the "churn_data" dataframe. This function is useful for quickly inspecting the structure and contents of the dataframe to understand what kind of data it contains.

Step 2 : Exploratory Data Analysis (EDA)

EDA is a process of describing and summarizing data to bring important aspects into focus for further analysis.

R
# Check missing values in each column
colSums(is.na(churn_data))

# Check the dimension of the data
dim(churn_data)

# Removing missing values
churn_data<-na.omit(churn_data)

# Check total missing values
sum(is.na(churn_data))

Output:

      customerID           gender    SeniorCitizen          Partner       Dependents 
0 0 0 0 0
tenure PhoneService MultipleLines InternetService OnlineSecurity
0 0 0 0 0
OnlineBackup DeviceProtection TechSupport StreamingTV StreamingMovies
0 0 0 0 0
Contract PaperlessBilling PaymentMethod MonthlyCharges TotalCharges
0 0 0 0 11
Churn
0

[1] 7032 21

[1] 0

Check the summary of the data

The `summary(churn_data)` function in R provides a concise statistical summary of each column in the `churn_data` dataframe. For numeric columns, it shows the minimum, 1st quartile, median, mean, 3rd quartile, and maximum values. For categorical columns, it displays the frequency of each category. This helps you quickly understand the distribution and key statistics of your data.

R
summary(churn_data)

Output:

      customerID      gender     SeniorCitizen    Partner    Dependents     tenure     
0002-ORFBO: 1 Female:3483 Min. :0.0000 No :3639 No :4933 Min. : 1.00
0003-MKNFE: 1 Male :3549 1st Qu.:0.0000 Yes:3393 Yes:2099 1st Qu.: 9.00
0004-TLHLJ: 1 Median :0.0000 Median :29.00
0011-IGKFF: 1 Mean :0.1624 Mean :32.42
0013-EXCHZ: 1 3rd Qu.:0.0000 3rd Qu.:55.00
0013-MHZWF: 1 Max. :1.0000 Max. :72.00
(Other) :7026
PhoneService MultipleLines InternetService OnlineSecurity
No : 680 No :3385 DSL :2416 No :3497
Yes:6352 No phone service: 680 Fiber optic:3096 No internet service:1520
Yes :2967 No :1520 Yes :2015




OnlineBackup DeviceProtection TechSupport
No :3087 No :3094 No :3472
No internet service:1520 No internet service:1520 No internet service:1520
Yes :2425 Yes :2418 Yes :2040




StreamingTV StreamingMovies Contract
No :2809 No :2781 Month-to-month:3875
No internet service:1520 No internet service:1520 One year :1472
Yes :2703 Yes :2731 Two year :1685




PaperlessBilling PaymentMethod MonthlyCharges TotalCharges
No :2864 Bank transfer (automatic):1542 Min. : 18.25 Min. : 18.8
Yes:4168 Credit card (automatic) :1521 1st Qu.: 35.59 1st Qu.: 401.4
Electronic check :2365 Median : 70.35 Median :1397.5
Mailed check :1604 Mean : 64.80 Mean :2283.3
3rd Qu.: 89.86 3rd Qu.:3794.7
Max. :118.75 Max. :8684.8

Churn
No :5163
Yes:1869

Step 3 : Data Visualization

Perform data visualization to find some important information from the data.

R
# Count the occurrences of each churn value
churn_counts <- table(churn_data$Churn)

# Convert churn_counts to a dataframe
churn_df <- as.data.frame(churn_counts)
names(churn_df) <- c("Churn", "Count")

# Create the pie chart
ggplot(churn_df, aes(x = "", y = Count, fill = Churn)) +
  geom_bar(stat = "identity", width = 1) +
  coord_polar(theta = "y") +
  geom_text(aes(label = scales::percent(Count / sum(Count))), 
            position = position_stack(vjust = 0.5)) +
  ggtitle("Churn Distribution") +
  theme_void()

Output:

gh
Telecom Customer Churn Analysis in R

The above code snippet creates a pie chart in R to show the distribution of churn (customer attrition) in the churn_data dataset. It counts how many entries belong to each category ('Churn' or 'No Churn'), converts this count into a dataframe, and then uses ggplot2 to plot the data as a pie chart with percentage labels.

Churn Distribution of Contract Status

Here we will visualize the Distribution of Contract Status.

R
# Create the count plot
ggplot(churn_data, aes(x = Churn, fill = Contract)) +
  geom_bar(position = "dodge") +
  labs(title = "Churn Distribution w.r.t Contract Status", x = "Churn") +
  theme_minimal()

Output:

Churn-Distribution-wrt-Contract-Status
Churn Distribution w.r.t Contract Status

The above code snippet creates a bar plot in R using ggplot2 to show the distribution of churn (customer attrition) with respect to Contract Status in the churn_data dataframe.

Churn Distribution of Tenure

Now we will visualize the Churn Distribution of Tenure.

R
# Create the count plot
ggplot(churn_data, aes(x = tenure, fill = Churn)) +
  geom_bar(position = "dodge",width = 2,colour="black") +
  labs(title = "Churn Distribution w.r.t Tenure", x = "Months", y = "Count") +
  theme_minimal()

Output:

Churn-Distribution-wrt-Tenure
Churn Distribution w.r.t Tenure

The above code snippet creates a bar plot in R using ggplot2 to show the distribution of churn (customer attrition) with respect to Tenure in the churn_data dataframe.

Churn Distribution of Internet Services

Now we will visualize the Churn Distribution of Internet Services.

R
# Create the count plot
ggplot(churn_data, aes(x = InternetService, fill = Churn)) +
  geom_bar(position = "dodge") +
  labs(title = "Churn Distribution w.r.t Internet Services", x = "Internet Service") +
  theme_minimal()

Output:

Churn-Distribution-wrt-Internet-Services
Churn Distribution w.r.t Internet Services

The above code snippet creates a bar plot in R using ggplot2 to show the distribution of churn (customer attrition) with respect to Internet Services in the churn_data dataframe.

Senior Citizen Status

Identifying the number of senior citizens helps in tailoring services and promotions specifically for this segment. A bar plot can show the distribution of senior citizens versus non-senior citizens.

R
# Sample data
senior_data <- data.frame(
  SeniorCitizen = c("No", "Yes"),
  Count = c(6932, 1539)
)

# Create bar plot
ggplot(senior_data, aes(x = SeniorCitizen, y = Count, fill = SeniorCitizen)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  labs(title = "Senior Citizen Status", x = "Senior Citizen", y = "Count") +
  scale_fill_manual(values = c("No" = "#66B3FF", "Yes" = "#FF9999"))

Output:

gh
Customer Churn Analysis in R

This bar plot displays two bars: one for non-senior citizens and one for senior citizens. The height of the bars indicates the count of customers in each category. The plot uses different colors to distinguish between senior citizens and non-senior citizens, making the comparison straightforward.

Payment Method

Understanding how customers prefer to pay for services can inform billing and payment strategy. A bar plot can visualize the distribution of different payment methods.

R
# Sample data
payment_data <- data.frame(
  PaymentMethod = c("Bank transfer (automatic)", "Credit card (automatic)",
                                                "Electronic check", "Mailed check"),
  Count = c(1542, 1521, 2365, 1604)
)

# Create bar plot
ggplot(payment_data, aes(x = PaymentMethod, y = Count, fill = PaymentMethod)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  labs(title = "Payment Method Distribution", x = "Payment Method", y = "Count") +
  scale_fill_brewer(palette = "Set3")

Output:

Screenshot-2024-07-04-091914
Telecom Customer Churn Analysis in R

The bar plot represents the number of customers using each payment method. The plot uses different colors for each payment method, enhancing the visual distinction and making it easy to identify the most and least popular payment methods among customers.

Conclusion

By leveraging the insights from the churn analysis, telecom companies can develop targeted strategies to reduce churn, enhance customer satisfaction, and ultimately drive growth. Continuous monitoring and analysis of customer data are essential to adapting to market trends and evolving customer needs, ensuring long-term success in the competitive telecom industry.


Next Article

Similar Reads