0% found this document useful (0 votes)

32 views9 pages

1 s2.0 S2666827024000434 Main

Word embedding and classification methods and their effects on fake news detection

Uploaded by

traviss9010

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views9 pages

1 s2.0 S2666827024000434 Main

Word embedding and classification methods and their effects on fake news detection

Uploaded by

traviss9010

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Machine Learning with Applications 17 (2024) 100567

Contents lists available at ScienceDirect

Machine Learning with Applications

journal homepage: www.elsevier.com/locate/mlwa

Explaining customer churn prediction in telecom industry using tabular

machine learning models
Sumana Sharma Poudel a , Suresh Pokharel b , Mohan Timilsina c ,∗
a Pokhara University, Nepal college of Information Technology, Nepal
b
Westcliff University, Presidential Graduate School, Nepal
c
Data Science Institute, University of Galway, Ireland

ARTICLE INFO ABSTRACT

Keywords: The study addresses customer churn, a major issue in service-oriented sectors like telecommunications, where it
Customer churn refers to the discontinuation of subscriptions. The research emphasizes the importance of recognizing customer
Explainable model satisfaction for retaining clients, focusing specifically on early churn prediction as a key strategy. Previous
Global explainable
approaches mainly used generalized classification techniques for churn prediction but often neglected the
Local explainable
aspect of interpretability, vital for decision-making. This study introduces explainer models to address this
Telecommunication
gap, providing both local and global explanations of churn predictions. Various classification models, including
the standout Gradient Boosting Machine (GBM), were used alongside visualization techniques like Shapley
Additive Explanations plots and scatter plots for enhanced interpretability. The GBM model demonstrated
superior performance with an 81% accuracy rate. A Wilcoxon signed rank test confirmed GBM’s effectiveness
over other models, with the 𝑝-value indicating significant performance differences. The study concludes that
GBM is notably better for churn prediction, and the employed visualization techniques effectively elucidate
key churn factors in the telecommunications sector.

1. Introduction & Rehman, 2013; Wei & Chiu, 2002) and can positively influence
the company’s reputation, reducing marketing costs for new customer
The service-oriented industries, such as telecommunications, face acquisition (Bolton & Bronkhorst, 1995; Reichheld & Sasser, 1990).
considerable challenges due to customer churn, where valuable cus- So, it is desirable to have thorough research on customer churn and
tomers are lost to competitors. As the world rapidly embraces digi- taking proactive measures in response by decision maker can provide
tization, the telecommunications sector serves as a crucial backbone. a competitive edge to stay ahead in this competition.
Notably, it represents a significant contributor to national income, The primary goal of the churn prediction is to support the cre-
particularly in developing countries, where it plays a substantial role ation of client retention plans in a market that is highly competitive.
in generating revenue (Liao & Lien, 2012). With its substantial business Churn models are made to predict which customers are likely to quit
volume, telecommunications is recognized as a key industry, evident in on their own will and to spot early signs of churn (Wei & Chiu,
ongoing technical advancements and a growing number of operators. 2002). For this, companies must leverage their databases as valuable
Consequently, fierce competition among service providers persists (Ger- assets to comprehend customer churn behavior (Coussement & Van
pott, Rams, & Schindler, 2001), leading to the introduction of new
den Poel, 2008). Fundamentally, these databases contain information
technologies, services and strategies aimed at attracting new customer
on customer service usage, billing details, and satisfaction levels. In
and retaining existing customers. The churn rate in this sector is
addition to predicting customers likely to switch, companies seek to
approximately 2.6% monthly (Hawley, 2003). Comparing the return on
understand churn causes, which aids in profiling prone customers
investment between acquiring a new customer and retaining an existing
and devising effective retention campaigns (Leung, Pazdor, & Souza,
one reveals that the latter is less expensive (Reinartz & Kumar, 2003;
2021). Effective churn modeling has two important components: (i)
Yang & Peterson, 2004) and generally easier than upselling (Ascarza,
Iyengar, & Schleicher, 2016). Therefore, customer retention is recog- predicting whether a specific customer will churn, (ii) discovering the
nized as the most profitable strategy (Qureshi, Rehman, Qamar, Kamal, reasons behind their churn, either at a local or global level. While

∗ Corresponding author.
E-mail addresses: [email protected] (S.S. Poudel), [email protected] (S. Pokharel), [email protected]
(M. Timilsina).

https://doi.org/10.1016/j.mlwa.2024.100567
Received 9 February 2024; Received in revised form 28 March 2024; Accepted 19 June 2024
Available online 24 June 2024
2666-8270/© 2024 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
S.S. Poudel et al. Machine Learning with Applications 17 (2024) 100567

much of the existing research predominantly focuses on the first aspect. et al., 2007). However, the information for land-line services providers
They are treating churn prediction as a binary classification task and are different than mobile services (Bin et al., 2007). Some of this
employing various machine learning techniques around it such as data is missing, less reliable or incomplete in land-line communication
feature extraction (Zhao, Gao, Dong, Dong, & Dong, 2017), feature service providers. For instances, customer ages and complaint data,
selection (Umayaparvathi & Iyakutti, 2017), treatment of imbalanced fault reports are unavailable and only the call details of a few months
datasets (Fujo et al., 2022), and utilizing classifiers like SVC (Cortes & are available. Due to business confidentiality and privacy, there are
Vapnik, 1995), Logistic Regression (Hosmer, Lemeshow, & Sturdivant, no public datasets for churn prediction (Huang, Kechadi, & Buckley,
2013), Random Forest (Breiman, 2001), XGBoost (Friedman, 2001), 2012).
and Neural networks (Goodfellow, Bengio, & Courville, 2016). How- Customer churn prediction models have demonstrated significant
ever, this alone may not suffice to fully grasp customer behavior and value beyond telecommunications, notably within industries like digital
these ignore the second important component. These model cannot marketing, e-commerce, and banking, where understanding and miti-
explain the reason behind the churning. gating churn is equally critical. In digital marketing, the application
This study aims to close the research gap in the field of churning pre- of churn models facilitates the optimization of customer engagement
diction by focusing not only on forecasting whether a certain customer and retention strategies. For instance, Ascarza (2018) in their work
would churn or not, but also on the reason why. For the reasoning delve into how digital marketing efforts can be tailored to retain cus-
we adapt SHapley Additive exPlanations (SHAP) to explain machine tomers showing signs of churn, offering insights into the effectiveness
learning predictions by identifying influential customers from the train- of targeted interventions. In the banking sector, Miguéis, Van den Poel,
ing set (Lundberg & Lee, 2017a). The specific research questions (RQs) Camanho, and e Cunha (2012) apply churn prediction to understand
investigated are: and predict customer churn concerning specific banking products and
services. These references collectively highlight the broad applicability
• What are the best available off-the-shelf machine learning of churn prediction models across various industries, emphasizing their
algorithms for predicting customer churn? potential to inform and refine customer retention strategies in diverse
1. Which classification algorithm performs best for churn business contexts.
prediction in terms of different evaluation metrics? Churn analysis and prediction task is also tackled from statistical
2. Is there a significant difference in the predictions made by modeling perspective. A very popular approach to model churn is time
these classifiers? to event prediction (Bhattacharya, 1998; Van den Poel & Lariviere,
2004). In the context of customer attrition, the time to failure links to
• How can we explain the factors responsible for customer the churn behavior. The potential churner behavior has also been con-
churn? sidered using structural equation modeling (Nguyen & LeBlanc, 1998;
Varki & Colgate, 2001). Such technique can be of great interest for
1. What are the most important predictors, and how do they
managerial decisions, as it evaluates the effect of suspected influential
influence prediction performance?
features on a specific customer decision, such as churn (Geiler, Affeldt,
2. Is there any interaction between the churn predictors?
& Nadif, 2022). The variance analysis was also widely used in market-
ing and business areas to uncover customer behavior (Maxham, 2001;
Contributions: Our contributions are summarized as follows: Mittal & Kamakura, 2001; Zeithaml, Berry, & Parasuraman, 1996).
With this, this research’s contributions are summarized as follows: Financial and retail services also rely on classical T-test and Chi square
statistics to forecast customer behavior and perceptions (Hitt & Frei,
• We rigorously compared state-of-the-art supervised machine learn-
2002; Mittal & Lassar, 1998). The churn prediction problem has one
ing
algorithms for churn prediction. important issue of class imbalance (Kong, Kowalczyk, Menzel, & Bäck,
2020) that might cause biased towards the negative samples which
• We performed statistical tests to find the most significant model
might hinder training the machine learning models (Zhu, Baesens, &
for churn prediction.
vanden Broucke, 2017). Typically, this problem occurs when the classes
• We provide explanations for each predictor corresponding to
in a given dataset are unequally distributed between the minority
customer churn, highlighting both positive and negative contri-
and majority classes that is low number of ‘‘churners’’ than ‘‘non
butions to churn prediction.
churners’’. Without considering this problem, effective learning process
To the best of our knowledge, our approach is the first to generate by classification algorithms will be a challenge, since the main goal is
global and/or local explanations for churn prediction. We conducted the detection of minority classes (Dwiyanti et al., 2016; Sun, Wong, &
rigorous experiments to evaluate tabular machine learning algorithms Kamel, 2009). The popular algorithms like k-nearest neighbors (k-NN)
using different evaluation metrics and to choose the most significant is also applied in the churn-like data however studies (Dubey & Pudi,
model. 2013; Tan, 2005) have shown several significant drawbacks. In the con-
The remainder of the paper is organized as follows: related work, text of class imbalance issues in churn prediction problem, Naive Bayes
problem definition, method description, experiments, and conclusion. classifier also appeared to be sensitive due to the strong bias in the prior
estimation (Bermejo, Gámez, & Puerta, 2011). However, Huang et al.
2. Related work (2012) demonstrated reasonable results using Naive Bayes method.
Earlier studies have provided for various customer churn models
Recently, data mining techniques have emerged to tackle the chal- they have analyzed the model based on customer behavior data and
lenging problems of customer churn in telecommunication service used different data mining techniques (Moayer & Gardner, 2012; Naz,
field (Au, Chan, & Yao, 2003; Hadden, Tiwari, Roy, & Ruta, 2007). As Shoaib, & Shahzad Sarfraz, 2018; Pushpa, 2012). In these studies,
one of the important measures to retain customers, churn prediction has all churn prediction models were analyzed and models with the best
been a concern in the telecommunication industry and research (Bin, results were presented. There are various approaches for that for ex-
Peiji, & Juan, 2007). Majority of the research focused on churn pre- ample Lazarov and Capota (Lazarov & Capota, 2007) showed that a
diction were dedicated in voice services available over mobile and model based on the customer’s lifetime value analysis is the best way
fixed-line networks. In most of the cases, the features used for churn to predict customer churn. Similarly Naz et al. (2018) and Bandara,
prediction in mobile telecommunication industry includes customer Perera, and Alahakoon (2013) analyzed model based on a dataset
demographics, contractual data, customer service logs, call details, they used and showed that a big dataset with more features causes
complaint data, bill and payment information (Bin et al., 2007; Hadden model training and evaluation difficult. Hence, this research suggested

2
S.S. Poudel et al. Machine Learning with Applications 17 (2024) 100567

focusing on feature selection to reduce the number of features. In to stay. Thus in this work, we exploit the power of XAI to uncover
terms of machine learning models the study showed that for true churn local and global explanation of churn prediction. In particular, these
rate and false churn rate, SVM should be used and in case of churn explanations will enable the understanding of machine learning reason-
probability, logistic regression should have been used. Similarly Ahmed ing for the domain expert for customer churn prediction. From global
and Linen (2017) proposed that using hybrid models are useful and explanation, one can learn about the most important pattern learned by
accurate for churn prediction. the machine learning model for churn prediction about training popu-
The user churn prediction is also studied from the network science lation. It helps to understand the interaction between the confounding
perspective. Recently the studies (Ahmad, Jafar, & Aljoumaa, 2019; predictors. From local explanation, it enables the reasoning that the
Huang et al., 2015; Mitrović & De Weerdt, 2020; Xu et al., 2021; Zhang, model applied to a particular case to answer the very specific questions
Zeng, Zhao, Jin, & Li, 2022) showed the effect of social influence on such as ‘‘Why customer Alex churned?’’ and ‘‘Why has Jane continued
user churn. The techniques to approach this problem is categorized to subscribe the plan’’.
from two perspective. The first one is to model the network structure as
a surrogate of social influence. For instance, Ahmad et al. (2019) used 3. Solution approach
social network analysis to extract network-based features for machine
learning model. Similarly, Yang, Shi, Jie, and Han (2018) extracted The overall solution of our approach is illustrated in Fig. 1. The
network features to cluster users in different communities and predict main aim of this study is to assess machine learning classifiers to predict
customer churn with a deep learning model. The second one is to model the customer churn and provide local and global explainability for
the sequential order of churn as a diffusion process and use propagation those predictions. In the next section, we explained our methodology
models such as inflection and stopping rule (Ji et al., 2021) and of our approach.
spreading propagation activation (Dasgupta et al., 2008) to simulate
the diffusion process and give predictions. However, the main caveat of 4. Methods
this method is that these approaches failed to capture the causal nature
The figure above depicts the methodology of the proposed model
of social influence. There is also a graph-based semi-supervised effort to
approach for the churn prediction. The step wise working of method-
predict the customer-churn in telecommunication (Benczúr, Csalogány,
ology is described as below:
Lukács, & Siklósi, 2007). Liu et al. (2018) propose a novel graph-
based inductive semi-supervised embedding model that jointly learns • Dataset: The input to the model is the Telecommunication dataset
the prediction function and the embedding function for user-game in any tabular format. The dataset used in the paper is from
interaction to predict the user churn from the games. Kaggle. The dataset consists of missing data which requires clean-
Recent studies begin to investigate how to use causal informa- ing. For this, the dataset is passed to data preparation and
tion to build better deep learning models (Bonner & Vasile, 2018; preprocessing steps.
Yoon, Jordon, & Van Der Schaar, 2018). It includes the applications • Selection Criteria: The Telecommunication dataset consist of
to eliminate the bias between the observed data and the application data of both churners and non-churners. Some of field might
scenarios and learning the causal effects to give more accurate churn consist of missing values as well. Such data should be handled
predictions (Johansson, Shalit, & Sontag, 2016). The studies by Umaya- before the data are fed into the model. Thus, in this steps missing
parvathi and Iyakutti (2017) demonstrated that deep learning models values are drop.
have similar performance to conventional classifiers such as support • Feature Engineering and feature selection: The raw datasets
vector machine and random forest. The transfer learning which is very need to be handled before fetching to the classifiers. The input
popular in image classification has also been employed in the customer datasets consists of duplicate columns and unique value columns
churn prediction (Ahmed et al., 2019). Similarly, Seymen, Dogan, as well. Such data does not provide any significance in the churn
and Hiziroglu (2020) proposed a novel deep learning model which is prediction and thus these columns are drop.
compared to logistic regression and artificial neural network models. • Encoding: The Telecommunication dataset consist of both nu-
In a similar note, Momin, Bohra, and Raut (2020) demonstrated that meric as well as categorical data. However, all of the machine
deep Learning enables multi-stage models to represent the data at level models do not work with categorical data. Thus, numeric
multiple abstraction levels which reduces the time and effort of feature conversion of data need to be done before application of ML
selection considerably as it automatically creates useful features for models. For handling of such categorical data one-hot encoding
accurate customer churn prediction. In spite of the popularity, the technique is implemented in the model. This led to the increment
deep learning models can still be considered as a black box because in the column of the dataset.
of the complicated architecture and there is a little visibility into its • Hyper parameter selection: the optimization of hyperparame-
decision rationale (Colbrook, Antun, & Hansen, 2022). Furthermore, it ters across diverse machine learning models deployed for predict-
is also ambitious to recognize problems in a machine learning model or ing customer churn within the telecommunications sector. These
otherwise find improvements for it if the model’s behavior cannot be models are characterized by a multitude of hyperparameters, each
understood (Adadi & Berrada, 2018). EXplainable Artificial Intelligence necessitating precise calibration to enhance model efficacy.
(XAI) (Emmert-Streib, Yli-Harja, & Dehmer, 2020) is a research area • Training Models: In our methodology, we have incorporated
that studies how to make models transparent and explainable. In terms a suite of state-of-the-art classification algorithms to ensure ro-
of black box models such as random forest and artificial neural network bust and accurate modeling. This includes the utilization of the
they require the application of XAI techniques to explain the model SVM (Cortes & Vapnik, 1995), known for its effectiveness in high-
recommendation (Leung et al., 2021). dimensional spaces, and LR (Hosmer et al., 2013), a staple for
From the above listed studies, we observed that customer churn has binary classification problems. Additionally, we have leveraged
investigated a wide range of algorithms from white box to black box the Random Forest Classifier (Breiman, 2001), which excels in
models. They have good abilities to differentiate between ‘‘churn’’ and handling large datasets with numerous features. The GBM (Fried-
‘‘no churn’’ customers. However, previous studies have not primarily man, 2001) has been selected for its prowess in predictive accu-
focus on explaining churn prediction model. Therefore, successfully racy by combining multiple weak prediction models into a strong
discriminating between these two categories is not only the aspect that one. Lastly, Neural Networks (Goodfellow et al., 2016) have
is utmost importance. For customer churn prediction, understanding been implemented for their unparalleled capacity to learn from
of the model and its outputs is important as well to target incentives complex data patterns through layers of interconnected nodes,
to customers who have a high risk of churning and inducing them making our approach comprehensive and powerful.

3
S.S. Poudel et al. Machine Learning with Applications 17 (2024) 100567

Fig. 1. An illustration of the data processing, model training, evaluation and explainer models on the customer churn data.

• Statistical Test: For a fair assessment of whether any model’s Table 1

Summary of Attributes used in the Dataset.
predictive ability is statistically better or not in comparison to
others Wilcoxon signed-rank test is implemented in the model. S.N Attributes Data type

This provide a rigorous validation for model selection. 1 CustomerID object

2 Count int64
• Evaluation Strategy: To rigorously evaluate the performance of
3 Country object
the classification models, we used 10-fold cross-validation strat- 4 State object
egy. This technique involves partitioning the original dataset into 5 City object
10 equal-sized subsets. In each fold of the validation process, nine 6 Zip_Code int64
subsets are used to train the model, and the remaining one subset 7 Lat_Long object
8 Latitude float64
is used to test the model. This cycle is repeated 10 times, with 9 Longitude float64
each of the 10 subsets serving as the test set exactly once. By 10 Gender object
employing 10-fold cross-validation, we aim to achieve a more 11 Senior_Citizen object
accurate and generalized understanding of the model’s predictive 12 Partner object
13 Dependents object
power and ensure that classifier’s performance is not dependent
14 Tenure_Months int64
on a particular random split of the data. This is considered a 15 Phone_Service object
robust method for assessing the generalizability of the model to 16 Multiple_Lines object
an independent dataset. 17 Internet_Service object
18 Online_Security object
• Explainer models: To elucidate the complexities of the telecom-
19 Online_Backup object
munication dataset within our study, we have integrated ex- 20 Device_Protection object
plainer models that substantially improve data visualization. Our 21 Tech_Support object
approach incorporates SHAP (Lundberg & Lee, 2017b) plots for 22 Streaming TV object
a macro-level analysis, providing global explanations of fea- 23 Streaming Movies object
24 Contract object
ture influences on the predictive model, alongside scatter plots
25 Paperless_Billing object
for micro-level insights into individual customer behaviors. This 26 Payment Method object
bifurcated visualization strategy enables a comprehensive un- 27 Monthly Charges float64
derstanding of the dataset, facilitating the identification of sys- 28 Total_Charges object
29 CLTV int64
temic and case-specific factors influencing customer churn. Con-
30 Churn_Label object
sequently, these methods enhance the interpretability of our
model and strengthen the predictive accuracy regarding the key
determinants of churn in the telecommunications domain.
In our churn prediction model, we placed significant emphasis on
5. Feature engineering and data processing feature engineering to enhance the model’s ability to predict customer
churn accurately. Among the additional features we created, the en-
Table 1, demonstrates the feature we used in our study. We prepared gagement score. This composite score is derived from Tenure_Months,
the dataset for analysis, ensuring a robust foundation for our predictive Monthly Charges, and Total Charges, offering customer engagement
models. The dataset comprises an array of features spanning customer and loyalty over time. By integrating these elements, the engagement
demographic information, account details, and service subscriptions, score provides a multifaceted understanding of how deeply and sat-
each playing a crucial role in understanding customer behavior and isfactorily customers are connected to the services offered. Another
predicting churn. critical feature we introduced is service utilization, which quantifies the

4
S.S. Poudel et al. Machine Learning with Applications 17 (2024) 100567

Table 2 Table 3
Model hyperparameters. Summary of the Dataset.
Model Hyperparameter tuning range Hyperparameter Description Dataset
SVC 0.001641949 – 464.0812108 C Number of samples 7043
Number of features 30
Logistic Regression 5.15E−05 - 4534347.358 C
% of positive samples (Churn) 26.54%
Random Forest 9 – 20 max-depth %of negative samples (Non-Churn) 73.46%
14–20 n-estimators Data source Kaggle
GBM 5–29 max-depth
5–10 min-samples-leaf
auto max-features
3–7 max-leaf-nodes model’s performance. The models are ranked by their Accuracy, with
Neural networks 5–9 hidden-layer-sizes GBM showing the highest Accuracy of 0.81 ± 0.02 and Neural Networks
relu activation the lowest at 0.74 ± 0.06. The ROC-score follows a similar trend, with
adam solver
GBM having the highest score. The PR-score is also highest for GBM,
AdaBoost 50 – 500 n-estimators suggesting its superior performance across various aspects of churn
0.01 – 1.0 learning-rate
prediction tasks in this evaluation. The data presented in the table
XGBoost 100 - 1000 n-estimators reveals that the GBM model exhibits superior performance compared
0.01 – 0.3 learning-rate
to other models.
3 – 10 max-depth
We have used the Wilcoxon signed-rank test is used to determine
if there is a significant difference in the predictive power of GBM
compared to each of the other models when applied to the same churn
total number of services a customer utilizes, including Phone_Service, prediction task. This allows for a fair assessment of whether GBM’s
Multiple_Lines, Internet_Service, among others. This feature reflects the predictive ability is statistically better or not, providing a rigorous
depth of product penetration and serves as an indicator of potential validation for model selection.
customer satisfaction. A higher service utilization often suggests that The test results showcased in the Table 5 demonstrate that GBM
customers find value in a wider range of services, potentially increasing significantly outperform several other supervised machine learning
their loyalty and decreasing their likelihood of churn. models in the context of churn prediction for this specific dataset.
AdaBoost, with a 𝑝-value of 0.05, indicates that its difference in perfor-
6. Results mance compared to GBM is on the threshold of statistical significance,
suggesting a competitive but slightly less effective model than GBM
6.1. Model hyperparameter tuning in this context. XGBoost’s 𝑝-value of 0.07, slightly above the conven-
tional threshold for statistical significance, suggests that while it may
Table 2 illustrates the optimization of hyperparameters across di- offer strong predictive capabilities, it does not statistically outperform
verse machine learning models deployed for predicting customer churn GBM to a significant degree in this dataset. Both Neural Networks
within the telecommunications sector. These models are characterized and Logistic Regression, with p-values well below the 0.05 thresh-
by a multitude of hyperparameters, each necessitating precise calibra- old, demonstrate a statistically significant difference in performance
tion to enhance model efficacy. Detailed in the table are the ranges compared to GBM, indicating GBM’s superior capabilities in churn
of hyperparameter tuning, alongside the specific hyperparameters se- prediction. The SVC’s performance, with a 𝑝-value marginally above
lected for each model, underscoring their pivotal role in refining model the threshold, and Random Forest, with a higher 𝑝-value, suggest a less
performance. significant difference compared to GBM, underscoring GBM’s robust-
ness and effectiveness as a churn prediction tool. This comprehensive
6.2. Experiments comparison underscores the importance of selecting the right model
based on the dataset’s specific characteristics and the predictive task at
Table 3 presents the specifications of the dataset employed in hand. While GBM shows strong performance, the nuanced differences
our studies. The use of detailed telecommunication data poses sub- between models highlight the potential benefits of model ensemble
stantial challenges, primarily due to rigorous privacy regulations and approaches or further hyperparameter tuning to optimize predictive
proprietary limitations, which significantly hinder external analytical accuracy.
endeavors and innovative developments. Kaggle1 enhances these con- To further understand the effectiveness of the GBM, we utilized a
straints by providing anonymized datasets, thereby ensuring adherence confusion matrix to examine its predictive accuracy and identify the
to privacy standards while simultaneously facilitating the extraction of areas where the model may be making errors.
valuable analytical insights. The platform’s dynamic community further Table 6 presents the confusion matrix for the GBM model, a key
promotes a culture of collaboration and knowledge exchange, catalyz- tool in our churn prediction analysis. The matrix indicates that the
ing the development of novel solutions for intricate sector-specific model is highly effective at identifying customers who will remain with
issues such as churn prediction. Consequently, our study leverages the service, as evidenced by the 466 true negatives. However, it also
this publicly accessible data to train our models and derive predictive points to a notable challenge in the form of 84 false negatives, which
insights from this dataset. represent customers who were predicted to stay but actually churned.
To assess the performance of the state-of-the-art classifier model, While the model successfully identified 103 actual churners (true pos-
we utilized a comprehensive set of evaluation metrics, including Ac- itives), it incorrectly flagged 51 loyal customers as likely to churn
curacy, Precision, Recall, F1-score, Receiver Operating Characteristic (false positives), suggesting a need for refinement. The GBM model’s
(ROC) curve, and Precision–Recall (PR) score. Table 4 summarizes the strong suit is its ability to recognize stable customers, a vital aspect
performance metrics of various machine learning models used for churn of preserving a customer base and avoiding the costs associated with
prediction tasks. Each evaluation metric is accompanied by a mean unwarranted retention incentives. Yet, its tendency to overlook some
value and a standard deviation ±, indicating the variability of the churners could lead to substantial customer loss if not addressed. Im-
proving the model’s sensitivity, to capture more true churn cases, and
its precision, to reduce the mistaken identification of loyal customers as
1
https://www.kaggle.com/ churners, emerges as a critical focus for advancing its utility in practical

5
S.S. Poudel et al. Machine Learning with Applications 17 (2024) 100567

Table 4
Results of the 10 Fold cross validation of supervised machine learning classification model for churn prediction. The figure behind ± is the
standard deviation.
Models Accuracy Precision Recall F1-score ROC-score PR-score
Neural networks 0.74 ± 0.06 0.58 ± 0.26 0.43 ± 0.31 0.41 ± 0.21 0.83 ± 0.02 0.64 ± 0.03
SVC 0.78 ± 0.01 0.68 ± 0.03 0.34 ± 0.02 0.45 ± 0.02 0.77 ± 0.02 0.57 ± 0.04
Logistic Regression 0.79 ± 0.02 0.64 ± 0.04 0.47 ± 0.06 0.54 ± 0.05 0.81 ± 0.03 0.61 ± 0.04
AdaBoost 0.79 ± 0.01 0.65 ± 0.02 0.50 ± 0.06 0.57 ± 0.03 0.82 ± 0.01 0.63 ± 0.02
XGBoost 0.80 ± 0.03 0.68 ± 0.01 0.55 ± 0.02 0.61 ± 0.03 0.85 ± 0.02 0.67 ± 0.02
Random Forest 0.80 ± 0.02 0.71 ± 0.04 0.43 ± 0.08 0.53 ± 0.07 0.84 ± 0.01 0.64 ± 0.03
GBM 0.81 ± 0.02 0.67 ± 0.04 0.55 ± 0.03 0.60 ± 0.02 0.86 ± 0.01 0.68 ± 0.03

Table 5 values, with red signifying higher values. The horizontal spread of the
Wilcoxon signed rank test.
dots reflects the magnitude of each feature’s SHAP value; points to
Models Statistics pvalue
the right of the central vertical line indicate a feature’s propensity to
Neural Networks 0.0 0.03125 increase the likelihood of churn, while points to the left suggest a de-
SVC 1.0 0.0625
Logistic Regression 0.0 0.03125
crease. Notably, features such as ‘Internet_Service_Fiber optic’ and ‘Pay-
AdaBoost 2.0 0.05 ment_Method_Electronic check’ predominantly contribute positively to
XGBoost 1.5 0.07 churn predictions, whereas features like ‘Online_Security_No’, ‘Depen-
Random Forest 3.0 0.15625
dents_Yes’, and ‘Tech_Support_No’ display a mixture of positive and
negative effects on the model’s predictions. In the next section, we have
Table 6 demonstrated the top two ranked features ‘Contract_Month-to-month’,
Confusion Matrix analysis for GBM. ‘Tenure_Months‘ by the GBM and its interaction with the other features
Prediction outcome in the data.
Non-churners Churners
Non-churners 466 51
Actual value 6.2.2. Interaction between the churn predictors
Churners 84 103
Fig. 3 visualizes the relationship between month-to-month contracts
and the provision of fiber optic internet service in the context of
customer churn. The red dots represent customers who have churned
business scenarios. These enhancements are imperative for tailoring (discontinued their service), and the blue dots represent those who
customer retention strategies more effectively and securing a healthier have not churned (continued their service). The 𝑥-axis differentiates
churn rate, thereby improving the business’s financial performance and customers based on their contract type, with a particular focus on
customer satisfaction. month-to-month contracts. The 𝑦-axis measures some standardized met-
Qualitative Benchmark with Other State-Of-The-Art Models: In ric related to churn, possibly a probability or a churn score. From the
Table 7, we introduce an innovative approach to customer churn pre-
plot, we can observe a higher density of red dots at the higher end
diction, leveraging Gradient Boosting Machines (GBM) to analyze the
of the month-to-month contract axis, indicating that customers with
Kaggle customer churn prediction dataset. Our methodology achieved
month-to-month contracts and fiber optic internet service are more
a ROC-Score of 0.86, positioning it competitively among state-of-the-
likely to churn. Conversely, there are more blue dots concentrated
art methods in churn prediction for the telecommunications industry.
towards the lower end of the axis, suggesting that customers without
Notably, Ebrah et al.’s use of SVM on both the IBM Watson dataset
and the cell2cell dataset resulted in ROC-Scores of 0.83 and 0.99, fiber optic service or with longer contract terms are less likely to
respectively, indicating a high benchmark for model performance in churn. This implies an interaction where the likelihood of churn is
varied contexts (Ebrah & Elnasir, 2019). Similarly, Shrestha et al. amplified for customers who have fiber optic service on a month-to-
demonstrated the efficacy of XGBoost in achieving a ROC-Score of month basis compared to those without such service or with more
0.98 with data from a Telecom service provider in Nepal (Shrestha & extended contracts.
Shakya, 2022), while Saha et al. utilized CNN and ANN models to reach Fig. 4 illustrates the relationship between customer tenure, mea-
a ROC-Score of 0.99 across datasets from both Southeast Asian and sured in months on the 𝑥-axis, and the amount they are charged
American telecom markets (Saha et al., 2023). These findings under- monthly, represented by the color intensity of the dots, with magenta
score the significant advancements in churn prediction methodologies, indicating higher charges and blue indicating lower charges. The 𝑦-axis
with SVM, XGBoost, CNN, and ANN models setting high standards for shows a standardized value metric, which might represent customer
accuracy and reliability. Our GBM-based approach contributes to this satisfaction or likelihood of churn. The pattern suggests that customers
evolving landscape by not only achieving a commendable ROC-Score with shorter tenure and higher monthly charges (magenta dots) expe-
but also by emphasizing the adaptability and effectiveness of GBM rience a more substantial negative impact on the standardized value
models in handling the complexities of customer churn prediction. This
metric, which could indicate lower satisfaction or higher churn risk. As
comparative analysis highlights our model’s potential in bridging the
tenure increases, the density of magenta dots diminishes, particularly
gap between traditional machine learning techniques and the demands
beyond the 20-month mark, suggesting that customers with higher
of modern-day churn prediction challenges.
monthly charges either improve in their standardized value metric or
6.2.1. Selection of most important predictors possibly churn out of the service, leaving behind those more satisfied
Fig. 2 presents a beeswarm plot generated using SHAP values, or less sensitive to the charge amount. The convergence of magenta
which delineates the influence of various features on the GBM model’s and blue dots as tenure increases indicates that the impact of monthly
churn predictions. The plot reveals that the ‘Contract_Month-to-month’, charges on the standardized metric decreases over time. Customers
‘Tenure_Months’, and ‘Monthly Charges’ features exert the most sub- with longer tenure, irrespective of their monthly charges, show similar
stantial impact on the model’s output, with the ‘Contract_Month-to- values of the standardized metric, which could imply that the initial
month’ feature, in particular, strongly pushing predictions towards sensitivity to pricing diminishes, or that the remaining customer base
churn. A gradation from blue to red denotes the range of feature has adapted to or accepted the monthly charges.

6
S.S. Poudel et al. Machine Learning with Applications 17 (2024) 100567

Table 7
Performance comparison of various models on telecom customer churn prediction, highlighting our GBM approach.
Reference Dataset Evaluation metric Model
Yabas, Cankaya, and Ince (2012) Orange Telecom ROC-Score (0.653) Random Forest
Ebrah and Elnasir (2019) IBM Watson dataset ROC-Score (0.83) SVM
Ebrah and Elnasir (2019) cell2cell ROC-Score (0.99) SVM
Shrestha and Shakya (2022) Telecom service providerof Nepal ROC-Score (0.98) XGBoost
Saha et al. (2023) Southeast Asian telecom industry, and American telecom market. ROC-Score (0.99) in bothdataset CNN and ANN
Our approach Kaggle customer churn prediction ROC-Score (0.86) GBM

Fig. 2. Beeswarm plot for GBM model.

7. Discussion

The telecommunications industry is at the forefront of customer-

centric strategies, where understanding and mitigating churn is not just
beneficial but essential for sustaining growth and profitability. In our
study, we sought to identify a machine learning model that not only
excels in churn prediction but also offers clear insights into the reasons
behind customer turnover. GBM model emerged as the front-runner in
our analyses, substantiated by a rigorous statistical comparison using
the Wilcoxon signed-rank test. The test revealed that GBM significantly
outperforms Neural Networks and Logistic Regression in predicting
churn, with p-values indicating the improbability of such results being
due to chance.
What sets our approach apart is the incorporation of SHAP, which
provided both global and local interpretability of the GBM model’s
predictions. Globally, SHAP values allowed us to rank features by their
Fig. 3. Feature Interaction of Contract-Month-to-month. importance and to understand the overall direction and strength of
each feature’s impact on churn prediction. For instance, features like
month-to-month contracts, tenure, and monthly charges were identi-
fied as key drivers of churn. Customers with short-term contracts or
higher monthly charges were predisposed to churn, implying that long-
term contracts and competitive pricing could be effective retention
strategies.
Locally, SHAP offered insights into individual predictions, explain-
ing why specific customers were likely to churn according to the model.
This level of detail is crucial for customer relationship management,
as it allows for personalized intervention strategies. For example, a
customer predicted to churn due to high monthly charges could be
offered a discount or a bundle package as an incentive to stay. The
GBM model’s ability to reveal complex interactions between features
was another advantage. Through SHAP interaction values, we observed
how the impact of one feature on churn could change in the presence
of another feature. For example, the negative effect of a month-to-
month contract on customer retention was exacerbated when combined
with fiber optic internet service, suggesting that customers with this
Fig. 4. Feature Interaction of Tenure-Months.
combination of services were particularly churn-prone.

7
S.S. Poudel et al. Machine Learning with Applications 17 (2024) 100567

The combination of GBM and SHAP explanations thus provided CRediT authorship contribution statement
a powerful tool for telecom operators. Not only could they accu-
rately predict which customers were at risk of churning, but they Sumana Sharma Poudel: Conducted experiments, Analysed the
could also understand the underlying factors contributing to these results, Prepared the original draft. Suresh Pokharel: Revised the
predictions. This understanding facilitates the development of targeted original draft. Mohan Timilsina: Provided the guidance, Revised the
strategies to retain specific customer segments, enhancing the efficiency manuscript.
of marketing efforts and potentially improving customer satisfaction.
Incorporating these insights into business operations could lead to more Declaration of competing interest
nuanced customer segmentation and more effective churn prevention
initiatives. For instance, identifying at-risk customers based on their us- We wish to confirm that there are no known conflicts of interest
age patterns and service preferences enables the deployment of tailored associated with this publication and there has been no significant
communication strategies and personalized offers, thereby fostering financial support for this work that could have influenced its outcome.
customer engagement and loyalty.
Our work’s core contribution lies in enhancing the interpretabil- Data availability
ity of machine learning (ML) models for customer churn prediction,
particularly through the use of SHapley Additive exPlanations (SHAP) Data will be made available on request.
values. The creation of unique features before data classification indeed
presents a valuable avenue for research; however, it poses substantial Acknowledgments
challenges, including the need for deep domain expertise, limitations
posed by data availability and quality, the balance between model We would like to thank Data Science Institute, Insight Center for
complexity and interpretability, and the risk of overfitting. Our study Data Analytics, at University of Galway Ireland for providing us con-
focuses on leveraging existing, well-understood features and enrich- structive feedback and improvement of the manuscript.
ing the analysis with detailed interpretability to provide actionable
insights. This approach not only aids telecom providers in identifying References
and addressing churn risks but also maintains the model’s generaliz-
ability and robustness, carefully navigating the complexities inherent Adadi, A., & Berrada, M. (2018). Peeking inside the black-box: a survey on explainable
artificial intelligence (XAI). IEEE Access, 6, 52138–52160.
in feature engineering.
Ahmad, A. K., Jafar, A., & Aljoumaa, K. (2019). Customer churn prediction in telecom
using machine learning in big data platform. Journal of Big Data, 6(1), 1–24.
8. Conclusion Ahmed, U., Khan, A., Khan, S. H., Basit, A., Haq, I. U., & Lee, Y. S. (2019). Transfer
learning and meta classification based deep churn prediction system for telecom
In the telecom sector, accurately predicting which customers are industry. arXiv preprint arXiv:1901.06091.
Ahmed, A., & Linen, D. M. (2017). A review and analysis of churn prediction methods
likely to leave the service is crucial. The ability to identify at-risk for customer retention in telecom industries. In 2017 4th international conference on
customers early on allows companies to intervene with targeted re- advanced computing and communication systems (pp. 1–7). IEEE.
tention strategies. Machine learning models, particularly those that Ascarza, E. (2018). Retention futility: Targeting high-risk customers might be
handle tabular data, are key to making these predictions. These models ineffective. Journal of Marketing Research, 55(1), 80–98.
Ascarza, E., Iyengar, R., & Schleicher, M. (2016). The perils of proactive churn
analyze customer data and can effectively forecast who might churn.
prevention using plan recommendations: Evidence from a field experiment. Journal
This predictive power is essential for reducing churn rates, which is of Marketing Research, 53(1), 46–60.
a persistent problem for telecom providers. Our research found that Au, W.-H., Chan, K. C., & Yao, X. (2003). A novel evolutionary data mining al-
the GBM model was especially effective in this data. To confirm GBM’s gorithm with applications to churn prediction. IEEE Transactions on Evolutionary
performance, we compared it with other advanced models using the Computation, 7(6), 532–545.
Bandara, W., Perera, A., & Alahakoon, D. (2013). Churn prediction methodologies in the
Wilcoxon signed-rank test. The test results showed that GBM was
telecommunications sector: A survey. In 2013 international conference on advances
significantly better at predicting churn. The 𝑝-value from the test helped in ICT for emerging regions (pp. 172–176). IEEE.
us understand the strength of this evidence. A lower 𝑝-value indicates a Benczúr, A. A., Csalogány, K., Lukács, L., & Siklósi, D. (2007). Semi-supervised learning:
more definitive difference between the models, and in our case, GBM’s A comparative study for web spam and telephone user churn. In In graph labeling
workshop in conjunction with ECML/pKDD. Citeseer.
lower 𝑝-value confirmed its superior predictive ability. similarly, we
Bermejo, P., Gámez, J. A., & Puerta, J. M. (2011). Improving the performance of Naive
leveraged the SHAP (SHapley Additive exPlanations) values to gain Bayes multinomial in e-mail foldering by introducing distribution-based balance of
insights into the importance of different features in our predictive datasets. Expert Systems with Applications, 38(3), 2072–2080.
model. This information is invaluable for telecom companies looking Bhattacharya, C. (1998). When customers are members: Customer retention in paid
to pinpoint the factors that most influence customer churn. By utiliz- membership contexts. Journal of the Academy of Marketing Science, 26(1), 31–44.
Bin, L., Peiji, S., & Juan, L. (2007). Customer churn prediction based on the decision
ing SHAP values, we were able to identify which specific customer
tree in personal handyphone system service. In 2007 international conference on
attributes, such as call duration, plan type, or contract length, had the service systems and service management (pp. 1–5). IEEE.
most significant impact on the churn prediction. These insights helped Bolton, R. N., & Bronkhorst, T. M. (1995). The relationship between customer
telecom providers tailor their retention efforts towards addressing the complaints to the firm and subsequent exit behavior. ACR North American Advances.
Bonner, S., & Vasile, F. (2018). Causal embeddings for recommendation. In Proceedings
key factors driving customer attrition. SHAP values provided a trans-
of the 12th ACM conference on recommender systems (pp. 104–112).
parent and interpretable way to analyze the model’s decision-making Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
process, making it a valuable tool for optimizing customer retention Colbrook, M. J., Antun, V., & Hansen, A. C. (2022). The difficulty of computing
strategies in the telecommunications sector. stable and accurate neural networks: On the barriers of deep learning and Smale’s
18th problem. Proceedings of the National Academy of Sciences, 119(12), Article
e2107151119.
Funding
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20,
273–297.
This work receive no funding. Coussement, K., & Van den Poel, D. (2008). Churn prediction in subscription
services: An application of support vector machines while comparing two
Ethical approval parameter-selection techniques. Expert Systems with Applications, 34(1), 313–327.
Dasgupta, K., Singh, R., Viswanathan, B., Chakraborty, D., Mukherjea, S., Nanavati, A.
A., et al. (2008). Social ties and their relevance to churn in mobile telecom
All data used in this work is freely available online. No other aspect networks. In Proceedings of the 11th international conference on extending database
of this work cause ethical issues. technology: advances in database technology (pp. 668–677).

8
S.S. Poudel et al. Machine Learning with Applications 17 (2024) 100567

Dubey, H., & Pudi, V. (2013). Class based weighted k-nearest neighbor over imbalance Mittal, B., & Lassar, W. M. (1998). Why do customers switch? The dynamics of
dataset. In Pacific-Asia conference on knowledge discovery and data mining (pp. satisfaction versus loyalty. Journal of Services Marketing, 12(3), 177–194.
305–316). Springer. Moayer, S., & Gardner, S. (2012). Integration of data mining within a strategic
Dwiyanti, E., Ardiyanti, A., et al. (2016). Handling imbalanced data in churn prediction knowledge management framework. International Journal of Advanced Computer
using rusboost and feature selection (case study: Pt. telekomunikasi Indonesia Science and Applications, 3(8).
regional 7). In International conference on soft computing and data mining (pp. Momin, S., Bohra, T., & Raut, P. (2020). Prediction of customer churn using machine
376–385). Springer. learning. In EAI international conference on big data innovation for sustainable cognitive
Ebrah, K., & Elnasir, S. (2019). Churn prediction using machine learning and recom- computing (pp. 203–212). Springer.
mendations plans for telecoms. Journal of Computer and Communications, 7(11), 3. Naz, N. A., Shoaib, U., & Shahzad Sarfraz, M. (2018). A review on customer
http://dx.doi.org/10.4236/jcc.2019.711003. churn prediction data mining modeling techniques. Indian Journal of Science and
Emmert-Streib, F., Yli-Harja, O., & Dehmer, M. (2020). Explainable artificial intelligence Technology, 11(27), 1–27.
and machine learning: A reality rooted perspective. Wiley Interdisciplinary Reviews: Nguyen, N., & LeBlanc, G. (1998). The mediating role of corporate image on customers’
Data Mining and Knowledge Discovery, 10(6), Article e1368. retention decisions: an investigation in financial services. International Journal of
Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Bank Marketing.
Annals of Statistics, 1189–1232. Pushpa, S. (2012). An efficient method of building the telecom social network for churn
Fujo, S. W., Subramanian, S., Khder, M. A., et al. (2022). Customer churn prediction in prediction. International Journal of Data Mining & Knowled Management Process, 2(3),
telecommunication industry using deep learning. Information Sciences Letters, 11(1), 31–39.
24. Qureshi, S. A., Rehman, A. S., Qamar, A. M., Kamal, A., & Rehman, A. (2013).
Geiler, L., Affeldt, S., & Nadif, M. (2022). A survey on machine learning methods for Telecommunication subscribers’ churn prediction model using machine learning.
churn prediction. International Journal of Data Science and Analytics, 1–26. In Eighth international conference on digital information management (pp. 131–136).
Gerpott, T. J., Rams, W., & Schindler, A. (2001). Customer retention, loyalty, IEEE.
and satisfaction in the German mobile cellular telecommunications market. Reichheld, F. F., & Sasser, W. E. (1990). Zero defeofions: Quoliiy comes to services.
Telecommunications Policy, 25(4), 249–269. Harvard Business Review, 68(5), 105–111.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press. Reinartz, W. J., & Kumar, V. (2003). The impact of customer relationship characteristics
Hadden, J., Tiwari, A., Roy, R., & Ruta, D. (2007). Computer assisted customer churn on profitable lifetime duration. Journal of Marketing, 67(1), 77–99.
management: State-of-the-art and future trends. Computers & Operations Research, Saha, L., et al. (2023). Deep churn prediction method for telecommunication industry.
34(10), 2902–2917. Sustainability, 15(5), 4543.
Hawley, D. (2003). International wireless churn management: research and recom- Seymen, O. F., Dogan, O., & Hiziroglu, A. (2020). Customer churn prediction using
mendations. Yankee Group report, (June), URL http://www.ams.com/cme/pdfs/ deep learning. In International conference on soft computing and pattern recognition
yankeechurnstudy.pdf. (Accessed January 2006). (pp. 520–529). Springer.
Hitt, L. M., & Frei, F. X. (2002). Do better customers utilize electronic distribution Shrestha, S. M., & Shakya, A. (2022). A customer churn prediction model using
channels? The case of PC banking. Management Science, 48(6), 732–748. XGBoost for the telecommunication industry in Nepal. Procedia Computer Science,
Hosmer, D. W., Jr., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression: 215, 652–661.
vol. 398, John Wiley & Sons. Sun, Y., Wong, A. K., & Kamel, M. S. (2009). Classification of imbalanced data: A
Huang, B., Kechadi, M. T., & Buckley, B. (2012). Customer churn prediction in review. International Journal of Pattern Recognition and Artificial Intelligence, 23(04),
telecommunications. Expert Systems with Applications, 39(1), 1414–1425. 687–719.
Huang, Y., Zhu, F., Yuan, M., Deng, K., Li, Y., Ni, B., et al. (2015). Telco churn Tan, S. (2005). Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert
prediction with big data. In Proceedings of the 2015 ACM SIGMOD international Systems with Applications, 28(4), 667–671.
conference on management of data (pp. 607–618). Umayaparvathi, V., & Iyakutti, K. (2017). Automated feature selection and churn
Ji, H., Zhu, J., Wang, X., Shi, C., Wang, B., Tan, X., et al. (2021). Who you would like to prediction using deep learning models. International Research Journal of Engineering
share with? a study of share recommendation in social e-commerce. In Proceedings and Technology (IRJET), 4(3), 1846–1854.
of the AAAI conference on artificial intelligence, vol. 35, no. 1 (pp. 232–239). Van den Poel, D., & Lariviere, B. (2004). Customer attrition analysis for financial
Johansson, F., Shalit, U., & Sontag, D. (2016). Learning representations for counter- services using proportional hazard models. European Journal of Operational Research,
factual inference. In International conference on machine learning (pp. 3020–3029). 157(1), 196–217.
PMLR. Varki, S., & Colgate, M. (2001). The role of price perceptions in an integrated model
Kong, J., Kowalczyk, W., Menzel, S., & Bäck, T. (2020). Improving imbalanced of behavioral intentions. Journal of Service Research, 3(3), 232–240.
classification by anomaly detection. In International conference on parallel problem Wei, C.-P., & Chiu, I.-T. (2002). Turning telecommunications call details to churn
solving from nature (pp. 512–523). Springer. prediction: a data mining approach. Expert Systems with Applications, 23(2),
Lazarov, V., & Capota, M. (2007). Churn prediction. Business Analysis Course. TUM 103–112.
Computer Science, 33, 34. Xu, F., Zhang, G., Yuan, Y., Huang, H., Yang, D., Jin, D., et al. (2021). Understanding
Leung, C. K., Pazdor, A. G., & Souza, J. (2021). Explainable artificial intelligence for the invitation acceptance in agent-initiated social e-commerce. In Proceedings of the
data science on customer churn. In 2021 IEEE 8th international conference on data international AAAI conference on web and social media, vol. 15 (pp. 820–829).
science and advanced analytics (pp. 1–10). IEEE. Yabas, U., Cankaya, H. C., & Ince, T. (2012). Customer churn prediction for telecom
Liao, C.-H., & Lien, C.-Y. (2012). Measuring the technology gap of APEC integrated services. In 2012 IEEE 36th annual computer software and applications conference (pp.
telecommunications operators. Telecommunications Policy, 36(10–11), 989–996. 358–359). http://dx.doi.org/10.1109/COMPSAC.2012.54.
Liu, X., Xie, M., Wen, X., Chen, R., Ge, Y., Duffield, N., et al. (2018). A semi-supervised Yang, Z., & Peterson, R. T. (2004). Customer perceived value, satisfaction, and loyalty:
and inductive embedding model for churn prediction of large-scale mobile games. The role of switching costs. Psychology & Marketing, 21(10), 799–822.
In 2018 ieee international conference on data mining (pp. 277–286). IEEE. Yang, C., Shi, X., Jie, L., & Han, J. (2018). I know you’ll be back: Interpretable new user
Lundberg, S. M., & Lee, S.-I. (2017a). A unified approach to interpreting model clustering and churn prediction on a mobile social application. In Proceedings of
predictions. Advances in Neural Information Processing Systems, 30. the 24th ACM SIGKDD international conference on knowledge discovery & data mining
Lundberg, S. M., & Lee, S.-I. (2017b). A unified approach to interpreting model (pp. 914–922).
predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, Yoon, J., Jordon, J., & Van Der Schaar, M. (2018). GANITE: Estimation of individualized
S. Vishwanathan, & R. Garnett (Eds.), Advances in neural information processing treatment effects using generative adversarial nets. In International conference on
systems 30 (pp. 4765–4774). Curran Associates, Inc.. learning representations.
Maxham, J. G., III (2001). Service recovery’s influence on consumer satisfaction, Zeithaml, V. A., Berry, L. L., & Parasuraman, A. (1996). The behavioral consequences
positive word-of-mouth, and purchase intentions. Journal of Business Research, of service quality. Journal of Marketing, 60(2), 31–46.
54(1), 11–24. Zhang, G., Zeng, J., Zhao, Z., Jin, D., & Li, Y. (2022). A counterfactual modeling
Miguéis, V. L., Van den Poel, D., Camanho, A. S., & e Cunha, J. F. (2012). Modeling framework for churn prediction. In Proceedings of the fifteenth ACM international
partial customer churn: On the value of first product-category purchase sequences. conference on web search and data mining (pp. 1424–1432).
Expert Systems with Applications, 39(12), 11250–11256. Zhao, L., Gao, Q., Dong, X., Dong, A., & Dong, X. (2017). K-local maximum margin
Mitrović, S., & De Weerdt, J. (2020). Churn modeling with probabilistic meta paths- feature extraction algorithm for churn prediction in telecom. Cluster Computing, 20,
based representation learning. Information Processing & Management, 57(2), Article 1401–1409.
102052. Zhu, B., Baesens, B., & vanden Broucke, S. K. (2017). An empirical comparison
Mittal, V., & Kamakura, W. A. (2001). Satisfaction, repurchase intent, and repurchase of techniques for the class imbalance problem in churn prediction. Information
behavior: Investigating the moderating effect of customer characteristics. Journal Sciences, 408, 84–99.
of Marketing Research, 38(1), 131–142.

Virtual Engagement: Managing Seamless Marketing with Technology
From Everand
Virtual Engagement: Managing Seamless Marketing with Technology
Rajagopal
No ratings yet
Non-Equilibrium Statistical Mechanics
100% (2)
Non-Equilibrium Statistical Mechanics
337 pages
Optimization in Planning and Operation of Electric Power System
100% (1)
Optimization in Planning and Operation of Electric Power System
362 pages
Classification of Customer Churn Prediction Model For Telecommunication Industry Using Analysis of Variance
No ratings yet
Classification of Customer Churn Prediction Model For Telecommunication Industry Using Analysis of Variance
7 pages
Customer Churn Prediction in Telcom Industry Using Data Mining Techniques
No ratings yet
Customer Churn Prediction in Telcom Industry Using Data Mining Techniques
14 pages
Data Mining 101: Core Concepts and Algorithms
From Everand
Data Mining 101: Core Concepts and Algorithms
Swarnalata Verma
No ratings yet
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
6 pages
Journal Pone 0278095
No ratings yet
Journal Pone 0278095
21 pages
Customerchurnprediction Systema Machinelearning
No ratings yet
Customerchurnprediction Systema Machinelearning
24 pages
Customer Churn Telecom
No ratings yet
Customer Churn Telecom
35 pages
Time Series Analysis (Stat 2042) Lecture Note.
No ratings yet
Time Series Analysis (Stat 2042) Lecture Note.
134 pages
Edge Computing Applications in Supply Chain Management
From Everand
Edge Computing Applications in Supply Chain Management
Bo Li
No ratings yet
Predicting Customer Churn A Systematic Literature Review
No ratings yet
Predicting Customer Churn A Systematic Literature Review
22 pages
Ahmad2019 Article CustomerChurnPredictionInTelec PDF
No ratings yet
Ahmad2019 Article CustomerChurnPredictionInTelec PDF
24 pages
Churn Rate DPV
No ratings yet
Churn Rate DPV
15 pages
Paper3 On Chrun Prediction
No ratings yet
Paper3 On Chrun Prediction
16 pages
A Review On Machine Learning Methods For Customer Churn Prediction and Recommendations For Business Practitioners
No ratings yet
A Review On Machine Learning Methods For Customer Churn Prediction and Recommendations For Business Practitioners
30 pages
CSITRJ1170 Final Paper V1
No ratings yet
CSITRJ1170 Final Paper V1
16 pages
Customer Churn Prediction Using Machine Learning Approaches
No ratings yet
Customer Churn Prediction Using Machine Learning Approaches
7 pages
Reading 3
No ratings yet
Reading 3
19 pages
Speech F
No ratings yet
Speech F
16 pages
Computing Efficient Features Using Rough Set Theory Combined With Ensemble Classification Techniques To Improve The Customer Churn Prediction in Telecommunication Sector
No ratings yet
Computing Efficient Features Using Rough Set Theory Combined With Ensemble Classification Techniques To Improve The Customer Churn Prediction in Telecommunication Sector
22 pages
2024 Article 63750
No ratings yet
2024 Article 63750
13 pages
Customer Churn Prediction in The Telecommunication Sector Using A Rough Set Approach
No ratings yet
Customer Churn Prediction in The Telecommunication Sector Using A Rough Set Approach
13 pages
Applying Data Mining To Telecom Churn Ma
No ratings yet
Applying Data Mining To Telecom Churn Ma
10 pages
Second Law of Thermodynamics
100% (1)
Second Law of Thermodynamics
18 pages
Customer Churn Prediction in Telecom Sector Using Machine Learning Techniques
No ratings yet
Customer Churn Prediction in Telecom Sector Using Machine Learning Techniques
16 pages
Enhancing Customer Retention Strategies Predicting Churn Rate in Telecom Sectors Using Machine Learning Ensemble Techniques
No ratings yet
Enhancing Customer Retention Strategies Predicting Churn Rate in Telecom Sectors Using Machine Learning Ensemble Techniques
6 pages
Comparative Analysis of Predictive Models For Customer Churn Prediction in The Telecommunication Industry
No ratings yet
Comparative Analysis of Predictive Models For Customer Churn Prediction in The Telecommunication Industry
6 pages
Algorithms 17 00231
No ratings yet
Algorithms 17 00231
21 pages
Enhancing Customer Churn Prediction in Telecommunication With CNN-Gradient Boosting Machine
No ratings yet
Enhancing Customer Churn Prediction in Telecommunication With CNN-Gradient Boosting Machine
6 pages
1 s2.0 S2590123024014208 Main
No ratings yet
1 s2.0 S2590123024014208 Main
12 pages
Token ID Ain20250117003-1
No ratings yet
Token ID Ain20250117003-1
14 pages
2 Customer Churning Analysis Using Machine Learning Algorithms
No ratings yet
2 Customer Churning Analysis Using Machine Learning Algorithms
10 pages
Customer Churn Prediction Using Machine Learning: D. Deepika, Nihal Chandra
100% (1)
Customer Churn Prediction Using Machine Learning: D. Deepika, Nihal Chandra
14 pages
Customer Churn in Subscription Business Model-Pred
No ratings yet
Customer Churn in Subscription Business Model-Pred
7 pages
IJIKMv18p087 105tran8783
No ratings yet
IJIKMv18p087 105tran8783
20 pages
5-LP Simplex (CJ-ZJ Tableau)
No ratings yet
5-LP Simplex (CJ-ZJ Tableau)
7 pages
Customer Churn Prediction System: A Machine Learning Approach
No ratings yet
Customer Churn Prediction System: A Machine Learning Approach
24 pages
Churn Prediction in Telecom Using Machine Learning in R
No ratings yet
Churn Prediction in Telecom Using Machine Learning in R
9 pages
Customer Churn Prediction Using Machine Learning
No ratings yet
Customer Churn Prediction Using Machine Learning
7 pages
Clustering Comparison of Customer Attrition Dataset Using Machine Learning Algorithms
No ratings yet
Clustering Comparison of Customer Attrition Dataset Using Machine Learning Algorithms
5 pages
AI Unit 2 - Constarint Sattisfaction, Means End Analysis, Adversial Search
No ratings yet
AI Unit 2 - Constarint Sattisfaction, Means End Analysis, Adversial Search
42 pages
1 7 Machine Learning Approaches For Customer Churn Prediction in Telecommunications
No ratings yet
1 7 Machine Learning Approaches For Customer Churn Prediction in Telecommunications
7 pages
Output 4
No ratings yet
Output 4
5 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
2 pages
Behavioral Attributes and Financial Churn Prediction: Regulararticle Open Access
No ratings yet
Behavioral Attributes and Financial Churn Prediction: Regulararticle Open Access
18 pages
Lec Note - Robust Control - 1501
No ratings yet
Lec Note - Robust Control - 1501
83 pages
Customer Churn Prediction Using Machine Learning Algorithms
No ratings yet
Customer Churn Prediction Using Machine Learning Algorithms
6 pages
Bharad Waj 2018
No ratings yet
Bharad Waj 2018
3 pages
Efficacy of Customer Churn Prediction System
No ratings yet
Efficacy of Customer Churn Prediction System
8 pages
Décortication Article 1
No ratings yet
Décortication Article 1
4 pages
Research Churn
No ratings yet
Research Churn
4 pages
Abhishekj Uvatkar
No ratings yet
Abhishekj Uvatkar
4 pages
Ali Tamaddoni Jahromi, Mehrad Moeini, Issar Akbari, Aram Akbarzadeh
No ratings yet
Ali Tamaddoni Jahromi, Mehrad Moeini, Issar Akbari, Aram Akbarzadeh
11 pages
Anticipating Customer Churn in Telecommunication Using Machine Learning Algorithms For Customer Retention
No ratings yet
Anticipating Customer Churn in Telecommunication Using Machine Learning Algorithms For Customer Retention
7 pages
Paper Published
No ratings yet
Paper Published
5 pages
A Review On Churn Prediction and Customer Segmentation Using Machine Learning
No ratings yet
A Review On Churn Prediction and Customer Segmentation Using Machine Learning
5 pages
Customer Churn Prediction in Telecommunication
No ratings yet
Customer Churn Prediction in Telecommunication
13 pages
Document 2 1
No ratings yet
Document 2 1
1 page
CHURNFORGE Research Paper Kajal
No ratings yet
CHURNFORGE Research Paper Kajal
6 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
5 pages
(IJCST-V11I1P5) :jitendra Maan, Harsh Maan
No ratings yet
(IJCST-V11I1P5) :jitendra Maan, Harsh Maan
6 pages
Analysis of Customer Churn Prediction in Telecom Industry Using Decision Trees and Logistic Regression
No ratings yet
Analysis of Customer Churn Prediction in Telecom Industry Using Decision Trees and Logistic Regression
4 pages
2 Uninformed Search
No ratings yet
2 Uninformed Search
41 pages
A Proposed Churn Prediction Model: Essam Shaaban, Yehia Helmy, Ayman Khedr, Mona Nasr
No ratings yet
A Proposed Churn Prediction Model: Essam Shaaban, Yehia Helmy, Ayman Khedr, Mona Nasr
5 pages
Customer Churn Analysis in Telecom Industry
No ratings yet
Customer Churn Analysis in Telecom Industry
6 pages
Mathematical Modeling
No ratings yet
Mathematical Modeling
17 pages
Control Engineering (10me82)
No ratings yet
Control Engineering (10me82)
122 pages
Customer Churn Analysis and Prediction
No ratings yet
Customer Churn Analysis and Prediction
4 pages
LAB # 01 Digital Sequences (Unit Step, Unit Impulse) : Background Review
No ratings yet
LAB # 01 Digital Sequences (Unit Step, Unit Impulse) : Background Review
6 pages
CD
No ratings yet
CD
5 pages
Ones Complement
No ratings yet
Ones Complement
7 pages
KMeansPP Soda
No ratings yet
KMeansPP Soda
9 pages
Pid Control
No ratings yet
Pid Control
19 pages
Kernel Support and Resistance
No ratings yet
Kernel Support and Resistance
2 pages
Bottleneck Model: Terminology and Symbols
No ratings yet
Bottleneck Model: Terminology and Symbols
5 pages
Graph Theory
No ratings yet
Graph Theory
18 pages
ASME VIII 2 Permissible Cycle Life
No ratings yet
ASME VIII 2 Permissible Cycle Life
5 pages
M269-Final-By ISA-5th Edition
No ratings yet
M269-Final-By ISA-5th Edition
110 pages
Assymetric Key Cryptography
No ratings yet
Assymetric Key Cryptography
4 pages
Branch Net
No ratings yet
Branch Net
13 pages
Controlability & Reachability
No ratings yet
Controlability & Reachability
3 pages
Bspline
No ratings yet
Bspline
12 pages
Quantum Computing A Tool For Zero Trust Wireless Networks
No ratings yet
Quantum Computing A Tool For Zero Trust Wireless Networks
9 pages
Review On Online Feature Selection
No ratings yet
Review On Online Feature Selection
4 pages
Week1 Exercises
No ratings yet
Week1 Exercises
3 pages
A Very Brief Introduction To Machine Learning With Applications To Communication Systems
No ratings yet
A Very Brief Introduction To Machine Learning With Applications To Communication Systems
20 pages
Practice Set 2
No ratings yet
Practice Set 2
5 pages
Hw5 C2a Vergara
No ratings yet
Hw5 C2a Vergara
5 pages

1 s2.0 S2666827024000434 Main

Uploaded by

1 s2.0 S2666827024000434 Main

Uploaded by

Machine Learning with Applications 17 (2024) 100567

Contents lists available at ScienceDirect

Machine Learning with Applications

Explaining customer churn prediction in telecom industry using tabular

ARTICLE INFO ABSTRACT

• Statistical Test: For a fair assessment of whether any model’s Table 1

This provide a rigorous validation for model selection. 1 CustomerID object

Fig. 2. Beeswarm plot for GBM model.

The telecommunications industry is at the forefront of customer-

You might also like