0% found this document useful (0 votes)
24 views16 pages

Hybrid ML for Online Insurance CX

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views16 pages

Hybrid ML for Online Insurance CX

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Decision Analytics Journal 11 (2024) 100452

Contents lists available at ScienceDirect

Decision Analytics Journal


journal homepage: www.elsevier.com/locate/dajour

A hybrid machine learning with process analytics for predicting customer


experience in online insurance services industry
Fatemeh Akhavan, Erfan Hassannayebi ∗
Department of Industrial Engineering, Sharif University of Technology, Tehran, Iran

ARTICLE INFO ABSTRACT


Keywords: It is essential to innovate and improve service levels by predicting possible process outcomes due to the
Process intelligence growth of online service providers. This study estimates the customer satisfaction level based on customer
Business process mining experience analysis. In doing so, we answer recent calls for research about a more thorough exploration
Predictive analytics
of customer behavior using predictive process monitoring techniques. In particular, a hybrid framework
Customer experience
of supervised/unsupervised machine learning methods is proposed to predict the outcomes of customers’
Online insurance
experiences while dealing with the problem of high intra-class variance. This problem occurs due to the large
dispersion of traces identified in the customer journeys. In this regard, customer journeys are first matched
with the event log format aiming to implement a Density-Based Spatial Clustering of Applications with Noise
(DBSCAN) clustering technique based on the similarity between the customer journeys. After summarizing the
journeys by removing low-value activities, the multi-class decision tree classification method is applied, and
the level of customer satisfaction is predicted. Due to the imbalanced nature of the data, the oversampling
for imbalanced classification is applied to achieve good results in accuracy indicators such as recall, precision,
and F1-score. Finally, the proposed approach has been evaluated on a real-life event log, BPI Challenge 2016,
to investigate unsatisfied customers. The results of the machine learning models on the test data show a high
degree of accuracy in predicting customer dissatisfaction.

1. Introduction that customers go through when interacting with a business is known


as the customer journey [4].
Process intelligence is a powerful technique in the sub-disciplines Nowadays, many fields and types of analysis apply the customer
of both Artificial intelligence (AI) and business process management journeys. One of the applications of customer journey analytics is the
(BPM) [1]. It allows gaining insights into business process performance extraction of journey maps using data mining approaches [5]. With
and behavior using historical records from event logs. Process mining the increase of online service providers, customer behavior has become
is an essential category of process intelligence that targets existing more complex and thus the extracted journey map will be incom-
processes in an organization and provides solutions to subject matter prehensible. To solve this problem, the issue of abstracting customer
experts for modeling, documenting, and collaborating to re-engineer journey maps has been considered one of the main aspects of customer
an organization’s operational processes [2,3]. journey analysis [6]. On the other hand, recently, to improve the
In customer services, process intelligence can be used to monitor, key performance indicators (KPIs) related to businesses, recommender
analyze, and optimize all customer interactions including phone calls, systems have been widely used. Personalized recommendations, which
email, chatbots, and social media. Process intelligence can provide can be related to various decision-making processes [7,8], such as what
insights into customer behavior, preferences, and pain points. The things to buy, are provided by recommender systems [9], which use
benefits of using process intelligence in customer services include the customer behavior as input. In this regard, some of the recent arti-
ability to improve customer satisfaction, optimize business processes, cles have focused on improving recommender systems using customer
better allocate resources, and reduce churn. In this context, customer journey analysis methods [4,10,11]. Marketing professionals can better
journey mapping is a traditional technique used to understand the end- understand what customers want and how to engage with them by
to-end customer experience and identify any pain points or areas for mapping the consumer journey [12]. Also, due to the attraction of
improvement. On the other hand, customer journey analytics refers to customers from different communication channels, customer journey
the science of analyzing customer behavior to measure its impact on analysis methods are used to improve the attraction and effectiveness
business outcomes based on real data. The entire sequence of activities of advertising strategies [13].

∗ Corresponding author.
E-mail addresses: [email protected] (F. Akhavan), [email protected] (E. Hassannayebi).

https://doi.org/10.1016/j.dajour.2024.100452
Received 8 June 2023; Received in revised form 24 March 2024; Accepted 25 March 2024
Available online 27 March 2024
2772-6622/© 2024 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY-NC license
(http://creativecommons.org/licenses/by-nc/4.0/).
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Customer satisfaction analysis plays a pivotal role in enhancing Visualizing the customer journey: Process mining can provide a
the quality of services provided by industries. In today’s digital age, visual representation of the customer journey, highlighting the different
where billions of individuals are using online platforms, each piece of touchpoints and how customers navigate through the system. This can
information carries emotions and sentiments. Whether it is a positive or help to identify bottlenecks and areas for improvement.
negative experience during customer journeys. In this situation, compa- Identifying customer behavior patterns: With process mining, it is
nies face fierce competition and must continually strive to attract and possible to analyze the behavior of individual customers or groups
retain customers. To achieve this, understanding customer needs and of customers, such as identifying which pages they visit most of-
levels of satisfaction during their journeys is paramount. Traditional ten or how they interact with the website. This can help to under-
methods of collecting feedback, such as paper forms or online surveys, stand the customer journey more deeply and identify opportunities for
have limitations, including low response rates and potential biases. In improvement.
contrast, customer journeys do not have these limitations [14]. Measuring the efficiency of the customer journey: Process mining
The insurance service industry has always been one of the most can help quantify the efficiency of a given customer journey, such as the
important B2C (business-to-consumer) industries in the world. Busi- time it takes for customers to complete certain tasks or the number of
nesses in this field are in constant and continuous communication with steps they must go through to complete a transaction. This can help to
customers. The complexity of customer behavior has made the way to streamline the customer journey and improve its overall effectiveness.
interact with them in this industry become a debatable issue. Businesses Analyzing the impact of changes: By analyzing the customer journey
are trying to create a competitive advantage so that they can create a before and after changes are made, process mining can help to deter-
good experience for their customers. In this regard, analyzing customers mine the effectiveness of those changes and identify further areas for
and providing predictions of their behavior has become one of the improvement.
favorite topics [15]. Besides the above-mentioned use cases of customer journey an-
The insurance industry faces various challenges that can be ad- alytics, novel applications of process mining techniques are helping
dressed through innovative methods. Here is a list of issues in the decision-makers to go beyond the traditional process discovery and
insurance sector along with suggested methods based on the provided involving the machine learning algorithm to enhance the user expe-
research [16]: rience further. For example, customer journey prediction is the process
of analyzing customer behavior and data to predict the potential path
• Fraud Detection: Detecting fraudulent claims is a significant chal- that a customer is likely to take in their interaction with a brand or
lenge for insurance companies. company. This is especially relevant in the context of e-commerce and
• Claim Analysis: Distinguishing between genuine and fraudulent online businesses such as insurance services, where customers interact
claims is crucial for efficient operations. with companies through multiple channels such as social media, email,
• Data Management: Handling and analyzing vast amounts of data and websites. In this context, predictive modeling refers to building
efficiently is essential for decision-making. predictive models based on data analysis to identify potential customer
• Customer Profiling: Understanding client behavior and prefer- journeys and make predictions about future behavior. This paper,
ences is key to offering tailored insurance solutions. therefore, focuses on the following research questions:
• Data Utilization: Leveraging the full potential of available data for − RQ1: According to previous studies, how can supervised and
improved decision-making and operational efficiency. unsupervised machine learning algorithms be used to analyze customer
journeys and their patterns to predict the behavior of website users?
Suggested Methods: − RQ2: According to the history of previous users’ behavior, how
process mining and predictive process monitoring algorithms can be
• Machine Learning Algorithms: Utilize ML algorithms for fraud used to predict whether or not users will submit a complaint?
detection, and claim analysis to enhance accuracy and efficiency. It is crucial to highlight the challenges that necessitated the inno-
• Data Mining Techniques: Employ data mining for fraud detection, vative solutions presented. The challenges faced in predictive business
and customer profiling to extract valuable insights. process monitoring, particularly in the realm of customer journey anal-
• Exploratory Data Analysis (EDA): Conduct EDA to identify mean- ysis, are multifaceted. One significant challenge addressed by this study
ingful factors for claim filing and acceptance, aiding in decision- is the high intra-class variance, stemming from the substantial disper-
making. sion of traces identified in customer journeys. This variance poses a
• Feature Selection: Implement feature selection techniques to re- fundamental obstacle to accurate outcome prediction and customer sat-
duce data dimensionality and improve analysis results. isfaction assessment. Additionally, the complexity of customer behavior
• Predictive Analytics: Use predictive analytics for claims process- in online service environments, characterized by diverse communica-
ing, underwriting analysis, and customer behavior insights. tion channels and intricate interaction patterns, presents a challenge in
• Enhanced Data Utilization: Increase data utilization through ML extracting meaningful insights from customer journey data.
to automate processes, reduce claim handling costs, and improve Moreover, the imbalanced nature of the data, where certain out-
customer satisfaction. comes are underrepresented, further complicates the predictive mod-
eling process. These challenges underscore the necessity for a com-
By addressing these issues with the suggested methods, insurance prehensive and innovative approach that combines supervised and
companies can enhance their operations, improve decision-making, and unsupervised machine learning methods to overcome the limitations
better serve their clients in a rapidly evolving industry. of traditional predictive models and enhance the accuracy and effec-
Process mining can be a powerful approach for exploring cus- tiveness of outcome predictions in customer experience analysis. By
tomer journeys, as it allows for the visualization and analysis of the addressing these challenges head-on, the research contributions out-
actual customer behavior and interactions with the system. The value- lined in the paper aim to bridge critical gaps in existing methodologies
creating applications of process mining for organizations are process and offer a proactive and comprehensive framework for improving
visualization, identifying and monitoring key performance indicators of customer satisfaction analysis and predictive process monitoring in the
processes, and making decisions and corrective actions based on them. online service industry.
Value means efficiency and improvement of current processes, financial In summary, the research contributions are fourfold: first, a hybrid
and non-financial benefits such as customer satisfaction [17]. Here are machine learning method extended existing studies by proposing a
some ways that process mining can play a role in exploring customer novel predictive model to fill the existing research gaps and overcome
journeys: the problem of high intra-class variance. This framework considers

2
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

the similarity between journeys and simultaneously employs decision activity/event is recurring prediction. That means the following steps
tree classification as supervised and Density-Based Spatial Clustering of can be predicted sequentially by using one predicted step. In this
Applications with Noise (DBSCAN) clustering as unsupervised learning regard, Pauwels et al. [22] introduce a basic neural network method
methods. Second, the analysis proposes an essential contribution to that can incrementally predict the next activity of sequences.
moving from reactive actions to proactive behavior. By predicting the
outcome of a running case and identifying journeys that cause the 2.2. Outcome-oriented predictive process monitoring
failure, organizations can plan for corrective actions and improve the
rate of succession. Third, this study extends the scope of the traditional The outcome-oriented prediction tasks such as predicting the failure
outcome prediction with binary labels towards improving the classi- of a process [23], purchase or not, etc., can give an insight into the
fication method by considering a range of outcomes, e.g., 0–5 be the business and help the business owners in the decision-making process.
range of customer satisfaction level. Thus, a multi-class decision tree In recent years, different approaches have been proposed to predict
as a classification method is employed, and customer satisfaction level the outcome of a business process. Some studies, i.e. Kim et al. [24],
is predicted. Fourth, combining statistical feature selection methods Elkhawaga et al. [25], Gusmao et al. [26], Lee et al. [27], have
reduces the created clusters’ dimensions and increases the prediction focused on selecting appropriate features to predict the outcome with
accuracy. The research contributions, both in terms of application and higher accuracy. In this regard, a resource-aware feature selection
modeling framework, have been refactored to directly connect to the method that identifies features related to resources was presented by
potential reader, the distinguishing aspects and distinctiveness of the Kim et al. [24]. Also, Elkhawaga et al. [25] extracted the appropriate
proposed process mining method. features by assigning importance to the activities with the implementa-
The remainder of this paper is organized as follows: Section 3 tion of Shapley Additive Explanations (SHAP) and Partial Dependence
discusses the literature review on process mining and predictive process Plot (PDP) methods. Gusmao et al. [26] addressed the application of
monitoring, positioning the study within the scope of the application this method and they analyzed the customer journeys by detecting
of predictive process monitoring methods in predicting the measures fraud in the energy industry. They used Pareto analysis in the feature
of processes. Some important concepts used in the definition of the selection phase. Lee et al. [27] used Repeated Incremental Pruning to
problem are described in Section 2. The problem is described in Sec- Produce Error Reduction (RIPPER) in the training phase and they could
tion 4 and the solution method is presented in Section 5, followed by a improve prediction performance. Francescomarino et al. [28] used the
summary of the outcomes and research findings. The case study will be Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
presented in Section 6. At last, Section 7 draws findings and directions and Model-Based Clustering (MBASED) after using frequency-based
for future work. encoding to predict the outcome of the tumor detection process. Since
data privacy against potential attacks is important and vital in the
2. Literature review healthcare industry, Rafiei et al. [29] also investigated the challenges of
protecting patients’ privacy protection techniques, focusing on group-
This section provides background on process mining and predictive based techniques in this industry. Francescomarino et al. [30] also
process monitoring. Research on predictive process monitoring and employed the Canopy algorithm as an incremental-based clustering
machine learning methods has recently acknowledged a noteworthy method after using index-based encoding. They implemented Hoeffding
extent of consideration by process analysts and business owners. Ac- Tree (HT) and Adaptive Hoeffding Tree (AT) techniques to improve the
cording to a systematic review conducted in 2020 [18], the trend prediction results. Moreover, Wang et al. [31] proposed LSTM-based
of publishing articles on predictive process monitoring from the first approaches to overcome the time challenge in training an incremental
decade of the 21st century has grown. In the continuation of this model. Pasquadibisceglie et al. [32] focused on the encoding phase
section, previous research on process mining and predictive process of a prediction process and trained a Convolutional Neural Network
monitoring are reviewed. (CNN) model to predict the outcome of a running case. In the following,
The concept of predictive process monitoring was introduced in Weinzierl et al. [33] also identified temporary solutions (workarounds)
2011 by van der Aalst et al. [19]. They constructed an annotated according to the registered incidents. In this regard, Boolean encoding
transition system (ATS) and used it for prediction purposes, focused on and CNN classification methods were used.
completion time prediction, and implemented the proposed approach
on the event log of a municipality in the Netherlands. Studies on the 2.3. Remaining and completion time prediction
use of predictive process monitoring are categorized by the type of
prediction task. Continuing, we will review the literature of relevant One of the indicators for measuring the quality of service providers
studies based on these categories. such as banks, information technology companies, etc. is the waiting
time in processes such as loan application and incident and problem
2.1. Next-event prediction models troubleshooting [34]. So, in addition to outcome-oriented approaches,
time-related prediction tasks have many applications in identifying de-
Nowadays, predicting the next step of a customer journey by in- viations, such as preventing breaches of service level agreements which
creasing the understanding of customer behavior is one of the factors can increase customer satisfaction, identifying delays and the bottle-
influencing the success of organizations [11]. Organizations can use necks that cause delays, etc. The most famous and widely used methods
this information to develop a recommender system and improve cus- of predicting process execution time are summarized in a systematic
tomer experience. Researchers looked into the use of neural networks review published by Márquez-Chamorro et al. [12]. These methods fall
in business process monitoring due to the popularity of Deep Learn- into two groups: statistical techniques and machine learning methods.
ing. Most state-of-the-art next activity/event(s) prediction approaches Polato et al. [35] used regression models to predict the remaining time
use Long Short Term Memory (LSTM) cells at the prediction phase. of a running case by capturing the stability of the primary process. On
Deep learning methods often use a one-hot encoding feature vector the other hand, Gunnarsson et al. [36] implemented LSTM as a machine
as input. Pasquadibisceglie et al. [20] proposed a novel predicting learning method for predicting the completion time of a luggage han-
approach based on one-hot encoding and LSTM as a learning algorithm. dling process in an airport. Some studies used window-based encoding
Jalayer et al. [21] similarly predicted the next activity, with the dif- for predicting the time-related indicators [36,37]. In this method, the
ference that by using the SoftMax function, calculated the importance size of the window is specified and the number of events to be encoded
of the activities and selected the most important features to enter and entered into the algorithm is reduced. One of the applications of
the learning algorithm. One of the challenges of predicting the next predicting the completion time of processes is to prevent the breach

3
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

of service level agreement (SLA) in the information technology (IT) Process mining techniques can add significant value for analyzing the
industries. In this context, Mehdiyev et al. [38] predicted an SLA existing and future interaction patterns of customers. In this study, the
breach in the Volvo Group using a K-means clustering method. They exploration of the journeys is handled by user objectives to predict a
extracted the effective features by implementing a deep Neural Network specific outcome of the process at different levels of granularity of the
(DNN) and Surrogate Decision Trees (SDT). Also, Costache et al. [39] event log.
Also, used the STEP method for the encoding phase and predicted the
workload by focusing on the intra-case features. 3.1. Customer journey
In the systematic review conducted by Maita et al. [40], prediction
has been considered one of the main topics of mining in the field A customer journey is a sequence of interactions between a cus-
of process mining, in which traditional techniques based on process tomer and a service provider that calls them ‘‘touch points’’. This
graphs have been of more interest than machine learning methods. journey tells the story of the customer’s experience, from the first
The following table (cf. Table 1) summarizes the studies in pre- encounter until a long-term relationship is formed. Customer journeys
dictive process monitoring literature. The prediction tasks have been include the following elements [57] (cf. Table 2).
performed by using classification methods such as decision trees, ran-
dom forests, and neural networks. As can be seen, many of these 3.2. Event log (𝜀)
researches answer the question, what is the outcome of a process? What
is the next activity/event? When is the completion time of a running An event log is a set of data stored in the information systems based
case? on the observed behaviors of different cases [58]. An event log captures
As highlighted in recent studies, most of them have used only super- data related to each event or activity, including the time of occurrence,
vised learning methods to make predictions. Since real-world processes the type of event, and any associated data or details [59].
are not fully structured, mining a process model that covers all possible Event logs are inputs of process mining algorithms. Event logs
traces will result in a complex and spaghetti-like process model. Then include the following components [20,24,60] (cf. Table 3). In customer
the implementation of the prediction model will have lots of errors and services, an event log is a record of all customer interactions and related
be useless. One of the proposed solutions to solve this problem is the activities that occur during the execution of customer service processes.
use of clustering methods on different process paths [41]. It captures data related to each interaction, including the time of
Unsupervised learning methods such as clustering methods are im- occurrence, customer details, type of interaction, and any associated
portant because different processes are extracted due to the various
data or details. An event log of customer services can help organizations
traces in an event log Some of these execution paths are more sim-
analyze customer interactions, identify customer needs, and optimize
ilar and located in a cluster, and classification models fit into each
customer service processes.
cluster [4].
Due to the significant diversity in the journey traveled by the
3.3. Predictive business process monitoring
customers, hybrid supervised/unsupervised machine learning methods
are used to deal with the problem of high intra-class variance [42].
When executing a redesigned business process, the new process
The variation between multiple traces of a label is the intra-class
may not meet expectations. For example, unforeseen exceptions may
variance that defines the performance of a model. The low value
occur that cause the processing time of some activities to be much
of the intra-class variance shows the repeatability of the test which
longer than expected. These cases increase user dissatisfaction. The first
means the closeness between the results of successive tests [43]. Fur-
step to addressing these issues, preventing and resolving them, is to
thermore, almost none of the previous studies have paid attention to
understand what happens in reality. Process monitoring means using
the usefulness of the attributes used to develop such models which
the data obtained from the execution of a business process, to extract
determine the accuracy levels of prediction models [44]. Data mining
insights about the actual performance of the process and its compli-
procedures remove extraneous attributes since some provide little (or
ance with norms, policies, or regulations. The data resulting from the
no) information and may even overshadow significant ones. In addition
implementation of business processes is generally in the form of a set
to the limited consideration of outcome prediction, existing studies
almost exclusively focus solely on binary classification and neglect of event logs. Business process monitoring methods use incoming event
other possible and relevant outcomes. This is a relevant problem when logs and generate some artifacts to help process actors, analysts, process
the outcome of a process, i.e., customer satisfaction, is defined as a owners, and other managers gain a picture of process performance at
numerical interval. various levels. Predictive business process monitoring means predicting
To address the mentioned research gaps, the current research study the continuation of the running cases based on the models extracted
has proposed a prediction approach that uses both supervised and from the historical event logs. This prediction includes various tasks
unsupervised learning methods. In this regard, DBSCAN clustering such as predicting the next activity, the future path, the remaining
is implemented to group similar traces, then the attribute selection cycle time, the outcomes of processes, etc. [61]. In the following, some
technique is conducted and activities with no added value are removed. functions that are commonly used in predictive process monitoring are
DBSCAN has several advantages over other clustering algorithms. It can explained.
handle clusters of arbitrary shape and can identify clusters of varying Encoding function (f: 𝜀 → X): This function receives an event log
densities. Additionally, it does not require the number of clusters to (𝜀) as input and provides a vector (X) that can be entered into the
be specified beforehand, as the algorithm can discover the number of classification function. The following methods used in the literature for
clusters automatically. Also, a zero-to-five numerical interval is consid- the encoding part:
ered as the outcome, and a multi-class classification is implemented to ∙ One-hot: one-hot encoding is one of the popular encoding tech-
predict the outcomes of customer journeys. The next section explains niques that has been applied to categorical features without any kind
the preliminaries of the proposed solution method for customer journey of order or relationship. It is a useful technique that enables machines
analytics. to learn from datasets with categorical variables, helping to develop
predictive models. It involves the conversion of categorical values into
3. Preliminaries binary vectors, where each vector indicates the presence or absence of
a particular category. In this method, each of the unique values of the
This section discusses the preliminaries and fundamental concepts mentioned variable is converted into a binary variable. According to
of the research problem. The customer journeys and customer experi- the value of the variable, the numbers 0 or 1 are considered for each
ence are basic components of any e-commerce business environment. of the binary variables [62].

4
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Table 1
A taxonomy of the existing literature on business process monitoring.
Article Method Data usage Application/sector
Encoding Clustering Feature selection Classification
Deng et al. [45] LSTMED – FDR LSTM Tennessee Eastman Time monitoring
(TE) process dataset
Galanti et al. Not mentioned – – Catboost Italian utility Next activity
[46] provider company prediction
Delias et al. [47] Not mentioned – – LR Claims management Outcome prediction
process dataset
Mehdiyev et al. Not mentioned – SHAP QRF Manufacturing Run-time prediction
[48] Execution Systems
Bozorgi et al. One-hot – – Not BPIC16, 17, 19, 20 cycle time and
[49] mentioned outcome prediction
Kim et al. [24] index-based – Resource-aware GB, RF BPIC11, 12, 15 Outcome prediction
Amponsah et al. Not mentioned – – DT NHIS claims process Fraud detection
[50]
Lee et al. [27] Index-based, prefix-length RIPPER, BPIC11,12,15,17 Outcome prediction
One-hot bucketing XGB, RF
Weinzierl et al. Boolean – – CNN BPIC12,13,19,20 Workaround
[33] detection
Jalayer et al. One-hot SoftMax LSTM Helpdesk, Next activity
[21] BPIC12,15,17 prediction
Elkhawaga et al. aggregation- – SHAP, PDP XGB, LR Sepsis1, 2, 3, Outcome prediction
[25] based, BPIC17, Traffic
index-based fines, Hos_billing1,
2
Pasquadibisceglie One-hot – – LSTM BPIC12, 13, 17, 20, Next activity
et al. [20] CoSeLoG prediction
Mehdiyev et al. autoencoder K-means DNN SDT Volvo IT Belgium’s SLA breach
[38] incident Prediction
management
Costache et al. STEP – Intra-case MLR, 3-layer BPIC17 Workload prediction
[39] CNN
Wang et al. [31] Frequency-based – – LSTM BPIC12, 17, Sepsis Outcome prediction
cases, Production,
Road Traffic Fines
and Hospital Billing
Pauwels et al. One-hot – – SDL, DBN, Helpdesk, BPIC11, Next activity
[22] LSTM, CNN 12, 15 prediction
Pasquadibisceglie Image encoding Autoencoder, CNN SEPSIS, BPIC11, 12, Outcome prediction
[32] Mutual info Production
approach
Mello et al. [23] Boolean, – – DT, RF, GB IT department of a Predicting the
Frequency-based Brazilian failures
organization
Taymouri et al. One-hot GAN’s GNA Helpdesk, BPIC12, Next activity
[51] discriminator 17 prediction
Xu et al. [52] Frequency-based – – DT, RF Electronic Medical Outcome prediction
Record
Gusmao et al. Not mentioned – Pareto analysis LR CPFL Energia Fraud detection
[26]
Gunnarsson Window-based – – LSTM An airport’s baggage Completion time
et al. [36] system prediction
Rizzi et al. [53] Frequency-based, – – RF BPIC11, Claim Outcome prediction
Simple-index, Management log
Complex-index
Pauwels et al. One-hot – – DBN BPIC12, 15, 18, Next events
[54] prediction
Polato et al. One-hot – – LR, CR, DATS BPIC12, Help desk Remaining time
[35] log, Road traffic log prediction
Francescomarino Index-based Canopy HT, AT BPIC11, 15, drift1, Outcome prediction
et al. [30] 2
Marquez et al. Window-based DT, RT, ANN, BPIC13, Incident Run-time prediction
[37] SVM, EA management log
Tax et al. [55] One-hot – – LSTM Helpdesk, BPIC12 The remaining time,
the next event, and
its timestamp
(continued on next page)

5
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Table 1 (continued).
Article Method Data usage Application/sector
Encoding Clustering Feature selection Classification
Francescomarino Frequency-based DBSCAN, – DT, RF BPIC11 Tumor diagnostic
et al. [28] MBASED
Dadashnia et al. Boolean – – LSTM BPIC16 Next activity
[56] prediction
This study Frequency-based DBSCAN Hybrid of DT BPIC16 Outcome prediction
variance and
covariance
analysis

Table 2
Customer journey elements.
Title Definition
Customer A client receiving a service.
Journey A common route was taken by a customer.
Touchpoint An interaction between a customer and a service provider.
Timeline The duration between the first and last touchpoints in a journey.

When one-hot encoding is to be used, the desired feature must be


selected first. For example, when an activity is chosen for encoding, all
distinct activities will be considered as columns. Table 4 is an example
of an event log that contains three main columns: Case ID, Activity, and
Timestamp. After implementing one-hot encoding method, Table 5 is
obtained.
∙ Frequency/Aggregation-based: In this method, the output matrix
includes unique Case IDs as rows and unique activities as columns.
The internal cells are the frequency of each activity in each case jour-
ney [58,63]. The frequency-based encoding is shown in the following
example (cf. Table 6, 7): Fig. 1. An overview of the process outcome prediction.
∙ Boolean: This method is similar to the previous method, with
the difference that instead of the frequency of each unique activity in
the journey of each case ID, the fulfillment of the activity or the non- journeys of a website’s customers is investigated. By using predictive
fulfillment is shown with the numbers 1 and 0 [63] (For example cf. business process monitoring, an attempt has been made to predict the
Table 8, 9). output of the process and the journeys taken by customers and to
∙ Index-based: In the index-based encoding method, the columns answer the research questions.
represent activities and features related to them [63]. Below is an One of the limitations that affect the problem of customer jour-
example to explain this method, Act means activity and TS means time ney analysis is that users have different behaviors when visiting a
stamp (cf. Table 10, 11). website. There are many reasons why users have different behaviors
Classification function (y: X → Y): This function receives output when visiting a website. Some of the common factors include personal
from the encoding function (X) as input and delivers its class label preferences, demographics, experience, the layout and navigation of a
(Y) as output. It should be noted that in the available studies, dif- website, and accessibility. Because of this phenomenon, there is a lot
ferent supervised learning methods such as Long Short-Term Memory of variation in the journeys taken within the identified classes, and a
(LSTM), Artificial Neural Networks (ANN), Convolutional Neural Net- lot of dispersion in the class causes the durability and repeatability of
works (CNN), Dynamic Bayesian Networks (DBN), Single Dense Layer the learning model to be less.
(SDL), Support Vector Machine (SVM), Random Forests (RF), Decision Websites consist of several pages that users visit and perform ac-
Trees (DT), Logistic Regression (LR), Multiple Linear Regression (MLR), tions according to their needs. So at first, it is necessary to match
Contextual Regression (CR), Data-aware Transition System (DATS), the customer journeys, website visit data, and event log format (cf.
GBoost, XGBoost have been used to predict the desired task. Tables 12 to 14).
Customer journey heterogeneity refers to variations in the path that
4. Problem scope and definition customers take when interacting with a business or brand. The problem
with customer journey heterogeneity is that it can make it difficult
Nowadays, extracting knowledge from historical data has been no- for businesses to create a consistent and cohesive experience for their
ticed. Process mining is a new concept for discovering process models customers. When customers interact with a business, they may do so
from event logs and can be used as a suitable method for extracting through a variety of touchpoints, such as a website, social media, email,
knowledge from customer journeys. By understanding the behavior of phone, or in-person interactions. Each touchpoint can influence the
customers, business owners can imagine themselves in the position of customer journey, and customers can have vastly different experiences
their customers and, as a result, have a more accurate prediction of based on their preferred touchpoints, their needs, and their behaviors.
their needs and desires. To deal with huge variability in process instances, in this research,
Customer interactions with service providers can be considered as an attempt has been made to predict the degree of dissatisfaction by
sequences of events. Customers follow certain paths to accomplish a analyzing the customer’s journey on online sites and to improve the
particular outcome [11]. With the expansion of businesses that provide customer’s user experience by using the obtained results. In this regard,
online services, the journey of customers takes place on the websites. according to the registration of complaints by customers, the level of
In this research, the application of process mining in analyzing the dissatisfaction is measured in a range of 0 to 5 (cf. Fig. 1).

6
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Table 3
Event logs: definitions, notations, and structures.
Title Symbol Definition
Activity A A step in a process. An event log typically has one activity column.
Event logs Case ID ID A unique phrase or number for each instance of the process.
Columns
Timestamp TS The time of occurrence of a particular event
Case/event D Attributes that change along the path of an event, such as the resource, are the event attributes, else,
attributes gender and age are the case attributes. An event log typically has several cases and event attributes.
Event logs Event e A change in the state of a process, such as a start or completion of an activity, an input or output
Rows message, a timeout violation, etc.
e = (A, ID, TS, D1 , . . . . ,D𝑚 ) m≥0
Trace 𝜎 The sequence of events
𝜎 = [e1 , . . . ,e𝑖 ] ∨i ∈ [1,n]
Variant S A set of all unique traces in the event log.

Table 4 Table 10
An event log. An event log.
ID Act TS ID Act TS
1 A 09:20 1 A 09:20
2 B 09:19 2 B 09:19
2 C 09:20 2 A 09:20
3 C 09:25 1 B 09:25

Table 11
The event log encoded by the index-based encoding method.
Table 5
ID Act1 TS1 Act2 TS2
The event log encoded by the one-hot
encoding method. 1 A 09:20 B 09:25
ID A B C 2 B 09:19 A 09:20

1 1 0 0
2 0 1 1 Table 12
3 0 0 1 A website visit dataset example [4].
User ID URL Time

Table 6 1 .com/search 11-09-2022:20.08


An event log. 2 .com/product 12-09-2022:16.11
ID Act TS 2 .com/details 12-09-2022:21.46

1 A 09:20
2 B 09:19 Table 13
2 A 09:20 An example of an event log [4].
2 B 9:26
Case ID Activity Time
1 B 09:25
1 a 11-09-2022:20.08
2 b 12-09-2022:16.11
Table 7 2 c 12-09-2022:21.46
The event log encoded by the
frequency-based encoding method.
Table 14
ID A B
Matching event log and website visit dataset formats.
1 1 1
Event log Website visit data Customer Journey
2 1 2
Case ID User ID Customer
Activity URL Touchpoint
Table 8 Timestamp Time Timeline
An event log.
ID Act TS
1 A 09:20
5. Process mining methodology
2 B 09:19
2 A 09:20
2 B 9:26 The foundation of the prediction approach and the outline of the
tasks performed to accomplish the objectives of this research are ex-
Table 9 plained in this section. Fig. 2 provides an overview of the hybrid su-
The event log encoded by the boolean pervised/unsupervised process mining approach which includes several
encoding method.
computing modules, i.e., event log preprocessing, data transformation,
ID A B
trace clustering, feature engineering, and finally classification.
1 1 0
This approach can combine the strengths of both techniques to
2 1 1
achieve more accurate and efficient results. Supervised process mining
involves using labeled data to train a machine-learning model to rec-
ognize patterns and make predictions about future events. In contrast,
unsupervised process mining involves analyzing data without prior
knowledge or labeling, to discover hidden patterns and insights.

7
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Fig. 2. An overview of proposed process mining approach.

In the first step, customer journeys are created by users’ behavior noise. It uses epsilon as the maximum radius of each cluster and the
and formed event logs. The event logs extracted from information sys- minimum number of cases in a cluster; then, based on similarities
tems are encoded using the Frequency-based method and transformed between journeys, it returns clusters that instants in each one have
into the appropriate format for entering the learning algorithm. Next, similar journeys. The basic steps of the DBSCAN algorithm are:
Density-Based Spatial Clustering of Applications with Noise (DBSCAN)
is used as the clustering method based on the similarities between the 1. Choose a random unvisited data point.
journeys. Also, silhouette index analysis has been used to adjust the 2. Retrieve all its neighboring points within a distance of epsilon.
parameters to find the appropriate values of the input parameters to 3. If the number of neighboring points is greater than or equal to
the clustering algorithm and create the appropriate clusters. Analysis the minimum points parameter, then a new cluster is formed.
based on variance and covariance is performed to select the most 4. If the number of neighboring points is less than the minimum
effective attributes of each cluster. Finally, using supervised learning points parameter, the point is marked as noise.
methods (multi-class decision trees), the output of customer journeys 5. Repeat the process until all data points have been visited.
is predicted.
In the present setting, the input consists of an event log representing Each parameter of the function can take various values that change
the traces and the attributes of the matching event sets. The output is a the result in clustering. In this study, to find the most suitable cluster-
set of learned decision trees for each corresponding customer journey ing, the analysis of the silhouette score has been used. The silhouette
cluster. The pseudo-code of the proposed process mining approach is score takes different values for different values of the epsilon and the
defined as follows (cf. Tables 15): (see Fig. 3). minimum number of samples, but the closer this score is to one, the
The following notations to describe the approach for building a more favorable it is. According to the pseudo-code below, by calcu-
prediction model were required: lating this index for different values of the epsilon and the minimum
Encoding function (ENCODING (𝜀)): the function takes the event number of samples, an attempt has been made to determine the best
log as an input and returns an encoded matrix. In this study, the
values of the parameters (cf. Tables 16) (see Fig. 4).
frequency-based encoding method is applied.
Feature selection functions (drop_event_with_zero_frequency(clu-
Clustering function (EXTRACT CLUSTER (trace encoded , clustering
ster) & drop_event_with_low_variance(cluster without event with fre-
parameters)): this function takes an encoded matrix and clustering
quency 0) & drop_correlated_event (cluster without event with low
parameters as inputs and returns clusters as outputs. In this study,
variance)): the functions take the clusters from the clustering functions
The DBSCAN (Density-based spatial clustering of applications with
noise) clustering method has been implemented. DBSCAN is a popular as inputs, remove inefficient, redundant and correlated attributes (ac-
unsupervised clustering algorithm that is used to group data points tivity) and returns lower dimension matrixes. In this study, some statis-
that are close to each other in a given dataset. DBSCAN operates by tical analyses (variance and covariance-based feature selection) accord-
grouping data points that are close enough to each other and have ing to the pseudo-code below have been implemented (cf. Tables 17):
enough other data points in their vicinity. The algorithm defines two (see Fig. 5).
parameters, epsilon, and minimum points, to determine what counts Classification function (BUILD TREE (Feature selected Cluster,
as a cluster. Epsilon is the distance between two points below which Event)): Finally, the label is predicted by implementing the selected
they are considered neighbors, while the minimum points parameter learning algorithm on the clusters. In this study, the decision trees
determines the minimum number of points needed to form a cluster. method is used. In this regard, 70% of the event log was separated
Points that do not belong to any cluster are considered outliers or for training and 30% for testing. Then, the model was evaluated, and

8
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Fig. 3. Flowchart of the prediction model.

the following indicators based on the confusion matrix were calcu- logs offer real-world data, covering diverse business processes from
lated [23] (cf. Fig. 6): different industries [65]. Since the RQs focus on customer journeys
𝑇𝑁 + 𝑇𝑃 and complaints registration on an online site, a dataset extracted from
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (1) the online site was required. The published event log in the BPIC’16
𝑇𝑁 + 𝑇𝑃 + 𝐹𝑁 + 𝐹𝑃
𝑇𝑃 constitutes genuine and standardized data. This data represents the au-
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (2)
𝑇𝑃 + 𝐹𝑃 thentic behavior of users, rendering the results applicable in real-world
𝑇𝑃 scenarios [49].
𝑅𝑒𝑐𝑎𝑙𝑙 = (3)
𝑇𝑃 + 𝐹𝑁 The event log is related to the behavior of users in eight months in
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙
𝐹 1 − 𝑆𝑐𝑜𝑟𝑒 = 2 ∗ (4) an online provider of employment and insurance services (UVM) in the
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
Netherlands. The discovered process map of the activity event log for
6. Case study: online insurance services clicks logged in is presented in Fig. 7.
Several data sources are used to collect the required information.
In this section, the proposed predictive process monitoring approach (1) Customer click data from the site www.werk.nl
was applied to a real event log. For this purpose, An event log that (2) Message data, showing when applicants contacted the agency
was released in the sixth International Business Process Intelligence through a digital channel,
Challenge (BPIC’16) is used [64]. (3) Call data from the call center, presenting when applicants con-
The BPI Challenge event logs provide valuable advantages for re- tacted the call center by cell phone,
search in process mining and business process management. These (4) Complaint data showing when applicants complained.

9
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Table 15
Pseudocode for the prediction model.

Table 16
Pseudocode for parameters tuning of the clustering phase.

In this context, we intend to gain knowledge about the different be- The goal is to predict user satisfaction using extracted customer
havioral patterns of applicants and to determine how different channels journeys from the event log. In this event log, there is the activity
are being used. of registered complaints by users. Thus, an attempt has been made
The data set covers the following main objects and their correspond- to predict the level of user satisfaction by using predictive process
ing attributes and events: monitoring and multi-class classification (cf. Fig. 1).
– Customer - client of a Dutch public agency for handling unem- Encoding: The mentioned event log (𝜀) contains 7,364,684 events
ployment benefits which include several attributes such as customerID, (rows), which after entering the encoding function (ENCODING (𝜀)),
age category, and gender a matrix (trace encoded ) with dimensions of 27412*815 is formed. To
– Session - browser-session identifier of a user browsing the website
evaluate the proposed model, first, the usual approach of predictive
of the agency
process monitoring has been implemented on this event log. In fact,
– IP - IP address of an applicant browsing the website of the agency
without clustering, the decision tree algorithm has been implemented.
– Office_U - user involved in an activity handling an applicant
By performing the classification after encoding, the following results
interaction
are obtained (cf. Table 19). As it is shown, the false positive error
– Office_W - worker involved in an activity handling an applicant
(FP) for classes 1 to 5 is significant and this indicates the problem of
interaction
– Complaint - a complaint document handed in by an applicant intra-class variance. To solve this problem, a hybrid of supervised and
– ComplaintDossier - a collection of complaints by the same appli- unsupervised machine learning methods is implemented.
cant [66] Clustering:
The event log, as mentioned above, includes the behavior of 27,412 After performing the encoding step, clustering is handled based
users in CSV format. In the eight months of data collection, 815 on the similarity between the customer journeys. In this regard, The
activities (site page visits) were executed by them. In total 26,477 DBSCAN clustering method is applied. It is necessary to set the ap-
process variants (S) are detected in this event log (cf. Table 18). propriate input clustering parameters, including the radius and the

10
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Table 17
Pseudocode of the feature selection phase.

Table 18 minimum number of samples in the clusters. To adjust these param-


Descriptive statistics of the BPI Challenge 2016 dataset. eters, the Silhouette-score analysis was used and the following results
Indicator Trace length (|𝜎|) Activity frequency Euclidean distance were obtained.
Min. 1 1 0 In the first step, the minimum and maximum possible radius are
Max. 9,719 1,748,353 6,339.06 determined by calculating the Euclidean distance between different
Avg. 268.6 903.6 122.62
journeys. The below chart shows the distribution of computed distances
between different customer journeys. As can be seen, the radius can be
Table 19 checked from 0 to 6,339.06, and the minimum number of samples from
Confusion matrix before applying the clustering technique. 0 to 27,412.
Class Confusion matrix This analysis is done to find the best combination of the radius
TP FP FN TN and the minimum number of samples in the clusters to maximize the
0 8085 0 100 3 Silhouette-score value. As reported in Table 13, the silhouette score
1 1 126 0 8071 and the number of clusters are obtained by considering the radius and
2 0 41 0 8198 minimum number of points. Finally, ten combinations are found, which
3 0 4 0 8218 are reported in Table 13. The best available combination is the radius of
4 0 3 0 8222
5 0 0 0 8223
247 and the minimum sample of 4, which results in a Silhouette score
of 0.9045 and 3 clusters (cf. Table 20).
Feature selection: In this step, the effective attributes in each
cluster have been identified and selected by calculating the variance
of each attribute and the covariance between them. In the first step,

11
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

the following results were obtained (cf. Table 22). In clusters 2 and 3,
only classes 0 and 1 were present in the training data; therefore, only
these classes were predicted in the test data.
In this table, after implementing the classification algorithm in each
cluster, the values of the confusion matrix were calculated for each
class. After calculating these values, the accuracy indices including
accuracy, recall, precision, and F1-score were obtained.
As it is known, due to the unbalanced data in cluster C1 (distribution
of classes: {0:27081},{1:180},{2:35},{3:6},{4:4},{5:1}), the model is
biased to class 0. After classifying the data, the accuracy index in this
class is very high, and the indices in other classes are not suitable;
for this reason, over-sampling calcification for the imbalanced data
technique has been implemented [67]. It can improve the model, and
the results of the following table have been obtained (cf. Table 23).
Similar to the previous table in this table, after implementing the
classification algorithm using the oversampling technique in each clus-
ter, the values of the confusion matrix were calculated for each class.
After calculating these values, the accuracy indices including accuracy,
recall, precision, and F1-score were obtained.
In the research conducted by Dadashnia et al. [56], a model based
on the LSTM method was implemented on the BPIC16 dataset. In
comparing the accuracy, their model stands at 64%, this paper with
the proposed method achieving a significantly higher accuracy of 98%,
it is evident that the suggested approach outperforms the current
Fig. 4. Flowchart of DBSCAN parameters tuning. standard by a substantial margin. This notable difference underscores
the superior predictive capabilities and effectiveness of the new method
in customer journey analytics. By leveraging a hybrid of supervised and
unsupervised learning techniques, the proposed model not only demon-
Table 20 strates a remarkable accuracy rate but also addresses the challenge of
Silhouette-score analysis. high intra-class variance, a common issue in process monitoring. The
Radius #Min-points #Clusters #Noise Silhouette-score results clearly indicate that the new approach is not only proficient
247 4 3 97 0.9045 in making accurate predictions but also excels in providing valuable
229 3 3 106 0.9029 insights into customer satisfaction levels. This substantial improvement
156 6 2 252 0.8585 in accuracy highlights the potential of the proposed method to revolu-
249 5 2 99 0.8575
tionize predictive process monitoring in online services, paving the way
249 4 4 90 0.8572
249 3 4 89 0.8516 for enhanced customer experiences and improved decision-making for
130 5 2 413 0.8259 service providers.
130 4 3 401 0.8151
200 2 7 123 0.7847
201 2 7 123 0.7847 7. Conclusion and policy implications
249 1 88 0 0.6719
The benefits of customer journey prediction are numerous, in-
cluding higher customer loyalty, increased customer satisfaction, and
greater revenue growth. By predicting customer behavior, companies
can more effectively engage with customers, anticipate their needs,
attributes with little variance are removed, and attributes with a very
and provide personalized experiences that are likely to drive customer
high positive or negative correlation are removed among the remaining
retention and business growth. Process mining can provide valuable
attributes (cf. Table 21).
It should be noted that the meaning of the feature in the matrix insights for exploring and predicting customer journeys by visualizing
encoded is the columns of this matrix, which are the URLs of different customer behavior, identifying areas for improvement, and measuring
site pages. the effectiveness of the customer journey. One of the important appli-
In this table, in the first column, the number of activities observed cations of process mining is improving process performance by using
in each cluster is listed, after calculating the variance of each activity predictive business process monitoring that helps to gain insightful
(see Table 17), the number of activities that have been identified as knowledge about the features of processes. This study employs a
having little variance should be removed. In the next step, the number novel hybrid computing method of supervised/unsupervised machine
of activities that are correlated with each other is given in the fourth learning techniques for predicting the outcomes of process instances
column, and after removing all of them, the number of valuable and under the high degree of internal heterogeneity in process variants.
remaining activities is specified in the last column. This paper underscores the importance of customer satisfaction
Classification: at the final step, for predicting the level of dissat- analysis in online services. By harnessing the power of machine learn-
isfaction of users, a scale of 0 to 5 was considered, and the following ing and customer journey analysis, it empowers service providers to
results were obtained using the multi-class decision tree algorithm in gain deep insights into user experience, identify areas for improvement,
each cluster. For this purpose, the event of ‘Complaint’ is considered and ultimately enhance the overall experience [14]. This research holds
as a label in this dataset, and the frequency of doing this activity will the potential to guide online service providers towards more informed
indicate the degree of dissatisfaction. decision-making and, consequently, greater customer satisfaction and
To evaluate the proposed learning model, the data in each cluster loyalty.
has been split into training and testing data. Finally, the accuracy, The study aims to fill the existing research gap by answering recent
recall, and F1-score were measured by creating confusion matrixes, and calls for research about a more thorough exploration of customer

12
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Fig. 5. Flowchart of feature selectin phase.

Table 21
Feature selection in the clusters.
Clusters #total attr. #attr. with low var. #correlated attr. #remaining attr.
C1 807 215 114 478
C2 69 21 10 38
C3 90 29 30 31

behavior by using predictive process monitoring techniques. The au- The purpose of the proposed approach is to predict the outcome of
thors use process mining to explore customer journeys, identify cus- journeys taken by customers. In this regard, by Considering the journey
tomer behavior patterns, and measure the efficiency of the customer traveled by customers as a sequence of events, the proposed predictive
journey. The study focuses on two research questions: (1) how can process monitoring approach has been used to predict the outcome
supervised and unsupervised machine learning algorithms be used to of the customer journey. First, the journeys taken by customers were
analyze customer journeys and their patterns to predict the behavior of clustered. And then, low-value activities in each cluster were removed.
website users? and (2) how can process mining and predictive process To label the outcomes of journeys, a multi-class classification method
monitoring algorithms be used to predict whether or not users will was used. Using the multi-class classification on the BPI Challenge 2016
submit a complaint? The study’s contributions include proposing a dataset, the level of customer dissatisfaction and complaint registration
novel predictive model to fill the existing research gaps, moving from by them has been predicted.
reactive actions to proactive behavior, improving the classification According to earlier research in the field of prediction business pro-
method by considering a range of outcomes, reducing the created cess monitoring, machine learning methods have been widely used to
clusters’ dimensions, and increasing the prediction accuracy. make various prediction tasks. The researchers aim to enhance service

13
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Table 22
Classification results.
Clusters Class Confusion matrix Accuracy Recall Precision F1-score
TP FP FN TN
0 8023 62 108 0 0.9792 0.9867 0.9923 0.895
1 0 92 50 8051 0.9826 0 0 –
C1 2 0 14 9 8170 0.9971 0 0 –
3 0 1 1 8191 0.9997 0 0 –
4 0 1 1 8191 0.9997 0 0 –
5 0 0 1 8192 0.9998 0 – –
C2 0 1 0 0 0 1 1 1 1
C3 1 1 0 0 0 1 1 1 1

Table 23
Classification results after applying the over-sampling technique.
Clusters Class Confusion matrix Accuracy Recall Precision F1-score
TP FP FN TN
0 7835 0 174 40 737 0.9964 0.9782 1 0.9890
1 8084 126 0 40 536 0.9974 1 0.9846 0.9922
C1 2 8080 41 0 40 625 0.9991 1 0.9949 0.9974
3 8189 4 0 40 553 0.9999 1 0.9995 0.9997
4 8171 3 0 40 572 0.9999 1 0.9996 0.9998
5 8213 0 0 40 533 1 1 1 1
C2 0 1 0 0 0 1 1 1 1
C3 1 1 0 0 0 1 1 1 1

extracted journeys taken by customers (referred to as the high intra-


class variance problem). Due to the dispersion in the traces identified in
the customer journey analysis, this article proposes using a combination
of supervised and unsupervised learning methods to address the issue
of high intra-class variance. The experiment demonstrates that our
strategy outperforms well. The model is useful for making predictions,
but it also has value on its own because it can show the level of
consumer satisfaction.
In future work, we intend to work more on overcoming the im-
balance condition in an event log of the customer journeys. In this
regard, we will use reinforcement learning, etc. as a classification
method. Additional directions for further work include the extension
of the proposed approach to predict the continuation of the customer
journey sequence by using recursive prediction methods. Finally, we
want to look into how the proposed approach may be used to predict
the following activities, their timestamp, and the cycle time needed to
complete the process.
Reproducibility
The algorithm described in Section 5 is implemented using Python.
The source code and supplementary material are publicly available
at https://github.com/FtemehAkhavan/Predictive-Process-Monitoring-
for-Predicting-Customer-Experience/tree/main#readme . This reposi-
tory contains the code, input data, and configuration files necessary to
Fig. 6. Confusion Matrix for multi-class classification.
reproduce the results.

Declaration of competing interest


quality and innovation by anticipating potential outcomes in processes.
We wish to confirm that there are no known conflicts of interest
In this study, a real-life event log is used to evaluate the proposed
associated with this publication. There has been no significant financial
approach. The research finding shows an accuracy of 0.99 in predicting support for this work that could have influenced its outcome.
customer satisfaction. In predicting outcomes of processes, researchers
have favored the supervised learning method. However, as noted in Data availability
the literature review section, one challenge in analyzing customer
journeys and predicting customer behavior is the high variability in Data will be made available on request.

14
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

Fig. 7. The discovered process map of the activity event log for clicks logged in.

References [10] A. Terragni, M. Hassani, Optimizing customer journey using process mining
and sequence-aware recommendation, in: Proceedings of the ACM Symposium
[1] B. Ramos Gutiérrez, A.M. Reina Quintero, L. Parody, M.T. Gómez López, When on Applied Computing, Vol. Part F147772, 2019, http://dx.doi.org/10.1145/
business processes meet complex events in logistics: A systematic mapping study, 3297280.3297288.
Comput. Ind. 144 (2023) http://dx.doi.org/10.1016/j.compind.2022.103788. [11] J. Goossens, T. Demewez, M. Hassani, Effective steering of customer jour-
[2] W. Van Der Aalst, Process mining: Overview and opportunities, ACM Trans. ney via order-aware recommendation, in: IEEE Int. Conf. Data Min. Work.
Manag. Inf. Syst. 3 (2) (2012) http://dx.doi.org/10.1145/2229156.2229157. ICDMW, 2018-Novem, 2019, pp. 828–837, http://dx.doi.org/10.1109/ICDMW.
[3] S.J.J. Leemans, S.J. van Zelst, X. Lu, Partial-order-based process mining: a survey 2018.00123.
and outlook, Knowl. Inf. Syst. 65 (1) (2023) http://dx.doi.org/10.1007/s10115- [12] M.D. Vollrath, S.G. Villegas, Avoiding digital marketing analytics myopia: revis-
022-01777-3. iting the customer decision journey as a strategic marketing framework, J. Mark.
[4] A. Terragni, M. Hassani, Analyzing Customer Journey with Process Mining: From Anal. 10 (2) (2022) http://dx.doi.org/10.1057/s41270-020-00098-0.
Discovery to Recommendations, 2018, http://dx.doi.org/10.1109/FiCloud.2018. [13] M. Cordewener, Customer journey identification through temporal patterns and
00040. Markov clustering, 2016.
[5] G. Bernard, P. Andritsos, Discovering customer journeys from evidence: A genetic [14] S. Kumar, M. Zymbler, A machine learning approach to analyze customer
approach inspired by process mining, in: Lecture Notes in Business Information satisfaction from airline tweets, J. Big Data 6 (1) (2019) http://dx.doi.org/10.
Processing, vol. 350, 2019, http://dx.doi.org/10.1007/978-3-030-21297-1_4. 1186/s40537-019-0224-1.
[6] G. Bernard, P. Andritsos, CJM-ab : Abstracting customer journey maps using [15] M.R. Islam others, Discovering dynamic adverse behavior of policyholders in
process mining, in: International Conference on Advanced Information Systems the life insurance industry, Technol. Forecast. Soc. Change 163 (2021) http:
Engineering, Vol. 1, 2018, pp. 49–56, http://dx.doi.org/10.1007/978-3-319- //dx.doi.org/10.1016/j.techfore.2020.120486.
92901-9. [16] S. Rawat, A. Rawat, D. Kumar, A.S. Sabitha, Application of machine learning
[7] M. Yari Eili, J. Rezaeenour, An approach based on process mining to assess the and data visualization techniques for decision support in the insurance sector,
quarantine strategies’ effect in reducing the COVID-19 spread, Libr. Hi Tech. 41 Int. J. Inf. Manag. Data Insights 1 (2) (2021) http://dx.doi.org/10.1016/j.jjimei.
(1) (2023) http://dx.doi.org/10.1108/LHT-01-2022-0062. 2021.100012.
[8] S.H. Hosseinizadeh Mazloumi, A. Moini, M. Agha Mohammad Ali Kermani, De- [17] P. Badakhshan, B. Wurm, T. Grisold, J. Geyer-Klingeberg, J. Mendling, J. vom
signing synchronizer module in CMMS software based on lean smart maintenance Brocke, Creating business value with process mining, 2022.
and process mining, J. Qual. Maint. Eng. 29 (2) (2023) http://dx.doi.org/10. [18] F. Spree, Predictive process monitoring : A use-case-driven literature review, in:
1108/JQME-10-2021-0077. EMISA Forum, 2020.
[9] N. Verma, J. Singh, A comprehensive review from sequential association com- [19] W.M.P. Van Der Aalst, M.H. Schonenberg, M. Song, Time prediction based on
puting to Hadoop-MapReduce parallel computing in a retail scenario, J. Manag. process mining, Inf. Syst. 36 (2) (2011) http://dx.doi.org/10.1016/j.is.2010.09.
Anal. 4 (4) (2017) http://dx.doi.org/10.1080/23270012.2017.1373261. 001.

15
F. Akhavan and E. Hassannayebi Decision Analytics Journal 11 (2024) 100452

[20] V. Pasquadibisceglie, A. Appice, G. Castellano, D. Malerba, A multi-view deep [45] W. Deng, Y. Li, K. Huang, D. Wu, C. Yang, W. Gui, LSTMED: An uneven dynamic
learning approach for predictive business process monitoring, IEEE Trans. Serv. process monitoring method based on LSTM and Autoencoder neural network,
Comput. (2021) http://dx.doi.org/10.1109/TSC.2021.3051771. Neural Netw. 158 (2023) http://dx.doi.org/10.1016/j.neunet.2022.11.001.
[21] A. Jalayer, M. Kahani, A. Pourmasoumi, A. Beheshti, HAM-Net: Predictive busi- [46] R. Galanti, M. de Leoni, N. Navarin, A. Marazzi, Object-centric process predictive
ness process monitoring with a hierarchical attention mechanism, Knowl.-Based analytics, Expert Syst. Appl. 213 (2023) http://dx.doi.org/10.1016/j.eswa.2022.
Syst. 236 (2022) 107722, http://dx.doi.org/10.1016/j.knosys.2021.107722. 119173.
[22] S. Pauwels, T. Calders, Incremental predictive process monitoring: The next [47] P. Delias, N. Mittas, G. Florou, A doubly robust approach for impact evaluation
activity case, in: Lecture Notes in Computer Science (Including Subseries Lecture of interventions for business process improvement based on event logs, Decis.
Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12875, Anal. J. 8 (2023) 100291, http://dx.doi.org/10.1016/j.dajour.2023.100291.
2021, http://dx.doi.org/10.1007/978-3-030-85469-0_10, LNCS. [48] N. Mehdiyev, M. Majlatow, P. Fettke, Quantifying and explaining machine
[23] P. Mello, K. Revoredo, F. Santoro, IT incident solving domain experiment learning uncertainty in predictive process monitoring: an operations research
on business process failure prediction, J. Inf. Data Manag. 11 (1) (2020) perspective, 2023, pp. 1–43, [Online]. Available: http://arxiv.org/abs/2304.
34–49. 06412.
[24] J. Kim, M. Comuzzi, M. Dumas, F.M. Maggi, I. Teinemaa, Encoding resource [49] Z. Dasht Bozorgi, I. Teinemaa, M. Dumas, M. La Rosa, A. Polyvyanyy, Prescriptive
experience for predictive process monitoring, Decis. Support Syst. 153 (2022) process monitoring based on causal effect estimation, Inf. Syst. 116 (2023)
http://dx.doi.org/10.1016/j.dss.2021.113669. http://dx.doi.org/10.1016/j.is.2023.102198.
[25] G. Elkhawaga, M. Abuelkheir, M. Reichert, Explainability of predictive process [50] A.A. Amponsah, A.F. Adekoya, B.A. Weyori, A novel fraud detection and
monitoring results: Can you see my data issues? 2022, pp. 1–32, [Online]. prevention method for healthcare claim processing using machine learning and
Available: http://arxiv.org/abs/2202.08041. blockchain technology, Decis. Anal. J. 4 (2022) http://dx.doi.org/10.1016/j.
[26] L. Gusmao, H. Helito, T. Anarelli, J.R. Conceicao, T. Ji, G. Barros, A customer dajour.2022.100122.
journey mapping approach to improve CPFL energia fraud detection predictive [51] F. Taymouri, M. La Rosa, Sarah Erfani, Zahra Dasht Bozorgi, Ilya Verenich,
models, 2020, http://dx.doi.org/10.1109/TDLA47668.2020.9326214. Predictive business process monitoring via generative adversarial nets: The
[27] S. Lee, M. Comuzzi, N. Kwon, Exploring the suitability of rule-based classification case of next event prediction, in: International Conference on Business Process
to provide interpretability in outcome-based process predictive monitoring, Management, in: LNCS, Vol. 12168, 2020, pp. 237–256, http://dx.doi.org/10.
Algorithms 15 (6) (2022) 187, http://dx.doi.org/10.3390/a15060187. 1007/978-3-030-58666-9_24.
[28] C. Di Francescomarino, M. Dumas, F.M. Maggi, I. Teinemaa, Clustering-based [52] H. Xu, J. Pang, X. Yang, M. Li, D. Zhao, Using predictive process monitoring
predictive process monitoring, IEEE Trans. Serv. Comput. 12 (6) (2016) http: to assist thrombolytic therapy decision-making for ischemic stroke patients,
//dx.doi.org/10.1109/TSC.2016.2645153. BMC Med. Inform. Decis. Mak. 20 (2020) http://dx.doi.org/10.1186/s12911-
[29] M. Rafiei, W.M.P. van der Aalst, Group-based privacy preservation techniques 020-1111-6.
for process mining, Data Knowl. Eng. 134 (2021) http://dx.doi.org/10.1016/j. [53] W. Rizzi, C. Di Francescomarino, F.M. Maggi, Explainability in predictive process
datak.2021.101908. monitoring: When understanding helps improving, in: Lecture Notes in Business
[30] C. Di Francescomarino, C. Ghidini, A.I. Apr, Incremental predictive process Information Processing, vol. 392, 2020, http://dx.doi.org/10.1007/978-3-030-
monitoring : How to deal with the variability, 2018, arXiv Prepr. arXiv. 58638-6_9, LNBIP.
[31] J. Wang, Dongjin Yu*, Chengfei Liu, Xiaoxiao Sun, Predicting Outcomes of [54] S. Pauwels, T. Calders, Bayesian network based predictions of business processes,
Business Process Executions Based on LSTM Neural Networks and Attention 2020.
Mechanism, 2021. [55] N. Tax, I. Verenich, M. La Rosa, M. Dumas, Predictive business process mon-
[32] V. Pasquadibisceglie, A. Appice, G. Castellano, D. Malerba, G. Modugno, Or- itoring with LSTM neural networks, in: International Conference on Advanced
ange: Outcome-oriented predictive process monitoring based on image encoding Information Systems Engineering, Vol. 3, 2017, pp. 477–492, http://dx.doi.org/
and CNNs, IEEE Access 8 (2020) http://dx.doi.org/10.1109/ACCESS.2020. 10.1007/978-3-319-59536-8.
3029323. [56] S. Dadashnia, J. Evermann, P. Fettke, P. Hake, N. Mehdiyev, T. Niesen,
[33] S. Weinzierl, V. Wolf, T. Pauli, D. Beverungen, M. Matzner, Detecting temporal Identification of Distinct Usage Patterns and Prediction of Customer Behavior,
workarounds in business processes–A deep-learning-based method for analysing 2016.
event log data, J. Bus. Anal. 5 (1) (2022) http://dx.doi.org/10.1080/2573234X. [57] G. Bernard, P. Andritsos, A process mining based model for customer journey
2021.1978337. mapping, in: CEUR Workshop Proceedings, Vol. 1848, 2017, pp. 49–56.
[34] R. Šperka, M. Halaška, The performance assessment framework (PPAFR) for RPA [58] I. Teinemaa, M. Dumas, M. La Rosa, F.M. Maggi, Outcome-oriented predictive
implementation in a loan application process using process mining, Inf. Syst. process monitoring: Review and benchmark, ACM Trans. Knowl. Discov. Data
e-Bus. Manag. (2022) http://dx.doi.org/10.1007/s10257-022-00602-2. 13 (2) (2019) http://dx.doi.org/10.1145/3301300.
[35] M. Polato, A. Sperduti, A. Burattin, M. de Leoni, Time and activity sequence [59] U. Singh, A. Muzaffar, R. Vyas, O.P. Vyas, Improving event log quality using
prediction of business process instances, Computing 100 (9) (2018) http://dx. autoencoders and performing quantitative analysis with conformance checking,
doi.org/10.1007/s00607-018-0593-x. 2023, http://dx.doi.org/10.1109/Confluence56041.2023.10048805.
[36] B.R. Gunnarsson, S.K.L.M. vanden Broucke, J. De Weerdt, Predictive process [60] A. Senderovich, C. Di Francescomarino, F.M. Maggi, From knowledge-driven to
monitoring in operational logistics: A case study in aviation, in: Lecture Notes in data-driven inter-case feature encoding in predictive process monitoring, Inf.
Business Information Processing, vol. 362, 2019, http://dx.doi.org/10.1007/978- Syst. 84 (2019) http://dx.doi.org/10.1016/j.is.2019.01.007.
3-030-37453-2_21, LNBIP. [61] N. Tax, I. Verenich, M. La Rosa, M. Dumas, Predictive business process mon-
[37] A.E. Márquez-Chamorro, M. Resinas, A. Ruiz-Cortés, M. Toro, Run-time predic- itoring with LSTM neural networks, in: Lecture Notes in Computer Science
tion of business process indicators using evolutionary decision rules, Expert Syst. (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in
Appl. 87 (2017) http://dx.doi.org/10.1016/j.eswa.2017.05.069. Bioinformatics), vol. 10253, 2017, http://dx.doi.org/10.1007/978-3-319-59536-
[38] N. Mehdiyev, P. Fettke, Explainable artificial intelligence for process mining: 8_30, LNCS.
A general overview and application of a novel local explanation approach for [62] C. Seger, An investigation of categorical variable encoding techniques in machine
predictive process monitoring, Stud. Comput. Intell. 937 (2021). learning: binary versus one-hot and feature hashing, Degree Proj. Technol.
[39] I. Costache, Dirk. Fahland, A Process-Aware Perspective on the Use of the (2018).
Performance Spectrum in Predictive Process Monitoring of Business Processes, [63] A. Leontjeva, R. Conforti, C. Di Francescomarino, M. Dumas, F.M. Maggi,
2021. Complex symbolic sequence encodings for predictive monitoring of business
[40] A.R.C. Maita others, A systematic mapping study of process mining, Enterprise processes, in: Lecture Notes in Computer Science (Including Subseries Lecture
Inf. Syst. 12 (5) (2018) http://dx.doi.org/10.1080/17517575.2017.1402371. Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9253,
[41] F. Folino, G. Greco, A. Guzzo, L. Pontieri, Mining usage scenarios in business 2015, http://dx.doi.org/10.1007/978-3-319-23063-4_21.
processes: Outlier-aware discovery and run-time prediction, Data Knowl. Eng. 70 [64] M. Dees, B. van Dongen, BPI challenge 2016, in: 4TU, Centre for Research Data,
(12) (2011) http://dx.doi.org/10.1016/j.datak.2011.07.002. 2016, Dataset, https://www.win.tue.nl/bpi/doku.php?id=2016:challenge.
[42] M. Hashemzadeh, B. Adlpour Azar, Retinal blood vessel extraction employing [65] L. Blevi, L. Delporte, J. Robbrecht, Process mining on the loan application
effective image features and combination of supervised and unsupervised ma- process of a Dutch Financial Institute BPI Challenge 2017, 2017, pp. 1–
chine learning methods, Artif. Intell. Med. 95 (2019) http://dx.doi.org/10.1016/ 33, [Online]. Available: https://home.kpmg.com/be/en/home/insights/2017/09/
j.artmed.2019.03.001. process-mining.html.
[43] D.A. Reid, S. Samangooei, C. Chen, M.S. Nixon, A. Ross, Soft biometrics for [66] Y. Wang, G. Zacharewicz, M.K. Traore, D. Chen, A tool for mining discrete event
surveillance: An overview, in: Handbook of Statistics, Vol. 31, 2013. simulation model, 2017, http://dx.doi.org/10.1109/WSC.2017.8248027.
[44] C.A.L. Amaral, M. Fantinato, H.A. Reijers, S.M. Peres, Enhancing completion time [67] B. Krawczyk, Learning from imbalanced data: open challenges and future
prediction through attribute selection, in: Lecture Notes in Business Information directions, Progr. Artif. Intell. 5 (4) (2016) http://dx.doi.org/10.1007/s13748-
Processing, vol. 346, 2019, http://dx.doi.org/10.1007/978-3-030-15154-6_1. 016-0094-0.

16

You might also like