A Multiperspective Fraud Detection Method For Multi-Participant E-Commerce Transactions
A Multiperspective Fraud Detection Method For Multi-Participant E-Commerce Transactions
, 1
Abstract—Detection and prevention of fraudulent year [4]. The dynamic and distributed nature of the Internet
transactions in e-commerce platforms have always been the has made anti-fraud systems inevitable to ensure the security
focus of transaction security systems. However, due to the of online transactions. Existing fraud detection systems
concealment of e-commerce, it is not easy to capture attackers
focusing on detecting abnormal user behaviors still
solely based on the historic order information. Many researches
try to develop technologies to prevent the frauds, which have not characterize vulnerabilities when mitigating emerging
considered the dynamic behaviors of users from multiple security threats. An important issue in existing fraud
perspectives. This leads to an inefficient detection of fraudulent detection systems is their lack of efficient process
behaviors. To this end, this paper proposes a novel fraud management during the trading process. The imperfect
detection method that integrates machine-learning and process monitoring function is one of the key issues that need
mining models to monitor real-time user behaviors. First, we
attention [5]. The detection perspective is usually not enough
establish a process model concerning the B2C e-commerce
platform, by incorporating the detection of user behaviors. due to the lack of process capture for the existing work. To
Second, a method for analyzing abnormalities that can extract this end, we propose a process-based method, where user
important features from event logs is presented. Then, we feed behaviors are recorded and analyzed in real-time, and
the extracted features to a Support Vector Machine (SVM) historical data is transformed into controllable data. In
based classification model that can detect fraud behaviors. We addition, we incorporate a multi-perspective detection of
demonstrate the effectiveness of our method in capturing
abnormal behaviors.
dynamic fraudulent behaviors in e-commerce systems through
the experiments. This paper combines the advantages of process mining and
keywords—Fraud detection; Electronic transaction; Petri net; machine learning models by introducing a hybrid method to
Machine learning solve the anomaly detection in data flows, which provides
information about each action embedded in a control flow
I. INTRODUCTION model. By modeling and analyzing the business process of
W
ITH the increasing popularity of e-commerce the e-commerce system, this method can dynamically detect
platforms, more and more commercial changes in user behaviors, transaction processes, and non-
transactions are now relying on web-based compliance situations, and comprehensively analyze and
systems than the traditional cash-based approach [1]. identify fraudulent transactions from multiple perspectives.
Although the entity economy is greatly impacted by the Important contributions of this paper are listed as follows:
COVID-19 epidemic in recent years, e-commerce remains 1) A conformance checking method based on process
largely unaffected by the pandemic, whereby aiding a steady mining is applied in the field of e-commerce
transactions to capture the abnormalities.
market growth [2]. The sales volume of B2C (Business to
2) A user behavior detection method is proposed to
Customer) e-commerce is expected to reach 6.5 trillion perform comprehensive anomaly detection based on
dollars by 2023 [3]. Petri nets.
Though the growth of e-commerce and the expansion of 3) An SVM model is developed by embedding a multi-
modern technologies offer better opportunities for online perspective process mining into machine learning
businesses, new security threats have emerged over the past methods to automatically classify fraudulent behaviors.
few years. Reportedly, the significant increase in the number The rest of this paper is organized as follows: Section 2
of online fraud cases costs billions of dollars worldwide every introduces the related work. Section 3 presents a model
analysis and a background study. Section 4 forms the
This work is supported in part by the Natural Science Foundation of theoretical basis and describes our proposed fraud detection
Shaanxi Province under Grants 2021JM-205, National Natural Science method. Section 5 presents and discusses the results of our
Foundation of China under Grant 52172325, and in part by the experiments and Section 6 validates our proposed fraud
fundamental research funds for the central universities under Grant
300102242902. (Corresponding Authors: YaDi Wang and Lu Liu). detection method. Section 7 concludes our paper along with
W. Yu and Y. Wang are with the Key Laboratory of Modern Teaching outlining our future research directions.
Technology, Ministry of Education, Xi’an 710062, China, and also with
the School of Computer Science, Shaanxi Normal University, Xi’an II. RELATED WORK
710119, China (E-mail: [email protected] and [email protected]).
L. Liu, B Yuan and J Panneerselvam are with the School of Computing Existing fraud detection methods are categorized into non-
and Mathematical Sciences, University of Leicester, Leicester LE1 7RH,
U.K. (E-mail: [email protected], [email protected] and
formal approaches such as machine learning, and formal
[email protected]). approaches such as process mining.
Y. An is with the School of Information Engineering, Chang’an The machine-learning-based methods learn from
University, Xi’an, China (E-mail: [email protected]).
Manuscript received ***, 2022; revised ***, 2022.
previously obtained historical data to perform classifications
> IEEE TRANSACTION ON COMPUTATIONAL SOCIAL SYSTEMS, VOL., NO., 2
or predictions of future observations to identify potential perspective of control flow including time and resources [20].
risky offline or online transactions [6]. Xuetong Niu et al. Febriyanti et al. [21] assumed any noticeable changes in
conducted a comparative study on credit card fraud detection business processes as a suspected fraud behavior and
methods that rely on machine-learning algorithms. Most of proposed a method to detect some suspicious abnormal
the machine-learning models perform well on the dataset of behaviors using a hybrid method of association rules and
credit card transactions. Moreover, supervised models process mining. Previous research on using process mining to
perform slightly better than unsupervised models after detect fraudulent transactions showed that process mining is
additional pre-processing, such as removing outliers [7]. capable of detecting fraudulent transactions, and it can
Credit card fraud detection is widely deployed at the effectively prevent audit fraud at a much earlier stage due to
application layer, which uses the idea of discovering specific the continuous monitoring nature of event logs [22].
abnormal user behaviors to detect fraud. The supervised In conclusion, many of the existing machine learning
learning algorithm is the most commonly used learning methods only consider static user behaviors based on their
method in online fraud monitoring transactions, since it has occurrence rate. Only a very few studies have investigated
higher accuracy and coverage. Recent research in [8, 9] has real-time, dynamic, and multi-perspective factors of user
proved that the machine learning method can efficiently behaviors in the e-commerce transaction process, which
capture fraudulent transactions in credit card applications. offers great control of the entire transaction process. The
Fraudsters often change their behavioral pattern dynamically detection system based on process mining can record and
to overcome existing fraud detection methods. In online analyze the changes in user behaviors and their preferences
credit card fraud detection, SVM can classify user behaviors on time. However, analysis of complex details increases the
under complex scenarios and deliver reliable results [10]. number of variables or factors that should be considered,
Many researchers take the advantage of combining multiple which makes the detection model more complex.
detection methods for comprehensive fraud detection. For
example, focusing on payment fraud applications, Dahee III. MODEL ANALYSIS
Choi et al. proposed a method by combining supervised and
unsupervised learning [11]. Most of the machine learning- An e-commerce platform is an information interaction
based methods use historical data to analyze fraudulent platform that provides online transactions for enterprises and/or
transactions. They have not given enough emphasis to the individuals. The coverage rate of B2C (Business to Customer) e-
transactional process flow and dynamic user behaviors. commerce platforms is higher than that of other e-commerce
The second type of fraud detection methods uses process platforms, and B2C has become the mainstream model of e-
mining, focusing on extracting knowledge from existing commerce in China [23]. The recent market trend of e-
event logs in information systems for the purpose of commerce has given emergence to various types of electronic
monitoring and improving the operational process in business payment systems. Third-party payment platforms supervise and
IT infrastructure [12]. Process mining specializes in
restrict both the buyers and merchants within the terms of the
comparing the event log with an established model to further
detect, locate, and interpret the deviation between the transaction, thereby ensuring the legitimate rights and interests of
established model and the actual event log [13]. both buyers and sellers. The process of e-commerce transactions
Process mining can detect a large number of abnormal is abstracted and the process flow is established as follow.
transactions, which are not known to be identifiable by A. Process analysis
traditional methods. M Jans et al. postulated the emerging
process mining approach as an appropriate solution to In a typical B2C process, buyers, e-commerce platforms,
mitigate against fraud incorporating internal affairs [14]. For and sellers interact with each other. As shown in Fig. 1, the
example, C Rinner et al. applied conformance checks to electronic transaction process encompasses five different
monitor the process of melanoma patients [15]. Asare et al. participants including Seller, Buyer, the third-party cashier
applied alignment and replay to check the conformance of the TP, the B2C trading platform BCS (Buyer and Seller Server),
electronic medical record log and the hospital workflow and the cashier server CS. This paper summarizes the
model [16]. Research has focused on monitoring and transaction payment process as follows:
evaluating the sequence of processes occurring in the
historical medical event log by establishing corresponding TP-SDK Buyer BCS Seller CS
training and testing models for conformance checking [17]. 1.Binfo,OPT,
Tools such as ProM, Disco and Heustic miner are largely OrderC
2.OrderC
used for conformance checking. Process mining can be an 3.Minfo
efficient approach for fraud detection. 4.OrderF
5.TN
Especially, it is important to be dynamic and multi- 6.TN
perspective when detecting fraudulent user behaviors [18]. 7.TN
8.TN
Process mining helps to compare the actual data against the 9.Pay request
1) After the Buyer logs in, the Buyer performs a series of payment result notification to notify the seller that the
operations on the user client device to purchase goods order is successfully paid.
or services. The BCS generates a commodity order 4) Fraud mode four - paying a cheap order to get
orderC according to the products or services that are expensive goods:
purchased. Commodity order orderC is then passed to First, the malicious actor submits a cheap order as an
the Seller through the e-commerce platform. ordinary buyer, and then submits an expensive order
2) The Seller makes a decision based on the information in but does not pay. However, the system marks the order
orderC, and the Seller passes on the willingness to the as “pending”. The malicious actor replaces the paid
platform BCS. If the order is rejected, the Buyer order with the current order at this time.
returns to the user operation process; if the order is
accepted, the system establishes a pre-payment formal IV. ANALYTICAL METHODS
order orderF, which contains the detailed information Fig. 2 depicts the framework of our detection method.
purchase of the user. Firstly, the transaction event log is filtered and cleaned, and a
3) After the payment is completed, CS signs the formal database of user behavior mode is constructed in the data
order orderF. Then CS sends two payment completion preparation stage. Secondly, we perform an analysis on the
notifications NTF, of which one is to notify TP, and the control flow, resource, throughput time analysis, data flow,
other is to notify the seller. The buyer’s click triggers and user behavior on the event logs, and extract the
the UpdateOrderStatus function, in other words, the abnormalities of each transition from different perspectives as
order status is updated. Afterward, the paid order the training features of fraud detection. Then machine-
information orderF’ is generated. learning algorithms are implemented, and finally, an SVM
4) The Seller checks the order invoice with the payment model is built to classify fraudulent transactions. Our
status. After that, the BCS checks the order and current proposed fraud detection method is introduced in detail as
transaction details. follows.
5) To notify the CS of the upcoming payment, CS
generates a unique transaction number (TN) of the
payment information, and then CS passes this Control flow
transaction code to the platform server BCS. analysis
Trancstion Fauduler
6) After the Seller receives the TN, it signs and passes TN data set Feature Fauduler
Resource Pattern
to the buyer client. At this time, the Buyer can confirm analysis
the order payment information and enters the password, Training SVM
it is normal, the TP makes the payment and sends the Fig. 2. The proposed framework.
payment request command to the CS.
B. Fraud mode analysis A. Theoretical basis
To capture the fraudulent behaviors effectively, we define An event log is made of multiple traces. Each trace
some common fraud modes [23][24] and abstract them as represents the life cycle of one case [25], which is specifically
follows. composed of case, event, timestamp, action, and resource.
1) Fraud mode one - an order is tempered by a malicious This section introduces establishing the link between the
actor: current action, which is shown as an event log, and the action
The malicious actor may deceive the victim merchant of the model. When some real-life event log is replayed on
by sending a fake formal payment order orderFA to the the process model, some transitions are introduced for routing
cashier server. The malicious actor obtained the order purposes rather than representing the actual work [26]. We
items that do not match the payment value by only consider actions of practical significance using a
tampering with the order information, such as the total Labeled Petri net defined as follows.
amount. Definition 1. System Net [26]
2) Fraud mode two - subcontract the order: SN is a system net SN=(lPN, Minit, Mfinal), lPN=(P, T, F, l)
The victim pays the malicious actor’s order instead of is a Petri net with a labeling function, UA is defined as
his order. To achieve their goals, the malicious actors universe of action labels, then the label can be formally
impersonate the duties of sellers and buyers. The order defined as: l=T↛UA; Minit is the initial markings and Mfinal is
information changes before and after the payment. the final markings, which are the tokens contained within the
3) Fraud mode three - send fake notifications: markings of the Petri net.
During such attacks, the malicious actor submits an Definition 2. Data Petri net DPN=(SN, V, U, R, W, G) in
order instead of paying for the order, but sends a fake which:
> IEEE TRANSACTION ON COMPUTATIONAL SOCIAL SYSTEMS, VOL., NO., 4
● SN is a system net SN=(lPN, Minit, Mfinal) based on occurring actions are identified, so that anomalies can be
lPN=(P, T, F, l); initially detected in a single perspective. For some special
● V is a set of data variables that are used in the cases, such as malicious actors fraudulently using legal
transitions; accounts to conduct illegal operations or even fraudulent
● U is a function that defines the range of each value, i.e., actions, comprehensive analysis and judgment should be
Dv is the domain of variable values v, and the value of carried out in combination with the inspection results from
all variables must be within the range defined. For each multiple perspectives. In this paper, any trace in the event log
value v∈V, U(v)=Dv; that does not conform to the model is suspected for potential
● R is a read function R∈T → ρ(V), which indicates the anomalies, and the following definition is adopted from [26].
sets of variables that should be read for each transition; Definition 5. Deviations between the event log and process
● W is a write function W∈T → ρ(W), which indicates model
the sets of variables that should be written for each A set (act, r, w, res, time) is defined, where the read
transition; variable of the action act is r, the write variable is w, its
● G is a guard function G∈T→(VW∪VR), which is resource attribute is represented by res, and the throughput
represented by some combination rules of reading time is represented by time. The traces in event logs are
variables and writing variables such that for any represented as SL=UA×UVM×UVM, and traces of Data Petri net
transition t∈T, and for any variables v∈V , if vr in G(t), can be represented as SDPN=T×UVM×UVM. According to the
then v∈R(t), for any variables v∈V, if vw in G(t), then definition in [26], “>>” means that there is no corresponding
v∈W(t); move. We use this definition to indicate occurrences of
(DPN, (M, s))[b> (M′, s′) describes an enabled binding b in deviations. To replay the event log in the model, different
types of deviations are defined as follows:
marking (M, s) may occur. The result is the marking (M′, s′)
● Deviation only in log:{sL =(l(t), r, w, res, time)∈SL}
after the occurrence. It represents the transition of a net ∩{sM =>>};
system from one state to another. In DPN, the new transitions
● Deviation only in DPN model: {sM∈SDPN} ∩{sL=>>};
after triggering should update the newly written variables to
● Deviation in both model and logs with correct data
all variable sets, i.e., 𝑠 𝑠 ⊕ 𝑤, in where snew(v) =w(v)
attributes: {sM=(t, r, w)∈SDPN} ∩{sL=(l(t), r, w)};
for all v∈write(t), and snew(v) =s(v) for all v∈V\write(t).
● Deviation in both model and logs with incorrect data
Definition 3. Trace and event logs [26]
UVN is a universe of variable names, UVV is the universe of attributes: {sM=(t, r, w)∈SDPN}∩{sL=(l(t), rl, wl)}
variable values, and UVM is the partial mapping from variable ∩{sL=(r≠rl | w≠wl)};
names to values, i.e., U VM U VN ● Deviation in resource: sL(res) != sM (res);
U VV ;
● Deviation in time: sL(time) = unqualified;
A trace, which is defined as a set of action sequences with
● All other deviations are considered as abnormal.
input and output data, can be represented as δ ∈
(UA×UVM×UVM)*. In the same way, an event log is composed The identification of unqualified traces is valuable [25]. The
of multiple sets of traces, which can be expressed as L∈ Ɓ focus of our analysis is to obtain a specific meaning of the
points that do not conform to the guards, and information that
((UA×UVM×UVM)*). is hidden in the abnormal points. For multiple control-flow
Definition 4. Cost function with optimal alignment alignments, the optimal alignments 𝛾 is selected. Fig. 3
sM is the Data Petri net model, and sL is the event log; 𝛾 is shows a Petri model mined from a set of event log. Four
defined as the alignment result of sM and sL. In order to deviations exist between traces in the event log and traces in
quantify the degree of deviation, a cost function is used to the model, which are represented as grey areas in Table Ⅰ.
define the movements that exist in the above alignment According to the path of Petri net model, from the perspective
results, i.e., κ∈Σ→R + 0. For ∀(sL, sM)∈Σ, if sL=>> or sM of control flow, the event log, t0 has occurred twice. The only
= >>, then 𝛫 , = 1; otherwise, 𝛫 , 0 . The deviation in the event log means redundant actions. After the
sequence cost is the sum of costs of individual moves in the occurrence of t3, there is t1 rather than t0, therefore the only
sequence, i.e., 𝛫 𝛾 ∑ , ∈γ 𝜅 𝑠 , 𝑠 . For all deviation in the model representing some actions is skipped.
alignment results 𝛾′of the event log and Data Petri net The throughput time of action t1 in the 5th line does not meet
model, there is an optimal alignment 𝛫 𝛾 𝛫 𝛾′ . the threshold requirement. Action t0 has a deviation in the
resource, presented in the 6th line of Table I.
B. Multi-perspective conformance checking
After the rules are formally defined, conformance checking ·
·
p3
Example of event logs
is used to detect abnormalities. Conformance checking
requires an alignment of event log L and process model DPN, t3 t1
t0,t2;
t0,t1,t2;
which is the alignment of each single trace δ∈L and the t0,t1,t3,t0,t2;
t0,t3,t0,t1,t2;
process model DPN. t0,t3,t0,t1,t3,t0,t2;
t0,t1,t3,t0,t3,t0,t2;
The event log of the system records detailed information ·
……
such as the occurrence time, executor, and interaction data in p0 t0 p1 t2 p2
each action. Through conformance checking, some special Fig. 3. The deviations example
trajectories that do not match the trajectories of commonly
> IEEE TRANSACTION ON COMPUTATIONAL SOCIAL SYSTEMS, VOL., NO., 5
“process data1” processes the order data. If the arc function represented as (D, OPT_type, Operation,
of “risk4” is satisfied, a new token is generated in the place recursive_correlation_ function). The similarity of action D
“risk4”; in the same way, if the arc function pointing to “risk5” is obtained after processing using the recursive correlation
is satisfied, i.e., a new token is generated in the place “risk5”. function [31]. A predefined threshold is used to determine
Abnormal order information can now be extracted from the whether the current user behavior is abnormal or not.
tokens. In summary, the CPN model realizes the functions of Similarly, a set of valid bindings for static attributes can be
processing and detecting order information. represented as (U, Static_User_Pattern, Static_att,
Full_sequence_comparison). It indicates that the full
D. User operation behavior detection
sequence comparison method [31] is used to compare the
Next, we add the data flow perspective based on the above current static attribute with the static attribute pattern.
detection method, which integrates the function of user Combined with the process, the matching similarity Static_att
behavior detection. Buyer behavior analysis can be divided is used as the w of the action, and the judgment can be made
into two parts: static attribute and dynamic behavior [28]. according to the threshold value.
The user’s static attribute data used in this paper mainly
includes IP address, login time, and operation duration. We Normal
User normal
use the Apriori algorithm [29] to obtain the normal patterns mode library
N
end if loss max 0,1 yi T xi b
2
16 (2)
17 if {{sM =( tx, r, w)SDPN}∩{sL= (l(tx), rl, wl)SL} ∩ i 1
18 {r ≠rl | w ≠ wl}}{sL(Guards(tx)) = False}, then
19 Bd[tx]=1 where, xi is the feature vector of the i-th sample; yi is the label
20 Else of the i-th sample; is the weight parameter; b is the bias
21 Bd[tx]=0 parameter; is the regularization coefficient.
22 end if
Through learning from the dataset and updating the
// Step3: For the detection result of order information, the function weights, the loss function of the SVM model gradually
of CPN model corresponds to transition t28 in DPN model, the event
log SL is input into the model to obtain the result. decreases and finally converges. After the above process, the
23 if there exists token in the end places riski, i[1,5], then SVM model is successfully constructed and used for
24 the abnormal conditions corresponding to riski of the end prediction. By taking the features of the current user as input,
places with token are recorded in the sequence Bd[txi]:
riski→Bd[t28i]=1
the SVM model classifies whether the current transaction
25 Else behaviors are fraudulent or not.
26 riski→Bd[t28i]=0 2) Feature selection for anomaly detection
27 end if In the process of multi-perspective detection, each
28 if the static attribute of the user does not meet the threshold,
then perspective gives an inference about whether the current
29 Bd[t29]=1 transaction has fraudulent behaviors or not. The SVM model
30 Else takes the detection inference of these perspectives as features.
31 Bd[t29]=0 We obtain 82 anomaly detection features from the Data Petri
32 end if net and data mining process. These features are used to detect
33 if the user dynamic behavior does not meet the threshold, then whether a current transaction is abnormal from multiple
34 Bd[t4]=1
perspectives. These features are used as the feature vectors in
35 Else
36 Bd[t4]=0
the SVM model. Parts of features and their meanings are
37 end if
shown in Table Ⅳ. These features are respectively the
38 Return B[tx]
control flow analysis results of 20 actions, the time flow
analysis results of 20 actions, the resource flow analysis
F. Fraud detection based on SVM results of 20 actions, and the data flow analysis results of 22
The evaluation of a single perspective is relatively one- actions.
sided and cannot accurately determine whether the current TABLE Ⅳ
EXAMPLES OF FEATURES
transaction is fraudulent or not. Therefore, it is very important
Feature Feature name Meaning
to integrate the detection results of each perspective to X1 Control flow analysis Whether the control flow of
evaluate the transaction's status as a whole. Next, the multi- result of transition A transition A is abnormal
perspective detection results are used as the features, and X2 Control flow analysis Whether the control flow of
result of transition B transition B is abnormal
SVM is used to learn from these features and to integrate X27 Time flow analysis Whether the time flow of
them for evaluating whether the current transaction has result of transition H transition H is abnormal
fraudulent behavior or not as a whole. X45 Resource analysis Whether the resource flow of
result of transition F transition F is abnormal
1) Classification problem and SVM [32] X62 Data flow analysis Whether the user’s operation
The problem of fraud detection is essentially a binary result of transition D behavior is abnormal
classification problem, which can be solved by a X82 Data flow analysis Whether the user's static
classification model. The binary classification problem is a result of transition U attributes are abnormal
process in which a classification function judges whether the
input data belongs to the positive class or the negative class. V. PROCESS MINING EXPERIMENT RESULTS
The mathematical definition is as follows: This paper uses the process-mining tool ProM LIte 1.2 as
h( x) p ( y 1| x), y 0 or 1 (1) the experimental platform [33]. Data flow experiments and
where, x presents the input data; y presents the class of the fusing multi-view experiments use Python3.7 and the
input data; h(x) is the classification function. machine learning framework Scikit-Learn-0.22.
> IEEE TRANSACTION ON COMPUTATIONAL SOCIAL SYSTEMS, VOL., NO., 8
K fold Cross-validation [37] is an effective way of perspectives under the case of integrating data flow and
verifying the effectiveness of the model performance. In this control flow features. These two kinds of characteristic data
experiment, the value of k is 10, that is, the experiment is only consider one aspect of the user’s anomaly. After
carried out through 10 fold- cross verification. learning the two types of data through the machine-learning
Fig. 12 (a) and (b) represents the statistical detection model, the information of the two aspects is fully utilized to
indicators of F1-core and AUC of our proposed SVM-based comprehensively detect user anomalies with better effect.
fraud detection model, obtained based on the 10-fold cross In summary, when compared with considering only one
validation. Among them, the blue curve is the control flow perspective of information, our proposed model characterizes
index, the green curve is the model score considering the data a higher F1-score and AUC indicators. The detection effect of
flow, and the red curve represents the score under the fusion abnormal e-commerce users is better in our model. Therefore,
of multi-perspective features. As seen from Fig. 12, F1-score our proposed method can detect abnormal e-commerce users
under the data fusion of control flow and data flow is higher more comprehensively. In addition, compared with the
than that when only one type of data is considered, that is, related deep learning methods for the fraud detection in e-
when the data of control flow and data flow are considered commerce, our methodology can depict the transaction
comprehensively, better user anomaly detection is obtained. process and structures, and it is interpretable.
REFERENCES
[1] R. A. Kuscu, Y. Cicekcisoy, and U. Bozoklu, Electronic Payment
Systems in Electronic Commerce. Turkey: IGI Global, 2020, pp. 114–
139.
[2] M. Abdelrhim, and A. Elsayed, “The Effect of COVID-19 Spread on
the e-commerce market: The case of the 5 largest e-commerce
companies in the world.” Available at SSRN 3621166, 2020, doi:
10.2139/ssrn.3621166.
[3] P. Rao et al., “The e-commerce supply chain and environmental
sustainability: An empirical investigation on the online retail sector.”
Cogent. Bus. Manag., vol. 8, no. 1, pp. 1938377, 2021.
[4] S. D. Dhobe, K. K. Tighare, and S. S. Dake, “A review on prevention
of fraud in electronic payment gateway using secret code,” Int. J. Res.
(b) AUC statistics
Eng. Sci. Manag., vol. 3, no. 1, pp. 602-606, Jun. 2020.
Fig. 12. F1-score and AUC statistics for three situations. [5] A. Abdallah, M. A. Maarof, and A. Zainal, “Fraud detection system: A
survey,” J. Netw. Comput. Appl., vol. 68, pp. 90-113, Apr. 2016.
To further validate the fraud detection effects of our model [6] E. A. Minastireanu, and G. Mesnita, “An Analysis of the Most Used
under the three aforementioned cases, we consider various Machine Learning Algorithms for Online Fraud Detection,” Info.
Econ., vol. 23, no. 1, 2019.
performance indicators under 50 rounds of tests and calculate [7] X. Niu, L. Wang, and X. Yang, “A comparison study of credit card
their average values. The results are shown in Table Ⅶ. As fraud detection: Supervised versus unsupervised,” arXiv preprint
seen from Table VII, the index of F1-score and AUC are both arXiv: vol. 1904, no. 10604, 2019, doi: 10.48550/arXiv.1904.10604.
greater than indexes that consider only one of the
> IEEE TRANSACTION ON COMPUTATIONAL SOCIAL SYSTEMS, VOL., NO., 12
[8] L. Zheng et al., “Transaction Fraud Detection Based on Total Order Process Owners: A Practical Process Analytics Guide using ProM,”
Relation and Behavior Diversity,” IEEE Trans. Computat. Social Syst., DSI technical report series, Jul. 2020.
vol. 5, no. 3, pp. 796-806, 2018. [34] A. Adriansyah, “Replay a log on petri net for performance/
[9] Z. Li, G. Liu, and C. Jiang, “Deep Representation Learning With Full conformance plug-in,” Technische Universiteit Eindhoven, 2012.
Center Loss for Credit Card Fraud Detection,” IEEE Trans. Computat. [35] F. Mannhardt, M Leoni and H Reijers, “The Multi-perspective Process
Social Syst., vol. 7, no. 2, pp. 569-579, 2020. Explorer,” BPM (Demos), vol. 1418, pp. 130-134, Aug. 2015.
[10] I. M. Mary, and M. Priyadharsini, “Online Transaction Fraud [36] D. Chen, X. Liu, Y. Zhou, X. Yang, L. Lu and W. Xin, “Grid search as
Detection System,” in 2021 Int. Conf. Adv. C. Inno. Tech. Engr. applied to the determination of Mark-Houwink parameters,” J. Appl.
(ICACITE), 2021, pp. 14-16. Polym. Sci., vol. 76, no. 4, pp. 481-487, 2015.
[11] D. Choi, and K. Lee, “Machine learning based approach to financial [37] J. Myerson, L. Green and M. Warusawitharana, “Area under the curve
fraud detection process in mobile payment system,” IT Conv. P. as a measure of discounting,” J. Exp. Anal. Behav., vol. 76, no. 2, pp.
(INPRA), vol. 5, no. 4, pp. 12-24, 2017. 235-243, Oct. 2001.
[12] R. Sarno et al., “Hybrid Association Rule Learning and Process [38] L. Zheng, G Liu, C. Yan, C Jiang and M. Li, “Improved TrAdaBoost
Mining for Fraud Detection,” IAENG Int. J. C. Sci., vol. 42, no. 2, and its Application to Transaction Fraud Detection,” IEEE Trans.
2015. Computat. Social Syst., vol. 7, no. 5, pp. 1304-1316, Jul. 2020.
[13] J. J. Stoop, “Process mining and fraud detection-A case study on the [39] J. Cui, C Yan and C Wang, “ReMEMBeR: Ranking Metric
theoretical and practical value of using process mining for the Embedding-Based Multicontextual Behavior Profiling for Online
detection of fraudulent behavior in the procurement process,” M.S. Banking Fraud Detection,” IEEE Trans. Computat. Social Syst., vol. 8,
thesis, Netherlands, ENS: University of Twente, 2012. no. 3, pp. 643 - 654, Aug. 2021.
[14] M. Jans et al., “A business process mining application for internal [40] Y. Xie, G. Liu, C. Yan, C Jiang and M Zhou, “Time-aware attention-
transaction fraud mitigation,” Expert Syst. Appl., vol. 38, no. 10, pp. based gated network for credit card fraud detection by extracting
13351-13359, 2011. transactional behaviors,” IEEE Trans. Computat. Social Syst., early
[15] C. Rinner et al., “Process mining and conformance checking of long access, doi: 10.1109/TCSS.2022.3158318.
running processes in the context of melanoma surveillance,” Int. J. [41] Q. Yang, C. Wang, C. Wang, H, Teng and C Jiang, “Fundamental
Env. Res. Pub. He., vol. 15, no. 12, pp. 2809, 2018. Limits of Data Utility: A Case Study for Data-Driven Identity
[16] E. Asare, L. Wang, and X. Fang, “Conformance Checking: Workflow Authentication,” IEEE Trans. Computat. Social Syst., vol. 8, no. 2, pp.
of Hospitals and Workflow of Open-Source EMRs,” IEEE Access, vol. 398-409, Aug. 2021.
8, pp. 139546-139566, 2020. [42] J. Liang, Y. Tang, R. Hare, B. Wu and F. Wang, “A Learning-
[17] W. Chomyat and W. Premchaiswadi, “Process mining on medical Embedded Attributed Petri Net to Optimize Student Learning in a
treatment history using conformance checking,” in 2016 14th Int. Conf. Serious Game,” IEEE Trans. Computat. Social Syst., early access, doi:
ICT K. Eng. (ICT&KE), 2016, pp. 77-83. 10.1109/TCSS.2021.3132355.
[18] M. D. Leoni, W. M. Van Der Aalst, and B. F. V. Dongen, “Data-and [43] L. He, G. Liu and M. Zhou, “Petri-Net-Based Model Checking for
resource-aware conformance checking of business processes,” in Int. Privacy-Critical Multiagent Systems,” IEEE Trans. Computat. Social
Conf. Bus. Info. Sys., Springer, Berlin, Heidelberg, 2012. pp. 48-59. Syst., early access, doi: 10.1109/TCSS.2022.3164052.
[19] S. M. Najem, and S. M. Kadeem, “A survey on fraud detection [44] F Zhao, D. Xiang, G Liu and C Jiang, “A New Method for Measuring
techniques in ecommerce,” Tech-Knowledge, vol. 1, no. 1, pp. 33-47, the Behavioral Consistency Degree of WF-Net Systems,” IEEE Trans.
2021. Computat. Social Syst., vol. 9, no. 2, pp. 480--493, Sep. 2022.
[20] K. Böhmer, and S. Rinderle-Ma, “Anomaly detection in business [45] G.J. Liu, Petri Nets: Theoretical Models and Analysis Methods for
process runtime behavior--challenges and limitations,” arXiv preprint Concurrent Systems. Singapore, Singapore, Springer, Nov. 2022, pp.
arXiv, 2017, doi: 10.48550/arXiv.1705.06659. 123-165.
[21] K. D. Febriyanti, R. Sarno and Y. Effendi, “Fraud detection on event
logs using fuzzy association rule learning,” in 2017 11th Int. Conf.
Info. Comm. Tech. Sys., Surabaya, Indonesia, 2017, pp. 149-154.
[22] T. Chiu, Y. Wang and M. Vasarhelyi, “A framework of applying
process mining for fraud scheme detection,” SSRN Electronic Journal, WangYang Yu received the Ph.D. degree
2017, doi:10.2139/ssrn.2995286. from Tongji University, Shanghai, China,
[23] W. Yang et al., “Show Me the Money! Finding Flawed
Implementations of Third-party In-app Payment in Android Apps,” in
in 2014. He is currently an Associate
Proc. NDSS, Shanghai, China, 2017. Professor with the School of Computer
[24] W. Rui, S. Chen, X. Wang and S.Qadeer, “How to Shop for Free Science, Shaanxi Normal University,
Online--Security Analysis of Cashier-as-a-Service Based Web Stores,” Xi'an, China. His research interests include
in Proc. SSP, Oakland, CA, USA, 2011, pp. 465-480.
[25] E. Ramezani, D. Fahland and W. Aalst, “Where did I misbehave?
the theory of Petri nets, formal methods in
Diagnostic information in compliance checking,” in BPM., Berlin, software engineering, and artificial
Germany, Springer, 2012, pp. 262-278. intelligence.
[26] M. Leoni, J. Munoz-Gama, J. Carmona and W. Aalst, “Decomposing
alignment-based conformance checking of data-aware process models,”
in Proc. OTM, Amantea, Italy, 2014, pp. 3-20.
[27] K. Jensen, “Coloured Petri Nets: A High Level Language for System YaDi Wang is a postgraduate student with
Design and Analysis,” DAIMI Report Series, vol. 19, no. 338, pp. 342- the School of Computer Science, Shaanxi
416, Mar. 1993. Normal University, Xi'an, China. Her
[28] B. Ji., H. Li., W. Han and Y. Jia, “Research on e-commerce-oriented research interests include the theory of
user abnormal behaviour detection,” Netinfo Security, Sep. 2014.
[29] R. Agrawal and R. Srikant, “Fast algorithms for mining association Petri nets, process mining, online
rules,” in Proc. VLDB, S. F., USA, 1994, pp. 487-499. transaction systems, formal methods in
[30] R. Srikant and R. Agrawal, “Mining sequential patterns: software engineering, and artificial
Generalizations and performance improvements,” in Proc. EDBT, intelligence.
Avignon, France, 1996, pp. 1-17.
[31] Y. Lian, Y. Dai and H. Wang, “Anomaly detection of user behaviors
based on profile mining,” Chinese J. Computat-Ch., vol. 25, no. 3, pp.
325-330, Mar. 2002.
[32] C. Cortes and V Vapnik, “Support-vector networks,” Machine
Learning, vol. 20, no. 3, pp.273-297, Sep. 1995.
[33] F. Yasmin, R. Bemthuis, M. Elhagaly, F. Wijnhoven and F. Bukhsh,
“A Process Mining Starting Guideline for Process Analysts and
> IEEE TRANSACTION ON COMPUTATIONAL SOCIAL SYSTEMS, VOL., NO., 13