See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/228133328
Association Rule Approach for Evaluation of Business Intelligence for Enterprise
Systems
Article · February 2012
CITATION READS
1 847
4 authors:
Saeed Rouhani Mehdi Ghazanfari
University of Tehran Iran University of Science and Technology
59 PUBLICATIONS 479 CITATIONS 89 PUBLICATIONS 1,131 CITATIONS
SEE PROFILE SEE PROFILE
Mostafa Jafari Peyman Akhavan
University of Zanjan Malek Ashtar University of Technology
64 PUBLICATIONS 1,151 CITATIONS 154 PUBLICATIONS 1,844 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Evaluating the Relationships between Organizational Knowledge Creation Theory and Organizational Performance View project
MSc Thesis View project
All content following this page was uploaded by Saeed Rouhani on 20 May 2014.
The user has requested enhancement of the downloaded file.
Association Rule Approach for Evaluation
of Business Intelligence for Enterprise
Systems
Mehdi Ghazanfari 1 , Saeed Rouhani 2 , Mostafa Jafari 3
and Peyman Akhavan 4
All business processes leave their ‘imprints’ in enterprise systems. For example, business process
events are recorded in logs for Enterprise Resource Planning (ERP), Customer Relationship
Management (CRM), and Supply Chain Management (SCM) systems. This paper focuses on the
potential use of association rule algorithms as data mining techniques in the field of process
mining. A process map is extracted using the proposed approach. Managers can understand the
real process steps that are hidden in data logs with the help of this process map. This paper uses
an approach that combines previously established (‘apriori’) procedures with a process model
discovered by process mining (analysis of business processes based on relevant records). This
paper also provides an original approach that can help organizations with Business Intelligence
(BI) implementation and with creating models extracted from enterprise systems logs.
Keywords: Decision support, Business intelligence, Association rules, Enterprise systems, Apriori
algorithm
Introduction
Over the past decade, many algorithms, tools and techniques have been developed for
process mining to advance Business Intelligence (BI). Nowadays, enterprise systems
accumulate related information in a structured form. Enterprise Resource Planning
(ERP) systems log all transactions, for example, users filling out forms and revising
documents. (Agrawal et al., 1998; Grigori et al., 2001; Sayal et al., 2002; and van der Aalst
et al., 2003). One of the important non-functional capabilities of business systems is
logging and registering all events which happen in all computerized business areas.
Logging customer information in the Customer Relationship Management (CRM)
module can be considered an example. These examples show that all business systems
1
Full Professor, Department of Industrial Engineering, Iran University of Science and Technology, Tehran,
Iran. E-mail: xxxxxxxxxxxxxx
CE pls
2
Ph.D. Candidate, Department of Industrial Engineering, Iran University of Science and Technology,
Tehran, Iran; and is the corresponding author. E-mail: [email protected]
provide 3
Assistant Professor, Department of Industrial Engineering, Iran University of Science and Technology,
email ids Tehran, Iran. E-mail: xxxxxxxxxxxxxx
4
Assistant Professor, Department of Industrial Engineering, Iran University of Science and Technology,
Tehran, Iran. E-mail: xxxxxxxxxxxxxx
© 2011 IUP. Rule
Association All Rights Reserved.
Approach to Evaluating of Business Intelligence for Enterprise Systems 1
take advantage of an event log feature, which is referred to as a ‘history’, an ‘audit trail’,
or a ‘transaction log’ and so on. The event log typically contains information about
events referring to an activity and a case. The case (also named the ‘process instance’)
is the ‘request’ which is being handled, for example, a customer ’s order, a job
application, an insurance claim or a building permit. Typically, events have a timestamp
indicating the time of occurrence. Moreover, when people are involved, event logs
contain information about the executive person or the one who initiates the event or
the performer (Zur Muehlen, 2001a and 2001b; and van der Aalst and van Hee, 2002).
Further, as to the availability of event logs, there is increasing interest in monitoring
business processes (to assess BI); meanwhile, there is constant pressure to improve the
performance and efficiency of business processes (Rouhani and Ghazanfari, 2007). This
requires more fine-grained monitoring facilities like those illustrated by today’s
buzzwords such as Business Activity Monitoring (BAM), Business Operations Management
(BOM) and Business Process Intelligence (BPI). Business process mining or process
mining, for short, aims to automate the construction of models by explaining the
behavior observed in the event log. For example, based on some event logs, one can
construct a process model expressed in terms of a Petri net (the modeling language in
process modeling) (Agrawal et al., 1998; Sayal et al., 2002; van der Aalst et al., 2002b;
and van der Aalst et al., 2003).
Therefore, it is important to consider the new tools, techniques and algorithms
being developed in this field, because enterprise systems like ERP, SCM, and CRM are
increasingly being implemented in businesses (Ghazanfari et al., 2009). Thus, process
mining as a knowledge discovery key will be useful for management decision making.
Nowadays, the individual-system approach to decision-support, such as Decision
Support Systems (DSS), has been replaced by a new environmental approach. In the
past, DSS were independent, separate systems in an organization and had a weak
dynamic relationship with other systems (island systems). However, enterprise
systems are now the foundation of an organization, and practitioners like ERP vendors
design and implement BI as a decision-support environment for the management in
enterprise systems. The increasing trend to use intelligent tools in business systems
has increased the need for process mining applications; therefore, new approaches
that support the desired quantity measurements in business areas can be useful.
The rest of this paper is organized as follows: Section 2 presents related work on
process mining. Section 3 briefly introduces the concept of BI and its definitions. The
association rule technique and its algorithms are discussed in Section 4. Section 5
describes the proposed approach and the adopted algorithm. The resulting mining
process model, which helps in the evaluation of BI in enterprise systems, is
presented in Section 5.
2 The IUP Journal of Computer Sciences, Vol. V, No. 2, 2011
2. Literature Review of Process Mining
The process mining concept was first developed in 1998. Agrawal et al. (1998) and Cook
and Wolf (1998a and 1998b), famous researchers in data mining, introduced the new
idea that if systems log (record) transactional events, then tools based on scientific
algorithms can discover the process model. This idea was later continued by others
(Herbst and Karagiannis, 1998, 1999 and 2000; Herbst, 2000a, 2000b and 2001; Schimm,
2000, 2001a, 2001b and 2002; Maruster et al., 2001 and 2002; Maxeiner et al., 2001;
Weijters and van der Aalst, 2001a, 2001b and 2002; van der Aalst et al., 2002b; Herbst
and Karagiannis, 2004; and van der Aalst et al., 2005). Cook and Wolf (1998a and 1998b)
described three methods for process discovery—neural networks, a purely algorithmic
approach and a Markovian approach. The merely algorithmic approach builds a Finite
State Machine (FSM) where states (event situations such as document versioning) are
fused if their futures (in terms of possible behavior in the next steps) are identical. The
Markovian approach uses a mixture of algorithmic and statistical methods. It is able to
deal with the noise pattern in multiple models caused by errors in logging). Cook and
Wolf (1998b) extended their work to include concurrent processes. They propose specific
indices (entropy, event type counts, periodicity and causality) and use these indices to
discover models in event streams. However, they do not provide an approach that can
generate explicit process models. They provide a way to measure and quantify
discrepancies between a process model and the actual behavior as registered using
event-based data.
The idea of applying process mining in the context of workflow management
systems, which had been preferred by managers in the 1990s, was first introduced by
Agrawal et al. (1998). This work is based on workflow graphs, which are motivated by
workflow products such as IBM MQSeries workflow and InConcert. In this paper, two
issues are defined. The first issue is to find a workflow graph produced by analyzing
events appearing in a given workflow log. The second issue is to define the edge
conditions at which the process model (workflow graph) can be seen from multiple
perspectives.
In the past two decades, approaches and algorithms were introduced in the process
mining field that tried to discover a process model (called a workflow graph), based on
the events logs of computerized systems.
Workflow graphs utilize true and false signs instead of OR and AND, which cannot
be applied to cyclic graphs. Maxeiner et al. (2001) presented a tool based on these
algorithms. Schimm (2000 and 2002) extended a mining approach suitable for
discovering hierarchically structured workflow processes. This requires all splits and
joints to be balanced by checking all input events to a node and output events from a
node. Herbst and Karagiannis also concentrated on the issue of process mining in the
context of workflow management using an inductive approach (Herbst and Karagiannis,
1998, 1999 and 2000; and Herbst, 2000a, 2000b and 2001). The work presented by Herbst
and Karagiannis (1998 and 2000) is limited to sequential models, but the approach
described by Herbst (2000a, 2000b and 2001) allows concurrency.
Association Rule Approach to Evaluating of Business Intelligence for Enterprise Systems 3
van der Aalst (2004) evaluates the method of extracting process models from data
with that of distillation. In terms of business process mining, van der Aalst (2004) affirms
that almost any transactional information system can provide suitable data.
Some workflow meta-models may be used to define workflow models, and can also
act as a language for the display of process-mining activities. An example of this is
provided by Herbst and Karagiannis (2004) and their InWoLvE workflow mining
system.
In the field of soft computing techniques, authors such as Alves de Medeiros et al.
(2005) and van der Aalst and Alves de Medeiros (2005) have explored the use of
evolutionary computing techniques like genetic algorithms.
Recent years have seen the development of extensions of process modeling towards
the incorporation of organizational objectives (Soffer and Wand, 2005), risks (Rosemann
and zur Muehlen, 2005) and other contextual factors related to process design
(Rosemann et al., 2008).
Work on identifying common process patterns encountered in organizations may be
required along with standardization of the way enterprise information systems record
process data (Tiwari et al., 2008). Nowadays process reengineering projects continue to
be important business investment decisions for high-level managers. Therefore, new
approaches for organizations to use enterprise systems data as knowledge for decision-
making are an important part of the agenda of new projects (Gartner Group, 2009).
There is a very extensive body of literature in this scientific area, but this summary
has included only literature that is directly related to the background and the new
approaches in process mining that are intended to extract a process model from users’
log data in enterprise systems.
3. Business Intelligence
Business Intelligence (BI) is a grand, umbrella term introduced by Howard Dresner of
the Gartner Group in 1989 to describe a set of concepts and methods to improve
business decision making by using fact-based computerized support systems (Nylund,
1999). As the first scientific definition, Ghoshal and Kim (1986) referred to BI as a
management philosophy and a tool that helps organizations to manage and refine
business information with the purpose of making effective decisions.
In continuance of this effort to describe BI, it was considered an instrument of
analysis providing automated decision making about business conditions, sales,
customer demand, product preference, etc. It uses huge database (data-warehouse)
analysis as well as mathematical, statistical, artificial intelligence, data mining and
Online Analysis Processing (OLAP) (Berson and Smith, 1997). Eckerson (2005)
understood that BI must be able to provide the following tools: production reporting
tools, end-user query and reporting tools, OLAP, dashboard/screen tools, data mining
tools and planning and modeling tools.
4 The IUP Journal of Computer Sciences, Vol. V, No. 2, 2011
And beyond BI is a set of concepts, methods and processes to improve business
decisions, which use information from multiple sources and apply past experience to
develop an exact understanding of business dynamics (Maria, 2005). It integrates the
analysis of data with decision analysis tools to provide right information to right
persons throughout the organization with the purpose of improving strategic and
tactical decisions. A BI system is a data-driven DSS that primarily supports querying
of a historical database and production of periodic summary reports (Power, 2008).
Lönnqvist and Pirttimäki (2006) state that the term ‘BI’ can be used when referring
to the following concepts:
• Related information and knowledge of the organization which describe the
business environment, the organization itself, the conditions of market,
customers and competitors, and economic issues;
• A systemic and systematic process by which organizations obtain, analyze
and distribute the information for making decisions about business
operations.
A literature review of the theme BI shows a ‘division’ between technical and
managerial view points, tracing two broad patterns. The managerial approach sees BI
as a process in which data gathered from inside and outside the enterprise are
integrated in order to generate information relevant to the decision-making process.
The role of BI here is to create an informational environment in which operational data
gathered from Transactional Processing Systems (TPS) and external sources can be
analyzed in order to extract ‘strategic’ business knowledge and support managerial
unstructured decisions.
The technical approach considers BI as a set of tools that support the process
described above. The focus is not on the process itself, but on the technologies,
algorithms and tools that allow the save, recovery, manipulation and analysis of data
and information (Petrini and Pozzebon, 2008).
But in overall view, there are two important issues: first, the core of BI is the
gathering, analysis and distribution of information; and the second the objective of BI
is to support the strategic decision-making process.
By strategic decisions we mean decisions related to implementation and
evaluation of organizational vision, mission, goals and objectives which are supposed
to have medium- to long-term impacts on the organization, as opposed to operational
decisions which are day-to-day in nature and more related to execution (Petrini and
Pozzebon, 2008).
Bose (2009) also describes the managerial view of BI as a process to get the right
information to the right people at the right time so that they can make decisions that
ultimately improve enterprise performance.
Association Rule Approach to Evaluating of Business Intelligence for Enterprise Systems 5
The technical view of BI usually centers on the process or applications and
technologies for gathering, storing, analyzing and providing access to data to help make
better business decisions. Another important observation in the BI evolution is that
industry leaders are currently moving from an operational BI of the past to an analytical
BI of the future that focuses on customers, resources and abilities to drive new decisions
everyday. They have implemented one or more forms of advanced analytics for meeting
these business needs. Ranjan (2008) considers BI as the conscious, methodical
transformation of data from any and all data sources into new forms to provide
information that is business-driven and result-oriented. It will often encompass a
mixture of tools, databases, and vendors in order to deliver an infrastructure that not
only will deliver the initial solution, but also will incorporate the ability to change with
the business and current marketplace.
Wu et al. (2007) define BI as a business management term used to describe
applications and technologies, which are used to gather, provide access to and analyze
data and information about the organization to help make better business decisions. In
other words, the purpose of BI is to provide actionable insight BI technologies, including
traditional data warehousing technologies such as reporting, ad hoc querying and OLAP.
Elbashir et al. (2008) refer to BI systems as an important group of systems for data
analysis and reporting that support managers at different levels of the organization with
timely, relevant, and trouble-free to use information and enable them to make better
decisions. They explain BI systems are often implemented as enhancements to widely
adopted enterprise systems like ERP systems. The scale of investment in BI systems is
reflective of their growing strategic importance and highlights the need for more
attention in research studies (Elbashir et al., 2008).
In some researches, BI is concerned with the integration and consolidation of raw
data into Key Performance Indicators (KPIs). KPIs represent an essential basis for
business decisions that are to be made in the context of process execution. Therefore,
operational processes provide the context for data analysis, information interpretation,
and taking appropriate action (Bucher et al., 2009).
Recently, Jalonen and Lonnqvist (2009) defined that BI generates, analyzes and
reports on trends in the business environment and on internal organizational matters.
The explained analyses may be produced systematically and regularly or they may be
ad hoc ones related to a specific decision-making context. This knowledge is employed
by decision makers at different organizational levels. The process results in the
generation of both numerical and textual information.
Two important propositions about the definitions of BI are as follows:
1. Often, approaches to BI are bounded by means of supported functions, systems,
or system types.
6 The IUP Journal of Computer Sciences, Vol. V, No. 2, 2011
2. BI is mainly aimed at providing an organization’s management levels with
decision-relevant analytic information in support of their management
activities.
In Table 1, BI definitions are divided based on three approaches—managerial
approach to BI, technical approach to BI, enterprise system enabler approach to BI.
Table 1: Classification of BI Definitions Based on Approach
BI Managerial Tec hni ca l System Enabler
Definition Approach A pp r o a c h A pp r o a c h
Focuses on Excellence of Tools that support the Value-added features
management decision- process of BI on supporting
making process Managerial Approach information
References Ghoshal and Kim (1986), Berson and Smith Eckerson (2005),
Maria (2005), Power (1997), Wu et al. Lönnqvist and
(2008), Petrini and (2007), Petrini and Pirttimäki (2006),
Pozzebon (2008), Bose Pozzebon (2008), and Ranjan (2008), and
(2009), and Jalonen and Bucher et al. (2009) Elbashir et al. (2008)
Lonnqvist (2009)
4. Association Rules
Association was introduced and applied by Agrawal et al. (1993). The goal of the
association rules is to detect relationships or associations in large data sets between
the specific values of nominal attributes.
Association rule mining discovers the relationships between items from the set of
transactions. These relationships can be expressed by association rules such as [i1
i2, i3 support = 3.5%, confidence = 45%]. This association rule implies that 3.5% of all
the transactions under analysis show that items i1, i2 and i3 appear jointly. A confidence
rate of 45% indicates that 45% of the transactions containing i1 also contain i2 and i3.
Associations may include any number of items on either side of the rule.
The problem of mining association rules is formally stated as follows (Agrawal et al.,
1993; and Srikant and Agrawal, 1997):
Let I = {i1, i2, ..., im} denote a set of literals, namely, items. Moreover, let D represent
a set of transactions, where each transaction T is a set of items such that T I. A unique
identifier, namely TID, is associated with each transaction. A transaction T is said to
contain X, a set of some items in I, if X T. An association rule implies the form
X Y, where, X I, Y I and X Y = . The rule X Y holds in the transaction set
D with confidence, c, where c% of transactions in D that contain X also contain Y. The
rule has support, s, in the transaction set D if s% of transactions in D contain X Y. An
efficient algorithm is required that restricts the search space and checks only a subset
of all association rules, yet does not miss important rules. The apriori algorithm
Association Rule Approach to Evaluating of Business Intelligence for Enterprise Systems 7
developed by Agrawal et al. (1993) and Srikant and Agrawal (1997) is such an algorithm.
However, the interestingness (validity) of the rule is only based on support and
confidence. The apriori algorithm is described as follows:
Step 1: L1 = Find large 1-itemsets;
Step 2: for (k = 2; Lk–1 0; k++) do begin
Step3: Ck = apriori gen (Lk–1); // new candidates//
Step4: for all TID T D do begin
Step 5: CT = subset (Ck, T // candidates contained in T//
Step 6: for all candidates C CT do
Step 7: Ccount ++
Step 8: end
Step 9: Lk = |C Ck| Ccount/no. of data minimum support threshold
Step10: end
Step 11: Return L – UkLk
In the above apriori algorithm, the apriori_gen procedure generates candidates
of the itemset and then uses the minimum support criterion to eliminate infrequent
item sets. The apriori_gen procedure performs two actions, namely, join and prune.
In join step, Lk–1 is joined with Lk–1 to generate potential candidates of itemset. The
prune step uses the minimum support criterion to remove candidates of itemset that
are not frequent. In fact, expanding an itemset reduces its support. A K-itemset can
only be frequent if all of its K–1 subsets are also frequent; consequent apriori_gen
only generates candidates with this property, a situation easily achievable given the
set Lk–1.
Association rule mining is a popular technique for market basket analysis, which
typically aims at discovering buying patterns of customers in supermarkets, mail order,
and other types of stores.
By mining association rules, marketing analysts try to find sets of products that
are frequently bought together, so that certain other items can be inferred from a
shopping cart containing particular items. Association rules can often be used to
design marketing promotions, for example, by appropriately arranging products on
supermarket shelves and by directly suggesting items to customers that may be of
interest to them.
With the constant collection and storage of considerable quantities of business data,
association rules are discovered from the domain databases and applied in many areas,
such as marketing, logistics and manufacturing (Chen et al., 2005). In the areas of
marketing, advertising and sales, corporations have found that they can benefit
8 The IUP Journal of Computer Sciences, Vol. V, No. 2, 2011
enormously if implicit and previously unknown buying and calling patterns of customers
can be discovered from large volumes of business data.
Generally, support and confidence are taken as two measurable factors to evaluate
the ‘interestingness’ of association rules (Agrawal et al., 1993; and Srikant and
Agrawal, 1997).
Association rules are regarded as interesting if their support and confidence are
greater than the user-specified minimum support and minimum confidence, respectively.
In data mining, it is important but difficult to determine these two thresholds of
‘interestingness’ (rule validity) appropriately. Data miners usually specify these
thresholds in an arbitrary manner.
Numerous algorithms for finding association rules have been developed in previous
studies (Hipp et al., 2000). However, relatively little literature has attempted to employ
the application-specific criteria for setting the threshold of association rules.
5. Proposed Approach
To apply the techniques of association rules in process mining, it is necessary to
convert log data in enterprise systems to a T set format (as mentioned in Section 3).
T set is equal to cases in apiori algorithm definitions. An apriori algorithm is used to
find rules or relations of activities for cases. Since this paper intends to find out the
relationship between some business activities, thresholds of support and confidence
can be set relatively lower. This way, the process model can be mined from rules with
appropriate thresholds.
The proposed approach for finding the relevant process models is schematically
illustrated in the flowchart in Figure 1. The proposed approach is described as
follows:
Figure 1: Flowchart of the Proposed Approach
Convert Event Log
Input Data
Mine Association Rules for Each
Two-Sequence Activity
Select Rules for Criteria
From Activity Chains
and Show Process
Association Rule Approach to Evaluating of Business Intelligence for Enterprise Systems 9
Step 1: Convert event log to transaction data.
Step 2: Input data for association rule mining.
Step 3: Mine association rules for each and every two-sequence activity by using
the apriori algorithm with minimum support and minimum confidence.
Step 4: Select rules for criteria.
Step 5: Form activity chains and show process.
For association rules like X Y, three criteria are jointly used for rule evaluation as
follows:
Support: The support, s, is the percentage of transactions that contain X Y (Agrawal
et al., 1993). It takes the form:
S X Y
Confidence: The confidence, c, is the ratio of the percentage of transactions that
contain X Y to the percentage of transactions that contain X (Agrawal et al., 1993).
It takes the form:
X Y
C
x
Lift: It is a simple correlation measure that is given as follows. The occurrence of item
set A is independent of the occurrence of item set B if P(X Y) = P(X) P(Y); otherwise,
item sets A and B are dependent and correlated as events. This definition can be easily
extended to more than two item sets. The lift between the occurrence of A and B can
be measured by computing (Han and Kamber, 2006):
X Y
lift X , Y
X Y
If the resulting value of lift is less than 1, then the occurrence of A is negatively
correlated with the occurrence of B. If the result value is greater than 1, then A and B
are positively correlated, meaning the occurrence of one implies the occurrence of the
other. If the resulting value is equal to 1, then A and B are independent and there is
no correlation between them.
6. Illustrative Example
An example of event log data in a process of industrial ERP is used to illustrate the
proposed approach that was presented in Section 5. First, the event log is filled in
transactional format in Clementine (a powerful data mining tool that can execute an
apiori algorithm on huge data). This example log is illustrated in Figure 2. For
association rule mining, a Clementine stream in Figure 3 was designed to derive, filter,
10 The IUP Journal of Computer Sciences, Vol. V, No. 2, 2011
and set data from the source in order to apply the apriori algorithm to it. This algorithm
needs criteria, in which minimum support and minimum confidence are set at 40% and
90.0%, respectively, and rules based on these criteria will be discovered.
Figure 2: Example Log
Figure 3: Design of the Data Mining Stream in Clementine
Table
CE pls
provide
a
clear
image
Filter Type Task 4,
Log bd
Task 5 and
Task 6
In the proposed approach, the algorithm is run six times because there were seven
tasks. This means that in order to find chains in this approach, we need the relationship
of tasks one after another. By using an apriori algorithm, the association rules between
task 1 and task 2, task 2 and task 3, task 3 and task 4,
task 4 and task 5, task 5 and task 6, and task 6 and task 7 are mined with specified
thresholds. After finding rules with respect to criteria (better support, confidence, and
lift) and forming chains, relevant rules that have high criteria value are selected. The
selected rules for the sample cases are presented in Figure 4.
Association Rule Approach to Evaluating of Business Intelligence for Enterprise Systems 11
Figure 4: Explored Association Rules
In the final step of the approach, which is formed by streams, the association rules
are sorted. In Figure 4, rules that form one chain are colored alike. Finally, by drawing
these chains, a process model is produced. The mined process model, which is the
result of the event log mining example, is illustrated in Figure 5. By extracting the
process model that is the result of the proposed approach, we can depict a process
map, which is a key tool for the evaluation of the BI of enterprise systems. Managers
can understand the real process steps that are hidden in data logs with the help of
this process map.
Real, current enterprise processes are different from what managers think they are,
and by using this process model on their enterprise systems, managers can realize this
fact. Some criteria such as support and confidence can also help them with process
scenarios that happen in different, exceptional cases.
By applying the proposed approach, a data mining technique like an association rule
algorithm can be used to develop a process model to evaluate the decision-making
support of enterprise systems. The proposed approach of this research was to use the
data log of the activities in enterprise systems to discover if the use of a data mining
technique called association rules is a suitable procedure to evaluate the power of BI
(decision-making support).
12 The IUP Journal of Computer Sciences, Vol. V, No. 2, 2011
Figure 5: Mined Process Model
b f
c d g h
e
i
j k
Conclusion
Enterprise systems, like ERP, SCM and CRM, are implemented in every organization
worldwide, hence it is important to consider new techniques, tools and algorithms in
the field of decision support (Ghazanfari et al., 2009). Association rule discovery is one
of the popular techniques recently developed in the area of data mining.
This research focused on the potential use of association rule algorithms as data
mining techniques in the field of process mining. By extracting the process model that
is the result of the proposed approach, we can depict a process map which is a key tool
for the evaluation of the BI of enterprise systems. With the help of this process map,
managers can understand the real process steps that are hidden in data logs.
This paper proposed an approach that is a combination of the association rule method
(apriori algorithm) and a process mining concept that can be used in the evaluation of
the BI of enterprise systems. The contribution of this research is a combination of some
criteria from apiori algorithm with process mining. The major goal of this approach was
to look for chains that are formed by association rules. To illustrate the applicability of
this approach, a sample of log data was converted to a process map.
This research has two special ’value-added’ features. First, we created a technique
Association Rule Approach to Evaluating of Business Intelligence for Enterprise Systems 13
of using process mining as a concept for the evaluation of the BI of enterprise systems
according to definitions of business intelligence. This means that if we can extract the
hidden process model of the users’ data systems using this approach, we can achieve
a kind of intelligence in enterprise systems. Second, a combination of a data mining tool
(apirori algorithm) in process mining was presented as an approach that had not
previously been considered.
An important limitation of this approach is the complexity of converting log data to
a usable apirori algorithm format. One of the predicted constraints of this approach is
the problem of converting the logs of different kinds of systems to a standard and usable
format for an algorithm, because different enterprise systems save their data logs with
different formats, and the suggested approach requires sequential and special data.
Applying other association rules algorithms in process mining research, comparing
these algorithms, and designing and implementing tools based on this approach are
recommended for future research.
References
1. Agrawal R, Gunopulos D and Leymann F (1998), “Mining Process Models from
Workflow Logs”, in 6th International Conference on Extending Database Technology,
pp. 469-483.
2. Agrawal R, Imielinski T and Swami A (1993), “Mining Association Rules Between
Sets of Items in Large Databases”, Proceedings of the ACM SIGMOD Conference on
Management of Data, pp. 254-259.
3. Alves de Medeiros A K, Weijters A J M M and van der Aalst W M P (2005), “Genetic
C E pl s c h k re f
nos 4 11,47,52 Process Mining: A Basic Approach and its Challenges”, in Bussler C and Haller A
not cited in the
text (Eds.), Business Process Management Workshops: BPM 2005, Springer Verlag,
Heidelberg.
4. Azoff M and Charlesworth I (2004), “The New Business Intelligence: A European
Perspective”, Butler Group White Paper.
5. Berson A and Smith S J (1997), Data Warehousing, Data Mining, and OLAP, McGraw-
Hill Ltd.
6. Bose R (2009), “Advanced Analytics: Opportunities and Challenges”, Industrial
Management & Data Systems, Vol. 109, No. 2, pp. 155-172.
7. Bucher T, Gericke A and Sigg S (2009), “Process-Centric Business Intelligence”,
Business Process Management Journal, Vol. 15, No. 3, pp. 408-429.
8. Chen M C, Huang C L, Chen K Y and Wu H P (2005), “Aggregation of Orders in
Distribution Centers Using Data Mining”, Expert Systems with Applications, Vol. 28,
No. 3, pp. 453-460.
14 The IUP Journal of Computer Sciences, Vol. V, No. 2, 2011
9. Cook J E and Wolf A L (1998a), “Discovering Models of Software Processes from
Event-Based Data”, ACM Transactions on Software Engineering and Methodology,
Vol. 7, No. 3, pp. 215-249.
10. Cook J E and Wolf A L (1998b), “Event-Based Detection of Concurrency”, in
Proceedings of the 6th International Symposium on the Foundations of Software
Engineering (FSE-6), pp. 35-45.
11. Cook J E and Wolf A L (1999), “Software Process Validation: Quantitatively Measuring
the Correspondence of a Process to a Model”, ACM Transactions on Software
Engineering and Methodology, Vol. 8, No. 2, pp. 147-176.
12. Eckerson Wayne W (2005), Performance Dashboards: Measuring, Monitoring, and
Managing Your Business, Wiley.
13. Elbashir M, Collier P and Davern M (2008), “Measuring the Effects of Business
Intelligence Systems: The Relationship Between Business Process and
Organizational Performance”, International Journal of Accounting Information
Systems, Vol. 9, No. 3, pp. 135-153.
14. Gartner Group (2009), “Meeting the Challenge: The 2009 CIO Agenda”, January, EXP
Premier Report, Gartner, Stamford, CT.
15. Ghazanfari M, Rouhani S, Jafari M and Taghavifard M (2009), “ERP Requirements
CE
for Supporting Management Decisions & Business Intelligence”, The IUP Journal of
pls
Information Technology, Vol. V, No. 3, pp.xxx.
provide
pp 16. Ghoshal S and Kim S K (1986), “Building Effective Intelligence Systems for
Competitive Advantage”, Sloan Management Review, Vol. 28, No. 1, pp. 49-58.
17. Grigori D, Casati F, Dayal U and Shan M C (2001), “Improving Business Process
Quality Through Exception Understanding, Prediction, and Prevention”, in P Apers,
P Atzeni, S Ceri, S Paraboschi, K Ramamohanarao and R Snodgrass (Eds.),
Proceedings of the 27th International Conference on Very Large Data Bases
(VLDB’01), pp. 159-168, Morgan Kaufmann, Los Alamitos, CA.
18. Han J and Kamber M (2006), Data Mining: Concepts and Techniques, 2nd Edition, The
Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann
Publishers.
19. Herbst J (2000a), “A Machine Learning Approach to Workflow Management”, in
Proceedings 11th European Conference on Machine Learning, Lecture Notes in
Computer Science, Vol. 1810, pp. 183-194, Springer-Verlag, Berlin.
20. Herbst J (2000b), “Dealing with Concurrency in Workflow Induction”, in U Baake, R
Zobel and M Al-Akaidi (Eds.), European Concurrent Engineering Conference, SCS
Europe.
21. Herbst J (2001), “Ein induktiver Ansatz zur Akquisition und Adaption von Workflow-
Modellen”, Ph.D. Thesis, University of Ulm.
Association Rule Approach to Evaluating of Business Intelligence for Enterprise Systems 15
22. Herbst J and Karagiannis D (1998), “Integrating Machine Learning and Workflow
Management to Support Acquisition and Adaptation of Workflow Models”, in
Proceedings of the 9th International Workshop on Database and Expert Systems
Applications, IEEE, pp. 745-752.
23. Herbst J and Karagiannis D (1999), “An Inductive Approach to the Acquisition and
Adaptation of Workflow Models”, in M Ibrahim and B Drabble (Eds.), Proceedings
of the IJCAI99 Workshop on Intelligent Workflow and Process Management: The
New Frontier for AI in Business, pp. 52-57, Stockholm, Sweden.
24. Herbst J and Karagiannis D (2000), “Integrating Machine Learning and Workflow
Management to Support Acquisition and Adaptation of Workflow Models”,
International Journal of Intelligent Systems in Accounting, Finance and Management,
Vol. 9, No.xxx, pp. 67-92.
CE pls
25. Herbst J and Karagiannis D (2004), “Workflow Mining with InWoLvE”, Computers in
provide
Industry, Vol. 53, No.xxx, pp. 245-264.
issue
26. Hipp J, Gu¨nter U and Nakhaeizadeh G (2000), “Algorithms for Association Rule
no.s Mining—A General Survey and Comparison”, ACM SIGKDD Explorations Newsletter,
Vol. 2, No. 1, pp. 58-64.
27. Jalonen H and Lonnqvist A (2009), “Predictive Business-Fresh Initiative or Old Wine
in a New Bottle”, Management Decision, Vol. 47, No. 10, pp. 1595-1609.
28. Lönnqvist A and Pirttimäki V (2006), “The Measurement of Business Intelligence”,
Information Systems Management, Vol. 23, No. 1, pp. 32-40.
29. Maria F (2005), “Improving the Utilization of External Strategic Information”,
Tampere University of Technology, Master of Science Thesis.
30. Maruster L, van der Aalst W M P, Weijters A J M M et al. (2001), “Automated
Discovery of Workflow Models from Hospital Data”, in B Kr€ose, M de Rijke, G
Schreiber and M Van Someren (Eds.), Proceedings of the 13 th Belgium-
Netherlands Conference on Artificial Intelligence (BNAIC 2001), pp. 183-190.
31. Maruster L, Weijters A J M M, van der Aalst W M P and van den Bosch A (2002),
“Process Mining: Discovering Direct Successors in Process Logs”, in Proceedings of
the 5th International Conference on Discovery Science (Discovery Science 2002),
Lecture Notes in Artificial Intelligence, Vol. 2534, No.xxx, pp. 364-373, Springer-
Verlag, Berlin.
32. Maxeiner M K, Kuspert K and Leymann F (2001), “Data Mining von Workflow-
Protokollen zur Teilautomatisierten Konstruktion von Prozemodellen”, in Proceedings
of Datenbanksysteme in B€uro, Technik und Wissenschaft, Informatik Aktuell,
Springer, pp. 75-84, Berlin, Germany.
C E p ls c h k
33rd ref is 33. Nylund A (1999), Tracing the BI Family Tree, Knowledge Management.
incomplete
16 The IUP Journal of Computer Sciences, Vol. V, No. 2, 2011
34. Petrini M and Pozzebon M (2008), “What Role is ‘Business Intelligence’ Playing in
Developing Countries? A Picture of Brazilian Companies”, in Rahman and Hakikur
(Eds.), Data Mining Applications for Empowering Knowledge Societies, pp. 237-257,
IGI Global.
35. Power D J (2008), “Understanding Data-Driven Decision Support Systems”,
Information Systems Management, Vol. 25, No. 2, pp. 149-154.
36. Ranjan J (2008), “Business Justification with Business Intelligence”, VINE: The
Journal of Information and Knowledge Management Systems, Vol. 38, No. 4,
pp. 461-475.
37. Rosemann M and zur Muehlen M (2005), “Integrating Risks in Business Process
Models”, in B Campbell, J Underwood and D Bunker (Eds.), Paper Presented at 16th
Australasian Conference on Information Systems, Australasian Chapter of the
Association for Information Systems, Sydney.
38. Rosemann M, Recker J and Flender C (2008), “Contextualization of Business
Processes”, International Journal of Business Process Integration and Management,
Vol. 3, No. 1, pp. 47-60.
39. Rouhani S and Ghazanfari M (2007), “Ranking Explored Association Rules with
ANP”, The 1st National Data Mining Conference (IDMC), Amirkabir University of
Technology, Tehran.
40. Sayal M, Casati F, Dayal U and Shan M C (2002), “Business Process Cockpit”, in
Proceedings of 28th International Conference on Very Large Data Bases (VLDB’02),
pp. 880-883, Morgan Kaufmann, Los Alamitos, CA.
41. Schimm G (2000), “Generic Linear Business Process Modeling”, in S W Liddle, H C
Mayr and B Thalheim (Eds.), Proceedings of the ER 2000 Workshop on Conceptual
Approaches for E-Business and the World Wide Web and Conceptual Modeling,
Vol. 1921, pp. 31-39, Springer-Verlag, Berlin.
42. Schimm G (2001a), “Process Mining elektronischer Gesch€aftsprozesse”, in
CE pls
Proceedings Elektronische eschaftsprozesse.
chk
refs 43. Schimm G (2001b), “Process Mining linearer Prozessmodelle--Ein Ansatz zur
42, 43 automatisierten Akquisition von Prozesswissen”, in Proceedings Konferenz
Professionelles Wissensmanagement.
44. Schimm G (2002), “Process Miner—A Tool for Mining Process Schemes from Event-
Based Data”, in S Flesca and G Ianni (Eds.), Proceedings of the 8th European
Conference on Artificial Intelligence (JELIA), Lecture Notes in Computer Science,
Vol. 2424, pp. 525-528, Springer-Verlag, Berlin.
45. Soffer P and Wand Y (2005), “On the Notion of Soft-Goals in Business Process
Modeling”, Business Process Management Journal, Vol. 11, No. 6, pp. 663-679.
Association Rule Approach to Evaluating of Business Intelligence for Enterprise Systems 17
46. Srikant R and Agrawal R (1997), “Mining Generalized Association Rules”, Future
Generation Computer Systems, Vol. 13, pp. 161-180.
47. Staffware (2002), Staffware Process Monitor (SPM), available at http://www.
staffware.com.
48. Tiwari A, Turner C J and Majeed B (2008), “A Review of Business Process Mining:
State-of-the-Art and Future Trends”, Business Process Management Journal, Vol. 14,
No. 1, pp. 5-22.
49. van der Aalst W M P (2004), “Process Mining: A Research Agenda”, Computers in
Industry, Vol. 53, pp. 231-244.
50. van der Aalst W M P and Alves de Medeiros A K (2005), “Process Mining and Security:
Detecting Process Executions and Checking Process Conformance”, Electronic Notes
in Theoretical Computer Science, Vol. 121,pp. 3-21.
51. van der Aalst W M P and Van Hee K M (2002), Workflow Management: Models,
Methods, and Systems, MIT Press, Cambridge, MA.
52. van der Aalst W M P, Ter Hofstede A H M, Kiepuszewski B and Barros A P (2002a),
“Workflow Patterns”, QUT Technical Report, FIT-TR-2002-02, Queensland University
of Technology, Brisbane.
53. van der Aalst W M P, Weijters A J M M and Maruster L (2002b), “Workflow Mining:
Which Processes Can be Rediscovered?”, BETA Working Paper Series, WP 74,
Eindhoven University of Technology, Eindhoven.
54. van der Aalst W M P, Van Dongen B F, Herbst J et al. (2003), “Workflow Mining:
A Survey of Issues and Approaches”, Data Knowl. Eng., Vol. 47, No. 2, pp. 237-267.
55. Weijters A J M M and van der Aalst W M P (2001a), “Process Mining: Discovering
Workflow Models from Event-Based Data”, in B Kr€ose, M de Rijke, G Schreiber and
M Van Someren (Eds.), Proceedings of the 13th Belgium-Netherlands Conference on
Artificial Intelligence (BNAIC 2001), pp. 283-290.
56. Weijters A J M M and van der Aalst W M P (2001b), “Rediscovering Workflow Models
from Event-Based Data”, in V Hoste and G de Pauw (Eds.), Proceedings of the 11th
Dutch-Belgian Conference on Machine Learning (Benelearn 2001), pp. 93-100.
57. Weijters A J M M and van der Aalst W M P (2002), “Workflow Mining: Discovering
Workflow Models from Event-Based Data”, in C Dousson, F H€oppner and R
Quiniou (Eds.), Proceedings of the ECAI Workshop on Knowledge Discovery and
Spatial Data, pp. 78-84.
58. Wu L, Barash G and Bartolini C (2007), “A Service-Oriented Architecture for Business
Intelligence”, IEEE International Conference on Service-Oriented Computing and
Applications, SOCA’07, pp. 279-285.
18 The IUP Journal of Computer Sciences, Vol. V, No. 2, 2011
59. Zur Muehlen M (2001a), “Process-Driven Management Information Systems-
Combining Data Warehouses and Workflow Technology”, in B Gavish (Ed.),
Proceedings of the International Conference on Electronic Commerce Research
(ICECR-4), pp. 550-566, IEEE Computer Society Press, Los Alamitos, California.
60. Zur Muehlen M (2001b), “Workflow-Based Process Controlling-Or: What You Can
Measure You Can Control”, in L Fischer (Ed.), Workflow Handbook 2001: Workflow
Management Coalition, pp. 61-77, Future Strategies, Lighthouse Point, Florida.
Reference # 56J-2011-04-xx-01
Association Rule Approach to Evaluating of Business Intelligence for Enterprise Systems 19
View publication stats