[Lecture Notes in Computer Science 7196 Lecture Notes in Artificial Intelligence] Grzegorz Skorupa, Wojciech Lorkiewicz, Radosław Katarzyniak (Auth.), Jeng-Shyang Pan, Shyi-Ming Chen, Ngoc Thanh Nguyen (Eds.) - Intelligent Inform
[Lecture Notes in Computer Science 7196 Lecture Notes in Artificial Intelligence] Grzegorz Skorupa, Wojciech Lorkiewicz, Radosław Katarzyniak (Auth.), Jeng-Shyang Pan, Shyi-Ming Chen, Ngoc Thanh Nguyen (Eds.) - Intelligent Inform
Intelligent Information
and Database Systems
4th Asian Conference, ACIIDS 2012
Kaohsiung, Taiwan, March 19-21, 2012
Proceedings, Part I
13
Series Editors
Volume Editors
Jeng-Shyang Pan
National Kaohsiung University of Applied Sciences
Department of Electronic Engineering
No. 415, Chien Kung Road, Kaohsiung 80778, Taiwan
E-mail: [email protected]
Shyi-Ming Chen
National Taichung University of Education
Graduate Institute of Educational Measurement and Statistics
No. 140, Min-Shen Road, Taichung 40306, Taiwan
E-mail: [email protected]
ACIIDS 2012 was the fourth event of the series of international scientific
conferences for research and applications in the field of intelligent information
and database systems. The aim of ACIIDS 2012 was to provide an international
forum for scientific research in the technologies and applications of intelligent
information, database systems and their applications. ACIIDS 2012 took place
March 19–21, 2012, in Kaohsiung, Taiwan. It was co-organized by the National
Kaohsiung University of Applied Sciences (Taiwan), National Taichung Univer-
sity of Education (Taiwan), Taiwanese Association for Consumer Electronics
(TACE) and Wroclaw University of Technology (Poland), in cooperation with
the University of Information Technology (Vietnam), International Society of
Applied Intelligence (ISAI), and Gdynia Maritime University (Poland). ACIIDS
2009 and ACIIDS 2010 took place in Dong Hoi and Hue in Vietnam, respectively,
and ACIIDS 2011 in Deagu, Korea.
We received more than 472 papers from 15 countries over the world. Each
paper was peer reviewed by at least two members of the International Program
Committee and International Reviewer Board. Only 161 papers with the highest
quality were selected for oral presentation and publication in the three volumes
of ACIIDS 2012 proceedings.
The papers included in the proceedings cover the following topics: intelligent
database systems, data warehouses and data mining, natural language processing
and computational linguistics, Semantic Web, social networks and recommenda-
tion systems, collaborative systems and applications, e-business and e-commerce
systems, e-learning systems, information modeling and requirements engineering,
information retrieval systems, intelligent agents and multi-agent systems, intel-
ligent information systems, intelligent Internet systems, intelligent optimization
techniques, object-relational DBMS, ontologies and knowledge sharing, semi-
structured and XML database systems, unified modeling language and unified
processes, Web services and Semantic Web, computer networks and communi-
cation systems.
Accepted and presented papers highlight new trends and challenges of in-
telligent information and database systems. The presenters showed how new
research could lead to novel and innovative applications. We hope you will find
these results useful and inspiring for your future research.
We would like to express our sincere thanks to the Honorary Chairs: Cheng-Qi
Zhang (University of Technology Sydney, Australia), Szu-Wei Yang (President
of National Taichung University of Education, Taiwan) and Tadeusz Wieckowski
(Rector of Wroclaw University of Technology, Poland) for their support.
Our special thanks go to the Program Chairs, all Program and Reviewer
Committee members and all the additional reviewers for their valuable efforts
in the review process, which helped us to guarantee the highest quality of the
VI Preface
selected papers for the conference. We cordially thank the organizers and chairs
of special sessions, which essentially contribute to the success of the conference.
We would also like to express our thanks to the keynote speakers Jerzy
Swiatek from Poland, Shi-Kuo Chang from the USA, Jun Wang, and Rong-
Sheng Xu from China for their interesting and informative talks of world-class
standard.
We cordially thank our main sponsors, National Kaohsiung University of Ap-
plied Sciences (Taiwan), National Taichung University of Education (Taiwan),
Taiwanese Association for Consumer Electronics (TACE) and Wroclaw Univer-
sity of Technology (Poland). Our special thanks are due also to Springer for
publishing the proceedings, and other sponsors for their kind support.
We wish to thank the members of the Organizing Committee for their very
substantial work, especially those who played essential roles: Thou-Ho Chen,
Chin-Shuih Shieh, Mong-Fong Horng and the members of the Local Organizing
Committee for their excellent work.
We cordially thank all the authors for their valuable contributions and the
other participants of this conference. The conference would not have been pos-
sible without their support.
Thanks are also due to the many experts who contributed to making the
event a success.
Jeng-Shyang Pan
Shyi-Ming Chen
Ngoc Thanh Nguyen
Conference Organization
Honorary Chairs
Cheng-Qi Zhang University of Technology Sydney, Australia
Szu-Wei Yang National Taichung University of Education,
Taiwan
Tadeusz Wieckowski Wroclaw University of Technology, Poland
General Chair
Ngoc Thanh Nguyen Wroclaw University of Technology, Poland
Publication Chairs
Chin-Shiuh Shieh National Kaohsiung University of Applied
Sciences, Taiwan
Li-Hsing Yen National University of Kaohsiung, Taiwan
Organizing Chair
Thou-Ho Chen National Kaohsiung University of Applied
Sciences, Taiwan
VIII Conference Organization
Steering Committee
Ngoc Thanh Nguyen - Chair Wroclaw University of Technology, Poland
Bin-Yih Liao National Kaohsiung University of Applied
Sciences, Taiwan
Longbing Cao University of Technology Sydney, Australia
Adam Grzech Wroclaw University of Technology, Poland
Tu Bao Ho Japan Advanced Institute of Science and
Technology, Japan
Tzung-Pei Hong National University of Kaohsiung, Taiwan
Lakhmi C. Jain University of South Australia, Australia
Geun-Sik Jo Inha University, Korea
Jason J. Jung Yeungnam University, Korea
Hoai An Le-Thi University Paul Verlaine - Metz, France
Antoni Ligeza
AGH University of Science and Technology,
Poland
Toyoaki Nishida Kyoto University, Japan
Leszek Rutkowski Technical University of Czestochowa, Poland
Keynote Speakers
– Jerzy Swiatek
– Shi-Kuo Chang
– Jun Wang
Computational Intelligence Laboratory in the Department of Mechanical and
Automation Engineering at the Chinese University of Hong Kong, China
– Rong-Sheng Xu
Computing Center at Institute of High Energy Physics, Chinese Academy of
Sciences, China
Agent System
A Multi-agent Strategy for Integration of Imprecise Descriptions . . . . . . . 1
Grzegorz Skorupa, Wojciech Lorkiewicz, and Radoslaw Katarzyniak
Intelligent Systems(1)
System Analysis Techniques in eHealth Systems: A Case Study . . . . . . . . 74
Krzysztof Brzostowski, Jaroslaw Drapala, and Jerzy Świa̧tek
Intelligent Systems(2)
Local Neighbor Enrichment for Ontology Integration . . . . . . . . . . . . . . . . . 156
Trong Hai Duong, Hai Bang Truong, and Ngoc Thanh Nguyen
Intelligent Systems(3)
Facial Feature Extraction and Applications: A Review . . . . . . . . . . . . . . . . 228
Ying-Ming Wu, Hsueh-Wu Wang, Yen-Ling Lu, Shin Yen, and
Ying-Tung Hsiao
Table of Contents – Part I XVII
Clustering Technology
Approach to Image Segmentation Based on Interval Type-2 Fuzzy
Subtractive Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Long Thanh Ngo and Binh Huy Pham
Adaptive Graphical User Interface Solution for Modern User Devices . . . 411
Miroslav Behan and Ondrej Krejcar
Computational Intelligence
Semi-parametric Smoothing Regression Model Based on GA for
Financial Time Series Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Lingzhi Wang
Intelligent Service
Intelligence Decision Trading Systems for Stock Index . . . . . . . . . . . . . . . . 366
Monruthai Radeerom, Hataitep Wongsuwarn, and
M.L. Kulthon Kasemsan
1 Introduction
Distributed systems are recently gaining interest and their widespread usage
is becoming more and more popular. They usually provide an effective and a
low cost solution to problems that are often unmanageable for a monolithic
and centralised approach. Unfortunately, allowing separate autonomous compo-
nents to coordinate, exchange knowledge and maintain relations is an exquisitely
demanding task, especially in dynamic environments. One of such fundamental
tasks relates to knowledge integration, where a collective stance of the distributed
system needs to be determined and justified. In a setting where the agents are
additionally highly autonomous, i.e. can represent inconsistent or even opposing
views, resolving the collective knowledge state is a notoriously hard problem [9].
Here we follow the cognitive linguistics approach to communication1 and adopt
the phenomenological stance, where all individuals maintain a conscious and in-
tentional relationship to the external world through their bodily experiences. In-
corporating the Grounding Theory model [6–8], the autonomous agent through
1
As opposed to classical definitions of categories, the existence of a mind-independent
reality (objectivist realism) and absolute truths.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 1–10, 2012.
c Springer-Verlag Berlin Heidelberg 2012
2 G. Skorupa, W. Lorkiewicz, and R. Katarzyniak
the interaction with the external environment experiences surface mode qualia
(understood as ”raw feels”), that trigger a particular cognitive schema within
the individual. In short, an observation is an individual perception that is intro-
duced into the agent body and represented as embodied structures. Namely, it is
assumed that the cognitive agent can store internally reflections of perceptually
available states of properties P in perceptually available objects O. It means that
each aspect of the external world is recognisable for the agent and can become
a part of the body of its empirically originated knowledge.
The internal organisation of the agent is strictly private and individual, con-
sequently the embodied structures are not shared among the interacting agents
and cannot be directly communicated. As such, for an agent to share its cur-
rent viewpoint, it is necessary to utilise a language that is established within
the population. In particular, registering a particular language symbol triggers a
consistent, with the knowledge stance of the speaker, reaction within the hearer.
Additionally, following the view of Lakoff and Johnson2 , agent’s internal system
of accessible concepts is strictly related to the structure of the external world,
i.e. the language symbol is grounded in the embodied experience [4].
According to Dennet multiple ‘exposure to x – that is, sensory confrontation
with x over suitable period of time – is the normally sufficient condition for
knowing (or having true beliefs) about x’[1]. Based on this idea, the Grounding
Theory [6–8] defines a mechanism for an agent to fill in the unobserved parts of
the environment with cognitive schema extracted from the past empirical experi-
ences (corresponding with a particular state of an unobserved property in a given
object). Moreover, observation in which an agent observed an object exhibiting
a property (not exhibiting a property) makes a corresponding cognitive scheme
stronger (weaker) in relation to the complementary schemes. These strengths,
called relative grounding strengths (λ), along with a system of modality thresh-
olds (for modal levels of possibility, belief and knowledge), serve as means for
grounding appropriate modal statements.
Obviously, the agents do not share the same experiences, i.e., their occasional
observations of the environment do not have to be synchronised nor have to focus
on the same aspects of the external world. As such grounded modal statements,
which are further uttered by the agents, do not have to be identical, even more
– do not have to be consistent. Consequently, determining the collective stance
of the population is not a trivial task and a system capable of determining such
a global view requires a specialised integration mechanism.
It should be noted that there exists a vast literature on the topic of knowledge
integration, in particular in belief integration tasks [2, 3]. The most prominent and
well known is the research concerning belief revision, i.e. researching the problem of
assessing new pieces of information and how to incorporate them in already estab-
lished knowledge bases, belief merging, i.e. researching the problem of integrating
already established knowledge bases, and voting, i.e. researching the means for a
collective compromise between multiple preferences (knowledge pieces). However,
2
‘(...) Human language and thought are structured by, and bound to, an embodied
experience (...)’ [5].
A Multi-agent Strategy for Integration of Imprecise Descriptions 3
applying the mechanisms from the aforementioned studies leads to several prob-
lems and questions. In particular, as a consequence that they neglect the fact that
modal statements should be grounded in an individual agent. Each such grounded
modal statement represents a particular cognitive stance of the uttering agent and as
such the integration task should take this cognitive stance into account. Encouraged
by this gap in the literature, in the following sections, we formulate an alternative
approach to the integration task – incorporating the mechanisms of Grounding
Theory.
3
For the sake of simplicity, we further omit the object from the notation, i.e. we use
p instead of p(o), and assume that at a certain point of time tI ∈ T the integration
task is strictly limited to a given pair of properties q, p ∈ P .
4 G. Skorupa, W. Lorkiewicz, and R. Katarzyniak
4
For precise definition of the notions of grounding set, modality thresholds, grounding
strength and their internal relations, please see [6–8].
A Multi-agent Strategy for Integration of Imprecise Descriptions 5
Definition 2. For a given observer agent a and it’s complete set of grounded
statements Ω (a) , let R : Ω → [0, 1]4 denote the internal reflection function that
determines the internal reflection R(Ω (a) ) = λ(a) = (λp∧q , λp∧¬q , λ¬p∧q , λ¬p∧¬q )
of the a’s cognitive stance. In general, let set Λ = {λ(1) , λ(2) , ..., λ(A) } denote the
set of all reflections of cognitive states.
Next we enter the integration stage, where the internal reflections Λ of gathered
stances Ω are further integrated to form a single consistent knowledge stance
(See sec. 3). In particular, based on the integrated reflection the agent determines
(See sec. 2.1) a collective reflection, i.e. creates an integrated cognitive state
λ∗ = (λ∗p∧q , λ∗p∧¬q , λ∗¬p∧q , λ∗¬p∧¬q ) of community of agents. Method of construct-
ing vector λ∗ ensures that required properties (See sec.2.2) are met. Finally, the
integrated cognitive state λ∗ that serves as the source for the integrated set of
modal statements Ω ∗ – describing agents’ community knowledge.
3 Integration Strategy
We present a formal description of the integration process – reflection stage
(method of collecting and determining the individual reflections of knowledge
states λ(a) ) and integration stage (method of determining the collective knowl-
edge state λ∗ and translating to integrated set of modal statements Ω ∗ ).
5
Constraints for modal messages on q, ¬p, ¬q, p ∧ ¬q, ¬p ∧ q and ¬p ∧ ¬q are defined
analogously. Parameters λkB , λkP , λsP and λsB are set so that they meet requirements
from [6–8].
A Multi-agent Strategy for Integration of Imprecise Descriptions 7
line segment (Know(p) is said) or a tetrahedron (no ‘Know’ statements are said)
– every interior point meets the constraints Ξ. However, due to the restrictions
imposed by strict inequalities this property is not met by all of the figure’s
boundary points . It should be noted, that when the figure is degenerated to
a single point it must satisfy all of the constraints Ξ. Additionally, each non-
boundary point of a line-segment must satisfy all of the constraints Ξ.
In order to find a point satisfying constraints Ξ we find all of the figure’s
(defined by Ξ) vertices. Let:
(c) (c) (c) (c)
V = {v (1) , v (2) , ..., v (C) }, where v (c) = (v1 , v2 , v3 , v4 ) (4)
denote a set of vertices of figure defined by constraints Ξ.6
Found vertices are used to calculate a single point lying within the figure’s
interior. Point λ is called a reflection of agent’s cognitive state and is calculated
as an average of vertices:
C
1 (c)
λ = (λp∧q , λp∧¬q , λ¬p∧q , λ¬p∧¬q ) = v (5)
C c=1
The figure defined by Ξ is convex, in each case, but not strongly convex. This
ensures that the point λ (See eq.5) lies within the figure’s bounds. If the figure
is reduced to a single point, then λ is exactly that point. If the figure is reduced
to a line segment, then λ lies at the middle of that segment. In case the figure
has a three-dimensional shape, then λ lies within its interior. Consequently, the
point λ meets all of the criteria given by Ξ.
Vertices V can be obtained using well-known algebraic properties of 4 dimen-
sional Euclidean space and simple linear equations. For the sake of simplicity
we modify the set Ξ to define two auxiliary sets Ξ≤ , Ξ= of changed constraints.
Consequently we define a set Ξ≤ , which contains only linear equations and soft
inequalities, and a set Ξ= , which contains only linear equations.
To obtain set Ξ≤ we change every equality constraint from set Ξ so that it
has a form: a1 λp∧q + a2 λp∧¬q + a3 λ¬p∧q + a4 λ¬p∧¬q + a5 = 0 and change every
inequality constraint so that it has form a1 λp∧q +a2 λp∧¬q +a3 λ¬p∧q +a4 λ¬p∧¬q +
a5 ≤ 0. For example, a constraint for Bel(p ∧ q) can be transformed into two
constraints of the form Bk − λp∧q ≤ 0 and λp∧q − 1 ≤ 0.
To create set Ξ= we change every inequality constraint from Ξ≤ to equality.
For example a constraint for statement P os(p) can be transformed into λp∧q +
λp∧¬q − λsP = 0.
Let A = [aij ] be a 5 × card(Ξ= ) matrix. Holding parameters aij from Ξ=
constraints. Conditions for constraints from set Ξ= can be written as a system
of linear equations of the matrix form:
[λ 1]A = [λp∧q λp∧¬q λ¬p∧q λ¬p∧¬q 1]A = 0 (6)
Equation 6 always has no solutions – however, every minor of 5 × 5 submatrix
of A is a good candidate for a figure vertex.
6
The number (C) of vertices depends on the figure’s shape.
8 G. Skorupa, W. Lorkiewicz, and R. Katarzyniak
Example 1
agent 1 Know(p), Bel(q), Bel(p ∧ q), P os(¬q), P os(p ∧ ¬q)
agent 2 Know(q), Bel(p), Bel(p ∧ q), P os(¬p), P os(¬p ∧ q)
agent 3 Bel(p), P os(¬p), P os(q), P os(¬q), P os(p ∧ q), P os(p ∧ ¬q)
result Know(p), Know(q), Know(p ∧ q)
Example 2
agent 1 Know(¬p), Know(q), Know(¬p ∧ q)
agent 2 Know(p), Bel(q), Bel(p ∧ q), P os(¬q), P os(p ∧ ¬q)
agent 3 Bel(q), P os(p), P os(¬p), P os(¬q), P os(p ∧ q), P os(¬p ∧ q)
result Know(q), P os(p), P os(¬p), P os(p ∧ q), P os(¬p ∧ q)
Example 3
agent 1 Bel(p), Bel(q), Bel(p ∧ q), P os(¬q), P os(p ∧ ¬q)
agent 2 Bel(p), P os(¬p), P os(q), P os(¬q), P os(p ∧ q), P os(p ∧ ¬q), P os(¬p ∧ ¬q)
agent 3 Bel(p), P os(q), P os(¬q), P os(p ∧ q), P os(p ∧ ¬q)
result Bel(p), P os(¬p), P os(q), P os(¬q), P os(p ∧ q), P os(p ∧ ¬q)
4 Computational Example
Table 2 contains three examples of the integration process, all in a setting of 3
observing agents. Example 1 shows how superiority of knowledge is sustained.
10 G. Skorupa, W. Lorkiewicz, and R. Katarzyniak
One agent knows that p and second knows that q. After integration we know both
p and q. Within example 2 there are contradictory statements Know(¬p ∧ q),
Know(¬p) in agent 1 and Know(p) in agent 2. Integrating agent resigns from
knowledge of p but stays with knowledge of q. Example 3 contains no knowledge
statements. Result is a balanced response. Sentences Bel(p), P os(¬q), P os(p ∧
¬q) uttered by every agent are also present in result.
5 Conclusions
References
1. Dennett, D.C.: True believers: The intentional strategy and why it works. In:
Stich, S.P., Warfield, T.A. (eds.) Mental Representation: A Reader. Blackwell (1994)
2. Gardenfors, P.: Belief Revision. Cambridge University Press, New York (1999)
3. Hansson, S.O.: A Survey of Non-Prioritized Belief Revision. Erkenntnis 50(2/3),
413–427 (1999)
4. Harnad, S.: The Symbol Grounding Problem. Physica D 42, 335–346 (1990)
5. Lakoff, G., Johnson, M.: Philosophy In The Flesh: the Embodied Mind and its
Challenge to Western Thought. Basic Books (1999)
6. Katarzyniak, R.: Grounding Atom Formulas and Simple Modalities in Communica-
tive Agents. Applied Informatics, 388–392 (2003)
7. Katarzyniak, R.: The Language Grounding Problem and its Relation to the Internal
Structure of Cognitive Agents. J. UCS 11(2), 357–374 (2005)
8. Katarzyniak, R.: On some properties of grounding uniform sets of modal conjunc-
tions. Journal of Intelligent and Fuzzy Systems 17(3), 209–218 (2006)
9. Nguyen, N.T.: Consensus systems for conflict solving in distributed systems.
Information Sciences 147(1-4), 91–122 (2002)
Performance Evaluation
of Multiagent-System Oriented Models
for Efficient Power System Topology Verification
Abstract. In the paper, the power system topology verification with use of
multiagent systems is considered. Two multiagent systems are taken into
account. One of these systems is a modification of the second one. In the
modified system, there are additionally so-called substation agents. Stages of
analysis, design and investigation of performance characteristics of presented
multiagent systems are described in the paper. The goal of the paper is
presentation of performance characteristics of the mentioned multiagent
systems in terms of probabilistic characteristics of agent activity and created
messages. The carried out investigations show that the modified multiagent
system has much better features than the earlier system.
1 Introduction
Knowledge of the correct connectivity model of a Power System (PS), i.e. a correct
PS topology model is very essential from the point of view of a monitoring of PS. In
this context, not only possession of proper procedure of building a PS topology
model but also possession of effective procedure for verification of this model is
very important, especially when building a PS topology model and its verification
should be realized automatically. In this paper, the verification of the PS topology
model is considered. It is assumed that Topology Verification (TV) is performed
with the use of the method described in [1]. The original method from [1]
decomposes TV for the whole PS into many Local TVs (LTVs). The paper [2]
presents idea of utilization of Agent Technology (AT) for PS TV realized with the
use of the method from [1]. Three different solutions of Multi-Agent Systems
(MASs) for PS TV are considered in [3]. It was found that such MAS, which
initializes PS TV when introductory analysis of measurement data can point out
occurrence of any error in a PS topology model (i.e. Topology Error – TE), is the
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 11–20, 2012.
© Springer-Verlag Berlin Heidelberg 2012
12 K. Wilkosz, Z. Kruczkiewicz, and T. Babczyński
most advantageous. In [4], the new solution of MAS for PS TV is introduced. In [4],
not only existence of one electrical node in one substation but also existence of
several electrical nodes in one substation is taken into account. Such situation occurs
in many substations in actual PSs. In [4], the maximal numbers of messages created
by considered MASs and mean times of TV processes are analyzed. This paper
continues the investigations of features of MASs taken into considerations in [4].
The goal of the paper is presentation of performance characteristics of these MASs
in terms of probabilistic characteristics of agent activity and created messages. One
should be noticed that performance requirements have to be considered for each
phase of a life cycle when a new system (i.e. MAS in the paper) is designed. Such
the statement is one of the essential statements in performance engineering of
software systems [5], [6]. For example, the problem of performance evaluation of
MASs is considered in such papers as: [7] for the systems: ZEUS, JADE and
Skeleton Agents, [8] for Aglets IBM, Concordia and Voyager.
In the paper, to enhance the development of MASs the MaSE (MultiAgent Systems
Engineering) methodology [9] is utilized.
The considered MASs perform PS TV utilizing the method described in [1]. The
method from [1] assumes calculation of so-called unbalance indices on the base of
measurement data of active and reactive power flows at ends of branches (power
lines, transformers) and voltage magnitudes at (electrical) nodes of PS. The
mentioned unbalance indices are inputs for Artificial Neural Networks (ANNs) with
radial basis functions. Analyzing the outputs of ANNs, decisions on existence of
topology errors are taken.
For each node of PS one ANN is created. Let us that the node i is taken into
account. Unbalance indices being inputs for the ANN associated with the node i are
calculated using measurement data of: (i) active and reactive power flows at ends of
branches which are incident to the node i, (ii) the voltage magnitude at the node i,
(iii) active and reactive power flows at ends of branches which are incident to nodes
neighbouring to the node i. The outputs of ANN for the node i are a base for taking
decisions on correctness of modelling of branches incident to this node. One can note
that there are two ANNs which take decisions about correctness of modelling the
branch being between the nodes with which these ANNs are associated. The final
decision is produced on the base of the mentioned decisions.
The earlier-described idea allows performing TV of a whole PS as many
verifications of modelling particular branches. It ought to be underlined that results of
these verifications should be coordinated each other. In the presented situation, MAS
for PS TV is proposed in [2]. In [2], the agents of two types are defined. The agents of
one type take decisions on the base of outputs of ANNs. The agents of second type
take decisions on the base of decisions produced by the agents of the fist type for the
particular branches.
Performance Evaluation of Multiagent-System Oriented Models 13
In the paper, as in [4], two MASs are considered. They are called MAS-1 and MAS-2.
Their definitions are the same as in [4].
In MAS-1, there are distinguished nodal agents and the agent called Dispatcher. In
MAS-2, apart from the same agents as in MAS-1 there are also so-called substation
agents.
The nodal agent is associated with one electrical node of PS. This agent gathers
measurement data of: (i) active and reactive power flows at ends of branches, which
are incident to its node, (ii) the voltage magnitude at its node. The nodal agent
investigates possibility of existing symptoms of occurrence of TE (TE symptoms).
At least one of the following events is considered as a TE symptom:
[
WPi ∉ [− δWPi , δWPi ], WQi ∉ − δWQi , δWQi ] (1)
where: WPi, WQi - unbalance indices for the i-th node for active and reactive power,
respectively [1]; δWPi, δWQi - positive constants.
If there are such symptoms, the nodal agent initiates LTV and in effect it takes
decisions regarding correctness of modelling particular branches of PS in the area
which is controlled by it. The nodal agent prepares a message for Dispatcher. The
message contains decisions taken in the LTV process.
The substation agent is associated not with one electrical node of PS but with one
substation in which there are many nodes. It performs the same functions as the nodal
agent but for every node of its substation. It also takes final decisions about
correctness of modelling the branches inside its substation, i.e. between electrical
nodes in its substation. The substation agent prepares a message for the agent
Dispatcher. The message contains: (i) decisions taken in the LTV processes regarding
correctness of modelling the branches between a considered substation and its
neighbouring ones. (ii) final decisions regarding correctness of modelling the
branches inside the substation.
The agent Dispatcher gathers messages from the nodal and substation agents. For
branches, for which the agent Dispatcher does not received final decisions about
correctness of their modelling (from substation agents), it takes such decisions on the
base of the decisions from LTVs.
The analysis model of the considered MASs is built by using the AgentTool_1.8.3
tool of the MaSE technology [9]. In this model, one distinguishes goals, roles and
tasks.
The main goal of the analyzed MASs is PS TV (Fig. 1). The subgoals of MASs
are: management of TV and agents associated with nodes and substations of PS (the
goal 1.1), executing the process of TV (the goal 1.2) and executing the process of
measuring distinguished quantities in PS (the goal 1.3).
14 K. Wilkosz, Z. Kruczkiewicz, and T. Babczyński
The considered systems include one agent which plays the Dispatcher role
(a rectangle). Each of other agents plays one instance of the roles Node or together Node
and Substation (Fig. 2b, Fig. 3b). The Node role (Fig. 2b, Fig. 3b) fulfils the following
subgoals: 1.2.3, 1.3, 1.3.1, 1.3.2 (Fig. 1). For the SubStation role there is only one
subgoal 1.2.2 (Fig. 3b). Other goals (Fig. 1) are secured by the Dispatcher role. Each
role performs a few tasks (ellipses). Each task is modelled as the statecharts diagram.
Using tasks, the agents of roles exchange messages with each other according to the
suitable external protocols (solid lines). Internal protocols (dashed lines) are used when
tasks of the same role exchange messages. MAS-1 includes two types of agents: the first
one playing the Dispatcher role and the second one such as the nodal agent playing the
Node role for each node of PS (Fig. 4). Additionally, MAS-2 contains the substation
agents, playing the Substation and the Node roles (Fig. 5). In the last case, the Node
roles represent particular nodes of the multi-node substations of PS.
a) b)
Fig. 2. The diagrams of the experimental MAS-1 for realization of the TV process: a) sequence
diagram, b) role diagram
a) b)
Fig. 3. The diagrams of the experimental MAS-2 for realization of the TV process: a) sequence
diagram, b) role diagram
Performance Evaluation of Multiagent-System Oriented Models 15
The sequences of messages exchanged by tasks are used to fulfill goals (Fig. 1) of
roles (Fig. 2b, Fig. 3b). Fig. 2a presents such sequences of one instance of the
Dispatcher role and two instances of the Node role for MAS-1. Additionally, Fig. 3a
includes one instance of the Substation role and one instance of the Node role, both
related to the power substation, modelled in MAS-2. The labels of arrows represent
the messages (Fig. 2a, Fig. 3a) exchanged between the tasks of the roles with the use
of the external protocols (Fig. 2b, Fig. 3b).
Further, it is assumed that the considered PS has n nodes, m branches, and the
number of branches, which are connected with the considered node, is k. Additionally,
from the view point of MAS-2 the branches of two kinds are distinguished. The
branches of the first kind connect nodes belonging to different agents playing the
Substation or Node roles. Let’s assume that the number of such branches is equal to p.
The branches of the second kind connect nodes belonging to the agents of the
Substation role and they connect nodes inside the multi-node substation. The number
of such branches is equal to q. There is n = p + q. The number of agents is equal to d.
In particular, in MAS-1 where there are only agents of the Node roles, d = n.
In MAS-1, when any TE symptom is detected (Section 3) the agents of the Node
role take independent decisions regarding correctness of modelling of branches in the
LTV task. In MAS-2, there is the substation agent which fulfils the Node and
Substation roles. The Substation role executes complete TV for branches, which are
inside one substation.
The Modeling task build a PS topology model (Fig. 2b, Fig. 3b). This task
internally sends its data to the State change task using the internal NewMData
protocol. The State change task performs the detection of TE symptoms. This
detection is based on the testing the nodal unbalance indices, which are earlier
calculated using measurement data of active and reactive power flows at the ends of
branches. The State change task internally sends its data to the LTV task for LTV by
the internal NewData protocol.
The LTV task uses the external LTV protocol (Fig. 2b, Fig. 3b). During LTV, the
total number of messages exchanged among each LTV task and the Node for LTV
tasks of neighbouring nodes belonging to the different agents (the nodal agents or also
the substation agents) is equal to 4p or 4(m - q) (each message of the r-units size). In
other words, this number is equal to the sum of the numbers of: (i) all the
inform(LTV_NiNj) and inform(LTV_NjNi) messages (Fig. 2a) exchanged among
different agents of the Node role in the case of MAS-1 (Fig. 4); (ii) all the
inform(LTV_NiNj), inform(LTV_NjNi), inform(LTV_NiNsb) and inform(LTV_NsbNi)
messages (Fig. 3a) exchanged among agents of the Node roles and agents of the Node
and Substation roles in the case of MAS-2 (Fig. 5). It is assumed that instances of
roles Ni and Nj represent the nodal agents, whereas instances of role Nsb is fulfilled
by substation agents. The total number of sent messages is 4m, in the case of MAS-1
and 4p or 4(m-q) in the case of MAS-2.
The TV process is carried out after the appropriate LTV processes finish. In
MAS-1, the LTV tasks (Fig. 2b) send at most k n messages (of the r-units size). The
number k n is equal to 2m. The considered messages are the inform(TV_Ni),
inform(TV_Nj) messages (Fig. 2a) for the TV1 task of the agent of the Dispatcher role
(Fig. 2b). If only some of agents of the Node roles detect TE symptoms, they send the
inform(TV_Ni) messages to the agent of the Dispatcher role. The size of each such a
16 K. Wilkosz, Z. Kruczkiewicz, and T. Babczyński
message is equal to the number kr. The maximal number of the inform(TV_Ni) and
inform(TV_Nj) messages is equal to the number of branches connected with nodes
with TE symptoms.
In the case of MAS-2 as it is in the case of MAS-1, the nodal agents send messages
with LTV decisions to the agent of the Dispatcher role. More complex situation is
with respect to the substation agents. The complete TV process for the set of q
branches, which are inside substations, is carried out by the TV2 task of the Substation
role of the substation agents. The final TV decisions for the mentioned branches are
taken by these agents instead of the agent of the Dispatcher role without exchanging
any messages between nodes outside the substation. Then, the TV2 task sends the
inform(results_TV_Nsb) message (Fig. 3a) with a final TV decision to the Data TV2
task of the Dispatcher agent (Fig. 3b). The substation agent does not take final
decisions for the branches which go out from its substation. For such branches only
LTV decisions are taken and messages with these decisions are sent to the TV1 task of
the Dispatcher agent (Fig. 3b). The total number of messages sent by the nodal and
substation agents to the Dispatcher agent is 2p+q or 2m-q.
Summarizing, when all nodes identify TE symptoms, the total number of messages
in the complete TV process is 6m for MAS-1 and 6p+q or 6m-5q for MAS-2. The
mentioned numbers of messages are maximal ones.
The result from the presented analysis is that for MAS-2 the maximal number of
messages exchanged among agents is smaller than it is for MAS-1. Measure of the
decreasing of the mentioned number of messages is the number 5q, where q is the
number of internal branches of multi-node substations.
After the analysis models are worked out, the design model of MAS-1 (Fig. 4) and
MAS-2 (Fig.5) are created [9] as the Agent Template Diagrams. Agent Template
Diagrams are mapped from their analysis models. These diagrams show the Agent
Communication Language (ACL) messages [10] exchanged between agents. In Fig. 4,
each instance of the Node_Agent1 and Node_Agen2 agents represents the nodes of PS
which are connected with each other. In Fig. 5 the instances of the SubS_Agent1 and
SubS_Agent2 agents represent substations of PS. In both MASs, the Dispatcher agent
manages the whole PS TV.
they send the results of LTVs to the Dispatcher agent which executes TV for appropriate
branches. Additionally, in MAS-2 the SubS_Agent1 and SubS_Agent2 agents execute TV
(Section 4) for branches, which are internal for substations represented by the considered
agents, and the final TV results are sent to the Dispatcher agent.
δWQi = b ∑σ 2
Qij , where σ Pij , σQij are the standard deviations of small errors
j∈I i
burdening the measurement data of active and reactive power flows, respectively, on
the branch connecting the nodes i and j at the node i; Ii is the set of nodes connected
with the node i; b is the constant equal to 1, 2 or 3.
Carrying out the investigations, it was also taken into account that errors in the model
of a PS topology can exists. The assumption was made that probability of the improper
modelling of a branch is pM. pM is the constant equal to 0.01, 0.001 or 0.0001.
18 K. Wilkosz, Z. Kruczkiewicz, and T. Babczyński
The results of the investigations are in Table 1-3. The second subscript of the
quantities presented in Table 1-3 points out the value of b, for which these quantities
are determined.
In Table 1, the probability pe, i.e. the probability of detecting at least one TE
symptom, is shown for different pM. The probability of detecting at least one TE
symptom is defined as: p e = s g s t , where: sg is a number of TV cycles for which at
least one TE symptom is detected; st is a number of all the considered TV cycles (for
which possibility of occurrence of one of the events (1) is tested). When b = 1, the
probability of detecting at least one TE symptom in one TV cycle is very close to 1.
This probability is much smaller for b = 3.
In Table 1, also the probability of occurrence of at least one TE (pTE) is presented
for different pM. Differences between values of pe and pM mean that there can be cases
when a TE symptom is detected but none of TEs occurs. From the view point of “a
false alarm” the most favourable case is when b = 3.
Table 2 contains the mean numbers of nodes which detect TE symptoms ( ~ es ) and
~
the mean numbers of their neighbours ( e ) which do not detect TE symptoms but
n
perform the LTV processes. Table 2 shows that there exists strong dependence of ~es
~ ~ ~
and en on the parameter b. One can also note that the relation between es and en is
different for b = 1 and for b > 1. For b = 1 ~
e >~
e , but for b > 1 ~
s n
e < e~ .
s n
The mean numbers of messages sent during the TV process are shown in Table 3.
The number of messages sent during the TV process can be expressed as
Me = Men + Mel + Mef. For MAS-1 Men is a number of the messages
inform(LTV_M_NiNj) and inform(LTV_M_NjNi). For MAS-2 Men is a number of such
the messages as in the case of MAS-1 and also the messages inform(LTV_M_NiNsb)
e
and inform(LTV_M_NsbNi) (Section 4). Men is determined as M en = card (∪ Z i ) ,
i =1
For pM < 0.0001, changes of values of all considered quantities are very small.
Practically, these changes can be neglected.
Table 1. The probability of occurrence of at least one TE (pTE) and the probability of detecting
at least one TE symptom (pei) in the considered test system for different values of pM.
Table 2. The mean number of nodes detecting a TE symptom and their activated neighbours in
one TV cycle for different values of pM.
pM ~
es1 ~
es 2 ~
es 3 ~
en1 ~
en 2 ~
en 3
0.01 10.23 2.17 1.14 7.44 4.71 2.93
0.001 10.15 2.05 1.06 7.49 4.52 2.75
0.0001 10.14 2.04 1.05 7.49 4.50 2.74
Table 3. The mean numbers of messages sent during one TV cycle for different values of pM in
percents of the maximal numbers of these messages (Me max). For MAS-1 Me max = 150, for
MAS-2 Me max = 105.
0.01 96.62 38.89 7.19 95.54 35.27 6.34 96.62 45.26 28.26 95.54 41.04 24.93
0.001 96.49 36.21 3.05 95.38 32.75 2.69 96.50 43.49 26.62 95.38 39.33 23.42
0.0001 96.48 35.93 2.63 95.36 32.50 2.30 96.48 43.31 26.45 95.36 39.16 23.27
7 Conclusion
In the paper, two MASs, i.e. MAS-1 and MAS-2 are considered. Apart from the same
agents as in MAS-1, MAS-2 assumes utilization of so-called substation agents. Such
solution results in improving features of MAS for PS TV. Such parameters as the
maximal number of messages sent during one TV cycle Me_max, the mean number of
these messages for all considered TV cycle ( M eb b ∈ {1, 2, 3}) as well as the mean
number of the mentioned messages for these TV cycle, in which at least one TE
symptom is detected ( M ~ b ∈ {1, 2, 3}), are lower for MAS-2. The pointed out
eb
differences are relatively large. The listed parameters for MAS-2 are not larger than
70 % of the appropriate parameters for MAS-1. The considered differences are
especially large when values of M e 3 or M ~ are taken into account. Additionally, it
e3
should be stressed that when the parameter b increases the here-considered parameters
decreases faster for MAS-2 than for MAS-1.
20 K. Wilkosz, Z. Kruczkiewicz, and T. Babczyński
For MAS-2, i.e. for the more advantageous MAS from among the considered ones,
the mean number M ~ b ∈ {1, 2, 3}, is not larger than 25 % of M
eb e max, and the number
M eb b ∈ {1, 2, 3} is not larger than 6.5 % of Me max. The last mean number depends
strongly on the probability of occurrence of TE (pM). It can be also stated that this last
mean number is relatively low.
Generally, the results of the carried out investigations show that MAS-2 allows for
specific properties of PS better than MAS-1 and in effect it enables considerable
reduction of unnecessary transfer of messages in the computer network. It should be
also stressed, that in this paper the impact of the operation of a computer network on
the features of MASs for power system topology verification is not considered.
References
1. Lukomski, R., Wilkosz, K.: Method for Power System Topology Verification with Use of
Radial Basis Function Networks. In: Sandoval, F., Prieto, A.G., Cabestany, J., Graña, M.
(eds.) IWANN 2007. LNCS, vol. 4507, pp. 862–869. Springer, Heidelberg (2007)
2. Wilkosz, K.: A Multi-Agent System Approach to Power System Topology Verification.
In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds.) IDEAL 2007. LNCS,
vol. 4881, pp. 970–979. Springer, Heidelberg (2007)
3. Wilkosz, K., Kruczkiewicz, Z., Rojek, T.: Multiagent Systems for Power System Topology
Verification. In: Corchado, E., Yin, H. (eds.) IDEAL 2009. LNCS, vol. 5788, pp. 815–822.
Springer, Heidelberg (2009)
4. Wilkosz, K., Kruczkiewicz, Z.: Multiagent-System Oriented Models for Efficient Power
System Topology Verification. In: Nguyen, N.T., Kim, C.-G., Janiak, A. (eds.) ACIIDS 2011,
Part I. LNCS (LNAI), vol. 6591, pp. 486–495. Springer, Heidelberg (2011)
5. Smith, C.U., Lloyd, G.W.: Performance Solutions, A Practical Guide to Creating
Responsive, Scalable Software. Addison - Wesley, Canada (2002)
6. Babczyński, T., Kruczkiewicz, Z., Magott, J.: Performance Analysis of Multiagent
Industrial System. In: Klusch, M., Ossowski, S., Kashyap, V., Unland, R. (eds.) CIA 2004.
LNCS (LNAI), vol. 3191, pp. 242–256. Springer, Heidelberg (2004)
7. Camacho, D., Aler, R., Castro, C., Molina, J.M.: Performance Evaluation of ZEUS, JADE,
and Skeleton Agent Frameworks. In: IEEE International Conference on Systems, Man, and
Cybernetics, vol. 4, p. 6 (2002)
8. Dikaiakos, M.D., Kyriakou, M., Samaras, G.: Performance Evaluation of Mobile-Agent
Middleware: A Hierarchical Approach. In: Picco, G.P. (ed.) MA 2001. LNCS, vol. 2240,
pp. 244–259. Springer, Heidelberg (2001)
9. Deloach, S.A.: The MaSE Methodology. In: Bergenti, F., Gleizes, M.-P., Zambonelli, F.
(eds.) Methodologies and Software Engineering for Agent Systems. The Agent-Oriented
Software Engineering Handbook Series: Multiagent Systems, Artificial Societes, and
Simulated Organizations, vol. 11. Kluwer Academic Publishing, Dordrecht (2004)
10. Specification of FIPA, http://www.fipa.org/specs/
11. Dopazo, J.F., Klitin, O.A., Stagg, G.W., Van Slyck, L.S.: State Calculation of Power
Systems From Line Flow Measurements. IEEE Trans. on PAS PAS-89(7), 1698–1708
(1970)
12. Dopazo, J.F., Klitin, O.A., Van Slyck, L.S.: State Calculation of Power Systems from Line
Flow Measurements, Part II. IEEE Trans. on PAS PAS-91(1), 145–151 (1972)
Building a Model of an Intelligent Multi-Agent System
Based on Distributed Knowledge Bases
for Solving Problems Automatically
1 Introduction
Up to now, models and systems about Multi-Agent have not been applied to solve
problems relating to many fields about Mathematics because knowledge bases in
these models and systems are not enough complex to solve them. Besides,
architectures of agents and activity models of these systems are not appropriate to
dealing with issues about querying knowledge or solving problems. Example, some
Multi-Agent systems (MAS) such as in [1- 4], [8 -14] focused on methods to enhance
effects of E-Learning. In [1], authors present a set of Technology enhanced Learning
environments integrating the need aspects in a synergetic way to fit well PBLs
(Project Based Learning), especially in their context of use. In [2], the intelligent
tutoring module is addressed, and more specifically, its BDI (Believes, Desires, and
Intentions) based agents. These agents recognize each student and obtain information
about her/ his progress. So, the module suggests to each student specific tasks to
achieve her/his particular learning objectives. A performance oriented approach is
presented in [3]. In this approach, a set of key performance indicators (KPIs) has been
set up to represent a set of measures focusing on different aspects of organizational
and individual performance that are critical for the success of the organization. In [4],
authors propose an approach for integrating distance learning system and knowledge
management processes using knowledge creation techniques. Based on the proposed
approach, an integrative framework is presented for building knowledge-based
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 21–32, 2012.
© Springer-Verlag Berlin Heidelberg 2012
22 N.T.M. Khue and N.V. Do
distance learning system using the Intelligent Agent technology. In [8], a model of E-
Learning was proposed based on a process of coupling of ontologies and Multi-Agent
Systems for a synergy of their strengths. This model allows human agents to
cooperate with software agents to automatically build courses guided by relevant
learning objectives. The roles of intelligent agents within an E-Learning system,
called E-University, are presented in [9]. The agents perform specific tasks on the
behalf of students, professors, administrators, and other members of the university.
Also a group of intelligent agents for learning activities such as user interface agents,
task agents, knowledge agents and mobile agents is developed. Using the multi-agent
technology in E-Learning system, gives user interaction facility to both users and
designers, and adds ability to exchange information between different objects in a
flexible way.
In [10], a Multi-Agent model is built to improve the effect of teaching method
based on WBT (Web Based Tutoring). However, the set of rules in its Knowledge
Library is not suitable to many fields. Agent based Collaborative Affective
e-Learning Framework in [11] only supports for ordinary communication in virtual
environment, it doesn’t support for solving knowledge. The aim of this framework is
to understand e-learner’s affective states. In [12], the purpose of Multi-Agent
e-learning system is analyzing student’s errors. Based on the results of comparing
student’s ontology and teacher’s ontology, the system generates useful advices to
students. Ontology (O) includes of the set <T, R, F>. T is Term, R is Relation and F is
Function. This ontology model is fairly simple; it can not deal with complex problems
in reality. To build the Mobile-Agent model in [13], authors use distributed databases.
Therefore, this model only can deal with issues relating titles of courses and keywords
of subjects; it can not deal with contents of knowledge. In [14], the model of ontology
based on hierarchy tree, so this system also can not support for solving problems
automatically.
In our previous research such as in [6], we concentrated to build a model of Multi-
Agent in E-Learning, and in [7] we built a language to query knowledge. In this
paper, we focus on building a model of Multi-Agent for solving problems
automatically. Knowledge to find answers of questions is related to three fields, and is
distributed in three different places. This model simulates a complex, nature, and
reasonable organization of human. In the model, we use the Computational Objects
Knowledge Base (COKB) as in [5] because it is developed to support for solving
problems. COKB contains 6 components: (C, H, R, Ops, Funs, Rules). C is a set of
concepts, H is a set of hierarchies of concepts, R is a set of relations between
concepts, Ops is a set of operations, Funs is a set of functions, Rules is a set of rules.
Main issues of Multi-Agent systems include interacting between Agents,
disintegrating the work, separating the tasks of Agents, and synthesizing results.
These Agents act autonomously in the environment to reach their goals. They are able
to recognize the state of environment, properly act and impact on environment. They
coordinate together during the process of solving problems.
Some authors mentioned clustering algorithms which give the distributed systems
higher performance than previous clustering algorithms. In [15], an improved
algorithm based on the weighted clustering algorithm is proposed with additional
constraints for selection of cluster heads in mobile wireless sensor networks. In [16],
authors applied the optimization technique of genetic algorithm (GA) to the new
Building a Model of an Intelligent Multi-Agent System 23
adaptive cluster validity index, which is called Gene Index (GI). These algorithms can
be applied in our system to cluster agents based on their functions and fields of
knowledge. However, in this paper, we focus on the model of the system,
architectures of agents, and a method to test effects of the system.
Similarly, let SB be a set of facts about the knowledge in field B of the question.
SB = ∪ SjB . Let SC be a set of facts about the knowledge in field C of
∀ j :1 → l
the question. SC =
∪
∀ m :1 → n
SmC
AS =
( C ⇒
∪
a ) with
{ a }
( C ⊆ S )
Building a Model of an Intelligent Multi-Agent System 25
4 Architectures of Agents
Architecture of a User Agent is shown in Fig.3. A user inputs the question/problem into
the system according to given structures through the Received Problem. This module
sends the problem to the Disintegrate Problem. It disintegrates the problem into objects,
relations and facts. They are sent to the Classify Knowledge and divided into three
groups: the knowledge in field A, the knowledge in field B and the knowledge in field C.
26 N.T.M. Khue and N.V. Do
The Knowledge of Agent memorizes them, sends them to the Know the state of local
environment, and then them are updated into the Storage. The Knowledge of Agent
sends the knowledge of the problem to Intermediary Agents. After these Agents migrate
to other Places and back with answers, it receives answers and sends them to the
Synthesize Answers. This module synthesizes them into final answers. Then it sends
them to the Show Answers. The Show Answers presents final answers to the user.
Then, it receives the result to know if announcing the notice successful or not. After
that it actives the Retract Agent back to its Place. After Agent migrates, it informs the
result to Intermediary of its Place. The Knowledge of Agent can active the Dispose
Agent to kill Agent if Agent is lost in a long time while migrating.
5 An Application
We build an application for solving problems relating to three fields: plane geometry
(field A), 2D analytic geometry (field B), and algebra (field C). Users input questions
according to given structures. Agents in the system coordinate to solve the problems
and shows answers to users. We used JADE 4.0, JDK 1.6, Maple 13, MathML, XML,
dom4j, Suim, MozSwing, … in our application. Example question 1: In co-ordinate
plane Oxy sets A[1,3], B[4,2], D ∈ Ox, DA=DB. Find coordinate of point D and
perimeter of OAB. The process of solving problem as following: The first goal of the
system is to find coordinate of point D. The system searches a concept about the
coordinate of a point in 2D analytic geometry Knowledge Base. Then, the system
finds xD and yD. The system use knowledge about 2D analytic geometry to calculate
yD . We have: D ∈ Ox => yD = 0
In the following process, the system use knowledge about 2D analytic geometry to
calculate the module of DA and DB.
DA=DB =>
=>
Then, the system use knowledge about algebra to find value of xD from following
equation:
=> xD = 5/3 (use Maple to find solutions)
The system use knowledge about plane geometry to calculate perimeter of triangle
OAB. Perimeter (OAB) = OA + AB + OB . After calculating modules of OA, AB,
OB, the system receives results: Perimeter (OAB) =
The model in the left part of Fig.7 is an outline of the model in Fig.1. The model in
Fig 1 is detailed to show kinds of agents, relationships of agents, and activity model
of the proposed system. We can compare the model in the left part of Fig.7 (Multi-
Agent Paradigm) with the model in the right part of Fig 7 (Client-Server Paradigm).
In Multi-Agent Paradigm, the network load is reduced. After an agent migrates to
Building a Model of an Intelligent Multi-Agent System 29
After testing, we see that in 93% cases, the system in Multi-Agent Paradigm shows
the final results to the user when it is run in the network, whose connection is not
continuous. In other cases, it stops working when the time the connection is
interrupted is more than the time of life-cycle of agent. In all cases, the system in
Client-Server Paradigm can not show the final results to the user in the network,
whose connection is not continuous.
7 Conclusion
References
1. El Kamoun, N., Bousmah, M., Aqqal, A., Morocco, E.J.: Virtual Environment Online for the
Project based Learning session. Cyber Journals: Multidisciplinary Journals in Science and
Technology, Journal of Selected Areas in Software Engineering (JSSE) January Edition (2011)
2. Mikic-Fonte, F.A., Burguillo-Rial, J.C., Llamas-Nistal, M.: A BDI-based Intelligent
Tutoring Module for the e-Learning Platform INES. In: 40th ASEE/IEEE Frontiers in
Education Conference, Washington, DC (2010)
3. Wang, M., Ran, W., Jia, H., Liao, J., Sugumaran, V.: Ontology-Based Intelligent Agents in
Workplace eLearning. In: Americas Conference on Information Systems, AMCIS (2009)
4. Thuc, H.M., Hai, N.T., Thuy, N.T.: Knowledge Management Based Distance Learning
System using Intelligent Agent. Posts, Telecommunications & Information Technology
Journal, Special Issue: Research and Development on Information and Communications
Technology (16), 59–69 (2006)
5. Van Nhon, D.: Construction and development models of knowledge representation for
solving problems automatically, Ph.D Thesis, University of National Science, Ho Chi
Minh City (2001)
6. Van Nhon, D., Khue, N.T.M.: Building a model of Multi-Agent systems and its
application in E-Learning. Posts, Telecommunications & Information Technology Journal,
Special Issue: Research and Development on Information and Communications
Technology (18), 100–107 (2007)
32 N.T.M. Khue and N.V. Do
7. Khue, N.T.M., Hai, N.T.: Developing a language based on FIPA-ACL to query knowledge
in a multiagent system. In: Proceedings of the Sixth International Conference on
Information Technology for Education and Research, Vietnam, pp. 176–183 (2010)
8. El Bouhdidi, J., Ghailani, M., Abdoun, O., Fennan, A.: A New Approach based on a
Multi-ontologies and Multiagents System to Generate Customized Learning Paths in an E-
Learning Platform. International Journal of Computer Applications (0975 – 8887) 12(1)
(December 2010)
9. El-Bakry, H.M., Mastorakis, N.: Realization of E-University for Distance Learning.
WSEAS TRANSACTIONS on Computers 8(1) (2009)
10. Yoshida, T.: Cooperation learning in Multi-Agent Systems with annotation and reward.
International Journal of Knowledge-based and Intelligent Engineering System 11, 19–34
(2007)
11. Neiji, M., Ben Ammar, M.: Agent based Collaborative Affective e-Learning Framework.
The Electronic Journal of E-Learning 5(2), 123–134 (2007)
12. Gladun, A., Rogustshina, J.: An ontology based approach to student skills in Multi-Agent
e-Learning systems. International Journal Information Technologies and Knowledge 1
(2007)
13. Quah, J.T.S., Chen, Y.M., Leow, W.C.H.: E-Learning System, Nanyang Technological
University, Team LiB (2004)
14. Tecuci, G., Boicu, M., Marcu, D., Stanescu, B., Boicu, C., Comello, J.: Training and Using
Disciple Agents, US Army War College (2004)
15. Hong, T.-P., Wu, C.-H.: An Improved Weighted Clustering Algorithm for Determination
of Application Nodes in Heterogeneous Sensor Networks. Journal of Information Hiding
and Multimedia Signal Processing 2(2), 173–184 (2011)
16. Lin, T.C., Huang, H.C., Liao, B.Y., Pan, J.S.: An Optimized Approach on Applying
Genetic Algorithm to Adaptive Cluster Validity Index. International Journal of Computer
Sciences and Engineering Systems 1(4), 253–257 (2007)
Temporal Reasoning in Multi-agent Workflow Systems
Based on Formal Models
1 Introduction
Patient planning in hospitals is a highly complex task due to distributed organizational
structure, dynamic medical processes, uncertainty in operation time and stochastic
arrival of urgent patients. In existing literature, there are several studies on planning
and scheduling in hospitals. For example, Decker and Jinjiang propose a MAS
solution using the Generalized Partial Global Planning approach that preserves the
existing organization structures while providing better performance [1]. Kutanoglu
and Wu investigate a new method based on combinatorial auction [2]. Oddi and Cesta
explore constraint based scheduling techniques and implement a mixed-initiative
problem solving approach to managing medical resources in a hospital [3]. Daknou,
Zgaya, Hammadi and Hubert focus on treatment scheduling for patients at emergency
department in hospitals [4] based on MAS. In hospitals, an urgent patient often needs
to be handled properly by a time constraint. A critical issue is to determine whether
the medical processes of a patient can completed by a time constraint in the presence
of uncertainty based on the available resources in hospitals. The problem is a
Temporal Constraint Satisfaction Problem (TCSP). The objective of this paper is to
propose a viable approach to develop a problem solver for TCSP.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 33–42, 2012.
© Springer-Verlag Berlin Heidelberg 2012
34 F.-S. Hsieh and J.-B. Lin
A typical medical process may consist of a number of operations such as registration, diagnosis,
radiology test, blood examination, anesthesia, surgery, intensive and discharge, etc. These
operations are performed by different hospital workers such as doctor, staff, specialist
and nurse. We propose a problem solving environment based on MAS to determine a
patient can be handled timely based on the available resources in a hospital. There are
three types of agents corresponding, including workflow agents, resource agents (e.g.
doctor agents, staff agents, specialist agents and nurse agents) and patient agents. The
problem is to determine whether there exists a set of workflow agents and a set of
resource agents with available time slots that can coherently handle a patient by a
time constraint. MAS provides a flexible architecture for agents to discover each other
to form a dynamic team to handle patients. In this paper, we adopt MAS to study the
problem and develop our design methodology. To state this problem, we need the
Temporal Reasoning in Multi-agent Workflow Systems Based on Formal Models 35
following definition. Let ψ denote the time constraint for completing a patient’s
medical processes. Let RA denotes the set of resource agents and WA denotes the set
of workflow agents. In hospitals, the processing time for operations is usually highly
uncertain. The uncertainty in operation time for an operation performed by an agent
in RA is specified by an interval [ α ; β ], where α and β are the lower bound and the
upper bound on the operation time. A dynamic organization is denoted
by H (WA, RA) , whereWA denotes the set of workflow agents in H and RA denotes
the set of resource agents that take part in the activities in H . The Temporal
Constraint Satisfaction Problem (TCSP) is stated as follows. Given a time
constraint ψ , a set of resource agents RA , a set of workflow agentsWA and the lower
bound and the upper bound on the processing time for each operation performed by
the resource agent, the problem is to determine whether there
exist RA ⊆ RA and WA ⊆ WA such that the shortest completion time and the longest
completion time of H (WA, RA) satisfyψ .
The Java Agent Development Environment (JADE) platform that provides a built-
in directory service through DF agent (Directory Facilitator) agent to simplify service
publication and discovery in the development of MAS. We develop a problem solver
based on JADE as shown in Fig. 1. To study TCSP, a mathematical model for each
workflow agent and resource agent is proposed in the next section.
initial marking of the PN with Z as the set of nonnegative integers. A Petri net with
P
initial marking m0 is denoted by G (m 0 ) . A marking of G is a vector m ∈ Z that
•
indicates the number of tokens in each place under a state. t denotes the set of input
places of transition t . A transition t is enabled and can be fired under m iff
m( p) ≥ F ( p, t )∀p∈• t . In a TPN, each transition is associated with a time interval
[ α ; β ]; where α is called the static earliest firing time, β is called the static latest
firing time, and α ≤ β ; where α ( α ≥ 0 ) is the minimal time that must elapse since
the transition is enabled before firing the transition and β ( 0 ≤ β ≤ ∞ ) is the maximum
time during which the transition is enabled without being fired. Firing a transition
removes one token from each of its input places and adds one token to each of its output
places. A marking m′ is reachable from m if there exists a firing
sequence s bringing m to m′ . A TPN G = ( P , T , F , C , m0 ) is live if, no matter what
marking has been reached from m0 , it is possible to ultimately fire any transition
of G by progressing through some further firing sequence.
w1 w4 p5 w7
p1 p16
Radiology
C (t 5 ) = [α 5 β 5 ] t5 Test 2
C (t15 ) = [α15 β15 ] t15
C (t1 ) = [α1 β1 ] t1 Radiology Intensive
Test 1
p6 p9 p17 Care
p2 Registration C (t 8 ) = [α 8 β 8 ]
C (t 6 ) = [α 6 β 6 ] t6 t8 C (t16 ) = [α16 β16 ] t16
C (t 2 ) = [α 2 β 2 ] t2 p7 p10 p18
C (t 9 ) = [α 9 β 9 ]
p3 C (t 7 ) = [α 7 β 7 ] t7 t9
w8 p18
w2 p3 p8 p11 C (t17 ) = [α17 β17 ] t17
C (t 3 ) = [α 3 β 3 ]
t3 C (t10 ) = [α10 β10 ] t10
p19 Return to Ward
p4 Diagnosis p12 C (t18 ) = [α18 β18 ]
t18
C (t 4 ) = [α 4 β 4 ]
t4 w5 p12 p20
p5 C (t11 ) = [α11 β11 ] t11 w9
p20
Anesthesia
w3 p13 C (t19 ) = [α19 β19 ] t19
p5
C (t12 ) = [α12 β12 ] t12 p21
C (t 21 ) = [α 21 β 21 ]
Discharge
t 21
Blood p23 p14 C (t 20 ) = [α 20 β 20 ] t 20
Examination w6
p22
C (t 22 ) = [α 22 β 22 ] t 22 p14
p16
t4 t8 t12 t16 t 35
H 1r 2 H r22 H r13
H r23 H r33 H r43
pr3
t5 pr2 t 32
t1 t9 t13
pr3 t 30
p5 p33 p9 pr3
pr2 p1 pr3 p13 p31
t 33 t2 t10
t6
t14 t 31
H 1r 4 H 1r 5 H r16 H 1r 7
t 27
H r18
pr6 t 23
t18 pr4 t 20 pr5 t 25
p19 p22 p25
p17 p27 pr8
t19 t 24 t 26 pr7 t 28
t 21
between H rk and H rk ′ for k ≠ k ' . The initial marking mrk ( pr ) = 1 and m rk ( p) = 0 for
each Prk \ { pr } , where p r is the idle state place of resource r .
C rk (t ′) =
(t ) ∩ C rk Φ∀t , t ′ ∈ Trk
with t ≠ t ′ .
Fig. 3 illustrates the resource activities associated with the workflows in Fig. 2. To
solve TCSP based on the proposed TPN models we study the temporal property in the
next section.
To compute the earliest completion time for a workflow agent to execute transitions
for given available time intervals from the resource agents, we study the temporal
property of healthcare workflows by exploiting the workflow structure. We define a
token flow path as follows.
Definition 4.1: A token flow path π = p1t1 p2t 2 p3t3 ..... pn t n p is a directed path
of w n . Π n denotes the set of all token flow paths starting with the service input place
and ending with the service output place in w n .
Property 4.1: Given w n under marking m n , the shortest time for a token to arrive at
the sink place θ n from the source place ε n is τ n = max fα (π ) , where f α (π ) = Σ α n .
t n ∈π
Πn
π∈
The longest time for a token to arrive at the sink place θ n from the source
place ε n is τ n = max f β (π ) , where f β (π ) = Σ β n .
t n ∈π
Πn
π∈
DF w1 w2 w3 w4 w5 w6 w7 w8 w9
Q1()
R1() CFP1()
Q2()
R2()
CFP2()
Q3()
R3()
CFP3()
Q4()
R4()
CFP4()
Q5()
R5()
CFP5()
CFP6()
Q6()
R6()
CFP7()
Q7()
R7()
CFP8()
Q8()
R8()
CFP9()
agent wn with ε n = θ m and issue “Call for proposals” message (CFP) to them as
needed. A workflow agent wm will also query the DF Agent to discover resource
agents required to perform the activities in workflow agent wm . Fig. 4 shows how
workflow agent w1 searches the yellow page for the workflow agent with service input
place p 3 by sending Q1. In this example, w1 queries the DF Agent to
discover w2 and w3 that can provide the requested services. Once w3 and w2 are
discovered, w1 will send a “Call for proposals” message (CFP1) to w2 and a “Call for
proposals” message (CFP2) to w3 as shown in Fig.4. On receiving the “Call for
proposals” message (CFP1), w2 searches the yellow page by sending message Q2 as
shown in Fig. 4 to discover the potential workflow agents with service input place p5 .
A similar process follows. Fig. 4 illustrates the sequence diagram of service discovery
and call for proposals. Based on the services publication and discovery mechanism,
CNP can be applied to facilitate negotiation and temporal reasoning of workflow
agents. Fig. 5 shows the screen shot to define a workflow and a activity and the state
diagram to handle a CFP by a workflow agent. The following polynomial complexity
algorithm is applied by workflow agent wn to find the earliest completion σ n and the
latest completion time σ n based on the earliest completion time σ and the latest
completion time σ of the manager.
β 13 + β 14 + β 15 + β 16 + β 17 + β 18 + β 19 + β 20 ,
β 1 + β 2 + β 3 + β 4 + β 5 + β 8 + β 9 + β 10 + β 11 + β 12 +
β 13 + β 14 + β 15 + β 16 + β 17 + β 18 + β 19 + β 20 ,
β 1 + β 2 + β 3 + β 4 + β 21 + β 22 + β 13 + β 14 + β 15 + β 16 + β 17 + β 18 + β 19 + β 20 .
Idle
Temporal Reasoning
(a)
CFP to downstream
Time Constraint violated
workflow agent
(b)
(c)
6 Conclusion
algorithm is polynomial. Therefore, our algorithm is much more efficient than SCG
for the subclass of TPN. It is also scalable for large scale problems. Our results
indicate that reasoning about temporal constraints in the subclass of TPN proposed in
this paper can be achieved in polynomial time by exploiting its structure.
References
1. Decker, K., Li, J.: Coordinated hospital patient scheduling. In: International Conference on
Multi Agent Systems, Paris, July 3-7, pp. 104–111 (1998)
2. Kutanoglu, E., David Wu, S.: On combinatorial auction and Lagrangean relaxation for
distributed resource scheduling. IIE Transactions 31, 813–826 (1999)
3. Oddi, A., Cesta, A.: Toward interactive scheduling systems for managing medical
resources. Artificial Intelligence in Medicine 20, 113–138 (2000)
4. Daknou, A., Zgaya, H., Hammadi, S., Hubert, H.: A dynamic patient scheduling at the
emergency department in hospitals. In: IEEE Workshop on Health Care Management,
Venice, February 18-20, pp. 1–6 (2010)
5. Spyropoulos, C.D.: AI planning and scheduling in the medical hospital environment.
Artificial Intelligence in Medicine 20, 101–111 (2000)
6. Nilsson, N.J.: Artificial Intelligence: A New Synthesis. Morgan Kaufmann Publishers,
Inc., SanFrancisco (1998)
7. Ferber, J.: Multi-Agent Systems, An Introduction to Distributed Artificial Intelligence.
Addison-Wesley, Reading (1999)
8. Durfee, E.H., Lesser, V.R., Corkill, D.D.: Trends in cooperative distributed problem
solving. IEEE Transactions on Knowledge and Data Engineering 1(1), 63–83 (1989)
9. Russel, S.J., Norvig, P.: Artificial Intelligence—A Modern Approach, 2nd edn. Pearson
Education Asia Limited (2006)
10. Workflow Management Coalition, XPDL support and resources (2009),
http://www.wfmc.org/xpdl.html
11. Object Management Group, Business process modeling notation (2009),
http://www.bpmn.org
12. OASIS, Web services business process execution language version 2.0 (2009),
http://docs.oasis-open.org/wsbpel/2.0/OS/wsbpel-v2.0-OS.html
13. Merlin, P., Farbor, D.: Recoverability of communication protocols. IEEE Trans. on
Communications 24(9), 1036–1043 (1976)
14. Smith, R.G.: The Contract net protocol: high-level communication and control in a
distributed problem solver. IEEE Transactions on Computers 29, 1104–1113 (1980)
15. Murata, T.: Petri Nets: Properties, Analysis and Applications. Proceedings of the
IEEE 77(4), 541–580 (1989)
16. Berthomieu, B., Menasche, M.: An Enumerative Approach for Analyzing Time Petri Nets.
In: Proc. Ninth International Federation of Information Processing (IFIP) World Computer
Congress, vol. 9, pp. 41–46 (September 1983)
17. Conry, S.E., Kuwabara, K., Lesser, V.R., Meyer, R.A.: Multistage negotiation for distributed
constraint satisfaction. IEEE Transactions on Systems, Man and Cybernetics 21(6),
1462–1477 (1991)
Assessing Rice Area Infested by Brown Plant Hopper
Using Agent-Based and Dynamically Upscaling Approach
Vinh Gia Nhi Nguyen1, Hiep Xuan Huynh2, and Alexis Drogoul1
1
UMI 209 UMMISCO, IRD/UPMC, Institut de recherche pour le développement
Bondy, France
{nngvinh,alexis.drogoul}@gmail.com
2
DREAM Team, School of Information and Communication Technology,
Cantho Univesity, Vietnam
[email protected]
1 Introduction
Brown plant hopper is one kind of Nilaparvata lugens species and is the major pest on
rice [2],[5], this insect damaged rice crop systems in Asia and carries rice virus-related
diseases like grassy stunt, ragged stunt and wilted stunt, etc, that causing low rice prod-
uct. The local and national government tried to control BPHs invasion and outbreaks in
the Mekong Delta region-Vietnam through activities of researchers and farmers who
collected BPH data by making observation [8]. Questions raised as exploring Brown
plant hopper problems: “How to predict invasion of brown plant hoppers?” or “How to
know BPH density at certain location in time and space in the Mekong Delta?”, control-
ling flight activity and population dynamic of Brown plant hoppers was conducted by
directly sampling surveys and by trapping BPH adults during rice seasons. Reseachers
and staffs from plant protection department at commune level collect daily BPH density
(individuals/m2) and BPH infestation rice area (ha) and send weekly reports to plant pro-
tection department at district level, a estimation synthesis of reports is made by staffs and
distribute to decision-makers at province scale to have suitable policy during rice crop
season [10]. At field scale, almost farmers based on individual experience and
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 43–52, 2012.
© Springer-Verlag Berlin Heidelberg 2012
44 V.G.N. Nguyen, H.X. Huynh, and A. Drogoul
recommendation messages from local government at district and province level to choose
which kind of rice crops and fertilisation. However, BPH infestation rice area data are
available at field scale and are more detail but not available at district and province scale,
BPH infestation rice area information needed in upper scale are more general in meaning.
Fig.1 illustrates information flows between administrative scales:
Planning feedback
Scale
aggregate, report and
Decision-maker
recommend insecticide,
Researcher
rice crop protection plans
Province
feedback
Scaling is the transmission of information between scales [1] and scaling can help fill-
ing gap between one scale at which most information is willing and other scale at which
most decisions concerning plans are made [3]. Aim of upscaling is to answer questions at
scales that can not be solved directly, e.g. decision-makers at province scale no need
micro information at field/commune scale but they need macro information from district
scale to make long term provincial strategies and prediction in at least one
week/month/season/year. Almost macro information at district and province scale are
estimated by experience and knowledge of experts. Thus, an agent-based model [4],[6] is
necessary to capture real world objects: decision-maker, researcher, farmer, BPH, etc as
agents and to make linkages through reports, experiences and feedbacks between agents
for assessing BPH infestation rice area. This paper attempts to describe a model using
agent-based and upscaling approach to assess BPH infestation rice area in the hierarchy
in Fig.1 from field/commune scale to province scale. The paper is structured as follows:
section 2 proposes MSMEKO model using agent-based and upscaling approach, section
3 presents results and discussion and section 4 draws conclusion.
MSMEKO model is built on GAMA platform [7]. GAMA supports GIS tools to
design spatial environment for agents, status of agents can be explored during
simulation. Active agents such as decision-maker, BPH, rice have attributes and
Assessing Rice Area Infested by Brown Plant Hopper 45
actions (Fig.2), other passive agents could be weather and spatial objects from GIS
data (province, district, commune). Relationship lines in Fig.2 determine possible
inheritance and interactions among agents. A land use GIS map data is loaded into
simulation environment, agents like province, commune and district will be created
automatically with attributes: code, name, area and coordinates. Because of time and
large memory consuming while simulating for each BPH individual in land use GIS
map, a group of BPHs could be considered as an agent and automatically updates
number of eggs, nymphs, adults and number of invading adults to other group agents
at each simulation step. Real world objects in the Mekong Delta region are
represented as agents in computer model in the following table:
Attributes and processes in each agent will be adapted during implementing phase.
CdecisionMaker agent uses action report() to collect daily BPH density and BPH
infestation rice area from sample rice fields and send weekly reports to Ddecision-
Maker agent at district scale, these reports are synthesized by action aggregate() and
action report() to sent weekly to PdecisionMaker agent for decisions concerning agri-
cultural systems. This paper focused on aggregating/upscaling information function
for conveying BPH infestation rice area infromation from CdecisionMaker agent to
PdecisionMaker agent.
46 V.G.N. Nguyen, H.X. Huynh, and A. Drogoul
LightTrap Weather
+id Province PdecisionMaker
+windDir 1 1
+coordinates +windSpeed +id +id
+bphDensity +humidity +pName
+rainfall +coordinates +plan()
+updateLightTrap()
+temperature +area 1 +report()
+aggregate()
+updateWeather()
0..*
1..*
District 1
1 1
0..*
Crop
1
+id
0..1 +startTime
+endTime
Rice
+id
+kindOfRice
+timeToHarvest
+updateStage()
To assess status of BPH infestation at commune scale, BPH infestation in the study
[9] are classified into three levels: light, medium and heavy infestation (table 2). An
example of BPH infestation levels at commune scale on day 01-July-2009 in Dong
Thap province below:
Fig. 3. Three BPH infestation levels on rice fields in Dong Thap province, on 01-July-2009
m
d i = w 1 * c 1i + w 2 * c i2 + ... + w m * c im = ∑ w j * c ij (3)
j =1
This is also similar when aggregating BPH infestation rice area information from
district to province level.
Scale
Results
Aggregating
District Results
Aggregating
Field/ Data
Commune Model
Steps
Fig. 4. Upscaling BPH infestation rice area information from field to province scale
Procedure 1. (question Q1): communes have the same influence each other, weight
values are equal to value 1:
m
w j = 1, j = 1,2,..., m ==> d i = ∑ c ij (4)
j =1
( )
m
d i = ∑ w j , SA * c ij ,SA + w j , AW * c ij , AW
j =1
i
Function d in Eq.6 is non-linear function to solve questions like (Q3):
m
d i = ∑ w j , SA * c ij ,SA if rice season in question (Q3) is summer-autumn
j =1
m
d = ∑ w j , AW * c ij , AW if rice season is autumn-winter.
i
j =1
The model can be extended to region scale from province scale. For example, what is
total BPH infestation rice area in a month/season in a certain region?. Agents Pdeci-
sionMaker, DdecisionMaker and CdecisionMaker could assess BPH infestation rice
area information at each scale by using above aggregating functions as actions.
(large BPH infestation rice area) in communes or districts, e.g. heavy BPH infestation
rice area in Sa Dec and Tan Hong district is very high in comparision with other dis-
tricts. In the 1st week (Fig.5), BPH infestation rice area in Lap Vo, Lai Vung and Chau
Thanh district have the same circle perimeter of light and medium BPH infestation rice
area although they have different rice crop areas, this helps decision-makers find out
district groups or clusters having similar BPH infestation rice area.
Fig. 5. District BPH infestation rice area is Fig. 6. District BPH infestation rice area is
upscaled from communes in Dong Thap upscaled from communes in Dong Thap prov-
province on the 1st week, July, 2009 using ince on the 2sd week, July, 2009 using
procedure 1 procedure 1
L1
14000
Ha LR1
M1
12000 MR1
H1
HR1
10000
8000
6000
4000
2000
0
week 1 (01-Jul) week 2 (08-Jul) week 3 (16-Jul) week 4 (24-Jul)
Fig. 7. Province BPH infestation rice area Fig. 8. Total BPH infestation rice area for each
is upscaled from districts on the 1st week – level at province scale, weekly, July-2009. L1,
July, 2009 with procedure 1. M1, H1 are total BPH infestation rice area of
light, medium and heavy levels at province scale.
Assessing Rice Area Infested by Brown Plant Hopper 51
In Fig.5, Huyen Cao Lanh district has light infestation rice area, after one week
(Fig.6) this district occurs small medium and heavy infestation rice area.
In the 2sd week of July (Fig.6), because almost districts finished harvesting sum-
mer-autumn rice crops and started autumn-winter rice season, Sa Dec district has low
remaining rice area in ripening stage and BPHs migrate to other rice fields. Similarly,
Fig.7 shows BPH infestation rice area at province scale is aggregated from BPH in-
festation rice areas at district scale. Total BPH infestation rice area (ha) at province
scale using procedure 1 (Fig.8) denote that results from simulation and from estima-
tion reports have some differences, and LR1, MR1, HR1 are estimated results of total
light, medium and heavy infestation (ha) from estimation reports [10].
1400,0 L2
Ha 7000 Ha
LR2
M2
1200,0 MR2 6000
H2
HR2
1000,0 5000
L3
LR3
4000 M3
800,0
MR3
3000 H3
600,0
HR3
2000
400,0
1000
200,0
0
0,0 week 1 (01-Jul) week 2 (08-Jul) week 3 (16-Jul) week 4 (24-Jul)
week 1 (01-Jul) week 2 (08-Jul) week 3 (16-Jul) week 4 (24-Jul)
Fig. 9. Average BPH infestation rice area Fig. 10. Total BPH infestation rice area for
for each level at province scale, weekly, each level on summer-autumn rice crops at
July-2009. L2, M2, H2 are average BPH province scale, weekly, July-2009. L3, M3,
infestation rice area of light, medium and H3 are total BPH infestation rice area of light,
heavy levels per district at province scale medium and heavy levels at province scale
Fig.9 shows weekly average BPH infestation rice area per district resulted from
upscaling procedure 2 (Eq.5), results is rather fitted with infestation rice area indicator
in estimation reports (LR2, MR2, HR2). In Fig.10, procedure 3 (Eq.6) shows that at
the end of July summer-autumn rice season is mostly finished, BPH infestation rice
area for this season descreased as rapidly as that in estimated reports at province level
(LR3, MR3, HR3). An extension of upscaling from province scale could be applied to
Mekong delta region scale that includes province agents in assessing BPH infestation
rice area cover all province agents. Regional plant protection center bases on estima-
tion reports from provinces to have a general understanding on BPH infestation rice
area of provinces and have more suitable policies between provinces.
4 Conclusion
References
1. Bierkens, F.P., Finke, P.A., de Willigen, P.: Upscaling and downscaling methods for envi-
ronmental research. Kluwer Academic Publishers (2000)
2. Cheng, J.: Rice planthopper problems and relevant causes in China. In: Heong, K.L.,
Hardy, B. (eds.) Planthoppers: New Threats to the Sustainability of Intensive Rice Produc-
tion Systems in Asia, pp. 157–178. The International Rice Research Institute (IRRI),
Los Baños (2009)
3. Dalgaard, T., Hutchings, N.J., Porter, J.R.: Agroecology, scaling and interdisciplinarity.
Agriculture, Ecosystems & Environment 100, 39–51 (2003)
4. Damgaard, M., Kjeldsen, C., Sahrbacher, A., Happe, K., Dalgaard, T.: Validation of an
agent-based, spatio-temporal model for farming in the RIver Gudenå landscape. Results
from the MEA-Scope case study in Denmark. In: Piorr, A., Müller, K. (eds.) Rural Land-
scapes and Agricultural Policies in Europe, pp. 241–258. Springer, Berlin (2009)
5. Dyck, V.A., Thomas, B.: The brown planthopper problem. In: Brown Planthopper: Threat
to Rice Production in Asia, pp. 3–17. The International Rice Research Institute, Los Baños
(Philippines) (1997)
6. Drogoul, A., Vanbergue, D., Meurisse, T.: Multi-agent Based Simulation: Where Are the
Agents? In: Sichman, J.S., Bousquet, F., Davidsson, P. (eds.) MABS 2002. LNCS (LNAI),
vol. 2581, pp. 1–15. Springer, Heidelberg (2003)
7. Drogoul, A., et al.: Gama simulation platform (2011),
http://code.google.com/p/gama-platform/
8. Heong, K.L., Escalada, M.M., Huan, N.H., Chien, H.V., Choi, I.R., Chen, Y., Cabunagan, R.:
Research and implementation issues related to management of the brown planthopper/virus
problem in rice in Vietnam. In: Australian Centre for International Agricultural Research-
ACIAR, vol. 08 (2008)
9. Phan, C.H., Huynh, H.X., Drogoul, A.: An agent-based approach to the simulation of brown
plant hopper invasions (BPH) in the the Mekong Delta. In: IEEE-RIVF, pp. 227-232 (2010)
10. Provincial plant protection department (2009),
http://www.dongthap.gov.vn/wps/wcm/connect/Internet/
sitbaodientu/sitathongtincanbiet/sitasinhvatgayhai/
11. Plant protection department.: National technical regulation on Surveillance method of
plant pests. Ministry of Agriculture and Rural Development-Vietnam, QCVN 01-38 (2010)
A Service Administration Design Pattern
for Dynamically Configuring Communication Services
in Autonomic Computing Systems
1 Introduction
The Service administration design pattern decouples the implementation services
from the time at which the services are configured into an application or a system.
This decoupling improves the modularity of services and allows the services to evolve
over time independently of configuration issues (such as whether two services must
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 53–63, 2012.
© Springer-Verlag Berlin Heidelberg 2012
54 V. Mannava and T. Ramesh
be co-located or what concurrency model will be used to execute the services) [1]. In
addition, the Service administration pattern centralizes the administration of the
services it configures. This facilitates automatic initialization and termination of
services and can improve performance by factoring common service initialization and
termination patterns into efficient reusable components.
Distributed computing applications grow in size and complexity in response to
increasing computational needs, it is increasingly difficult to build a system that
satisfies all requirements and design constraints that it will encounter during its
lifetime. Many of these systems must operate continuously, disallowing periods of
downtime while humans modify code and fine-tune the system [2]. For instance,
several studies document the severe financial penalties incurred by companies when
facing problems such as data loss and data inaccessibility. As a result, it is important
for applications to be able to self-reconfigure in response to changing requirements
and environmental conditions.
This approach enables a decision making process to dynamically evolve
reconfiguration plans at run time. Autonomic systems sense the environment in
which they are operating and take action to change their own behavior or the
environment with a minimum effort. Every autonomic system having four properties
that are monitoring, decision making, reconfiguration and execution, figure 1 will
shows the autonomic system [5]
Although the use of patterns like Reactor, Acceptor, and Connector improve the
modularity and portability of the distributed time server, configuring communication
services using a static approach has the following drawbacks [1]:
Service Configuration Decisions Must Be Made Too Early in the Development
Cycle: This is undesirable since developers may not know a priori the best way to co-
locate or distribute service components. For example, the lack of memory resources in
wireless computing environments may force the split of Client and Clerk into two
independent processes running on separate hosts. In contrast, in a real-time avionics
A Service Administration Design Pattern 55
environment it might be necessary to co-locate the Clerk and Server into one process
to reduce communication latency. Forcing developers to commit prematurely to a
particular service configuration impedes flexibility and can reduce performance and
functionality [1].
Modifying a Service May Adversely Affect Other Services: The implementation of
each service component is tightly coupled with its initial configuration. This makes it
hard to modify one service without affecting other services. For example, in the real-
time avionics environment Mentioned above, a Clerk and a Time Server might be
statically configured to execute in one process to reduce latency. If the distributed
time algorithm implemented by the Clerk is changed, however, the existing Clerk
code would require modification, recompilation, and static relining. However,
terminating the process to change the Clerk code would terminate the Time Server as
well. This disruption in service may not be acceptable for highly available systems.
System Performance May Not Scale Up Efficiently: Associating a process with
each service ties up OS resources (such as I/O descriptors, virtual memory, and
process table slots). This design can be wasteful if services are frequently idle.
Moreover, processes are often the wrong abstraction for many short-lived
communication tasks (such as asking a Time Server for the current time or resolving a
host address request in the Domain Name Service). In these cases, multithreaded
Active Objects or single-threaded Reactive event loops may be more efficient.
3.2 Classification
Structural - Monitoring
3.3 Intent
System deals with service invocation and management, when client invoke service
from existing services observer will observe and report to service class it initiate time
stamp and assign clerk for service. Service will store in service repository it create
separate thread for each process and manage service, if service not completed within
time stamp then service will suspend for some time based on service class decision.
When service class is available then service will resumed.
56 V. Mannava and T. Ramesh
3.4 Motivation
The Service Administration design pattern decouples the implementation of services
from the time at which the services are configured into an application or a system.
This decoupling improves the modularity of services and allows the services to evolve
over time independently of configuration issues (such as whether two services must
be co-located or what concurrency model will be used to execute the services). In
addition, the Service Administration design pattern centralizes the administration of
the services it configures. This facilitates automatic initialization and termination of
services and can improve performance by factoring common service initialization and
termination patterns into efficient reusable components.
A UML class diagram for the Service Administration Pattern can be found in
Figure 2.
Three design patterns are used for service administration design pattern that are
Reflective monitoring, strategy and thread per connection pattern. Client invoke
service in service class, observer will observer invocation in service class then report
invocation to service class based on the service class will choose appropriate time
stamp and clerk for service. After assigning time stamp service stored in service
repository. Service repository will handle service based on thread per connection
pattern; Pattern will create separate thread for each and every service. Each service is
executed in separate location clerk will observe service weather it finishes with in
time or not, if service finishes with in time stamp then clerk will report results to
client otherwise it report observation to service class. Service takes decision based on
availability of time stamp if time is available then service is refused otherwise service
is suspended [5]. Here three design patterns are satisfies all autonomic properties
monitoring, decision making, reconfiguration and execution. Monitoring reflection
monitoring is used for decision making strategy is used, executing thread per
connection is used finally service class is reconfigured based on service results either
refuse or suspend. Figure 3 will shows sequence diagram for service administration.
3.6 Participants
(a) Client
Client class will invoke service in service class, client provide input to service
administration system client will try invoke service that are there in service class, if
service is there then it will invoke otherwise it get error message if service is there in
service class after execution of service client will get result from service class [13].
(b) Service
Service class will consists of service it is observed by observer class it will report
based on invocation, based on observer it will allocate times stamp and clerk to
service. After assign time stamp and clerk service is stored in service repository.
Based on service result service class is reconfigured.
A Service Administration Design Pattern
3.7 Consequences
uniform interface by which they are configured, thereby encouraging reuse and
simplifying development of subsequent services.
Increased Configuration Dynamism: The pattern enables a service to be dynamically
reconfigured without modifying, recompiling, or statically relining existing code. In
addition, reconfiguration of a service can often be performed without restarting the
service or other active services with which it is co-located.
Increased Opportunity for Tuning and Optimization: The pattern increases the
range of service configuration alternatives available to developers by decoupling service
functionality from the concurrency strategy used to execute the service. Developers can
adaptively tune daemon concurrency levels to match client demands and available OS
processing resources by choosing from a range of concurrency strategies. Some
alternatives include spawning a thread or process upon the arrival of a client request or
pre spawning a thread or process at service creation time.
3.9 Applicability
(a) Client: {
Client.java Public init(int serviceID){}
Public class Client Public fini(int serviceID){}
{ }
Public invoke( int serviced){}
} (e)Clerk:
(b) Service: Clerk.java
Service.java Public class Clerk
Public class service {
{ Public init(int serviceID){}
Public init(int serviced){} Public fini(int serviceID){}
Public observe( obj){} }
Public suspend(int serviced, int
clerkID){} (f)Observer:
Public refuse(int serviced, int Observer.java
clerkID){ } Public class Observer
Public fini(int serviced, int clerkID) {
{} Public Leran(){}
} Public Update(){}
(c) Service repository: Public Report(){}
Serviecerepository.java }
Public class Serviecerepository
{ (f) Service thread:
Public Invokeservice(int ser){ } Servicethread.java
Public run(int ser, int clerk ){} Public Servicethread
Public store(int serID, int time){ } {
} Public run()
d) Time stamp: {
Timestamp.java }
Public class Timestamp }
5 Profiling Results
To demonstrate the efficiency of the pattern we took the profiling values using the
Netbeans IDE and plotted a graph that shows the profiling statistics when the
pattern is applied and when pattern is not applied. This is shown in figure 4. Here
X-axis represents the runs and Y-axis represents the time intervals in milliseconds.
Below simulation shows the graphs based on the performance of the system if the
pattern is applied then the system performance is high as compared to the pattern is
not applied.
62 V. Mannava and T. Ramesh
Fig. 4. Profiling statistics before applying pattern and after applying pattern
6 Conclusion
This paper describes the Service Administration design pattern and illustrates how it
decouples the implementation of services from their configuration. This decoupling
increases the flexibility and extensibility of services. In particular, service
implementations can be developed and evolved over time independently of many issues
related to service administration. In addition, the Service Administration design pattern
provides the ability to reconfigure a service without modifying, recompiling, or statically
linking existing code. Based on proposed system existing system will upgrade their
resources and their services. Proposed system also satisfies all properties of autonomic
system like monitoring, decision making, and reconfiguration. Out future aim is to
implement this paper in aspect oriented programming that satisfies all autonomic
characteristics of autonomic system.
References
1. Jain, P., Schmidt, D.C.: Service Configuration: A Pattern for Dynamic Configuration of
Services. In: 3rd USENIX Conference on Object-Oriented Technologies and Systems
Portland (1997)
2. Ramirez, A.J., Betty, H.C.: Design patterns for developing dynamically adaptive Systems.
In: 5th International Workshop on Software Engineering for Adaptive and Self-Managing
Systems, Cape Town, South Africa, pp. 29–67, 50, 68 (2010)
3. Mannava, V., Ramesh, T.: A Novel Event Based Autonomic Design Pattern for Management
of Webservices. In: Wyld, D.C., Wozniak, M., Chaki, N., Meghanathan, N., Nagamalai, D.
(eds.) ACITY 2011. CCIS, vol. 198, pp. 142–151. Springer, Heidelberg (2011)
4. Prasad Vasireddy, V.S., Mannava, V., Ramesh, T.: A Novel Autonomic Design Pattern for
Invocation of Services. In: Wyld, D.C., Wozniak, M., Chaki, N., Meghanathan, N.,
Nagamalai, D. (eds.) CNSA 2011. CCIS, vol. 196, pp. 545–551. Springer, Heidelberg (2011)
A Service Administration Design Pattern 63
5. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns Elements of Reusable
Object-Oriented Software, Hawthorne, New York (1997)
6. Pree, W.: Design Patterns for Object-Oriented Software Development. Addison-Wesley,
MA (1994)
7. Schmidt, D.C., Suda, T.: An Object-Oriented Framework for Dynamically Configuring
Extensible Distributed Communication Systems. In: IEE/BCS Distributed Systems
Engineering Journal (Special Issue on Configurable Distributed Systems), pp. 280–293
(1994)
8. Crane, S., Magee, J., Pryce, N.: Design Patterns for Binding in Distributed Systems. In:
The OOPSLA 1995 Workshop on Design Patterns for Concurrent, Parallel, and
Distributed Object-Oriented Systems, Austin, TX. ACM (1995)
9. Chawla, A., Orso, A.: A generic instrumentation framework for collecting dynamic
information. SIGSOFT Softw. Eng. Notes (2004)
10. Cheng, S.-W., Garlan, D., Schmer, B.: Architecture-based selfadaptation in the presence of
multiple objectives. In: International Workshop on Self-Adaptation and Self-Managing
Systems. ACM, New York (2006)
Chemical Process Fault Diagnosis
Based on Sensor Validation Approach
Jialin Liu
Abstract. Investigating the root causes of abnormal events is a crucial task for
an industrial chemical process. When process faults are detected, isolating the
faulty variables provides additional information for investigating the root causes
of the faults. Numerous data-driven approaches require the datasets of known
faults, which may not exist for some industrial processes, to isolate the faulty
variables. The contribution plot is a popular tool to isolate faulty variables
without a priori knowledge. However, it is well known that this approach
suffers from the smearing effect, which may mislead the faulty variables of the
detected faults. In the presented work, a contribution plot without the smearing
effect to non-faulty variables was derived. An industrial example, correctly
isolating faulty variables and diagnosing the root causes of the faults for the
compression process, was provided to demonstrate the effectiveness of the
proposed approach for industrial processes.
1 Introduction
In modern chemical processes, distributed control systems are equipped for regulating
the processes, and the operating data are collected and stored in a historical database.
However, information about process operations is hidden under the historical data.
Therefore, it is more practical to develop methods that detect and investigate the root
causes of process faults based on data-driven approaches, rather than to use other
methods based on rigorous process models or knowledge-based approaches. Since the
measured variables are correlated for a chemical process, principal component
analysis (PCA) is a popular tool to extract the features of the process data that are
applied to monitor the process variations. After a fault is detected, the faulty variables
need to be isolated in order to diagnose the root causes of the fault. Contribution plots
are the most popular tool for identifying which variables are pushing the statistics out
of their control limits. Kourti and MacGregor [1] applied the contribution plots of
quality variables and process variables to find faulty variables of a high-pressure low-
density polyethylene reactor. They remarked that the contribution plots may not
reveal the assignable causes of abnormal events; however, the group of variables
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 64–73, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Chemical Process Fault Diagnosis Based on Sensor Validation Approach 65
event datasets and the time-consuming task of continuously optimizing the mixed-
integer programming problem for every sampling data until reaching a stable solution
is also not required.
The remainder of this paper is organized as follows. Section 2 gives an overview of
PCA and the contribution plots of statistics Q and T2. The proposed approach of the
contribution plots without smearing effect to non-faulty variables is detailed in section
3. In section 4, an industrial application is provided to demonstrate the effectiveness
of the proposed approach for industrial processes. Finally, conclusions are given.
2 Basic Theory
2.1 Principal Component Analysis
where Λ is a diagonal matrix with the first K terms of the significant eigenvalues
and P contains the respective eigenvectors. The Λ and P are the residual
eigenvalues and eigenvectors, respectively. The data matrix X can be decomposed as:
X = XPPT + XPP T = X
ˆ + E with X̂ being the projection of the data matrix X onto
the subspace formed by the first K eigenvectors, named the principal component (PC)
subspace, and E being the remainder of X that is orthogonal to the subspace.
Statistic Q is defined as a measure of the variations of the residual parts of data:
( )( )
T
Q = x − x x − x = xPP T x T = xCx T where C ≡ PP
T . In addition, another measure
for the variations of systematic parts of the PC subspace is the statistic T2:
T 2 = xPΛ −1PT x T = xDx T = tΛ −1t T where D ≡ PΛ −1PT and t are the first K term
scores. This is the Mahalanobis distance from the origin of the subspace to the
projection of the data. The confidence limits of Q and T2 can be found in reference
[11]. When a fault is detected by any one of above-mentioned statistics, the
contribution plots provide a preliminary tool to isolate faulty variables without any
prior knowledge of the fault. The contributions of Q for the ith variable can be written
as: ciQ = ( xCξ i ) where ξ i is a column vector in which the ith element is one and the
2
others are zero. The confidence limit for each contribution of Q has been derived in
references [2]. Qin et al. [12] derived the variable contributions to T2 for the ith
variable as ( xD0.5ξ i ) , and also provided the confidence limits of the contributions.
2
However, since the contributions of the statistics are transformed from the process
variables through a matrix multiplication, the faulty variables may smear out over the
other variables, which will mislead a diagnosis of the correct root causes of the faults.
Chemical Process Fault Diagnosis Based on Sensor Validation Approach 67
⎛ ∂Q ⎞ n
⎜ ⎟ = 2∑ xi ck ,i = 0 , k = 1...n (2)
⎝ ∂xk ⎠ xi ≠ xk i =1
where the ci,j is an element of the C. Rearranging the above equation, the
1 n
reconstructed input can be obtained: xk = −
*
∑ xi ck ,i where x*k is the kth
ck ,k i ≠k
reconstructed input. Substituting the reconstructed input into the definition of Q, the
statistic Q with the kth reconstructed variable can be written as:
n n n
Qk* = ∑∑ xi x j ci , j + 2 x*k ∑ x j c j ,k + xk* x*k ck ,k . The faulty variable was located by
i ≠k j ≠k j ≠k
selecting the variable with the minimum of sensor validity indices (SVIs), which was
defined as ηk = Qk* / Q for the kth variable [13].
Yue and Qin [6] suggested that T2 may be over its control limit when Q is
minimized to reconstruct the faulty sensor. Therefore, they proposed a combined
Q T2
index as: ϕ ≡ + = xΦx T , Φ ≡ C Qα + D Tα2 where Qα and Tα2 respectively
Qα Tα2
are the (1-α) confidence limits of the statistic Q and T2. The combined index is
minimized instead of the Q statistic and the reconstructed variable can be obtained as:
⎛ ∂ϕ ⎞
⎜ ⎟ = 0 , k = 1...n . The combined index with the kth reconstructed variable can
∂
⎝ k ⎠ x ≠x
x
i k
n n n
be written as: ϕk = ∑∑ xi x j φi , j + 2 xk ∑ x j φ j ,k + xk xk φk ,k .
* * * *
i ≠k j ≠k j ≠k
For a multiple sensor fault, the reconstructed variables can be written as: ξTΦxT = 0
where ξ ≡ ⎡⎣ξ1 ξ 2 … ξ nf ⎤⎦ , in which nf is the number of faulty variables and ξ i is
a column vector in which the ith element is one and the others are zero. The monitored
variables can be decomposed as x = xη + x ( I − η) , where η is a diagonal matrix, in
which the values of the diagonal elements are one for the faulty variables and zero for
the non-faulty ones. Therefore, minimizing the combined index can be rewritten in
the following form: ξ Φηx = − ξ Φ ( I − η ) x . The left-hand side of the equation
T T T T
contains the data to be reconstructed by the normal data in the right-hand term. The
68 J. Liu
collection of the faulty variables. The reconstruction of the faulty variables can be
obtained by the following equation:
nf = − ( ξ Φξ ) ξΦ ( I − η) x
−1
x*T T T
(3)
index (RCI) after the reconstructing the faulty data can be written as follows:
Therefore, the fault isolation task is to find subset xnf from x to maximize the RCI,
until the statistics Q and T2 are under the corresponding control limits, without the
information from faulty variables. The contribution of RCI for the ith faulty variable
can be defined as:
The proposed approach first evaluates each RCI with a reconstructed variable and
inserts the variable with the maximum RCI into xnf in the first step. Next, the RCIs are
evaluated using the reconstructed data of a non-faulty variable and the selected faulty
variables in xnf. The non-faulty variable with the maximum RCI is inserted into xnf.
The steps of adding a new faulty variable into xnf is repeated until both statistics are
under the corresponding control limits. The algorithm is summarized as follows:
1. Set nf = 0 and xnf = ∅.
2. For i = 1…n-nf,
Reconstruct the data of x nf ∪ xi ∉ x nf using eq. 3 and evaluate the RCI
using eq. 4.
3. Add the variable with the maximum RCI into xnf and set nf = nf + 1.
4. If the statistics Q and T2, without the information of the selected faulty
variables, are still over their control limits, go back to step 2.
5. Decreasingly sort the selected faulty variables according to the contributions
of RCI using eq. 5 and retain the variables in xnf that sufficiently reduce the
statistics Q and T2 under the corresponding control limits.
Steps 2-4 of the algorithm guarantee eq. 4 to be a monotonically increasing function
with the number of selected faulty variables; therefore, the statistics monotonically
decrease during iterations. The non-faulty variables, which may be selected in the
early stage of the iterations under insufficient information about the faulty variables,
are removed from xnf in step 5. When diagnosing the root causes of process faults, the
selected faulty variables do not equally contribute to the faults. The contribution plots
for the reduction of statistics can be used to find the faulty variables with the most
contributions, as the contributions have been confined within the selected faulty
variables and the fault magnitude will not smear over to the non-faulty variables.
Chemical Process Fault Diagnosis Based on Sensor Validation Approach 69
4 Industrial Application
The statistics after removing the faulty variables (FVs) is also shown in Fig. 2. It
demonstrates that the proposed approach guarantees the statistic Q and T2 under
their control limits after removing the FVs. The normalized contribution plot of RCI
is shown in Fig. 3(a) that indicates the major faulty variables after the sixth day was
x15, which is the outlet temperature of the second compression stage. The fault
isolation results using the contribution plot of Q is displayed in Fig. 3(b), in which
each contribution was normalized with the corresponding 99% confidence limits.
Although Fig. 3(b) indicates that x15 was one of the most significant faulty
variables; however, the fault magnitude was smeared over to the non-faulty
variables. Comparing the results of Fig. 3(a) and 3(b), the smearing effect of the
traditional contribution plots is effectively eliminated using the proposed approach.
Figure 3(c) shows the RBC of Q normalized with the corresponding 99%
confidence limits that is exactly same with Fig. 3(b). Since the RBC of Q and the
traditional contribution plots of Q differ only by a scaling coefficient, which
appears in the leading part of the control limits of RBC, the normalized RBC would
be identical to the normalized contribution plots.
70 J. Liu
20
Q
10
0
30
20
2
T
10
0
0 1 2 3 4 5 6 7 8 9 10
Day
For diagnosing root causes of the fault after the sixth day, the measured and
reconstructed data of x15 are shown in Fig. 4(a) in which the measurements were lower
than the reconstructed data during the period that the fault was detected. Comparing these
lower temperature data with the normal operating data, it can be found that the variations
of faulty data were still under the range of normal operation. Therefore, the statistic Q
detected the fault due to the variable correlation changes. For each stage of a centrifugal
compressor, the compression efficiency, which can be evaluated from the inlet-outlet
temperatures and pressures, is an important index to evaluate the operating performance
of the compression stage. Since the compression efficiency is reversely proportional to
the discharge temperature of the stage, it can be expected that the faulty data of x15, which
is the outlet temperature of the second stage, would mislead the compression efficiency
of the second stage being too high. Figure 4(b) compares the compression efficiencies of
all stages. The figure shows that the second stage’s efficiency extremely fluctuated after
the fault had been detected; therefore, it can be concluded that the root cause of the
detected fault was the sensor unreliability of the second stage’s outlet temperature.
Chemical Process Fault Diagnosis Based on Sensor Validation Approach 71
20
80 %
18
40 %
16
20 %
14
10 %
12
Variable
5%
10
0 1 2 3 4 5 6 7 8 9 10
Day
(a)
20
5
18
4
16
3
14
2
12
Variable
1
10
0 2 4 6 8 10
Day
(b)
20
5
18
4
16
3
14
2
12
Variable
1
10
0 2 4 6 8 10
Day
(c)
Fig. 3. Comparing the fault isolation results, (a) the proposed approach, (b) contribution plots of Q
normalized with the corresponding 99% confidence limits, (c) the RBC of Q normalized with the
corresponding 99% confidence limits
72 J. Liu
80
Measured Data
Reconstructed Data
78
76
Temperature ( C)
o
74
72
70
68
0 2 4 6 8 10
Day
(a)
0.96
st nd
1 Stage 2 Stage
rd th
3 Stage 4 Stage
0.92
Compression Efficiency
0.88
0.84
0.80
0.76
0 2 4 6 8 10
Day
(b)
Fig. 4. Diagnosing root causes of the fault after the sixth day, (a) comparison of the measured
and reconstructed data of x15, (b) comparison of compression efficiencies for all stages
5 Conclusions
The presented work developed a contribution plot without the smearing effect to non-
faulty variables. The proposed approach was shown to have the capability of isolating
multiple sensor faults without predefined faulty datasets. Since the resolution of
predefined faulty datasets would be deteriorated due to the time-varying nature of
industrial processes, it is not practical to isolate faulty variables based on the historical
event lists of an industrial process. In the industrial application, the fault isolation results
using the contribution plots of RCI were more precise than the solutions found using the
traditional contribution plots. In addition, it was demonstrated that the normalized RBC
of Q is equivalent to the traditional contribution of Q normalized with the corresponding
control limits; therefore, the RBC approach still suffers the smearing effect when
encountering a new fault. The results show that the predefined faulty datasets are not
necessary for the proposed approach; in addition, the smearing effect of the traditional
contribution plots is also eliminated.
Chemical Process Fault Diagnosis Based on Sensor Validation Approach 73
References
1. Kourti, T., MacGregor, J.F.: Multivariate SPC Methods for Process and Product Monitoring. J.
Qual. Technol. 28, 409–428 (1996)
2. Westerhuis, J.A., Gurden, S.P., Smilde, A.K.: Generalized Contribution Plots in
Multivariate Statistical Process Monitoring. Chemom. Intell. Lab. Syst. 51, 95–114 (2000)
3. Yoon, S., MacGregor, J.F.: Statistical and Causal Model-Based Approaches to Fault
Detection and Isolation. AIChE J. 46, 1813–1824 (2000)
4. Raich, A., Çinar, A.: Statistical Process Monitoring and Disturbance Diagnosis in
Multivariable Continuous Processes. AIChE J. 42, 995–1009 (1996)
5. Dunia, R., Qin, S.J.: Subspace Approach to Multidimensional Fault Identification and
Reconstruction. AIChE J. 44, 1813–1831 (1998)
6. Yue, H.H., Qin, S.J.: Reconstruction-Based Fault Identification Using a Combined Index.
Ind. Eng. Chem. Res. 40, 4403–4414 (2001)
7. Alcala, C.F., Qin, S.J.: Reconstruction-based Contribution for Process Monitoring.
Automatica 45, 1593–1600 (2009)
8. He, Q.P., Qin, S.J., Wang, J.: A New Fault Diagnosis Method Using Fault Directions in
Fisher Discriminant Analysis. AIChE J. 51, 555–571 (2005)
9. Liu, J., Chen, D.S.: Fault Detection and Identification Using Modified Bayesian
Classification on PCA Subspace. Ind. Eng. Chem. Res. 48, 3059–3077 (2009)
10. Kariwalaa, V., Odiowei, P.E., Cao, Y., Chen, T.: A Branch and Bound Method for
Isolation of Faulty Variables through Missing Variable Analysis. J. Proc. Cont. 20, 1198–
1206 (2010)
11. Jackson, J.E.: A User’s Guide to Principal Components. Wiley, New York (1991)
12. Qin, J.S., Valle, S., Piovoso, M.J.: On Unifying Multiblock Analysis with Application to
Decentralized Process Monitoring. J. Chemom. 15, 715–742 (2001)
13. Qin, S.J., Yue, H., Dunia, R.: Self-Validating Inferential Sensors with Application to Air
Emission Monitoring. Ind. Eng. Chem. Res. 36, 1675–1685 (1997)
14. Wang, X., Kruger, U., Irwin, G.W.: Process Monitoring Approach Using Fast Moving
Window PCA. Ind. Eng. Chem. Res. 44, 5691–5702 (2005)
System Analysis Techniques in eHealth Systems:
A Case Study
1 Introduction
Putting together wireless sensor network technologies with system analysis
techniques in distributed computing environment effects useful and powerful
tool capable to put into practise in various areas such as industry, healthcare,
sport, emergency management and entertainment. Such systems operating in
distributed environment can be composed of huge amount of sensing nodes (e.g.
Bluetooth or ZigBee interfaces) with wireless transceivers, computational units
i.e. servers and remote nodes to presents results of computing. Moreover these
systems must be capable to transfer and process giant volume of data generated
by any number of users access nodes. It must be stressed that in many cases
(it is conditional upon application area) collected data must be processed in
real-time. Additionally, in order to improve quality of usability in user’s point of
view it is necessary to assure access to systems functionalities anytime and any-
where. It means that in provided systems among other things seamless handover
mechanism must be implemented. This mechanism keeps continuous Internet
connectivity even sudden change of network occurred and it is implemented in
mobile IP protocol. There are two version of mobile IP i.e.: for IPv4 and IPv6.
Mobile IP for IPv6 based on its version for IPv4 but many improvements have
been made. Therefor the first one is strongly recommended for modern systems
based on wireless technologies [8].
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 74–85, 2012.
c Springer-Verlag Berlin Heidelberg 2012
System Analysis Techniques in eHealth Systems: A Case Study 75
events such as alarms or reminders e.g. about taking medicine. Described web-
based systems are offered among others by BodyTel, Entra Health Systems or
Lifescan [12],[13],[14]. They also provide glucometers with Bluetooth interface
or, like LifeScan, special adapters which allows to add Bluetooth interface to old
glucometers without such interface. It is very interesting and well suited solution
especially for elderly people.
On the other hand we credited prototypes in which many innovative solu-
tions are proposed. In [6] authors proposed a system which adds to contempo-
rary commercial systems some new functionalities. One of them is connected
with supporting a diet management, which is very important in the diabetes
therapy. The other functionality is an ability to predict a glucose level after an
insulin injection. To this end proposed system takes into account glucose level in
blood just before insulin injection and additionally historical data. In [3], [4] and
[5] models of glucose-insulin are proposed. It is another step forward enhance
knowledge on diabetic.
Some other example of the eHealth system for diabetes treatment is pre-
sented in [10]. This system includes controlling an insulin pump. Utilisation of
the insulin pump is very interesting and it can lead to very innovative and useful
solutions but so far there are many technical problems related to their employ-
ment in medical application.
sensor #1
Supervisor #1
Remote user
sensor #2
Internet
Gateway
Server
Data base
Sensed data can be pre-processed on personal server such as smart phone or cel-
lular phone and then transferred to server through Internet. It is worth stressing
that smart phones and/or cellular phones play two different roles in proposed
system. First of them is related to pre-processing of collected data from BAN
and PAN. The second one is to provide connection between users wireless sensor
networks and remote server through Internet. Personal server that runs on cellu-
lar phone or smart phone can be applied only to the tasks which are simple and
they are not demanding complex calculations. This unit can be used to manage
sensor units consist of BAN or PAN as well.
The central element of the system is second tier with computational units
such as servers. On the server site suitable application that provides set of func-
tionalities is running. Architecture of the application allows to configure it and
build problem-oriented scenarios. It means that it is possible to choose data
processing, supporting decision making and presentation components. These
components are used to design well-suited scenarios for certain healthcare or
wellness problem. Because of different nature of the sensed signal and problems,
different data processing methods may be used. For example, in simple appli-
cation only signal filtering algorithms are applied. Sophisticated problems may
require applied mathematical model. Thus it might be necessary to use estima-
tion algorithms which were usually time consuming. More complex problems,
to be solved, in many healthcare or wellness applications are connected with
pattern recognition and supporting decision making. In such tasks feature ex-
traction and selection must be solved first, then classification problem can be
78 K. Brzostowski, J, Drapala, and J. Świa̧tek
Charts Reports
Presentation Tier
Signal filtering
Estimation algorithms Data base
algorithms
for supporting decision making. It means that we have a tool which helps us to
improve quality of support for decision making by taking advantage of user’s
state, recognition of daily activities, his/her behaviour and environment.
The second advantage which is considered in this work and come out of pro-
posed architecture to process data from large amount of sensing units which
are placed on user’s body and his/her environment is personalisation. In pre-
vious section problem of parameter’s estimation of model has been discussed.
Mathematical model can help us to extract knowledge on considered process or
object from acquired data. Then it can be used to predict the future behaviour,
activities or actions of the specific user and it helps to improve management of
network and computational resources.
where
a4 x1
φ(x1 ) = , (3)
1 + exp(a5 − x1 )
where x1 is heart rate change from the rest (resting heart rate), u denotes speed
of the exerciser, x2 may be considered as fatigue, caused by such a factors as: –
vasodilation in the active muscles leading to low arterial blood pressure, –
accumulations of metabolic byproducts (e.g. lactic acid), – sweeting and
hyperventilation. Parameters a1 , . . . , a5 take nonnegative values.
Values of these parameters are obtained by the estimation procedure, with
use of data measured from few experiments. Typical training protocol involves
step-like functions u(t) that determine length of the resting (zero speed), the
exercise (high speed) and the recovery (walking or resting) periods (see Fig. 3).
For different training protocols, the heart rate profiles are registered and iden-
tification algorithm is applied to obtain values of parameters a1 , . . . , a5 (see
Fig. 4). Note the presence of unmeasurable variable x2 (t) within the subsystem
modeling fatigue (the block at the bottom in Fig. 4). In order to handle identifi-
cation of such a closed-loop nonlinear system, numerical optimization algorithm
must be employed (in the work [2], where the treadmill is used to control speed,
authors employ the Levenberg-Marquardt procedure). It is reasonable to make
use of stochastic search methods, such as Simulated Annealing.
u – speed
T T
Denoting x = [x1 x2 ] and a = [a1 . . . a5 ] , we may write the model
equations (1)–(3) in a compact form:
x(t) = Φ (x(t), a; u(t)) . (4)
with respect T , for given D and u(t). Fatigue constraint has the form:
max x2 (t) ≤ xmax
2 , (7)
0≤t≤T
82 K. Brzostowski, J, Drapala, and J. Świa̧tek
where x2 is related to u(t) by the model equation (4). If we want to force the
exerciser to do his best, we may require, that the highest fatigue occurs at least
at the end of the race:
x2 (T ) = xmax
2 . (8)
Moreover, the exerciser can not run faster than umax , due to the fitness level:
d
max x1 (t) ≤ Δxmax
1 , (11)
0≤t≤T dt
and for diabetic:
max G(t) ≤ Gmin , (12)
0≤t≤T
where G(t) is predicted track of blood glucose level, worked out from the model
described in the work [5].
To sum everything up, for the basic setting (involving constraints (7)–(9),
optimal training protocol u∗ (t) is solution of the following optimization task:
where U is the space of all possible function u(t) and the shortest time for the
training protocol u∗ (t) is:
T ∗ = Q (u∗ (t), D) . (14)
Inequality constraints ψ(u(t), Φ) ≤ 0 in the vector notation are:
max
x2 (t) x2
max ≤ , (15)
0≤t≤T u(t) umax
where x2 (t) is derived from the model Φ (equation (4) and there is one equality
constraint g(u(t), Φ) = 0:
x2 (T ) = xmax
2 . (16)
As stated before, the space U of all possible functions u(t) is limited to compo-
sition of step-like functions, as shown in Fig. 4. The process of training protocol
design may be simplified by parametrization of the function u(t). Parameters
should define: length of the periods (resting, exercise and recovery) and associ-
ated speeds (recovery period has zero speed by default). The order of periods
determine their start and stop time instants. It is also possible to assign a prede-
fined speed to each period, which stems from the fact, that people walk and run
with their characteristic speeds. Parametrization makes optimization easier, but
introduces the problem concerning the total number of parameters describing the
System Analysis Techniques in eHealth Systems: A Case Study 83
solution u(t). Since the footrace is completed after the distance D is done, the
number of parameters may vary, depending on values of parameters. Thus, only
optimization methods that may perform search the space with varying number
of dimensions, such as Simulated Annealing or Evolutionary Algorithms, may
be applied.
5 Summary
In the work a eHealth system to support planning training protocol for exerciser
are presented. In order to solve introduced problem the system for distributed
computing environment based on PAN and BAN and remote server is proposed.
Benefits of the proposed architecture which are discussed in details are context-
awareness and concept of personalisation. As it was stressed they impact on
quality of usability of the system and improve management of network and
computational resources.
Introduced eHealth system is employed to acquire and deliver data and signals
for e.g.: identification, optimisation and control/management. In this work prob-
lem of support management of dynamic exercise intensity for proposed system
is formulated. To this end relationship between heart rate and exercise intensity
is applied. At the end possible solution by use of PID controller is discussed.
References
1. Cheng, T.M., Savkin, A.V., Celler, B.G., Su, S.W., Wang, L.: Nonlinear Modeling
and Control of Human Heart Rate Response During Exercise With Various Work
Load Intensities. IEEE Trans. On Biomedical Engineering 55(11), 2499–2508 (2008)
2. Cheng, T.M., Savkin, A.V., Celler, B.G., Su, S.W., Wang, L.: Heart Rate Regula-
tion During Exercise with Various Loads: Identification and Nonlinear Hinf Control.
In: Proc. of the 17th World Congress of The International Federation of Automatic
Control, pp. 11618–11623. IFAC, Seoul (2008)
3. Dalla, M.C., et al.: GIM, Simulation Software of Meal Glucose-Insulin Model. Jour-
nal of Diabetes Science and Technology 1, 323–330 (2007)
4. Dalla, M.C., et al.: Meal Simulation Model of the Glucose-Insulin System. IEEE
Transactions On Biomedical Engineering 54, 1740–1749 (2007)
5. Dalla, M.C., et al.: Physical Activity into the Meal Glucose-Insulin Model of Type
1 Diabetes: In Silico Studies. Journal of Diabetes Science and Technology 3, 56–67
(2009)
6. Grandinetti, L., Pisacane, O.: Web based prediction for diabetes treatment. Future
Generation Computer Systems 27, 139–147 (2011)
7. Greene, B.R., McGrath, D., O’Neill, R., O’Donovan, K.J., Burns, A., Caulfield,
B.: An adaptive gyroscope-based algorithm for temporal gait analysis. Journal of
Medical And Biological Engineering And Computing 48, 1251–1260 (2010)
8. Johnson, D., Perkins, C., Arkko, J.: RFC: 3775. Mobility Support in IPv6. Tech-
nical report, Network Working Group (2004)
9. Lornicz, K., Chen, B., Challen, G.W.: Mercury: A Wearable Sensor Network
Platform for High-Fidelity Motion Analysis (2009)
System Analysis Techniques in eHealth Systems: A Case Study 85
Abstract. This paper proposes a solution algorithm to locate the facial features
on the human face images. First, the proposed algorithm determines the face
region based on skin-tone segmentation and morphological operations. Then,
we locate the facial features (i.e. brows, eyes, and mouth) by their color
information. Finally, this algorithm set the shape control points based on the
Facial Animation Parameters in the MPEG-4 standard on the located facial
features. Results of experiments to the face images show that the proposed
approach is not only robust but also quit efficient.
1 Introduction
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 86–101, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Detection of Facial Features on Color Face Images 87
to detect facial features based on their photometric appearance [7-9]. Recently, more
efficient methods have been introduced, e.g., the voting method by Lowe [10] and the
statistical method by Weber [11]. Lowe has his own method for selecting and
representing keypoints [10], but the applications of Weber’s approach in [12] utilize
unsupervised descriptors by Kadir [13].
This study utilizes color human face images for locating the facial features. Unlike
other methods utilizing grey face images, color images contain more messages
regarding the facial features to enhance the locating features. Color images are also
good for various applications with improved result [13-17]. Therefore, to remove the
effect coming from background or other objects of the non-face area, the first step is
to divide the image into skin color and non-skin color areas so that the interesting
areas of the face image can be cut from the imaged. This step can effectively reduce
the computer burden and required follow-up of pixels. Also, the range of locating
features can be limited in the area on the known color regions to improve the solution
accuracy.
This paper presents a locating algorithm to determine the positions of the facial
features on the human face images. First, the proposed algorithm determines the face
region based on skin-tone segmentation and morphological operations. Then, we
locate the positions of brows, eyes, and mouth by their color and geometry
information. Finally, this algorithm makes the control points by the Facial Animation
Parameters in the MPEG-4 standard on the located facial features.
The remainder of the paper is organized as follows. In section II, we briefly review
the method of feature-based object detection and localization. Section III presents the
proposed solution algorithm to locating the facial features. Section IV devotes to the
experiments on the face database. The last section gives the conclusion.
2 Proposed Method
This study utilizes CbCr color space to build the criteria for segmenting skin colors.
YCbCr is a family of color spaces used as a part of the color image pipeline in video
and digital photography systems. Y is the luminance component and Cb and Cr are
the blue-difference and red-difference Chroma components. YCbCr commonly used
in television produced film, video, image compression, different image signal
processing between devices. Sometimes known as YPbPr, is a non-absolute color
space, which the three components that cannot be accurate to form a color, so the
need to convert the RGB color components of YCbCr space to the corresponding
value. According to the literature [18] and [16], there is not with direct relationship
between the skin of distribution and brightness in YCbCr color space. If the
brightness is considered as the index for color segmentation, the non-uniform
brightness distribution may affect the results on verdict. Therefore, the luminance
component can be removed, not included in the calculation. The detection on different
ethnic skin color in CbCr space at the same time within similar color can also be [17],
with considerable robustness.
After the color image segmentation, the background due to other objects or light
effects, there will be left a little under discontinuous and scattered blocks of varying
size ratio on the images. If not dealt with these noises may conduct errors on the
88 H.-W. Wang et al.
process images and also will increase the computation complexity. Therefore, this
step uses morphological image processing operation including dilation, erosion,
closing, area fill, the regional clean, bridge, Spur and Majority computing [15] to
remove irrelevant, redundant blocks and the gap on connection, to fill the outline of
the slender gap and the missing link finer pixels. Morphology with swelling, erosion
of basic operations are built and used interchangeably for the follow-up processes.
The implementation of these operations needs elements of a rectangular structuring
element for processing on the images. These operation are divided into binary and
grays images computing. As the operations on the two kinds of images with different
definitions, the following definitions of the operations are dedicated to the binary
image with detailed description where Z is defined as two-dimensional integer space,
A is the object image for processing, and B is the structural elements.
The expansion opearation is defined as: A and B are two sets in the Z space, the B
of A to B every time one direction along the center of the boundary moves along the
perimeter of A, every move in all its original point for reflection and shift reflecting
this z units, making the reflection of B and A overlap at least one element of the
collection for utilizing the displacement of the B to A to expand the boundary. The
shape and size of the structure element B can be defined as a rectangular or round and
so on. Usually it is set in a 3 × 3 array as the smallest unit size with fill in the value 1
for the direction of operation, the rest as 0. A swelling expressed by B as in equation
(3-4) and (3-5). Assuming the structure element B with length and wide as 2d unit,
A's wide and length as 2x unit, the expansion is executed after every point along the
circumference of A against B, extension center, making the A's side into 2 (x + d)
units, as illustrated in Fig. 3-1.
{(
A ⊕ B = z Bˆ ) z
∩A≠∅ } (1)
⎧
⎩ ⎢⎣
( )
A ⊕ B = ⎨ z ⎡ Bˆ
z ⎥⎦
⎫
∩ A⎤ ⊆ A ⎬
⎭
(2)
Detection of Facial Features on Color Face Images 89
This experiment of this project utilizing the images with neutral expression instead of
expression images because the positioning accuracy of the locating features will be
affected by wrinkle on faces and the samples regarding expression images are not
readily available. After segmenting face region image, the easily recognizable face for
the eyes, eyebrows, mouth are set for the target features. These features was cut out
and marked on their four corners of the right, left, upper and lower boundary points
for the target. The choices of these boundary points are defined by reference to the
MPEG-4 FAP feature points [19] and Song et al. [20] for locating the main affecting
of the expression features as the action points.
According to the literature [15, 21, 22], from the point of view of gray scale images of
human faces, we can found that the gray scale value in the pupil and iris is usually lower
than that of the surrounding skin color, i.e. the color in the gray-scale image would be
more black. With this feature, the RGB color image (that the region has been previously
cut) was converted to gray-scale image. By setting a threshold value, the gray-scale
image was mapped into binary image for separating the eye area out from the image
block. This study used a single threshold value to transfer the image into binary image.
We set the initial value of the threshold value as T0, maximum Tmax, and a fixed
increasing value Tstep. Between T0 to Tmax with the different thresholds to images for
transferring into binary pattern, each value of pixels larger than the threshold value is set
to white, the rest of the pixels set to black. There are (Tmax - T0) / Tstep sheets of image
produced in this step. Then the establishment of rules is applied to analyze each image in
the black area (i.e. below the threshold). By these rules, this process eliminates
unnecessary blocks to determine which of the two regions may be the position of the eye.
Its rules by two geometric position relationship building as follows:
(1) The horizontal distance between the centers of the two eyes will be limited in a
fixed-pixel range.
(2) The vertical distance between the centers of two eyes will be limited within a pixel
range.
(3) The block size of two eyes with its length and width will be limited to a ratio of
pixel range.
After filter out the eyes of the candidate blocks, this algorithm use 2-D correlation
coefficient to calculate the degree of similarity between blocks. Before calculating
correlation coefficient, because two of the shape is symmetric, one of the blocks
should be flipped for relatively high similarity. The 2-D correlation coefficient
represents the similarity of two blocks, coefficient does not have a specific unit, and
its value in the range of +1 to -1. Its value closer to 1, the relationship that the more
similar for the two variables; otherwise closer to 0, indicating that two variables are
less similar. The positive value of a 2-D correlation coefficient imply positive change
of a variable with the other variables will also increase at the same time, i.e. a positive
relationship between the two variables. The 2-D correlation coefficient with negative
value represents an increase in one variable while the other variable will be reduced,
in other word the relationship with reverse. It is noteworthy that the closer to 1 the
value of a positive sign.
The following rules are utilizing to locate the eye area:
90 H.-W. Wang et al.
(1) If r is greater than or equal to 0.5, the corresponding area can be identified as
the eyes.
(2) If r is less than 0.5 but greater than 0, the corresponding area can be identified
as the eyes.
(3) If r is less than or equal to 0, the corresponding area can be identified as the
eyes.
According to the eye location coordinates (Eye_C_XL, Eye_ C_YL; Eye_ C_XR, Eye_
C_YR) determined by the above rules, the position and the color relationships between
the eyes and eyebrows, the rules to find the position of the eyebrows are presented.
(1) The location of eyebrows must be above the eyes.
(2) The pixel of the vertical gap between the eyebrows and eyes is in a certain
range.
(3) The width of the eyebrows is usually longer than the width of the eye within a
certain pixel.
According to the above rules, we can create a mask to limit the search in the eyes of
some of the top non-color region. The area identified as the location of the eyebrows,
and the boundary points marked on the image, the positioning step is completed. This
study does not directly use the characteristics of color to search eyebrow in order to
avoid the light of their impact [15], which leads to search hard to control. Coordinates
to get the eyebrows (Brow_XL(i, j, k), Brow_YL(i, j, k); Brow_XR(i, j, k), Brow_YR
(i, j, k)) where the Brow_X is the eyebrows of X coordinates, Brow_Y is for the Y
coordinates, subscript L, R, represent lift and right, respectively, i, j, k define the
coordinates of boundary points.
During the experiment on the skin segment, the mouth region has its specific color
distribution observed in the YCbCr color space, and surrounding skin are obvious
differences, and took possession of a certain size of the pixel area. We apply this
feature to the mouth region segmentation and mark their location boundary points.
The results from the observation and found the mouth of the boundary points will
occur the phenomenon of migration, and expected some drop. To reinforce
positioning accuracy, the lip of the mouth, the cross point of the lips and the midpoint
of this line are utilized to assist correct boundary points. The lip in the HSV color
space is particularly evident and in the literature [21] also proved this view point.
After segmenting the lip line according to the differences in color, the original
boundary line is change to the midpoint of the calibration reference. So to get the
right boundary point to (Mou_X (i, j, k, l), Mou_Y (i, j, k, l)) where Mou_X is the X
coordinate of the mouth, Mou _Y is the Y-coordinates, i, j, k, l denote the coordinates
of boundary points.
3 Experimental Results
This study utilizes Georgia Tech Face Database [23] for testing. The database
contains fifty images of different people each with fifteen different angles, lighting,
and expression of images. These images are the size of 640*480 JPEG color format.
We select positive and expressionless images to test the proposed algorithm.
Detection of Facial Features on Color Face Images 91
According to the process presented in section 3, after loading the image processing
for color image segmentation, test images is studied with different color space
histogram to obtain the best color distribution space. On color images in different
color spaces and more layers of different shades of gray that the composition of the
image histogram statistics of the different range of pixel gray levels, these information
are get to understand the concentration and distribution trends. Figure 2, 3 and 4
illustrate the typical test examples from Georgia Tech Face Database.
Fig. 5 shows the HSV color space histogram statistics on Fig. 2. From the Fig. 5,
the distribution regarding the saturation component S concentrated in the 0.2 to 0.4
with a certain trend. The gray value distribution is more extreme due to the effect of
color component H. Hence, the search will obviously cover the whole HSV space
Detection of Facial Features on Color Face Images 93
[0, 1] making choice of unnecessary pixels. Figure 6 displays the original image
conversion to HSV space and Figure 7 shows the segment of HSV color space.
From Fig. 7, the color segmentation of images effect by light and shadow resulting
in many sub-blocks belonging to the face area not to be identified as the color and
also resulting in the presence of noise. In order to maintain the integrity of the face for
locating features, it is need to filter noises from images by the morphology operations.
The morphology operations are logical operating. In order to facilitate the process, the
color image transfer into binary image shown in Fig. 8. Fig. 9 exhibits the filter
results.
The colors of pupil and its around area are with difference gray value. Therefore, we
segment image through setting suitable threshold value. Refer the suggestion in [15], the
threshold can set at the range between 0.1 to 0.6. From the experiments, while the
threshold value is greater than 0.3, the segmentation image information in usability will
gradually become less and has become more difficult for future process. The optimal
threshold is set in the range of 0.13 to 0.22 for eye segmentation. For conservative
estimate, the range of threshold is set in 0.1 to 0.3. The next step followed by the image
segment is the establishment of rules to determine which blocks should be removed due
to useless. Because the locations of eyes and face are inside the face, we can determine
the black block covered with white block to be filtered out many of the extra block.
After determining which regions are inside the face, the rules of the geometric
relationship between two eyes are developed to determine which block blocks are eyes as
follows.
(1) The horizontal distance of center of two blocks within the level 45 to 75 pixels.
(2) The difference on the vertical distance of center of two blocks is less than 20
pixels.
(3) The range of two blocks is within 40 × 20 size.
Then calculating the correlation coefficient r of the two candidate blocks is for judging
eyes. If r is greater than 0.5, the two blocks can be identified as the eyes. If r is less than 0
the two blocks will not be identified as the eyes. Moreover, If the value of r between 0.5
and 0, selecting the maximum value of r corresponding to the block identified as the
eyes. Finally, the center will mark on the two blocks as shown in Fig. 10.
In fact, the locations of the eye points (detected by the above rules) are quite near.
There are only differences in the decimal point. Therefore, the results of all the
coordinates are obtained by rounding to integer. The coordinates with the highest
frequency outcome is set as the eye position shown in Figure 11.
Fig. 10. The center of the located tagets marked on the blocks
96 H.-W. Wang et al.
The following rules describe the positioning of the eyebrows are utilized to locate
the eyebrows’ position.
(1) Eyes should be above the eyebrows.
(2) The vertical gap within between the eyebrows and eyes is about 20 pixel.
(3) The width of eyebrows usually is higher than the width of the eye within
approximately 30 pixels.
The above rules (1) limite the searching in the area above eyes only, (2) only searching
the region with non-color pixel value, and (3) stores the result by the binary image
format. with the formation of a mapping diagram. Figure 12 shows the position of the
eyebrows represented as white non-color region. Figure 13 is the located position marked
with white color.
Final disposal of the locating feature is the positioning of the mouth, the same use
in the YCbCr space around the mouth and skin color of the color differences, to
segment the mouth differently. However, due to use of the Y component, it is easy to
select the non-mouth area to the rest of the region of mouth shown in Fig. 14.
Therefore, morphological operations required to dispose of the extra little noises to
get the full outline of the mouth area as shown in Fig. 15. The following rules are to
locate mouth on the binary image of white connected set with noise treatment.
(1) The position of mouth is at 1/2 to 3/4 of the face image.
(2) The area of mouth is usually above 900 pixels in the face image.
(3) The ratio of ratio of width to length on mouth is around the range 2:1 to 3:1.
After further screening by the rules, the final determination of the mouth area shown
in Fig. 16. Although its shape is not the complete line shape of the mouth, somewhat
prominent and distorted edges are generated, the outline is in the acceptable range.
Finally, the results of the located position of mouth map to the RGB space, the images
show the locating region of the mouth as in Fig. 17.
Detection of Facial Features on Color Face Images 97
Fig. 12. The position of the eyebrows represented as white non-color region
Fig. 16. The located mouth on the binary image of white connected set with noise treatment
4 Conclusions
This paper proposed a face detection algorithm for facial color images using a skin-
tone color model and facial features. First, the proposed algorithm determines the face
region based on skin-tone segmentation and morphological operations. Second, we
locate the facial features by their color and geometry information. Finally, this
100 H.-W. Wang et al.
algorithm set the shape control points according to the Facial Animation Parameters
in the MPEG-4 standard on the located facial features. The proposed method has been
tested by images from Georgia Tech Face Database. Experiment results show that this
method works well with the facial images.
References
1. Ahn, S., Ozawa, S.: Generating Facial Expressions Based on Estimation of Muscular
Contraction Parameters From Facial Feature Points. In: IEEE International Conf. on Systems,
Man, and Cybernetics, The Hague, Netherlands, October 10-13, vol. 1, pp. 660–665 (2004)
2. Zhang, Q., Liu, Z., Guo, B., Terzopoulos, D., Shum, H.Y.: Geometry-Driven Photorealistic
Facial Expression Synthesis. IEEE Transactions on Visualization and Computer
Graphics 12(1), 48–60 (2006)
3. Song, M., Tao, D., Liu, Z., Li, X., Zhou, M.: Image Ratio Features for Facial Expression
Recognition Application. IEEE Transactions on Systems, Man, and Cybernetics – Part B:
Cybernetics 40(3), 779–788 (2010)
4. Pentland, A., Moghaddam, B., Starner, T.: View-based and modular eigenspaces for face
recognition. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 1994),
Seattle, WA (1994)
5. Min Huang, W., Mariani, R.: Face detection and precise eyes location. In: Proc. Int. Conf.
on Pattern Recognition, ICPR 2000 (2000)
6. Huang, J., Wechsler, H.: Eye detection using optimal wavelet packets and radial basis
functions (rbfs). Int. J. Pattern Recognit. Artif. Intell. 13(7), 1009–1025 (1999)
7. Chellappa, R., Wilson, C.L., Sirohey, S.: Human and Machine Recognition of Faces: A
Survey. Proc. IEEE. 83, 705–740 (1995)
8. Lam, K.L., Yan, H.: Locating and Extracting the Eye in Human Face Images. Pattern
Recognition 29(5), 771–779 (1996)
9. Smeraldi, F., Carmona, O., Bigun, J.: Saccadic Search with Gabor Features Applied to Eye
Detection and Real-time Head Tracking. Image and Vision Computing 18(4), 323–329
(2000)
10. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput.
Vis. 60(2), 91–110 (2004)
11. Weber, M.: Unsupervised learning of models for object recognition. Ph.D. dissertation,
California Inst. Technol., Pasadena (2000)
12. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-
invariant learning. In: Proc. IEEE Computer Society Conf. Computer Vision and Pattern
Recognition, pp. 264–271 (2003)
13. Kadir, T.: Scale, saliency and scene description. Ph.D. dissertation. Oxford Univ., Oxford,
U.K (2002)
14. Phung, S.L., Bouzerdoum, A., Chai, D.: Skin Segmentation Using Color Pixel
Classification: Analysis and Comparison. IEEE Transactions on Pattern Analysis and
Machine Intelligence 27(1), 148–154 (2005)
15. Tao, L., Wang, H.B.: Detecting and Locating Human Eyes in Face Images Based on
Progressive Thresholding. In: Proc. of the 2007 IEEE International Conf. on Robotics and
Biomimetics, Sanya, China, December 15-18, pp. 445–449 (2007)
16. Berbar, M.A., Kelash, H.M., Kandeel, A.A.: Faces and Facial Features Detection in Color
Images. In: Proc. of the Geometric Modeling and Imaging –New Trends, July 05-06,
pp. 209–214 (2006)
Detection of Facial Features on Color Face Images 101
17. Guan, Y.: Robust Eye Detection from Facial Image based on Multi-cue Facial Information.
In: Proc. of the 2007 IEEE International Conf. on Control and Automation, Guangzhou,
China, May 30- June 1, pp. 1775–1778 (2007)
18. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Prentice Hall, New
Jersey (2002)
19. ISO/IEC Standard 14496-2, Coding of Audio-Visual Objects: Visual (October 1998)
20. Song, M., Tao, D., Liu, Z., Li, X., Zhou, M.: Image Ratio Features for Facial Expression
Recognition Application. IEEE Transactions on Systems, Man, and Cybernetics – Part B:
Cybernetics 40(3), 779–788 (2010)
21. Ding, L., Martinez, A.M.: Features versus Context: An Approach for Precise and Detailed
Detection and Delineation of Faces and Facial Features. IEEE Transactions on Pattern
Analysis and Machine Intelligence 32(11), 2022–2038 (2010)
22. Vezhnevets, V., Sazonov, V., Andreeva, A.: A Survey on Pixel-Based Skin Color
Detection Techniques. In: Proc. of Graphicon 2003, Moscow, Russia, pp. 85–92
(September 2003)
23. Georgia Tech Face Database, http://www.anefian.com/face_reco.htm
Adaptive Learning Diagnosis Mechanisms for E-Learning
YuLung Wu
Abstract. In class teaching with a large number of students, teachers lack sufficient
time in understanding individual student learning situation. The framework of
learning activity in this study is based on the Learning Diagnosis Diagram. Before
conducting learning activities, teachers must prepare Learning Diagnosis
Diagrams. This work proposes an adaptive Learning Diagnosis Diagram to
consider differences among students. The proposed system provides a personalized
Learning Diagnosis Diagram for individual students and adjusts learning phases to
automatically fit student achievement. The learning evaluation demonstrates the
effectiveness of the proposed method. The evaluation shows that the Learning
Diagnosis Diagram can provide an adaptive learning environment for students.
1 Introduction
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 102–110, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Adaptive Learning Diagnosis Mechanisms for E-Learning 103
of each student. A good student only needs to learn less but more important learning
content. Conversely, a poor student must study content in detail and evaluate their
weaknesses. Teachers need only develop full and detailed course content. The
proposed algorithm handles the different styles of students automatically in the
course.
2 Related Works
database server. The learning portfolios of each student include learning duration,
login frequency, Q&A, discussion content, homework and statistical data. These
learning portfolios enable the finding of students’ learning achievement, and enable
the automatic building of adaptive learning materials.
In this research, the learning activities of all students are guided by the proposed
system. This research adopts Learning Diagnosis Diagram(LDD) as the basis of the
proposed algorithm. First of all, the main definition is given.
The LDD comprises teaching materials and test, where the LDD is defined by experts
or teachers, and each learning node in the LDD has a related teaching material and
test that are designed by teachers. From Fig. 1, a link exists between learning node 1
and 4, and the direction of the arrow represents the learning sequence. In the
definition presented here, learning node 1 is a prior-knowledge of node 4. Similarly,
students should learn learning node 1 before node 4 in the learning activities.
The LDD is created by the teacher before the start of the course. The architecture
of the LDD follows the course plan, including course contents and course sequence.
This research adopts the course “Electronic Circuits Laboratory” as the evaluation
course. The course is taught at the sophomore level. The "Electronic Circuits
Laboratory" course introduced electronic devices and instruments used in circuits, and
provided students with opportunities to implement circuit theories. The course
familiarized students with basic theories and virtual circuits. The detailed information
of the evaluation is described in section 4. Fig. 2 shows the LDD in this evaluation.
The LDD was created by the course teacher.
Adaptive Learning Diagnosis Mechanisms for E-Learning 105
In our recent research [5][6][7], LDD was developed and obtained positive results
in three learning evaluations with amount 197 participants. To analyze response from
teachers, one of important conclusion is the design of LDD is only considering many
students, not all students. In other words, it is difficult to fit all students for one LDD.
Before any learning activities, the teacher must prepare a LDD. During learning
activities, students are guided by the LDD. The diagram includes all course content
and course calendar. But in a large class, there are many students with different
characteristics. Students with high-level ability learn material quickly and
systematically. Students who belong to this category do not need to learn detailed
information. A more effective method for these students is learning more about
important topics. Conversely, poor students learn step by step and acquire the
necessary information about a topic and, then, check their progress.
Using the collecting data and learning portfolio, the LDD records the performance
for each student and is used to analyze the characteristics of their learning status. The
main notion of the diagram is to identify similar learning content and combine them
into one learning node. According to Fig. 2, the LDD is an initial diagram designed
by instructors. The initial diagram contains the maximum number of learning nodes,
implying the maximum number of learning steps. Students with a poor performance
must learn diligently. These students learn and follow the LDD sequentially.
However, detailed learning steps are not always advantageous for high performing
students. For students with varying learning performances, learning plans with
appropriate styles must be developed.
Prior to learning activities, students are divided into groups based on their learning
performance of prior-test. Good students are guided by the LDD with fewer learning
nodes. Poor students are guided by the LDD with more learning nodes. Students then
begin their learning activities. To adapt to the individual needs of each student, the
LDD adjusts automatically with the following algorithm. A student that fails on a
learning node implies difficulty of the learning node for the student. Further
information must be obtained to identify the cause of failed learning nodes. The
learning node that the student fails, which is combined with similar learning nodes, is
expanded. The student must thus complete the expanded nodes to resolve the exact
learning problem.
The following is the definition of algorithm that used in the LDD.
106 Y.L. Wu
Definition 2. Item Success index (ISI) represents the success learning of one learning
node. To measure the degree of success learning, this research considers the ratio of
right answers to total answers for one learning node. Value ISIA,i represents the item
success index of learning node A for student i and ISIA is the average success index of
≦ ≦
learning node A for all students. ISIA,i = (The number of right answers of student i /
the number of answers of student i) * 100%, and 0 ISIA,i 1.
n
1
ISIA= ∑ ISI A when there are n students in a class.
n i =1
An easy learning node often means many students can learn this learning node
successfully. This research collects all answer of each learning node and calculates the
correct radio, ISI. A small ISIA value denotes that learning node A fails and hard to learn,
meaning the learning node is necessary. And a large ISIA value denotes the learning node
is easy, meaning learning node can be ignored: i.e., if the learning node is easy, that
means the learning node is not critical to students who perform well. This research adopts
ISI as a factor to combine the easy learning nodes into one learning node.
Definition 3. The correlation coefficient (CC) represents the correlation between two
learning nodes. CCAB represents the CC between learning nodes A and B.
n
∑ ( ISI A, i − ISI A )(ISI B , i − ISI B ) . Since negatives are not meaningful in this study,
CC AB = i =1
n n
≦CC ≦1.
i =1 i =1
The correlation value represents a degree of correlation, ranging from 1 to 0. The value
of 1 denotes that the data of these groups are the same; a 0 value shows that these groups
have no relationship with each others. From the answers, CCAB shows the degree of
similarity for learning nodes A and B. A larger CCAB value denotes that students have the
same learning performance, or we can say the attribute of two learning nodes is similar.
That means two learning nodes contain similar or related content. This research assumes
that two learning nodes that have a large CCAB value are similar learning nodes. This
research therefore combines similar learning nodes into one learning node.
Definition 4. Learning node Correlation Factor - The Learning node correlation factor
≦ ≦
(LCF) between learning node A and B, LCFAB = ((ISIA+ISIB) / 2 × CCAB, and
0 LCF 1. The LCFAB represents the degree of correlation for learning node A and
B by evaluating success and similarity.
This research utilizes CT to determine which learning nodes are combined. The CT
value can be adjusted by teachers. A large CT contains more learning nodes, and a
small CT contains fewer learning nodes. Teachers can preset some CTs that belong to
Adaptive Learning Diagnosis Mechanisms for E-Learning 107
students with different abilities. This proposed strategy generates automatically these
related LDDs. The proposed algorithm is shown in Fig. 3.
The learning process designed in this study is shown in Fig. 4. Each learning node
contains two steps. The first step is the learning step. In this step students must study
the on-line learning materials. The second step is testing. During this step students
take exams related to the learning materials. If students pass the test, they continue to
study the next learning node according to their LDD. If students fail the test, it means
the learning node is difficult, and thus students require easier and more detailed
learning materials. The learning system expands the failed node that is combined with
CT. Students who fail the test learn these expanded nodes to determine real problems
in their learning activities.
With regard to learning activities, this study focuses on the “Testing step”. The
adaptive testing in the LDD is designed for students with different learning styles.
The main idea of this research is detailed information for poor students and reduced
information for good students. The detailed information indicates the expanded
learning node while the reduced information indicates the combined learning node.
The following is an important issue: “Does a test result from a combined learning
node equal the test results from various expanded learning nodes?”. The following
section proposes a learning evaluation to discuss this issue.
4 Learning Evaluation
This section presents learning evaluation of this research. The participants were
eighty students. All of the research participants had experience of using computers
108 Y.L. Wu
and the Internet. Each learning node has ten questions for test, so there are 150
questions in test item bank. All questions are multiple-choice questions.
The LDD in this research was constructed by two professors who had taught the
course for many years, and is shown in Fig. 2. All learning nodes were taught in this
evaluation in a semester. When designing the course learning activities, team
assignments, conversation time and team examinations were arranged on a weekly
basis. In the evaluation activities, all participants must participate in the instruction
provided by the instructor. Meanwhile, in extracurricular activities, all participants
login to the system and complete some review tests before taking the next course.
Figure 2 is a LDD with CT=1 and is a full LDD. The LDD contains all learning
nodes and teaching materials and is suitable for low-level ability students. In Fig. 5a
to 5c, the number of learning nodes was reduced progressively. Many similar learning
nodes are combined progressively. Those learning nodes not combined were more
important and difficult. Good students can learn by using the small LDD. From the
results in Fig. 5a to 5c, the combining process also showed that those basic learning
nodes (upper) and advanced theories and learning nodes (lower) of Electronic Circuits
were retained. At the center of a diagram, there are basic theories which were
combined together. In Fig. 5c, all learning nodes were grouped into three learning
nodes: basic concepts, basic theories and advanced theories. The LDD generates
different styles of learning content according to the ability of each student.
This study discusses the following issue: “Do test results from one combined
learning node represent all nodes within the combined node?”. Moreover, for
evaluating the relation between learning effect and student ability, all participants are
separated into three groups. The evaluation includes 115 participants. Participants are
separated into a high-ability group, medium-ability group and low-ability group based
on pre-test score. Both the high-ability and low-ability groups contain 31 participants,
and the medium-ability group contains 53 participants. After all students finish their
learning activities with LDDs, their LDDs and learning results are collected.
Adaptive Learning Diagnosis Mechanisms for E-Learning 109
5 Conclusions
In class teaching with a large number of students, the teachers are unable to spend much
time on understanding individual student learning status. As a result, the fast-learning
students thoroughly grasp the contents being taught in class, while the slow-learning
110 Y.L. Wu
students fall further and further behind, and eventually the education system gives up on
them. In order to assist teachers to manage large classes, this research proposed a LDD
system. The system provides personalized learning environment for individual students
and adjusts learning phases to fit achievement of student automatically.
This research proposes a framework for remedial learning. In this research, we find
out the LDD with different learning styles. In the learning evaluation, the adaptive
LDD is suitable for high-ability students. Furthermore, the number of combined
learning nodes cannot exceed four. Moreover, a suitable value for CT is between 1
and 0.5. Too small CT leads to too many learning nodes being combined, indicating
decreased accuracy of the LDD test.
In the learning evaluation, the accuracy of CT=0.5 and 0.45 is not good enough for
medium and low-ability students. There may be some combined factors that are not
considered in the LDD. Possible such factors include the relationship between
learning nodes and learning order. In the LDD, merging and expansion do not
consider the relationship between learning nodes. Merging two learning nodes may
cause combined learning nodes to mix with too much content or with very different
content. The other factor is that the combined learning node does not consider
learning order. When the evaluation course is being taught, two learning nodes may
be separated into numerous weeks, but they are combined into a learning node that
represents they learned simultaneously. In the future, research will be conducted on
the two factors to increase the efficacy of the LDD.
References
1. Chang, F.C.I., Hung, L.P., Shih, T.K.: A New Courseware Diagram for Quantitative
Measurement of Distance Learning Courses. Journal of Information Science and
Engineering 19(6), 989–1014 (2003)
2. Goldsmith, T.E., Davenport, D.M.: Assessing structural similarity of graphs Pathfinder
associative network: studies in knowledge organization, Norwood, pp. 75–87 (1990)
3. Goldsmith, T.E., Johnson, P.J., Acton, W.H.: Assessing Structural Knowledge. Journal of
Educational Psychology 83(1), 88–96 (1991)
4. Hwang, G.J.: A conceptual map model for developing intelligent tutoring systems.
Computers & Education 40(3), 217–235 (2002)
5. Jong, B.S., Chan, T.Y., Wu, Y.L.: Learning Log Explorer in E-learning Diagnosis. IEEE
Transactions on Education 50(3), 216–228 (2007)
6. Jong, B.S., Lin, T.W., Wu, Y.L., Chan, T.Y.: An Efficient and Effective Progressive
Cooperative Learning on the WEB. Journal of Information Science and Engineering 22(2),
425–446 (2006)
7. Jong, B.S., Wu, Y.L., Chan, T.Y.: Dynamic Grouping Strategies Based on a Conceptual
Graph for Cooperative Learning. IEEE Transactions on Knowledge and Data
Engineering 18(6), 738–747 (2006)
8. Novak, J.D., Goin, D.B.: Learning how to learn. Cambridge University Press, New York
(1984)
9. Ruiz-Primo, M.A., Schultz, S.E., Li, M., Shavelson, R.J.: Comparison of the Reliability
and Validity of Scores from Two concept-Mapping Techniques. Journal of Research In
Science Teaching 38(2), 260–278 (2001)
10. Sowa, J.F.: Conceptual graphs for a data base interface. IBM Journal of Research and
Development 20(4), 257–336 (1976)
New Integration Technology for Video Virtual Reality
Wei-Ming Yeh
Department of Radio & Television, National Taiwan University of Arts, Taipei, Taiwan 220
[email protected]
Keywords: Augmented Reality, Cyber Codes, Fiduciary marker, HDR, Smart AR.
1 Introduction
Since early 2009, as soon as the digital video technology had great success in many
fields, such as: video game, mobile phone, and digital video camera (DSC). Many small
photo industries in Japan, realized that digital video technology could be the only way to
compete with two giant photo tycoons (Canon and Nikon) in the market. In fact, Canon
and Nikon have dominated the DSLR and SLR market for decades, and famous for high
quality lens, CMOS and CCD sensor, processor, and accessories, not willing to follow
this trend, to develop such fancy digital scene effect for their high-level DSLR camera
surprisingly. Currently, Sony, M 4/3 family, and Samsung, realized that using digital
video technology may achieve a niche between high-end DSLR, and consumer-level
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 111–117, 2012.
© Springer-Verlag Berlin Heidelberg 2012
112 W.-M. Yeh
DSC, offering shining stars, such as: Sony HX-1, HX-100V, Panasonic GF1, GF2, GF3,
and many other products, recognized as prominent DSCs in TIPA 2009, 2010, 2011
Awards. It is a simple fact that DSLR-like camera and Mirrorless Interchangeable Lens
Camera (MILC or EVIL) camera, with plenty of digital video technology (digital scene
effects or customized scene effects), could attract many armature consumers and white-
collar female consumers, offering giant marketing breakthrough, and survived in this
margin-profit camera market. Actually, since 2007, the IT industries have co-operated
with camera manufactures, developed many built in IC chips with brilliant customized
scene effects for cameras (DSC mainly), such as: Face Detection Technology, Smile
Shutter Mode, Full frame HD, CCD Anti Shake system, Live View, and many others. In
fact, within few years, in early 2009, we found something new, such as: Back-illuminated
CMOS image sensor[3], Sweep Panorama, Joining Multiple Exposed Patterns, Night
Scene Portrait ,Motion Remover[7], that would challenge multimillion dollars business
not in DC but highly profitable DSLR market, and may create a new possible attraction
for photo-fans to replace their old camera.
In addition, since 2010, after the introduction of 3D movie (AVATAR), people all
over the world seems to enjoy this 3D “mania” in their life, no matter 3D monitor
for hardware, and 3D Panorama video effect for photo image., and Smart AR just
another new technology wonder, which may create huge marketing benefit ahead. In
fact, Smart AR is one of Sony new VR technology, which is related to Augmented
reality (AR) , widely adopted by sports telecasting, video games, automobile, even
jet-fighter weapon system(F-35), and ready for advanced video effect someday. It can
offer a term for a live direct or an indirect view of a physical, real-world environment
whose elements are augmented by computer-generated sensory input, such as sound
or graphics. It is related to a more general concept called mediated reality, in which a
view of reality is modified (possibly even diminished rather than augmented) by a
computer. As a result, the technology functions by enhancing one’s current perception
of reality. By contrast, virtual reality replaces the real world with a simulated one [5].
2 Technical Innovation
Generally, all hot-selling video scene effects are depending on the requirements from
market. In the beginning of DSC, only expensive DSLR with high quality image and
cheap compact could survived, since decades ago. As the progress of technology
improvement, new categories of DSC were reveled, such as EVIL camera and DSLR-
like, both equipped with many interesting video scene effects, to attract more
consumers, those who enjoying private shooting, with neat size, and quality image,
especially for white-collar female and low-key specialist. Therefore, a camera with
plenty of built-in video scene effects, which is designed by many experienced
photographers, can help people to take good picture effortless.
Currently, most armature consumers prefer to have a DSC with amazing, plenty video
scene effects, no matter how often or how scarce using them, while shooting photos.
People simply love it, and do create great profit. We notice some hot-selling items, which
should be widely accepted and copied by many cameras, with reason price in 2009 till
2011, they were: blur-free twilight scenes shooting, Lightning-fast continuous shooting,
Full HD Video, Sweep Panorama. As to another technical wonders, such as: 30× Optical
New Integration Technology for Video Virtual Reality 113
Zoom, Super High Speed ISO.., they might not be as popular (accepted)as people
thought, or simply limited in few models. For example, in 2008 only a few DSLRs
equipped with HD video. Currently, it is hardly to find a DSC without HD video
ironically (before 2009). In addition, the Fujifilm HS10 (Fall 2009), offers two special
scene effects: Motion Remover and Multi Motion Capture, which won a strong attention
in this market, and HS 20 (Feb,, 2011) keep the good merit of Multi Motion Capture but
delete Motion Remover(too complicate to handle). In addition, new Sony HX 100V
(May, 2011) offers GPS, High Dynamic Range (HDR), and highly sophistic 3D
Panorama, which may lead a new direction for future camera development. Coming with
the new era of 3D technology, any specialist may take better 3D pictures easily, and
manipulating 3D video image by special software, without bulky and expensive
equipments. In addition, there are many brilliant 3D products available, such as: 3D TV,
projector, Polarized 3D glasses, 3D game, and cell phone.
In our experiment, we collect more than 300 cases from the telephone survey during
March 2011 to Aug 2011. Total of 212 cases were effective, telephone surveys were
written by two specialists, and those non-typical cases were determined after further
discussion of 3 specialists.
0.8 SP
0.7 AP
0.6 CS
marginal
tiy 0.5
li
abb 0.4
or 0.3
p
0.2
0.1
0.0
level0 level1 above2
DSLR confidence level
We utilize and draw profile of different groups with conditional and marginal
probability of different groups preference level. It is for SP, AP and CS respectively
in Fig. 3.Where solid line denotes the marginal probability of the levels, the dotted
lines the conditional probability. Find levels "low"(level 0) and "slight"(level 1), there
is a maximum disparity from solid line to others, nearly up to 0.2 in the AP, therefore
can infer there is less preference level in the AP relatively, and greater probability of
having slight confidence level in the SP, this shows it were to varying degrees
influenced by the video effect preference, though with different categories of fan
groups, hence the preference level associates with video effect. We use chi-squared
test for independence to confirm and associated with the preference level, the -value
would be 33.7058, and P =0.
We took a test for independence to confirm the different groups preference level and
three groups in section 3: The different groups’ preference level is related, but this test
has not used the ordinal response level to confidence level. The regular collects the
classification data with order in the camera research. For example it mentioned in the
article the response variables of 'low', 'slight', 'moderate' and 'high' preference, we
hope to probe into the factor influencing its preference level, so we analyze utilizing
proportional odds model. if only one covariate, model present straight line relation to
the logarithm of the accumulated odds of level p and covariate variable x, because this
model supposes that there is the same slope β to all level p, this c-l straight line is
parallel each other. This model is based on McCullagh and Nelder theory (1989) and
Diggle, Liang and Zeger (1994). One predictable variable was included in the study,
representing known or potential risk factors. They are kinds of variable status is a
categorical variable with three levels. It is represented by two indicator variables x
( x1 and x 2 ), as follows.
New Integration Technology for Video Virtual Reality 115
kinds x1 x2
SP 1 0
AP 0 1
CS 0 0
Note the use of the indicator variables as just explained for the categorical variable.
The primary purpose of the study was to assess the strength of the association
between each of the predictable variables. Let
L p ( x1 , x 2 ) = θ p + β 1 x1 + β 2 x 2 , p = 1,2 (1)
Use likelihood ratio tests for b1 = b2 =0 hypothesis, the explanation of likelihood ratio
tests is as follows: given L be the log likelihood of models, then G2= -2L. From
hypothesis b1 = b2 =0, we have
likelihood ratio tests use the difference of two deviances between model (1) and
model (2) as reference value to test H 0 : b1 = b2 =0. If reject the hypothesis,
furthermore, to test if b1 =0 or b2 =0. As fact, the preference level "under moderate"
to the preference level "high", the odds ratio of SP compared with CS is also
exp(b1 ) . Now according to formula, b1 is the logarithm of the estimated odds
when the preference level "under high" to the preference level "high", the SP
compared with CS. b1 >0 means the preference level presented by SP is less high
than CS. Alternatively, b1 <0 means the preference level presented by SP is higher
than CS.
Therefore, b1 − b2 is the logarithm of the estimated odds when the preference
level "under moderate" to the preference level "high", the SP compared with AP.
b1 − b2 > 0 means the preference level presented by SP is less high than AP.
Alternatively, b1 − b2 < 0 means the specify preference level presented by SP is
higher than AP. On the rear part in the article, we will utilize odds ratio to probe into
association between fan groups and video effects preference level.
Combining confidence level "2" and " 3" to "2"(above moderate), we utilize odds
ratio to calculate, deviance equal 416.96, if b1 = b2 = 0 , deviance = 444.04, the
116 W.-M. Yeh
difference between two deviances is 27.087, the related χ 2 -critical value will be test
H 0 : β1 = β 2 =0 , we find χ 02.05 = 27.087 , and P=0.001. The null hypothesis is
rejected and we conclude that β1 and β 2 are not zero simultaneously. Furthermore,
we analyze β1 =0 or β 2 =0 or both not equal to zero. Table 3 is the results by
maximum likelihood estimates. From Table 3 we find the hypothesis β̂1 =0 is
rejected, β̂ 2 =0 is accepted, P-value is 0.017 and 0.072 respectively, β̂ 2 represents
the logarithm of odds ratio for AP, thus the odds ratio would be estimated to
β
be e 2 . β̂1 <0 represents the logarithm of odds ratio of preference 0 rather than ≤
preference>0 of SP is 0.423 fold than that of CS. This indicates that the logarithm of
≤
odds ratio of preference>0 rather than preference 0 of video effect preference
Confidence level is about 2.36 fold for SP compared to CS customer. β̂ 2 =0 means
the confidence level presented by AP is the same as by CS, ascertaining the result is
unanimous of this result.
Based on this data, AP (Amateur Photographer) may enjoy most of video effects, we
assumed chip of video effects could save great amount of time and money, and pay
more attention on “shooting”. As to Senior Photographer(SP), usually have
professional software and better camera for years, they are famous to take quality
picture by high-level DSLR with manual operation, doing post production with
professional software, without any artificial intelligence effect. College student (CS)
could have less budget and time to post small size photos for facebook or website
frequently.
As to the 2011, the 3D Sweep Panorama (3DP), GPS, and High Dynamic Range
(HDR), can be hot-selling scene effects now, many new DSCs already load them as
part of standard equipments. We realize that Senior Photographer (SP) is willing to
accept all possible new technologies and devices, and followed by Amateur
Photographer (AP), as to College student (CS) show their conservative attitude all the
times.
New Integration Technology for Video Virtual Reality 117
4 Conclusion
Since a decade ago, people have enjoyed the pleasure to take pictures or recording
anything by various digital cameras (DSC/DV), in replace of old film-camera, and it
can take good picture easily. The future of DSC will be full of surprised, handy in use,
energy saving, affordable price, and more customized scene effects, in order to please
photo-fans, video game users, which set a new record for successors already, and
push camera and video manufactures to have more user-friendly innovation in entry-
level model (Sony α series) especially. As to the latest wonders, Sony Smart AR
combines the technology of 3D, Game, and VR, which may challenge the current
hot-selling 3D panorama and VR effects in the future, and may create new 3D
hardware market (DSC, Monitor/TV, Mobile Phone, NB..etc) soon. As to the future,
this new technology can enhance one’s current perception of reality image, and to
simulate a new image, can be no limitation than ever.
References
1. TIPA Awards 2009: The best imaging products of 2009, TIPA (May 2010),
http://www.tipa.com/english/XIX_tipa_awards_2009.php
2. TIPA Awards 2010: The best imaging products of 2010, TIPA (May 2011),
http://www.tipa.com/english/XX_tipa_awards_2010.php
3. Rick User: New Sony Bionz processor, DC View (May 2, 2010),
http://forums.dpreview.com/forums/
read.asp?forum=1035&message=35217115
4. Alpha Sony: Sony Alpha 55 and 33 Translucent Mirror, Electronista (October 2010)
5. Graham-Rowe, D.: Sony Sets Its Sights on Augmented Reality, Technology Review (MIT)
(May 31, 2011),
http://www.technologyreview.com/
printer_friendly_article.aspx?id=37637
6. Behrman, M.: This Augmented Reality Needs No Markers to Interfere With Your (Virtual)
World, Gizmodo (May 19, 2011),
http://gizmodo.com/5803701/this-augmented-reality
-needs-no-markers-to-interfere-with-your-virtual-world
7. Moynihan, T.: Fujifilm’s Motion Remover, PC World (March 3, 2010),
http://www.macworld.com/article/146896/2010/03/
sonys_intelligent_sweep_panorama_mode.html
Innovative Semantic Web Services for Next Generation
Academic Electronic Library via Web 3.0
via Distributed Artificial Intelligence
1
Department of International Business
2
Department of Education
National Taichung University of Education, 140 Min-Shen Road,
Taichung 40306 Taiwan ROC
{hcchu,swyang}@mail.ntcu.edu.tw
1 Introduction
The contemporary web technology has been focused on integration, virtualization, and
socialization aspects (Cho, et al., 2008). Undoubtedly, Web 2.0 technology, which was
emerged in 2000s, gradually dominates the current mainstreams of Internet Technology.
Web 2.0 introduced the extraordinary phenomenon of user-generated contents or
community-oriented gatherings based on the reality that it provides enormous users more
than just retrieving information from existing web sites. It is constructed based upon
architecture of participation that reduces the barrier of online collaboration and encourages
the users to create and distribute the contents among communities. Wikipedia, PLURK,
*
Corresponding author.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 118–124, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Innovative Semantic Web Services for Next Generation Academic Electronic Library 119
YouTube, Flicker, and Twitter are good examples of web 2.0 applications. Web 2.0 is the
platform that people socialize or interact with each other and those people are brought
together through a variety Community of Interest (COI) to form the cohesion. Any
community member could share one’s sensation with the rest of the community
participants either on-line or off-line. Some literatures suggest that Web 2.0 is also called
Social Web, which strongly encourages community members to provide data or metadata
in simpler ways regarding tagging, blogging, rating, or comments. Nevertheless, current
Web 2.0 community web sites, substantial amount of data is scattered in the poorly
structured databases and the contents are subjective prone. Hence, the repository of data
dispersed around the networks is exponentially expanding. Unsurprisingly, finding the
suitable data respecting a certain subject becomes much more challenging. Consequently,
locating the appropriate and relevant information might be a tedious task for community
members. The above arguments stimulate the emergence of Web 3.0, which will be the
third wave of World Wide Web (WWW). The Intelligent Agent (IA) and Semantic Web
play an essential role in the forthcoming Web 3.0 architectures. We have already
witnessed library 2.0 serving the academic electronic library, which already focus on the
SOA (Service-Oriented Architecture), from users’ perspectives [Yang, et al., 2009]. Till
now, there is no academic electronic library that provides the Web 3.0 characteristics into
the core functionalities of information systems. In this article, we propose the conceptual
model of semantic web services for next generation academic electronic library via Web
3.0 architectures. Web 3.0 will have substantial effects regarding electronic library
services in many aspects, which will dramatically change the ways for information search
of digital libraries users.
2 Preliminaries
2.1 Intelligent Agent (IA)
The concept of IA was originated from the emergence of Distributed Artificial Intelligence
(DAI), which was a major branch of AI, taken places around twenty years ago. Presently,
it takes advantage of the ubiquitous networks and the power of distributed processing over
the networks. An IA is regarded as a computer system, situated in some environment with
the capability of flexible and autonomous actions in order to meet its design objective
[Jennings, et al., 2008]. Hence, a Multi-Agent System (MAS) has proposed a new
paradigm of mobile computing based on the cooperation and autonomy among them.
Basically, one of the most challenging problems in a MAS is to ensure the autonomous
characteristic of every single IA as a coherent behavior. Based on the related researches,
social autonomy is believed to be one of the most important behaviors concerning the
interactions between the agents in MAS [Carabelea, et al., 2004]. Social autonomy means
the adoption of goals. From application point of view, the adoption of goals might exceed
the processing capability of a certain IA. Under such circumstances, the IA can seek
additional assistance from other IAs via the ubiquitous networks. In other words, an IA is
able to delegate a goal to other IA, which might adopt it, through cooperation in order to
solve the problems via the MAS structure. In some occasions, IA s might disagree
concerning a specific goal. Hence, they might have to carry on negotiation in dynamic and
120 H.-C. Chu and S.-W. Yang
Web 3.0 was a phrase coined by John Markoff of the New York Times in 2006. It refers to
hypothetical Internet-based services that collectively integrate the related information that
is randomly scattered in the networks and present to the end users without the intention to
launch distinct application programs. Therefore, Web 3.0 is also called the Intelligent
Web. Without loss of generality, Web 3.0 might be defined as the third-generation of the
WWW enabled by the convergence of several key emerging technology trends
encompassing ubiquitous connectivity, network computing, and the intelligent web
[Nova, 2006]. Nowadays, Web 3.0 is already hotly debated among global Internet
technology researchers. Although the definition of Web 3.0 is still indefinite, the reality is
that many technologies and paradigms have been migrating into the prototype of this
inevitable future. Some ICT (Internet Communication Technology) industry researchers
and practitioners from different fields also indicate that Web 3.0 will ultimately be seen as
applications that are pieced together and those applications would be very small and can be
executed on any computing devices [Wikipedia, 2009].
Web 3.0 also refers to the stipulation of a more productive, personalized and perceptive
surroundings for the end users by means of the integration of Semantic Web and Artificial
Intelligence (AI) technologies, which acts as the core mechanism taking charge of
interpreting linguistic expressions from the end users. Under Web 3.0 structures, all the
information needs to be well-organized in order to be interpreted by machines without any
ambiguity and as much as humans can. The Semantic Web will be bundled into the next
generation of Web 2.0 and it will create web sites that can extract, share, re-use, and
aggregate the information to one place presenting to users as collective knowledge.
Consequently, the user interface is a crucial ingredient for the embracing of Web 3.0.
Notably, Web 3.0 will satisfy the users’ needs in many aspects and it should be as
natural as possible. The adoption of AI will provide collective intelligence and more
collaborative searching power to the global academic digital library users. Under such
circumstances, Web 3.0 will emphasize the advantages of semantic social web with
respect to Web 2.0. Additionally, this will create the opportunities and possibilities for
the use of semantic tagging and annotation for the social web. An IA that is embedded
with the semantic web reasoning mechanism will search the web sites with the
capability of processing complicated queries in order to retrieve the most optimal
information for the users. Consequently, heterogeneous IAs are able to communicate,
cooperate or even to proceed the conflict resolutions. Eventually, pervasive and
ubiquitous computing will be the fundamental composites of Web 3.0, which will be
omnipresent with multimodal intelligent user interfaces to access all kind of
multimedia objects in an insensible manner. There is no doubt that for next generation
of web services of digital libraries, the systems should be capable of providing end
users more specific information based on semantic oriented applications.
Innovative Semantic Web Services for Next Generation Academic Electronic Library 121
At last, we would like to conclude that Web 3.0 is the combination of existing Web
2.0 and the Semantic Web. Ontology, IA, and semantic knowledge management will be
integrated to the Semantic Web. Ambient Intelligence (AmI) will play an essential role
concerning the embedded semantic reasoning mechanism in the third generation of
WWW through heterogeneous networks. Furthermore, human machine interfaces will
be much more unobtrusive to naive end users. Bringing high quality services to the
users is a salient challenge of future academic electronic libraries. Although Web 3.0 is
still in its infant age, several pioneering applications have touched down in the web
communities. Those web sites emphasize on the application of AI, in other words, the
more participants use it, the more convenient it becomes. They are powered by
semantics interpretation and robotically learn the user preferences and would make
connections and suitable recommendation tailored-made to end users. Those web sites
will categorize distinct interests for users and eventually bring and bundle together
pushing to users’ desktop in one place and the users are able to share those with anyone.
The theme of this research paper is going to provide the system prototype of next
generation academic library services under Web 3.0 structures, which none of the
existing academic libraries have done. We also design and propose the conceptual
model to illustrate the essence of the above statements from offering practical advices
point of view. The paper incorporates the key elements of Web 3.0 into the core
functionality for the web services of the future academic electronic libraries.
As Figure 1 illustrates, the dash rectangle area indicates the integration of Semantic Web
into the current existing academic library search modules. As we know that many
academic libraries subscribe electronic journals from publishers. Under Web 3.0 library
structures, the user can provide the linguistic variables, which will be interpreted by the IA
of a specific publisher. For example, the user found the desired book from current library
information system. Furthermore, the user would like to screen related comments,
reviews, or Q&A before check out the material. In our proposed scenario, the user might
provide ‘The most popular’ in the dialog box as the linguistic expressions and check
Twitter, PLURK, and YouTube as the reference resources as the figure illustrated. After
the user submits the desired goal, the IA of the specific publisher will search the Q&A
database within its own domain of the current publisher and provide the most optimal
responses. Figure 2 depicts Web 3.0 library system prototype that we proposed. For
example, academic library A subscribes Springer (electronic journals, eBooks, and
Databases) and a certain user provides the goal through the dialog box designed in Figure
1. After the user submits the request, academic library A will notify Springer and the IA of
the corporation will search the Q & A database within the publisher to provide the related
responses. Concurrently, the IA will traverse to Web 2.0 community to compile and
retrieve all the correlated information from Twitter, PLURK, Blog, Wikipedia, or
YouTube. All the information will be delivered to the screen of the user who set the goal.
The user would be able to visualize all the information without specifically triggering all
those applications and this would be the beauty of Web 3.0 library. Figure 2 also
demonstrates the scenarios that the IA of Springer might contact with the IA of Wiley,
122 H.-C. Chu and S.-W. Yang
Fig. 1. The search results that are automatically integrated presenting to end users
Innovative Semantic Web Services for Next Generation Academic Electronic Library 123
Fig. 2. Web 3.0 electronic libraries integrate semantic web, IA, and Web 2.0 community web
sites
3 Conclusions
In this research, we propose the conceptual model of Web 3.0 digital library, which
incorporates the Semantics Web mechanism into the core function of an IA. A web 3.0
library will be primarily based on pervasive and ubiquitous computing infrastructures.
Currently, there is no academic electronic library fulfilling web 3.0 criteria. However,
this is definitely the next generation of digital library. For any publisher, there will be
an IA representing the publisher on the web and ready to cooperate or negotiate with
other IAs in a MAS to fulfill the goal set by the users. The users would retrieve the
related information without intentionally invoke the distinct applications and all those
information would be collectively delivered to users in one place. Furthermore,
automatic customization would be another feature of Web 3.0 due to the integration of
AI. As technologies continue to advance, devices would become more context-aware
and intelligent.
124 H.-C. Chu and S.-W. Yang
References
1. Carabelea, C., Boissier, O., Florea, A.: Autonomy in Multi-agent Systems: A Classification
Attempt. In: Nickles, M., Rovatsos, M., Weiss, G. (eds.) AUTONOMY 2003. LNCS
(LNAI), vol. 2969, pp. 103–113. Springer, Heidelberg (2004)
2. Cho, E.A., Moon, C.J., Park, D.H., Bait, D.K.: An Approach to Privacy Enhancement for
Access Control Model in Web 3.0. In: Third International Conference on Convergence and
Hybrid Information Technology, vol. 2, pp. 1046–1051 (2008)
3. Jennings, N.R., Sycara, K., Wooldridge, M.: A Roadmap of Agent Research and Development.
International Journal of Autonomous Agents and Multi-Agent Systems 1(1), 7–38 (1998)
4. Silva, J.M., Rahman, A.S.M.M., Saddik, A.E.: Web 3.0: A vision for bridging the gap between
real and virtual. In: Communicability MS 2008, Vancouver, BC, Canada, October 31, pp. 9–14
(2008)
5. Spivack, N.: The third-generation web is coming (2006),
http://www.kurzweilai.net/articles/
art0689.html?printable=1 (accessed on September 26, 2009)
6. Web 3.0 Definition, http://en.wikipedia.org/wiki/Semantic_Web
(accessed on July 9, 2008)
7. Yang, X., Wei, Q., Peng, X.: System architecture of library 2.0. The Electronic
Library 27(2), 283–291 (2009)
8. Santofimia, M.J., Fahlman, S.E., Moya, F., López, J.C.: A Common-Sense Planning
Strategy for Ambient Intelligence. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds.)
KES 2010, Part II. LNCS (LNAI), vol. 6277, pp. 193–202. Springer, Heidelberg (2010)
9. Mamady, D., Tan, G., Toure, M.L., Alfawaer, Z.M.: An Artificial Immune System Based
Multi-Agent Robotic Cooperation. In: Novel Algorithms and Techniques in
Telecommunications, Automation and Industrial Electronics, pp. 60–67 (2008),
doi:10.1007/978-1-4020-8737-0_12
10. Ford, A.J., Mulvehill, A.M.: Collaborating with Multiple Distributed Perspectives and
Memories, Social Computing and Behavioral Modeling (2009),
doi:10.1007/978-1-4419-0056-2_12
11. Mateo, R.M.A., Yoon, I., Lee, J.: Data-Mining Model Based on Multi-agent for the Intelligent
Distributed Framework. In: Nguyen, N.T., Jo, G.-S., Howlett, R.J., Jain, L.C. (eds.)
KES-AMSTA 2008. LNCS (LNAI), vol. 4953, pp. 753–762. Springer, Heidelberg (2008)
12. Chohra, A., Madani, K., Kanzari, D.: Fuzzy Cognitive and Social Negotiation Agent
Strategy for Computational Collective Intelligence. In: Nguyen, N.T., Kowalczyk, R. (eds.)
CCI I 2009. LNCS, vol. 6220, pp. 143–159. Springer, Heidelberg (2010)
13. Madani, K., Chohra, A., Bahrammirzaee, A., Kanzari, D.: SISINE: A Negotiation Training
Dedicated Multi-Player Role-Playing Platform Using Artificial Intelligence Skills. In:
Xhafa, F., Caballé, S., Abraham, A., Daradoumis, T., Juan Perez, A.A. (eds.) Computational
Intelligence for Technology Enhanced Learning. SCI, vol. 273, pp. 169–194. Springer,
Heidelberg (2010)
14. Daconta, M.C., Obrst, L.J., Smith, K.T.: The Semantic Web: A Guide to the Future of XML,
Web Services, and Knowledge Management, 1st edn., pp. 3–26. Wiley (2003)
15. Scarlat, E., Maries, I.: Towards an Increase of Collective Intelligence within Organizations
Using Trust and Reputation Models. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.)
ICCCI 2009. LNCS (LNAI), vol. 5796, pp. 140–151. Springer, Heidelberg (2009)
16. Breslin, J.C., Passant, A., Decker, S.: Towards the Social Semantic Web. In: The Social
Semantic Web, pp. 271–283 (2009), doi:10.1007/978-3-642-01172-6_13
17. Boley, H., Osmun, T.M., Craig, B.L.: WellnessRules: A Web 3.0 Case Study in
RuleML-Based Prolog-N3 Profile Interoperation. In: Governatori, G., Hall, J., Paschke, A.
(eds.) RuleML 2009. LNCS, vol. 5858, pp. 43–52. Springer, Heidelberg (2009)
Using Fuzzy Reasoning Techniques
and the Domain Ontology
for Anti-Diabetic Drugs Recommendation
Abstract. In this paper, we use fuzzy reasoning techniques and the domain
ontology for anti-diabetic drugs selection. We present an anti-diabetic drugs
recommendation system based on fuzzy rules and the anti-diabetic drugs ontology
to recommend the medicine and the medicine information. The experimental
results show that the proposed anti-diabetic drugs recommendation system has a
good performance for anti-diabetic drugs selection.
1 Introduction
Clinical medicine expert systems have been presented for the past 40 years. The first
generation of clinical medicine expert systems is the MYCIN system [11] which can
diagnose infectious blood diseases. In recent years, some researchers [4], [5], [6], [8]
combine the domain ontology with expert systems. The ontology techniques [1],
[5]-[6], [8]-[9], are a combination of artificial intelligence and machine language to
help to share and reuse the knowledge. It also contains natural language processing
techniques and knowledge representation techniques. The ontology techniques can be
used as channels of communication between human beings and systems. The
ontology techniques can be further used for information retrieval and knowledge
management. The more perfect the framework of domain ontology, the more
complete the information which can be provided.
In this paper, we use fuzzy reasoning techniques and the domain ontology to build
an anti-diabetic drugs recommendation system. The experimental results show that
the proposed anti-diabetic drugs recommendation system has a good performance for
anti-diabetic drugs selection.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 125–135, 2012.
© Springer-Verlag Berlin Heidelberg 2012
126 S.-M. Chen et al.
In 1965, Zadeh proposed the theory of fuzzy sets [12]. Let X be a universe of
discourse, where X = {x1, x2, …, xn}. A fuzzy set in the universe of discourse X can be
represented as follows:
n
A = ∑ μ A ( xi ) / xi
(1)
i =1
= A( 1)/ 1 + A( 2)/ 2 + …+ A( n)/ n ,
where A is the membership function of the fuzzy set A, A(xi) indicates the degree of
membership of xi in the fuzzy μA(xi)∈ [0, 1], the symbol ″+″ is the union operator, the
symbol ″/″ is the separator, and 1 ≤ i ≤ n.
Let A and B be two fuzzy sets in the universe of discourse U and let the
membership functions of fuzzy sets A and B be μA and μB, respectively. Then, the
∪
union of the fuzzy sets A and B, denoted as A B, is defined as follows [12]:
(2)
μ A∪ B (u ) = max{μ A (u ), μ B (u )}, ∀u ∈ U .
The intersection of the fuzzy sets A and B, denoted as A∩B, is defined as follows [2]:
Let us consider the following two fuzzy rules in the knowledge base of a fuzzy
rule-based system:
IF X1 is A1 AND X2 is B1 THEN Z is C1,
IF X1 is A2 AND X2 is B2 THEN Z is C2,
where the observation is “X1 is x and Y1 is y” and x and y are crisp values. According
to [3], Mamdni’s Max-Min operations for fuzzy reasoning are shown in Fig. 1. Then,
the fuzzy rule-based system performs the defuzzification operations to get a crisp
value z of the fuzzy reasoning result based on the center of gravity (COG)
defuzzification operations, shown as follows [3]:
∑ μc ( xi ) xi
k
(4)
z= i =1
,
∑ μc ( xi )
k
i =1
where μc is the membership function of the fuzzy set C, μc(xi) denotes the degree of
membership of xi belonging to the fuzzy set C, xi ∈ w, and 1 ≤ i ≤ k.
Using Fuzzy Reasoning Techniques and the Domain Ontology 127
4 Ontology Knowledge
Ontology is a knowledge representation method in the semantic web [9] and it includes
three parts, i.e., concepts, relationships and instances [5], [15]. In recent years, some
researchers uses the Ontology Web Language (OWL) to describe the ontology. OWL is
based on XML, where the RDF syntax is used in the OWL. The OWL can be divided
into three levels of language [14], i.e., OWL Full, OWL DL and OWL List.
Diabetes Fuzzifier
Tests
Inference
Fuzzy Rule Base
Engine
Defuzzification
Recommend
Medicine Medicine
Ontology
User (Doctor)
In this paper, we adopt the clinical practice data of the American Association of
Clinical Endocrinologists Medical Guidelines [10]. Table 1 shows a fuzzy rule matrix to
128 S.-M. Chen et al.
infer the usability of the Metformin (MET) class anti-diabetic drugs, which contains 64
fuzzy rules with six attributes (HbA1c Test, Hypoglycemia Test, Renal Test, Heart Test,
BMI Test and Liver Test) and three kinds of the usability, i.e., Recommend (R), Not
Recommend (NR) and Danger (D); HbA1c is a test that measures the amount of glycated
hemoglobin in the blood; the renal test is based on the Creatinine (Cr), which is a
break-down product of creatine phosphate in the muscle and which is usually produced at
a fairly constant rate by the body; the heart test is based on the functional classification of
the New York Heart Association (NYHA); the NYHA functional classification provides
a simple way of classifying the danger of heart failure; the weight test is based on the
body mass index (BMI) or the Quetelet index, which is a heuristic proxy for the human
body fat based on an individual’s weight and height; the liver test is based on the liver’s
abnormal releases (GPT) [10]. Table 2 shows a fuzzy rule matrix to infer the usability of
the Dipeptidyl peptidase-4 (DPP4) class anti-diabetic drugs. Table 3 shows a fuzzy rule
matrix to infer the usability of the Thiazolidinedione (TZD) class anti-diabetic drugs.
Table 4 shows a fuzzy rule matrix to infer the usability of the Glinide class anti-diabetic
drugs. Table 5 shows a fuzzy rule matrix to infer the usability of the Sulfonylureas (SU)
class anti-diabetic drugs. Table 6 shows a fuzzy rule matrix to infer the usability of the
α-glucosidase (AGL) class anti-diabetic drugs to infer the degree.
Table 1. Fuzzy rule matrix to infer the usability of the Metformin class anti-diabetic drugs
Abnormal
Abnormal
Abnormal
Renal
Normal
Normal
Normal
Normal
Table 2. Fuzzy rule matrix to infer the usability of the DPP4 class anti-diabetic drugs
Abnormal
Abnormal
Abnormal
Normal
Normal
Normal
Normal
Renal
Table 3. Fuzzy rule matrix to infer the usability of the TZD class anti-diabetic drugs
HbA1c Normal Abnormal
Hypoglycemia No Yes No Yes
Renal
Abnormal
Abnormal
Abnormal
Abnormal
Normal
Normal
Normal
Normal
Heart BMI Liver
Normal R R R R R R R R
Low
Abnormal NR NR NR NR NR NR NR NR
Normal
Normal NR NR NR NR NR NR NR NR
High
Abnormal NR NR NR NR NR NR NR NR
Normal D D D D D D D D
Low
Abnormal D D D D D D D D
Abnormal
Normal D D D D D D D D
High
Abnormal D D D D D D D D
Table 4. Fuzzy rule matrix to infer the usability of the Glinide class anti-diabetic drugs
HbA1c Normal Abnormal
Hypoglycemia No Yes No Yes
Abnormal
Abnormal
Abnormal
Abnormal
Normal
Normal
Normal
Normal
Renal
Table 5. Fuzzy rule matrix to infer the usability of the SU class anti-diabetic drugs
HbA1c Normal Abnormal
Hypoglycemia
No Yes No Yes
Abnormal
Abnormal
Abnormal
Abnormal
Normal
Normal
Normal
Normal
Renal
Table 6. Fuzzy rule matrix to infer the usability of the AGL class anti-diabetic drugs to infer the
degree
HbA1c Normal Abnormal
Hypoglycemia No Yes No Yes
Abnormal
Abnormal
Abnormal
Abnormal
Renal
Normal
Normal
Normal
Normal
Fig. 3 shows the membership function curves for the HbA1c Tests, Hypoglycemia
Test, Rental Test, Heart Test, BMI Test and Liver Test, respectively.
Fig. 4 shows the membership function curves for the fuzzy sets “Danger (D)”,
“Not Recommended (NR)”, “Recommended (R)”.
Fig. 4. Membership function curves for the fuzzy sets “Danger”, “Not Recommended”, and
“Recommended”
6 Experimental Results
In this paper, we use the center of gravity method shown in formula (4) to deal with the
defuzzification process. Table 7 shows 20 patients’ data and their fuzzy reasoning results.
Table 8 shows the 20 patients’ data and their recommended levels. The recommended
levels are 3, 2 or 1, which stand for different levels of recommendation. If the
recommendation level is over 2, then the anti-diabetes drugs can be used. If the
recommendation level is below 2, then the anti-diabetes drugs should be used carefully.
HbA1c Hypoglycemia Renal Heart BMI Liver AGL DPP4 Glinide MET SU TZD
No.1 6.8 0 0.6 0 22 0 0.80837 0.80837 0.80837 0.80837 0.80837 0.80837
No.2 7.2 48 0.8 1 23 33 0.799297 0.799297 0.799297 0.799297 0.799297 0.799297
No.3 8.3 55 1.8 0 24.5 18 0.725865 0.799297 0.71293 0.658352 0.71293 0.735512
No.4 9.8 66 2.1 3 27 80 0.666831 0.791961 0.559477 0.483749 0.559477 0.208039
No.5 10 70 0.6 0 30 140 0.814551 0.814551 0.5 0.5 0.5 0.5
No.6 7.85 65 0.8 4 27 78 0.788143 0.788143 0.576484 0.605195 0.576484 0.211857
No.7 9 40 3.8 4 21 100 0.5 0.814551 0.5 0.235069 0.5 0.185449
No.8 11 65 3.9 2 25 78 0.5 0.794041 0.570937 0.242327 0.5 0.205959
No.9 6.5 60 1.5 3 26 80 0.785547 0.785547 0.601117 0.601117 0.601117 0.214453
No.10 11.5 68 2.7 2 24.8 130 0.554593 0.800086 0.5 0.272684 0.5 0.199914
No.11 7.9 0 2.1 4 23 100 0.660531 0.789427 0.5 0.389229 0.490959 0.210573
No.12 9.8 48 0.6 4 24.5 78 0.794041 0.794041 0.598304 0.598304 0.605367 0.205959
No.13 10 55 0.8 2 27 80 0.796089 0.796089 0.589085 0.589085 0.592071 0.203911
No.14 7.85 65 3.8 3 30 130 0.5 0.788143 0.5 0.235069 0.5 0.211857
No.15 9 60 3.9 2 27 98 0.5 0.785547 0.511751 0.242327 0.5 0.214453
No.16 6.75 40 3.9 3 30 18 0.5 0.80944 0.80944 0.242327 0.5 0.19056
No.17 6.9 65 1.5 0 27 80 0.796089 0.796089 0.569414 0.589085 0.569414 0.5
No.18 7.78 60 2.7 4 21 140 0.5654 0.785547 0.5 0.303666 0.5 0.214453
No.19 6.8 68 2.1 4 25 78 0.666831 0.791961 0.530927 0.491988 0.530927 0.208039
No.20 6.65 0 0.6 2 26 100 0.796089 0.796089 0.5 0.5 0.391938 0.203911
Note: The range of “Recommended” is between 0.65 and 1, the “Not Recommended” is
between 0.35 and 0.65, and the range of “Danger” is between 0 and 0.35.
132 S.-M. Chen et al.
HbA1c Hyppoglycemia Rennal Heart BMI Liver AGL DPP4 Glinide MET SU TZD
No.1 6.8 0 0.6 0 22 0 3 3 3 3 3 3
No.2 7.2 48 0.8 1 23 33 3 3 3 3 3 3
No.3 8.3 55 1.8 0 24.5 18 3 3 3 3 3 3
No.4 9.8 66 2.1 3 27 80 3 3 2 2 2 1
No.5 10 70 0.6 0 30 140 3 3 2 2 2 2
No.6 7.85 65 0.8 4 27 78 3 3 2 2 2 1
No.7 9 40 3.8 4 21 100 2 3 2 1 2 1
No.8 11 65 3.9 2 25 78 2 3 2 1 2 1
No.9 6.5 60 1.5 3 26 80 3 3 2 2 2 1
No.10 11.5 68 2.7 2 24.8 130 2 3 2 1 2 1
No.11 7.9 0 2.1 4 23 100 3 3 2 2 2 1
No.12 9.8 48 0.6 4 24.5 78 3 3 2 2 2 1
No.13 10 55 0.8 2 27 80 3 3 2 2 2 1
No.14 7.85 65 3.8 3 30 130 2 3 2 1 2 1
No.15 9 60 3.9 2 27 98 2 3 2 1 2 1
No.16 6.75 40 3.9 3 30 18 2 3 3 1 2 1
No.17 6.9 65 1.5 0 27 80 3 3 2 2 2 2
No.18 7.78 60 2.7 4 21 140 2 3 2 1 2 1
No.19 6.8 68 2.1 4 25 78 3 3 2 2 2 1
No.20 6.65 0 0.6 2 26 100 3 3 2 2 2 1
Note: “3” denotes “Recommended”, “2” denotes “not recommended”, and “1”
denotes “Danger”.
Table 9 [10] shows the medicine names and their composition for reducing the
HbA1c. We construct the ontology to remind the oral hypoglycemic agents’ knowledge.
Protégé was used to construct the medicine ontology in the preliminary experiment due
to the fact that the Protégé supports the RDF (Resource Description Framework). The
OWL DL (Description Logic) format [14] was adopted in this paper due to the fact that
the OWL DL is based on the XML and RDF syntax. OWL DL supports those users who
want the maximum expressiveness while retaining the computational completeness (i.e.,
all conclusions are guaranteed to be computable) and decidability (i.e., all computations
will be finished in a finite time) [14].
Medicine Name /
Medicine Class Composition HbA1c (%)
Generic Name, Brand Name
Glyburide
Glyburide / Euglucon
(2.5mg, 5mg)
Glipizide / Minidiab Glidiab (5mg)
Sulfonylureas
Gliclazide 0.9 - 2.5
(SU) Gliclazide / Diamicron
(30mg, 80mg)
Glimepiride
Glimepiride / Amaryl
(1mg, 2mg)
Metformin Metformin HCl
Metformin / Glucophage, Bentomin, Glucomine 1.1 - 3.0
(MET) (500 mg, 850 mg)
α-glucosidase
Acarbose / Glucobay Acarbose (50mg) 0.6 - 1.3
(AGL)
Rosiglitazone maleate
Thiazolidinedione Rosiglitazone / Avandia
(2mg, 4mg, 8mg) 1.5 - 1.6
(TZD)
Pioglitazone / Actos Pioglitazone HCl (30mg)
NovoNorm
Repaglinide / NovoNorm
Glinide (0.5mg, 1mg, 2mg) 0.8
Nateglinide / Starlix Nateglinide (60mg)
JANUMET
DDP4 inhibitor Sitagliptin_phosphate 0.7
( 0.059mg, 0.05mg, 0.1mg)
Using Fuzzy Reasoning Techniques and the Domain Ontology 133
藥物類型 藥物成分
<http://www.owl-ontologies.com/Ontology1290657366.owl#>
SELECT ? ?
藥物類型 藥物成分
WHERE
{? default:names ? }
藥物類型
ORDER BY ASC (? 藥物成分 ) ASC(? )
Fig. 6 shows the SPARQL query results regarding the ingredient and medicine doses
of diabetic drugs. For example, the user query the medicine composition, the Acarbose,
the Gliclazide, …, and the Sitagliptin phosphate. The system returns medicine dose, i.e.,
Glucobay: 50mg, Diamicorn: 30mg, …, and JANUMET: 0.1mg.
7 Conclusions
In this paper, we have used fuzzy reasoning techniques and the domain ontology for
anti-diabetic drugs selection. We have presented an anti-diabetic drugs recommendation
system based on fuzzy rules and the anti-diabetic drugs ontology to recommend the
medicine and the medicine information. The experimental results show that the proposed
anti-diabetic drugs recommendation system has a good performance for anti-diabetic
drugs selection.
Acknowledgment. The authors would like to thank Dr. Cho-Tsan Bau, Taichung
Hospital, Taichung, Taiwan, for his help during this work. This work was supported in
part by the National Science Council, Republic of China, under Grant
NSC100-2221-E-011-118-MY2.
References
1. Bobillo, F., Delgado, M., Gómez-Romero, J., López, E.: A Semantic Fuzzy Expert System
for a Fuzzy Balanced Scorecard. Expert Systems with Applications 36(1), 423–433 (2009)
2. Chen, S.M., Lee, S.H., Lee, C.H.: A New Method for Generating Fuzzy Rules from
Numerical Data for Handling Classification Problems. Applied Artificial Intelligence 15(7),
645–664 (2001)
3. Lee, C.C.: Fuzzy Logic in Control Systems: Fuzzy Logic Controller, Part II. IEEE
Transactions on Systems, Man, and Cybernetics 20(2), 419–435 (1990)
4. Lee, C.S., Wang, M.H.: A Fuzzy Expert System for Diabetes Decision Support Application.
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 41(1), 139–153
(2011)
5. Lee, C.S., Wang, M.H., Hagras, H.: A Type-2 Fuzzy Ontology and Its Application to
Personal Diabetic-Diet Recommendation. IEEE Transactions on Fuzzy Systems 18(2),
374–395 (2010)
6. Lee, C.S., Jian, Z.W., Huang, L.K.: A Fuzzy Ontology and Its Application to News
Summarization. IEEE Transactions on Systems, Man, and Cybernetics-Part B:
Cybernetics 35(5), 859–880 (2005)
7. Misra, S., Roy, S., Obaidat, M.S., Mohanta, D.: A Fuzzy Logic-Based Energy Efficient
Packet Loss Preventive Routing Protocol. In: Proceedings of the 12th International
Conference on Symposium on Performance Evaluation of Computer & Telecommunication
Systems, SPECTS 2009, pp. 185–192 (2009)
8. Mao, Y., Wu, Z., Tian, W., Jiang, X., Cheung, W.K.: Dynamic Sub-Ontology Evolution for
Traditional Chinese Medicine Web Ontology. Journal of Biomedical Informatics 41(5),
790–805 (2008)
9. Quan, T.T., Hui, S.C., Fong, A.C.M.: Automatic Fuzzy Ontology Generation for Semantic
Help-Desk Support. IEEE Transactions on Industrial Informatics 2(3), 1551–3203 (2006)
Using Fuzzy Reasoning Techniques and the Domain Ontology 135
10. Rodbard, H.W., Blonde, L., Braithwaite, S.S., Brett, E.M., Cobin, R.H., Handelsman, Y.,
Hellman, R., Jellinger, P.S., Jovanovic, L.G., Levy, P., Mechanick, J.I., Zangeneh, F.: American
Association of Clinical Endocrinologists Medical Guidelines for Clinical Practice for the
Management of Diabetes Mellitus. American Association of Clinical Endocrinologists 13, 1–68
(2007)
11. Shortliffe, E.H.: MYCIN: A Rule-Based Computer Program for Advising Physicians
Regarding Antimicrobial Therapy Selection. Technical Report, Department of Computer
Sciences, Stanford University, California (1974)
12. Zadeh, L.A.: Fuzzy Sets. Information and Control 8, 338–353 (1965)
13. Joseki - A SPARQL Server for Jena, http://joseki.sourceforge.net/
14. OWL Web Ontology Language Overview,
http://www.w3.org/TR/owl-features
15. Ontology, http://en.wikipedia.org/wiki/Ontology
16. SPARQL, http://www.w3.org/TR/rdf-sparql-protocol/
Content-Aware Image Resizing Based on Aesthetic
Abstract. This paper presents an image resizing system, the purpose is to build
an image resizing system based on aesthetic composition including rule of
thirds and subject position. With global operations, using the traditional scaling
and non-traditional content-aware image resizing do the global operation,
supplemented by photo rating for adaptive adjustment to reduce user operation
with clear quantitative criteria as a basis for adjustment. In non-traditional
content-aware image resizing, two algorithms are used, Seam Carving for
Content-Aware Image Resizing and Adaptive Content-Aware Image Resizing,
to adjust path detection with Otsu's Method according to the above-mentioned
algorithm diagram.
1 Introduction
Nowadays, the popularity of digital cameras allows people take picture everywhere,
but not every photo can be satisfactory, so more and more image modification
software has be develop, but general public are not professional photographers nor
photo retouching experts, the public want to modify the composition of their favorite
photos to the photographer used even meet the aesthetic point of view, so effective
and simple to use software to adjust the image has become a trend by the attention.
The focus of this paper is to reduce user operations, and bring out the traditional
image adjustment and the two non-traditional method of image adjustment, in
addition the subjective factors in the observed images and the background are the
main things twisted and deformed, and factors in the objective for using a different
platform assessment scores before and after image adjustment, used to justify what
was better, and integration of different scoring system in order to achieve adaptive
image adjustment purposes.
2 Literature Review
Each person has a different aesthetics definition, the so-called aesthetic composition
can also be said to be very subjective, but the rule of thumb is that it can be integrated
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 136–145, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Content-Aware Image Resizing Based on Aesthetic 137
out and for a publically recognized standard, Hong-Cheng Kao proposed [1], several
criteria for the aesthetic image under the composition is well defined and measurable
criteria, and in Che-Hua Yeh and others have proposed [2], they extend the original
composition standard and developed a set of evaluation and ranking system that can
be adopted by users for their shoots. The photos selected in this study were analyzed
by the system, too.
However, in terms of photo resizing, manual resizing is deeply depended due to
different individual subjective aesthetic. In 2007, what Shai Avidan has presented [3]
can perceive the photo contents to further avoid the main object and overall algorithm
when resizing photo composition that can reduce resizing. The next year, 2008
Michael Rubinstein presented [4] made a very good extension application, but in the
process of photo resizing have some defects. Such as the protection of the contents is
still not enough or the adjustment direction is not satisfactory. For this problem,
Yu-Shuen Wang using visual saliency improvement image energy map [5]. For [3]
the choice of path and improve efficiency, Wei-Ming Dong [6] also made some
improvements.
In this paper, an image resizing system is proposed, using content-aware image
resizing [3] [6]. In addition, we give a threshold for the energy map, because these
algorithm path choices bases on energy map [7]. The goal is to smoothen the smooth
part and reduce the main object distortions. In aesthetic composition adaptation, we
tried traditional scaling to collocate with non-traditional content perception that
allows the resized photos more accepted to the general public.
First, use Photo Ranking System [2] to give a score for the original image by the three
elements on composition. Then through our system choose algorithm to resizing.
After the adjustment is completed, then by [1][2] to give the different aspects scores
of subjective and objective. In order to adjust the image does have consistent aesthetic
basis, as shown in Fig. 1 at Sec. 3.2. In this system, Esthetic Analysis background is
mentioned in [1][2], Content Aware Image Resizing[3][6] is coupled with threshold
selection method [7] to adjust the weights of the energy which mapped to compare
and analyze.
138 J.-S. Sheu, Y.-C. Kao, and H. Chu
The chosen images detecting the horizon by the algorithm mention in [1]. which
r is distance from the horizon to the upper.
DH is the Horizontal line for the Rule of Thirds in the range of composition.
σ
Hr is the standard deviation of the Rule of Thirds.
s
H r is the original score of the horizontal line, and final score for the Rule of
Thirds is s H b .
If DH 1.6 > r > DH 2.0 , then the score s H b is 1.0. However r not in this region,
then the s H b is the maximum absolute value from the three equations in of S H r ,
and S Hb in this region. The result of S H r is closer to 1.0, representing the horizon closer
to the Rule of Thirds. When the score is 1.0, represents horizon match the rule of
thirds.
σ
( DH 1.6 − DH 2.0 )
Hr = (5)
2
⎛ −(D − DH 1.0 ) 2 ⎞
s
H r =1.0 = − exp ⎜ ⎟ (6)
⎝ 2σ Hr2 ⎠
⎛
⎜ −(D − DH1.6 ) 2 ⎞⎟
s
H r =1.6 = − exp ⎜⎜ ⎟ (7)
⎜
⎝ 2σ Hr2 ⎟⎟
⎠
⎛ −(D − DH 2.0 ) 2 ⎞
s
H r = 2.0 = − exp ⎜ ⎟
⎜ 2σ Hr2 ⎟ (8)
⎝ ⎠
Use these equation can calculate the score for the image. Magnitude of each
adjustment of system is one-tenth of the image high. Here it is defined DR .
DR = [ Image height ] /10) , To do for the [3] [6] algorithm adjust each time the image
pixels. Each time after adjustment by the algorithm is re-calculated scores, and when
the score is 1.0, the adjustments are finished.
Chosen image that main subject is deviation from Power Weight Point, use [3] [6]
algorithm to resizing, each time adjustment by DR . Use fROT in [2] to score the subject
and four Power Weight Point. Ai is the subject size. S i is the saliency value of subject.
Di is the distance of subject and Power Weight Point(standard deviation σ = 0.17 ). If
the subject is more close to the Power Weight Point scores will be higher.
D2
1
∑
− i
fROT = AiSie 2σ
∑ i AiSi i
(9)
After the image resizing base on the rule of thirds and the subject relocation, need to
be observed the background and subject have been destroyed or distorted.
Pi = fi / N (10)
t L
μ 1 = ∑ P i / ω 1(t ) , μ 2 = ∑P i / ω 2(t ) (12)
i=k i = i +1
ω 1μ 1 + ω 2 μ 2 = μT (13)
σ B2 = ω 1( μ 1 − μ T ) 2 + ω 2 ( μ 2 − μ T ) 2 (14)
Content-Aware Image Resizing Based on Aesthetic 141
Image through aesthetic analysis, resizing for different parts of the content is used the
non-traditional Content-Aware Image Resizing [3] [6] and Scaling in traditional way,
for the [3] [6] In particular, we addition Otsu's Method on the path selection method is
also Energy map to adjustment.
The main aim is use the Sobel Edge Detection on the original content-aware
resizing of to detect the energy map. e( I )
∂ ∂
e( I ) = I + I (16)
∂x ∂y
However, this approach is not use Binary Threshold on the Energy map e ( I ) , the
optimal threshold value T for the Energy map is calculated.
Were substituted into the [3][4], Zero below the T part and keep the value higher
than T:
if e(i, j ) ≤ T , e(i, j ) = 0
(17)
e(i, j ) > T , e(i, j ) = e(i, j )
Make Content-Aware Image Resizing in the path choice can easily avoid the subject
and select smooth part is the path. While the energy overall sum of Seam selected (the
energy due to some zero on the path) will different; the Optimal Seam will be
different. So the smooth part allowed to more easily identify the more is our main
purpose.
After we resizing the image, score the image by [1] [2] respectively, and observe
the background and the main object distortion and destruction, which is used to judge
the resized image which one is better.
4 Experiments
We use the Image base on Rule of Thirds and the subject location with different
coping strategies to score and analysis. First choices the image have horizon line on
the middle. Then use non-traditional content-aware resizing algorithms to adjust.
According the experiments we give energy map a threshold or not, to adjust energy
map of the different results.
142 J.-S. Sheu, Y.-C. Kao, and H. Chu
Fig. 3. Adjusted the horizon to the rule of thirds and the threshold value T is not set
(a) Original Image: The horizon on the image middle.
(b) Original Image`s energy distribution.
(c) Reduce pixels by Seam Carving resizing to adjust horizon.
(d) Reduce pixels by Adaptive to adjust horizon.
(e) Increase pixels by Seam Carving resizing to adjust horizon.
(f) Increase pixels by Adaptive to adjust horizon.
From the Fig. 3, we can observe that, using [3] [6] algorithm to make adjustments and
the energy map with threshold value not set Horizontal line in the adjustment process
cannot successfully adjust to the rule of thirds. Because the right-down silhouette energy
too closes the background energy, adjustment process chose to adjust right-down
silhouette. It cannot achieve the purpose we want - to adjust the horizontal line.
The original energy map threshold T is given. Then adjust and observe. In Fig. 4 can
see the clouds at top on picture become smoother after using the threshold. Adjust by
original [3][6] algorithm will adjust the silhouette like Fig. 3. After give threshold vale,
algorithm will adjust clouds. Then can adjust to the rule of thirds on the horizontal line.
Fig. 4. Adjusted the horizon to the rule of thirds and the threshold value T is set
(a) Original Image: The horizon on the image middle.
(b) Original Image`s energy distribution by give the threshold value T.
(c) Reduce pixels by Seam Carving resizing to adjust horizon.
(d) Reduce pixels by Adaptive to adjust horizon.
(e) Increase pixels by Seam Carving resizing to adjust horizon.
(f) Increase pixels by Adaptive to adjust horizon.
Content-Aware Image Resizing Based on Aesthetic 143
Fig. 5. The subject adjusts to Power Weight Point and the threshold value is not set
(a) Original Image: The subject in the center.
(b) Original Image`s energy distribution
(c) Reduce pixels by Seam Carving resizing to adjust subject location.
(d) Reduce pixels by Adaptive to adjust subject location.
(e) Increase pixels by Seam Carving resizing to adjust subject location.
(f) Increase pixels by Adaptive to adjust subject location.
Fig. 6. The subject adjusts to Power Weight Point and the threshold value is set
(a)Original Image: The subject in the center.
(b) Original Image`s energy distribution by give the threshold value T.
(c) Reduce pixels by Seam Carving resizing to adjust subject location.
(d) Reduce pixels by Adaptive to adjust subject location.
(e) Increase pixels by Seam Carving resizing to adjust subject location.
(f) Increase pixels by Adaptive to adjust subject location.
From Fig. 5 and Fig. 6 can see because the energy of subject higher than background,
so whether the given threshold for the adjustment process of the subject re-location to
144 J.-S. Sheu, Y.-C. Kao, and H. Chu
little effect. On the process of adjustment pixels Increase or decrease the pixel process,
decrease pixel is better than increase pixel. Decrease pixel processes have less frequent
operations. The Increase pixel process to cause amplification background is too vague or
subject led to the amplification of the background over the main image as is the
proportion of the overall decline.
Adjust the horizontal line in Fig. 3, after the horizontal line near the middle. The
results of s H b are close to the worst case -1, and adjusted Fig. 4, the score of the
horizontal line close to 1.0.
In Fig. 5 and Fig. 6 is not very different, the form below do the presentation and
comparison.
fROT is ranking system scores. Both of Fig. 5 and Fig. 6, (b) the energy map for the
image not graded. By the table above we can clearly see that image scores adjust
subject location by reduce the image pixels (c) (d) improved significantly than the
unadjusted image (a) scores. Increase pixels to adjust subject location (e) (f), the fROT
increase magnitude are smaller. And, Fig. 6 (given threshold T) compare with Fig. 5
(not given threshold T), Fig. 6 (c) score is higher than Fig. 5 (c). That is, [3] algorithm
has been improvement by [7] given threshold. But in (e) and (f) contrary to the scores
is declined. In the same we found that the scores of subject re-location by reduce
pixels (c) and (d) higher than Increase pixels (e) (f). Adjust subject location by
Increase pixels will cause background over-amplification and fuzzy. Thus a
conclusion will be given, adjust the position of the subject to the four Power Weight
Point, use [3] and [6] method with reduce pixels is better way, and if choose [3] to
adjust, can give threshold T [7] to increase the performance.
5 Conclusion
According the experiments, in horizon adjustment, the subject energy and background
energy too close (Fig. 5, 6 (a) Original Image), use content-aware resizing to do
horizon adjust often damage to the subject image distortion and the horizon line. But
after given threshold to reduce the non-subject energy, content-aware resizing can be
more easily to select background to be path.
Content-Aware Image Resizing Based on Aesthetic 145
Seam Carving for Content-Aware Image Resizing with threshold can improve the
results than original Seam Carving for Content-Aware Image Resizing. Adaptive
Content-Aware Image Resizing is a less obvious effect. Finally, use reduce pixels
than increase pixel is better, reason is that the increase pixel process to cause
amplification background is too vague or subject led to the amplification of the
background over the main image as is the proportion of the overall decline.
The above study found that after a given threshold, Content-Aware Image Resizing
algorithms can protection the subject. Adjusted to meet the aesthetic composition in
the different images have different adjustment methods. The system for different
types of images with traditional and non-traditional methods allowed more convenient
to adjust and perfect, and we hope that the future can make the system more efficient
development of more complete, so that each image can be easily adjusted to meet the
aesthetic composition, each individuals can have a photographer shoot out like a
beautiful image.
References
1. Kao, H.-C.: Esthetics-based Quantitative Analysis of Photo Composition. Master’s Thesis,
Department Computer Science & information Engineering, National Taiwan University,
R.O.C (2008)
2. Yeh, C.-H., Ho, Y.-C., Barsky, B.A., Ouhyoung, M.: Personalized Photograph Ranking and
Selection System. In: Proceedings of the International Conference on Multimedia (2010)
3. Avidan, S., Shamir, A.: Seam Carving for Content-Aware Image Resizing. ACM
Transactions on Graphics (New York, NY, USA) 26(3) (July 2007)
4. Rubinstein, M., Shamir, A., Avidan, S.: Improved Seam Carving for Video Retargeting.
ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH (New York,
NY, USA) 27(3) (August 2008)
5. Wang, Y.-S., Tai, C.-L., Sorkine, O., Lee, T.-Y.: Optimized Scale-and-Stretch for Image
Resizing. ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH Asia
27(5) (December 2008)
6. Dong, W., Paul, J.-C.: Transactions on Adaptive Content-Aware Image Resizing, vol. 28(2).
The Eurographics Association and Blackwell Publishing Ltd. (2009)
7. Otsu, N.: A Threshold Selection Method. IEEE Transactions on System, Man, and Cybernetics,
62–66 (1979)
A Functional Knowledge Model and Application
1 Introduction
Knowledge representation is one of important branches of artificial intelligence. To
build the expert systems or knowledge-based problem solving systems, we need to
design a knowledge base and an inference engine. The quality of the intelligent
systems depends greatly on knowledge. Therefore, researching and developing
methods for representing knowledge and reasoning mechanisms have great
significance in theories and applications in artificial intelligence science.
Nowadays, there are many different knowledge models have been proposed and are
widely applied in many fields. The knowledge models are mentioned in [1], [2], [5], [10]
such as semantic network, rule – based system, conceptual graph, computing network,
COKB, etc, for human many different methods in designing the knowledge base.
The components of knowledge, which are represented by those models, are rather
varied such as: concepts, operators, laws, etc. However, in human knowledge, there are
many other components have not been studied fully for representing. One of those
components is functions and their computing relations. Therefore, in this paper, we
present a knowledge model for representing a domain related functions and their
relations. This model is a method for designing a functional knowledge base for solving
automatically problems of inference and calculation on functions. Simplification is a
basic and important problem in this knowledge domain. This paper also proposes a
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 146–155, 2012.
© Springer-Verlag Berlin Heidelberg 2012
A Functional Knowledge Model and Application 147
technique for automatic solving this problem by means of defining the length of
expressions.
One of the most practical applications of this model is knowledge domain of
trigonometric expressions. Automated and readable simplification of trigonometric
expressions is an important issue but has not been completely solved by current
computer algebra systems. In [11], the author presented a technique for solving this
problem by means of combination rules. However, this technique can simplify
trigonometric polynomials only. Therefore, based on proposed model and algorithm,
the paper also presents the design of a program that simplifies trigonometric
expressions automatically. The program is implemented by Maple and C #. This
program can simplify not only trigonometric polynomials but also other forms of
trigonometric expressions. It is able to give result and solution in step by step for
explaining the process of reasoning. It is very useful for teachers and students in
teaching and studying.
Functional knowledge model is a contribution in the research and development of
methods for representing knowledge on the computer.
sin( x ) cos( x) 2
Example 2: f = − ⇒ length(f) = 10.
sin( x) − cot( x)2 − cos( x) 2 2sin( x) 2 − 1
3.1 Definitions
Here are some definitions that are necessary for the design of automatic simplification
algorithm. Given FKM (Cf, Ops, Rules, Vals):
Definition 1. Given two expressions exprf, exprg ∈ <Cf>. exprf is ”simpler than”
exprg if length(exprf) < length(exprg).
Definition 3.
− A rule r in Rules is called “simplifying rule”
if length(lhs(r )) > length(rhs (r )) . lhs(r), rhs(r) are left – hand side and
right – hand side of r.
A set of simplifying rules is Rrg.
Rrg = {r ∈ Rules, length(lhs (t )) > length(rhs(r ))}
Base on definition 3, the set Rules in FMK consists of Rrg and Rkt.
Rules = Rrg ∪ Rkt
3.2 Algorithms
Here are simple simplification algorithm and simple expansion algorithm. The idea
of both two algorithms is to perform the inference process combines the heuristic
rules to enhance the speed of solving problems and achieve a better solution.
Some heuristic rules were used for selecting simplifying rule are listed below:
− Chose rule whose right – hand side is constant.
− Chose rule that has no new function in it
− Chose rule that involves functions in expression.
− Chose rule that length of its left – hand side is longest and length of its
right – hand side is shortest.
Some heuristic rules were used for selecting expansion rule are listed below:
− Chose rule that has no new function in it.
− Chose rule that involves functions in expression.
− Chose rule makes expression has shortest length after applying.
152 N. Do and T.-L. Pham
4 Application
Example 3:Simplify f = sin( x)2 (1 + cot( x)) + cos( x)2 (1 + tan( x))
⎧ π⎫
with ⎨0 < x, x < ⎬
⎩ 2⎭
Solution:
sin( x)
Step 1: Apply tan( x) =
cos( x)
⎛ sin( x) ⎞
Then: f = sin( x)2 (1 + cot( x )) + cos( x) 2 ⎜ 1 + ⎟
⎝ c os( x) ⎠
cos( x)
Step 2: Apply cot( x) =
sin( x)
⎛ cos( x) ⎞ 2⎛ sin( x) ⎞
Then: f = sin( x)2 ⎜1 + ⎟ + cos( x) ⎜1 + ⎟
⎝ sin( x) ⎠ ⎝ cos( x ) ⎠
Step 3: Product to sum
Then: f = cos( x) 2 + 2 cos( x)sin( x) + sin( x)2
solution, or give a bad result, or cannot simplify in some cases. Then, it finds that the
ability of our program is better than Maple’s. Its solutions are in step by step and
close to human thought.
Result of Maple: 4
Result of program: 4
Solution:
Step 1: Apply ( tan( x) + cot( x) ) = cot( x) 2 + 2 tan( x) cot( x) + tan( x)2
2
Example 5: Simplify f = sin( x)2 (1 + cot( x)) + cos( x)2 (1 + tan( x))
⎧ π⎫
with ⎨0 < x, x < ⎬
⎩ 2⎭
Result of Maple: 2 cos( x) sin( x ) + 1
Result of program: f = cos( x) + sin( x)
Solution: (see example 3)
cos( x) 2 − sin( y ) 2
Example 6: Simplify f = − cot( x)2 cot( y )2
sin( x) 2 sin( y )2
− cos( x) 2 + 1 − cos( y )2 + cos( x )2 cos( y ) 2
Result of Maple:
sin( x)2 sin( y ) 2
Result of Maple: -1
Solution:
cos( x)
Step 1: Apply cot( x) =
sin( x)
cos( x) 2 − sin( y )2 cos( x)2 cot( y )2
Then: f = −
sin( x) 2 sin( y ) 2 sin( x)2
154 N. Do and T.-L. Pham
cos( y )
Step 2: Apply cot( y ) =
sin( y )
cos( x) 2 − sin( y )2 cos( x )2 cos( y )2
Then: f = −
sin( x) 2 sin( y )2 sin( x )2 sin( y ) 2
Step 3: Product to sum
cos( x) 2 1 cos( x)2 cos( y )2
Then: f = − −
sin( x)2 sin( y ) 2 sin( x) 2 sin( x)2 sin( y )2
Step 4: Common factor
Then: f = −
(
cos( x)2 −1 + cos( y ) 2 )− 1
2 2
sin( x) sin( y ) sin( x)2
Step 5: Apply −1 + cos( y ) 2 = − sin( y ) 2
cos( x) 2 1
Then: f = 2
−
sin( x) sin( x)2
Step 6: Common factor
−1 + cos( x) 2
Then: f =
sin( x) 2
Step 7: Apply −1 + cos( x) 2 = − sin( x) 2
Then: f = −1
5 Conclusion
Functional knowledge model is a new method for representing knowledge domains
related to the functions, and their computing relations. This model is suitable for
designing a knowledge base of functions in many different applications. With the explicit
structure, model can be used to build the solving modules. This model has contributed to
the development of methods for representing human knowledge on computers.
The technique of simplification above is based on the length of the expression. It
has helped to solve the problem of simplifying automatically expression in many
practical knowledge domains with many forms of expressions.
The program of automatic trigonometric simplifying has been tested on many
exercises in trigonometry in math books of high education. This program can simplify
many forms of expressions and provide solutions that are natural, accurate, and
similar to the thinking of people.
References
1. Sowa, J.F.: Knowledge Representation - Logical, Philosophical, and Computational
Foundations. Brooks/Cole, California (2000)
2. Tim Jones, M.: Aritificial Intelligence - A Systems Approach. Infinity Science Press LLC
(2008)
A Functional Knowledge Model and Application 155
3. Tyugu, E.: Algorithms and Architectures of Aritificial Intelligence. IOS Press (2007)
4. Lakemeyer, G., Nebel, B.: Foundations of Knowledge representation and Reasoning.
Springer, Heidelberg (1994)
5. van Harmelen, F., Lifschitz, V., Porter, B. (eds.): Handbook of Knowledge Representation.
Elsevier (2008)
6. Calmet, J., Tjandra, I.A.: Representation of Mathematical Knowledge. In: Raś, Z.W.,
Zemankova, M. (eds.) ISMIS 1991. LNCS, vol. 542, pp. 469–478. Springer, Heidelberg
(1991)
7. Brewster, C., O’Hara, K.: Knowledge Representation with Ontologies: The Present and
Future. IEEE Intelligent Systems 19(1), 72–73 (2004)
8. Sabine, B., Gills, K., Gills, M.: An Ontological approach to the construction of problem-
solving models, Laria Research Report: LRR 2005-03 (2005)
9. Van Nhon, D.: A system that supports studying knowledge and solving of analytic geometry
problems. In: 16th World Computer Congress, Proceedings of Conference on Education Uses
of Information and Communication Technologies, Beijing, China, pp. 236–239 (2000)
10. Do, N.: An Ontology for Knowledge Representation and Applications. Proceeding of
World Academy of Science, Engineering and Technology 32 (2008)
11. Fu, H., Zhong, X., Zeng, Z.: Automated and readable simplification of trigonometric
expressions. Mathematical and Computer Modeling 44, 1169–1177 (2006)
Local Neighbor Enrichment
for Ontology Integration
Trong Hai Duong1 , Hai Bang Truong2 , and Ngoc Thanh Nguyen3
1
Faculty of Mathematics and Informatics,
Quangbinh University, Vietnam
[email protected]
2
University of Information Technology,
VNU-HCM, Vietnam
[email protected]
3
Institute of Informatics,
Wroclaw University of Technology, Poland
[email protected]
Abstract. The main aim of this research is to deal with enriching con-
ceptual semantic by expanding local conceptual neighbor. The approach
consists of two phases: neighbor enrichment phase and matching phase.
The enrichment phase is based on analysis of the extension semantic the
ontologies have. The extension we make use of in this work is generated
an contextually expanded neighbor of each concept from external knowl-
edge sources such as WordNet, ODP, and Wikimedia. Outputs of the
enrichment phase are two sets of contextually expanded neighbors be-
longing to these two corresponding ontologies, respectively. The matching
phase calculates similarities between these contextually expended neigh-
bors, which yields decisions which concepts are to be matched.
1 Introduction
In the literature, there are many definitions of ontology integration, but the def-
inition given by [16] is refereed in this research that defined as the process of
finding commonalities between two different ontologies O and O and deriving a
new ontology O∗ that facilitates interoperability between computer systems that
are base on the O and O ontologies. The new ontology O∗ may replace O or
O , or it may be used only as an intermediary between a system based on O and
system based on O . Finding ontological commonality is a very complex task,
since ontologies have varies characteristics, e.g., the same concept but different
names, the same name but different concepts, overlapping concepts but different
concepts, multiple forms of the same concept, and multiple concepts of the same
form [6]. Although several efforts in ontology integration have already been con-
tributed, they have different focuses, assumptions, and limitations. According
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 156–166, 2012.
Springer-Verlag Berlin Heidelberg 2012
Local Neighbor Enrichment for Ontology Integration 157
to the literature, ontological similarity techniques that have been explored for
commonalities finding can be classified into following four methods: instance-
based similarity that is a similarity measurement between concepts based on
concepts’ common instances [4], lexical-based similarity that the similarity be-
tween two concepts is determined by analyzing the linguistic meanings of associ-
ated names [8,9], schema-based similarity that a similarity between two concepts
is determined by analyzing the similarity between associated properties [1], and
taxonomy-based similarity that is found by analyzing the structural relationships
between them, such as subsumption [7].
The main aim of this research is to deal with enriching conceptual semantic
by expanding local conceptual neighbor. The approach consists of two phases:
neighbor enrichment phase and matching phase. The enrichment phase is based
on analysis of the extension semantic the ontologies have. The extension we
make use of in this work is generated an contextually expanded neighbor of each
concept from external knowledge sources such as WordNet1 , ODP2 , and Wiki-
media3 . The intuition is that given two ontologies which should be matched,
we construct a contextually expanded neighbor for each concept. The neigh-
bor, is first derived from external knowledge sources, as it reflects the domain-
independent concept for the corresponding concept, called domain-independent
concept neighbor or global neighbor. Then, this independent neighbor matches
to local neighbor (extracted from the corresponding ontology) to generate the
corresponding contextually expanded neighbor. Outputs of the enrichment phase
are two sets of contextually expanded neighbors, which belong to the two cor-
responding ontologies, respectively. The matching phase calculates similarities
between these contextually expended neighbors, which yields decisions which
concepts are to be matched.
2 Related Works
To enrich semantic for ontology, most previous works focus on using external
knowledge to generate a enrichment structure such as a feature vector [17] and
a forest for representing each concept [15].
The main task of the work [17] for the semantic enrichment structure is to
generate a feature vector of a concept, which comes out as the result of extension
analysis of the relevant ontologies.
In particularly, each concept is considered as a query to collect relevant doc-
uments that is then assigned to the query concept. However, if documents have
already been assigned to specific concepts as their instances, we can skip this
step and construct feature vector for the concepts directly. There are two steps
to construct feature vector for each concept as follows: First step, using the vec-
tor space model tf/idf to construct the generalization of the documents. Second
step, for each leaf concept, the feature vector is calculated as the feature vector
1
http://wordnet.princeton.edu/
2
http://www.dmoz.org/
3
http://en.wikipedia.org
158 T.H. Duong, H.B. Truong, and N.T. Nguyen
of set of documents associated with the concept. For each none-leaf concept, the
feature vector is calculated by taking into consideration contributions from the
documents that have been assigned to the concept and its direct sub concepts.
Similarities between concepts across different ontologies are determined by their
feature vector similarities. The semantic enrichment process of this approach
is shown in Fig. 1. This approach makes a strong assumption that the docu-
ments querying from a concept name are relevant to the concept. However, even
the assumption is satisfied, the representative feature of a concept is common
understanding or domain-independent the corresponding concept. Considering
concepts in a specific context on a specific domain ontology, this approach will
lead to mismatching as aforementioned.
Another system [15], which calls BLOOMS, is based on the idea of boot-
strapping information on the LOD cloud. This approach is an utilization of the
Wikipedia category hierarchy. Two ontologies, which are assumed to contain
schema information, should be mapped by BLOOMS. It then proceeds with the
following steps:
– Pre-processing: The extraction of concepts from the input ontologies are
performed by removing property restrictions, individuals, and properties.
Each concept name is tokenized to obtain a list of all simple words contained
within them, with stop words removed.
– Construction: Each word belonging to a concept is a root of a tree con-
structing using information from Wikipedia. The depth of the tree limits
to four based on empirical observation depths beyond four typically include
very general categories, which are not useful for alignment. Therefore, each
concept is represented by a forest including trees rooting by its tokenized
words.
– Matching: The comparison of constructed BLOOMS forests is executed to
determine which conceptual names are to be aligned.
– Post-processing: Enhancing results using the Alignment API and other
reasoners.
Both aforementioned approaches used additional knowledge such as wikipedia
or text corpus to enrich a concept semantic that yields to determine which con-
cepts between two ontologies to align. However, these methods do not consider
contextual concept, which leads to many mismatching concepts.
Local Neighbor Enrichment for Ontology Integration 159
3 Methodologies
Fig. 3. Pre-Processing
4 Experiments
4.1 Data Sets
We performed a comprehensive evaluation of the proposed system using third
party datasets and other state-of-the-art systems in ontology matching. More
specifically, we evaluated our proposed method in two different ways. Firstly,
we examined the ability of the proposed method to serve as a general pur-
pose ontology matching system, by comparing it with other systems on the
Ontology Alignment Evaluation Initiative (OAEI)4 benchmarks5 . The domain
of benchmarks test is Bibliographic references. The test is based on one particu-
lar ontology dedicated to the very narrow domain of bibliography and alterative
ontologies of the same domain. They are organized in three groups: Data test 1
is consisting of simple tests (101-104), Data test 2 containing systematic tests
(201-266), and Data test 3 including four real-life ontologies of bibliographic
references (301-304) found on the web.
Secondly, we evaluated the proposed method for the purpose of LOD schema
integration and compared it with other systems for ontology matching on LOD
schema alignment. This data set contains schema-level mappings from two LOD
ontologies to Proton (an upper level ontology) created manually by human
experts for a real world application called FactForge, with over 300 classes and
100 properties. These two LOD ontologies include:
– DBpedia6 : The RDF version of Wikipedia, created manually from Wikipedia
article infoboxes. DBpedia consists of 259 classes ranging from general classes
(e.g. Event) to domain specific ones (e.g. Protein).
– Geonames7 : A geographic data set with over 6 million locations of interest,
which are classified into 11 different classes.
Fig. 4. LOD Schema Evaluation Comparison between our method and the previous
works BLOOMS, AROMA, and S-Match
Fig. 5. Evaluation Comparison between our method Aprior and BLOOMS on Brench-
mark data set
5 Conclusions
The main aim of this research is to deal with enriching conceptual semantic by
expanding local conceptual neighbor. The extension we make use of in this work
is generated an contextually expanded neighbor of each concept from external
knowledge sources such as WordNet, ODP, and Wikimedia. Experimental results
shows that our algorithm perform significantly in term of accuracy compare with
some known methods from OAEI Benchmarks and Linked Open Data sets and
quite better results compare with the existing methods.
This research was partially supported by Polish Ministry of Science and Higher
Education under grant no. N N519 407437 (2009-2012).
References
1. Castano, S., Ferrara, A., Montanelli, S.: Matching Ontologies in Open Networked
Systems: Techniques and Applications. In: Spaccapietra, S., Atzeni, P., Chu, W.W.,
Catarci, T., Sycara, K. (eds.) Journal on Data Semantics V. LNCS, vol. 3870,
pp. 25–63. Springer, Heidelberg (2006)
2. Danilowicz, C., Nguyen, N.T.: Consensus-based methods for restoring consistency
of replicated data. In: Kopotek, M., et al. (eds.) Advances in Soft Computing,
Proceedings of 9th International Conference on Intelligent Information Systems
2000, pp. 325–336. Physica-Verlag (2000)
3. David, J., Guillet, F., Briand, H.: Matching directories and OWL ontologies with
AROMA. In: CIKM 2006: The 15th ACM International Conference on Information
and Knowledge Management, pp. 830–831. ACM, New York (2006)
4. Doan, A.H., Madhavan, J., Domingos, P., Halevy, A.: Ontollogy matching: a ma-
chine learning approach. In: Handbook on Ontologies in Information Systems,
pp. 397–416. Springer, Heidelberg (2003)
5. Duong, T.H., Nguyen, N.T., Jo, G.S.: Effective Backbone Techniques for Ontol-
ogy Integration. In: Nguyen, N.T., Szczerbicki, E. (eds.) Intelligent Systems for
Knowledge Management. SCI, vol. 252, pp. 197–227. Springer, Heidelberg (2009)
6. Duong, T.H., Jo, G.S., Jung, J.J., Nguyen, N.T.: Complexity Analysis of Ontology
Integration Methodologies: A Comparative Study. Journal of Universal Computer
Science 15(4), 877–897 (2009)
7. Duong, T.H., Nguyen, N.T., Jo, G.S.: A Hybrid Method for Integrating Multiple
Ontologies. Cybernetics and Systems 40(2), 123–145 (2009)
8. Duong, T.H., Nguyen, N.T., Jo, G.S.: A Method for Integration across Text Corpus
and WordNet-based Ontologies. In: IEEE/ACM/WI/IAT 2008 Workshops Pro-
ceedings, pp. 1–4. IEEE Computer Society (2008)
9. Duong, T.H., Nguyen, N.T., Jo, G.S.: A Method for Integration of WordNet-based
Ontologies Using Distance Measures. In: Lovrek, I., Howlett, R.J., Jain, L.C. (eds.)
KES 2008, Part I. LNCS (LNAI), vol. 5177, pp. 210–219. Springer, Heidelberg
(2008)
10. Duong, T.H., Jo, G.S.: Anchor-Prior: An Effective Algorithm for Ontology In-
tegration. In: IEEE International Conference on Systems, Man, and Cybernetics
(IEEE SMC 2011), pp. 942–947. IEEE Computer Society, Anchorage (2011)
11. Giunchiglia, F., Shvaiko, P., Yatskevich, M.: S-Match: An Algorithm and an Imple-
mentation of Semantic Matching. In: Bussler, C.J., Davies, J., Fensel, D., Studer,
R. (eds.) ESWS 2004. LNCS, vol. 3053, pp. 61–75. Springer, Heidelberg (2004)
12. Maedche, A., Motik, B., Silva, N., Volz, R.: MAFRA – A MApping FRAmework
for Distributed Ontologies. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW
2002. LNCS (LNAI), vol. 2473, pp. 235–250. Springer, Heidelberg (2002)
13. Nguyen, N.T.: Conflicts of Ontologies – Classification and Consensus-Based Methods
for Resolving. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds.) KES 2006, Part II.
LNCS (LNAI), vol. 4252, pp. 267–274. Springer, Heidelberg (2006)
14. Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet::Similarity-measuring the
relatedness of concepts. In: Proceedings of NAACL (2004)
166 T.H. Duong, H.B. Truong, and N.T. Nguyen
15. Jain, P., Hitzler, P., Sheth, A.P., Verma, K., Yeh, P.Z.: Ontology Alignment for
Linked Open Data. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang,
L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496,
pp. 402–417. Springer, Heidelberg (2010)
16. Sowa, J.F.: Knowledge Representation: Logical, Philosophical and Computational
Foundations. Brooks/Cole (2000)
17. Su, X., Gulla, J.A.: Semantic Enrichment for Ontology Mapping. In: Meziane, F.,
Métais, E. (eds.) NLDB 2004. LNCS, vol. 3136, pp. 217–228. Springer, Heidelberg
(2004)
A Novel Choquet Integral Composition
Forecasting Model Based on M-Density
1 Introduction
The composition forecasting model is first considered by the work of Bates and
Granger (1969) [1], they are now in widespread use in many fields, especially in
economic field. Zhang Wang and Gao (2008) [2] applied the linear composition
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 167–176, 2012.
© Springer-Verlag Berlin Heidelberg 2012
168 H.-C. Liu et al.
forecasting model which composed the time series model, the second-order
exponential smoothing model and GM(1,1) forecasting model in the Agricultural
Economy Research. In our previous work [6], we extended the work of Zhang, Wang,
and Gao to propose some nonlinear composition forecasting model which also
composed the time series model, the second-order exponential smoothing model and
GM(1,1) forecasting model by using the ridge regression model [3] and the theory of
Choquet integral with respect to some fuzzy measures, including extensional L-
measure, L-measure, λ-measure and P-measure [4-10], and then found that the
extensional L-measure Choquet integral based composition forecasting model is the
best one. However, all of above mentioned Choquet integral composition forecasting
models with some different fuzzy measures are based on N-density, we know that the
performance of any Choquet integral is predominate by its fuzzy measure, and the
performance of any fuzzy measure is predominate by its fuzzy density function, in
other words, the performance of any Choquet integral is predominate by its fuzzy
density function.
In this paper, a novel fuzzy density function, called M-density, is considered.
Based on this new fuzzy density function, a novel composition forecasting model is
also considered. For comparing the forecasting efficiency of this new fuzzy density
function with the well-known fuzzy density functions, N-density, is also considered.
( ) 1 h
(
SMSE θˆt( ) = ∑ θˆt + j |t + j −1 − θˆt + j )
2
h
and (1)
h j =1
( )
then SMSE θˆt( ) is called the sequential mean square error (SMSE) of the h
h
Let (
yˆt + j |t = f xt + j ,1 , xt + j ,2 ,..., xt + j , m ) (3)
(iv) Let ( )
SMSE yˆt( ) =
h 1 h
h j =1
(
∑ yˆt + j|t + j −1 − yt + j )
2
(4)
( )
SMSE xt(, k) =
h 1 h
h j =1
(
∑ xt + j , k − yt + j )
2
(5)
m
yˆt = ∑ β k xt , k (7)
k =1
(i) Let y t = ( y1 , y2 ,..., yt ) be realized data vector of target variable from time 1 to
T
(
f ( X t ) = f xt ,1 , xt ,2 ,..., xt , m ) (9)
( ) = (X )
T −1
(iv) Let β t( r ) = βt(,1r ) , βt(,2r ) ,..., β t(,rm) T
t X t + rI m X tT y t (10)
yˆ t = f ( X t ) = X t β (t )
r
(11)
Then ( )
yˆ t + j |t = f X t + j = X t + j β (t )
r
(12)
(
yˆt + j |t = f xt + j ,1 , xt + j ,2 ,..., xt + j , m )
m (13)
= ⎡⎣ xt + j ,1 , xt + j ,2 ,..., xt + j , m ⎤⎦ β t( ) = ∑ βt(, k) xt + j , k
r r
k =1
mσˆ 2 1 t
, σˆ 2 = ∑ ( yi − yˆt )
2
r= (15)
βt β
T
t i =1
A ⊆ B ⇒ μ ( A) ≤ μ ( B ) (monotonicity) (17)
d ( x ) = μ ({ x} ) , x ∈ X (18)
∑ d ( x) = 1
x∈ X
(19)
Definition 8. N-density
Let μ be a fuzzy measure on a finite set X = { x1 , x2 ,..., xn } , yi be global response of
( )
subject i and f i x j be the evaluation of subject i for singleton x j , satisfying:
( )
0 < fi x j < 1, i = 1, 2,..., N , j = 1, 2,..., n (20)
( ( )) , j = 1, 2,..., n
r f xj
If ( )
dN xj =
∑ r ( f ( x ))
n (21)
j
j =1
3.3 M-density
In this paper, a novel normalized fuzzy density function based on Mean Square Error,
denoted M-density, is proposed, its formal definition is introduced as follows:
Definition 9. M-density
Let μ be a fuzzy measure on a finite set X = { x1 , x2 ,..., xn } , yi be global response of
( )
subject i and f i x j be the evaluation of subject i for singleton x j , satisfying:
( )
0 < fi x j < 1, i = 1, 2,..., N , j = 1, 2,..., n (22)
( )
−1
⎡ MSE x j ⎤
If ( )
dM x j = n⎣ ⎦ , j = 1, 2,..., n (23)
( )
−1
∑j =1
⎡
⎣
MSE x j ⎤
⎦
( ) ∑( y ( ))
N
1 2
Where MSE x j = i − fi x j (24)
N i =1
3.4 λ-measure
A, B ∈ 2 X , A ∩ B = φ , A ∪ B ≠ X
(25)
⇒ g λ ( A ∪ B ) = gλ ( A ) + gλ ( B ) + λ gλ ( A ) gλ ( B )
i i i (26)
i =1
Theorem 1. For any given normalized fuzzy density function, a λ-measure is just an
additive measure.
3.5 P-measure
∀
A ∈ 2 X ⇒ g P ( A ) = max d ( x ) = max g P ({ x} ) (27)
x∈ A x∈ A
3.6 L-measure
L ∈ [ −1, ∞ ) , A ⊂ X
⎧ (1 + L ) ∑ d ( x ) − L max d ( x) , L ∈ [ −1, 0]
⎪ x∈ A
x∈ A
⎪
⎪
( A − 1) L∑ d ( x ) ⎡⎢1 − ∑ d ( x )⎤⎥
(29)
⇒ g LE ( A ) = ⎨
⎪ d ( x) + ⎣ x∈ A ⎦ , L ∈ ( 0, ∞ )
⎪∑
x∈ A
x∈ A ⎡ n − A + L ( A − 1) ⎤ ∑ d ( x )
⎪⎩ ⎣ ⎦ x∈ X
( ) ( ) ( )
m
( ) ( )
where f i x( 0) = 0 , f i x( j ) indicates that the indices have been permuted so that
( ) ( ) ( ) {
0 ≤ fi x(1) ≤ f i x( 2 ) ≤ ... ≤ fi x( m ) , A( j ) = x( j ) , x( j +1) ,..., x( m) } (31)
( ) ( )
m
∫ C fi d λ = ∑ d x j f i x j , i = 1, 2,..., N (32)
j =1
i C f t dg μ ⎥ (33)
α ,β
t =1 ⎦
A Novel Choquet Integral Composition Forecasting Model Based on M-Density 175
1 N
1 N S yf
αˆ =
N
∑y
t =1
t − βˆ
N
∑ ∫ f dg μ ,
t =1
t βˆ =
S ff
(34)
A real data of the grain production with 3 kinds of forecasted values of the time series
model, the exponential smoothing model and GM(1,1) forecasting model,
respectively, in Jilin during 1952 to 2007 was obtained from the Table 1. in the paper
of Zhang, Wang and Gao [2]. For evaluating the proposed new density based
composition forecasting model, an experiment with the above-mentioned data by
using sequential mean square error was conducted.
We arrange the first 50 years grain production and their 3 kinds of forecasted
values as the training set and the rest data as the forecasting set. And the following N-
density and M-density of all fuzzy measures were used
N-density: {0.3331, 0.3343, 0.3326} (35)
SMSE
Composition forecasting Models
N-density M-density
LE-measure 13939.84 13398.29
L-measure 14147.83 13751.60
Choquet integral regression
λ-measure 21576.38 19831.86
P-measure 16734.88 16465.98
Ridge regression 18041.92
Multiple linear regression 24438.29
176 H.-C. Liu et al.
Table 1 shows that the M-density based Choquet integral composition forecasting
model with respect to LE-measure outperforms other composition forecasting models.
Furthermore, for each fuzzy measure, including the LE-measure, L-measure, λ-
measure and P-measure, the M-density based Choquet integral composition
forecasting model is better than the N-density based.
5 Conclusion
In this paper, a new density, M-density, was proposed. Based on M-density, a novel
composition forecasting model was also proposed. For comparing the forecasting
efficiency of this new density with the well-known density, N-density, a real data
experiment was conducted. The performances of Choquet integral composition
forecasting model with extensional L-measure, λ-measure and P-measure, by using M-
density and N-density, respectively, a ridge regression composition forecasting model
and a multiple linear regression composition forecasting model and the traditional linear
weighted composition forecasting model were compared. Experimental result showed
that for each fuzzy measure, including the LE-measure, L-measure, λ-measure and P-
measure, the M-density based Choquet integral composition forecasting model is better
than the N-density based, and the M-density based Choquet integral composition
forecasting model outperforms all of other composition forecasting models.
Acknowledgment. This paper is partially supported by the grant of National Science
Council of Taiwan Government (NSC 100-2511-S-468-001).
References
1. Bates, J.M., Granger, C.W.J.: The Combination of Forecasts. Operations Research
Quarterly 4, 451–468 (1969)
2. Zhang, H.-Q., Wang, B., Gao, L.-B.: Application of Composition Forecasting Model in the
Agricultural Economy Research. Journal of Anhui Agri. Sci. 36(22), 9779–9782 (2008)
3. Hoerl, A.E., Kenard, R.W., Baldwin, K.F.: Ridge regression: Some simulation.
Communications in Statistics 4(2), 105–123 (1975)
4. Liu, H.-C., Tu, Y.-C., Lin, W.-C., Chen, C.C.: Choquet integral regression model based on
L-Measure and γ-Support. In: Proceedings of 2008 International Conference on Wavelet
Analysis and Pattern Recognition (2008)
5. Liu, H.-C.: Extensional L-Measure Based on any Given Fuzzy Measure and its Application.
In: Proceedings of 2009 CACS International Automatic Control Conference, National Taipei
University of Technology, Taipei Taiwan, November 27-29, pp. 224–229 (2009)
6. Liu, H.-C., Ou, S.-L., Cheng, Y.-T., Ou, Y.-C., Yu, Y.-K.: A Novel Composition Forecasting
Model Based on Choquet Integral with Respect to Extensional L-Measure. In: Proceedings of
The 19th National Conference on Fuzzy Theory and Its Applications (2011)
7. Choquet, G.: Theory of capacities. Annales de l’Institut Fourier 5, 131–295 (1953)
8. Wang, Z., Klir, G.J.: Fuzzy Measure Theory. Plenum Press, New York (1992)
9. Sugeno, M.: Theory of fuzzy integrals and its applications. unpublished doctoral
dissertation, Tokyo Institute of Technology, Tokyo, Japan (1974)
10. Zadeh, L.A.: Fuzzy Sets as a Basis for Theory of Possibility. Fuzzy Sets and Systems 1,
3–28 (1978)
Aggregating Multiple Robots with Serialization
Abstract. This paper presents the design of an intelligent cart system to be used
in a typical airport. The intelligent cart system consists of a set of mobile
software agents to control the cart and provides a novel method for alignment.
If the carts gather and align themselves automatically after being used, it is
beneficial for human workers who have to collect them manually. To avoid
excessive energy consumption through the collection of the carts, in the
previous study, we have used ant colony optimization (ACO) and a clustering
method based on the algorithm. In the current study, we have extended the
ACO algorithm to use the vector values of the scattered carts in the field instead
of mere location. We constructed a simulator that performs ant colony
clustering using vector similarity. Waiting time and route to the destination of
each cart are made based on the cluster created this way. These routes and
waiting times are conveyed by the agent to each cart, while making them in
rough lines. Because the carts are clustered by the similarity of vectors, we have
observed that several groups have appeared to be aligned. The effectiveness of
the system is demonstrated by constructing a simulator and evaluating the
results.
1 Introduction
When we pass through terminals of an airport, we often see carts scattered in the
walkway and laborers manually collecting them one by one. It is a laborious task and
not a fascinating job. It would be much easier if carts were roughly gathered in any
way before the laborers begin to collect them. Multi-robot systems have made rapid
progress in various fields, and the core technologies of multi-robots systems are now
easily available [1]. Therefore, it is possible to make each cart have minimum
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 177–186, 2012.
© Springer-Verlag Berlin Heidelberg 2012
178 S. Sugiyama et al.
intelligence, making each cart an autonomous robot. We realize that for such a system
cost is a significant issue and we address one of those costs, the power source. A big
powerful battery is heavy and expensive; therefore such intelligent cart systems with
small batteries are desirable. Thus energy saving is an important issue in such a
system [2].
We use mobile agents to drive the carts that are placed in various locations to the
quasi-optimal destinations. So, through Ant Colony Optimization (ACO) with a
simulation, we choose the method of determining the destination of each cart.
Because clustering with the ACO's position can be determined from the optimal set
using ant agents that exist only on the simulator [3]. Earlier studies have succeeded in
making some beautiful groups [4] [5]. However, because we collected carts this way
it looks overindulgent, and a good alignment is hard. So we decided to do a better
alignment of the carts at the same time of their collection. In the improved method,
Ant Colony Clustering (ACC) has been used to calculate the best pheromone set
position, by making a degree of similarity in the vector; we should be able to align the
orientation of the cart in a cluster.
The structure of the balance of this paper is as follows. In the second section, we
describe the background. In the third section, we describe the agent system that
performs the arrangement of the intelligent carts. The agent system consists of several
static and mobile agents. The static agents interact with the users and compute the
ACC algorithm, and other mobile agents gather the initial positions of the carts and
distribute the assembly positions. The fourth section describes the simulation field,
and the initial coordinates and directions of the scattered carts in the field. The fifth
section describes the ant colony clustering (ACO) that uses vector value of each cart
has. The algorithm draws the carts that have similar vector values so that carts that are
roughly facing the same direction get together. The sixth section describes the
simulator and demonstrates the effectiveness of our algorithm through numerical
experiments. Finally, in the seventh section we summarize the work and discuss
future research directions.
2 Background
contributes to energy saving of multiple robots [2]. They have achieved significant
saving of energy.
Deneubourg has formulated the biology inspired behavioral algorithm that
simulates the ant corps gathering and brood sorting behaviors [6]. Lumer has
improved Deneubourg’s model and proposed a new simulation model that is called
Ant Colony Clustering [7]. His method could cluster similar objects into a few
groups. Kambayashi et al have improved Lumer's model and proposed a multi-robot
control by using ACC [4]. This method does not cluster similar objects but cluster
nearby objects by using pheromone. Kambayashi et al have designed an intelligent
cart system based on the proposed method [8]. As mentioned above, previous studies
could collect mobile robots roughly using ACC. In this paper, we try to align the
mobile robots while collecting them.
3 System Model
Our system model consists of carts and a few kinds of static and mobile software
agents. All the controls for the mobile carts as well as ACC computation performed in
the host computer are achieved through the static and mobile agents. They are: 1) user
interface agent (UIA), 2) operation agents (OA), 3) position collecting agent (PCA),
4) clustering simulation agent (CSA), and 5) driving agents (DA). All the software
agents except UIA and CSA are mobile agents. A mobile agent traverses carts
scattered in the field one by one to collect their coordinates. After receiving
the assembly positions computed by a static agent, many mobile agents migrate to the
carts and drive them to the assembly positions. Fig. 1 shows the interactions of the
cooperative agents to control an intelligent cart. The followings are details of each
agent:
1) User Interface Agent (UIA): The user interface agent (UIA) is a static agent that
resides on the host computer and interacts with the user. It is expected to
coordinate the entire agent system. When the user creates this agent with a list of
IP addresses of the intelligent carts, UIA creates PCA and passes the list to it.
2) Operation Agent (OA): Each cart has at least one operation agent (OA). It has the
task that the cart on which it resides is supposed to perform. Each intelligent cart
has its own OA. Currently all operation agents (OA) have a function for collision
avoidance and a function to sense RFID tags embedded in the floor carpet to
detect its precise coordinates in the field.
3) Position Collecting Agent (PCA): A distinct agent called position collecting
agent (PCA) traverses carts scattered in the field one by one and collects their
coordinates. PCA is created and dispatched by UIA. Upon returning to the host
computer, it hands the collected coordinates to the clustering simulation agent
(CSA) for ACC.
4) Clustering Simulation Agent (CSA): The host computer houses the static
clustering simulation agent (CSA). This agent actually performs the ACC
algorithm by using the coordinates collected by PCA as the initial positions, and
produces the quasi-optimal assembly positions of the carts, and then performs yet
180 S. Sugiyama et al.
another simulation to produce instructions each cart follows to reach its assigned
goal position. Upon terminating the simulation and producing the procedures for
all the intelligent carts, CSA creates a number of driving agents (DA).
5) Driving Agent (DA): The quasi-optimal arrangement coordinates, as well as
procedures to reach them, produced by the CSA are delivered by driving agents
(DA). One driving agent is created for each intelligent cart, and it contains the set
of procedures for the cart. The DA drives its intelligent cart to the designated
assembly position. DA is the intelligent part of the intelligent cart.
OA detects the current coordinate of the cart on which it resides. Each cart has its own
IP address and UIA hands in the list of the IP addresses to PCA. First, PCA migrates
to an arbitrary cart and starts hopping between them one by one. It communicates
locally with OA, and writes the coordinates of the cart into its own local data area.
When PCA gets all the coordinates of the carts, it returns to host computer. Upon
returning to the host computer, PCA creates CSA and hands in the coordinate data to
CSA which computes the ACC algorithm.
The current implementation employs RFID (Radio Frequency Identification) to get
precise coordinates [9]. We set RFID tags in a regular grid shape under the floor
carpet tiles. The tags we chose have a small range so that the position-collecting agent
can obtain fairly precise coordinates from the tags. Also, the cart has a basic collision
avoidance mechanism using infrared sensors.
CSA is the other static agent and its sole role is ACC computation. When CSA
receives the coordinate data of all the carts, it translates the coordinates into
coordinates for simulation, and then performs the clustering. When CSA finishes the
computation and produces a set of assembly positions, it then creates the set of
procedures for autonomous cart movements.
Then CSA creates DA that conveys the set of procedures to the intelligent carts as
many. Each DA receives its destination IP address from PCA, and the set of
procedures for the destination cart, and then migrates to the destination cart. Each DA
has a set of driving procedures that drives its assigned cart to the destination, while it
avoids collision. OA has the basic collision detection and avoidance procedures, and
DA has task-specific collision avoidance procedures.
Aggregating Multiple Robots with Serialization 181
The purpose of this study is to assemble robot carts scattered in a field into several
short lines. In order to achieve this task, we have developed an extended ACO
algorithm to collect robot carts that have similar vector values, i.e. facing the similar
directions. In order to develop and validate our algorithm, we have developed a
simulator. In this and the following sections concentrate the extended ACO algorithm
we have developed through the simulator. Since the idea is general, we henceforth
use “object” instead of “cart.”
Simulation field is the n × n grid of two-dimensional grid (Fig. 2). Since the edge
of the field is surrounded by walls so that ants and artificial objects do not go out of
the field. The initial position vectors of the objects in the simulator are determined
randomly each time the simulator starts. The number of objects and field size of the
simulator can be configured by the user.
In traditional systems ACO, artificial ants that are able to follow the same path of
other artificial ants leave pheromone signal. Our extended ACO system uses
pheromone as the similarity of vector values. Pheromone is generated from objects,
diffused into the surroundings of the objects. The farther away from the object, the
weaker the property becomes (Fig. 3).
Artificial ants are moving in random directions every eight discrete time, and they
pick up the weak vector objects. In addition, an artificial ant has an object, if the
vector of a nearby object that has high similarity with the object, the artificial ant has
a habit of putting objects there. When an artificial ant had placed an object, the object
vector is synthesized with vectors of surrounding objects into one vector.
Artificial ant behavior in the ACC of this study is determined by the equation
(1), (2), (3).
Equation (1), an artificial ant has discovers an object i, the ant determines whether
the expression suggests to pick up the object or not. Kp is the size of the norm that
forms the cluster. f (i) is the magnitude of the norm of object i found. The norm is the
magnitude of the vector. When an artificial ant has discovered the value of the norm
of the object, it compares the Kp . If Kp is less than the norm of the object the
artificial ant found, it pick up the object.
Equation 2, when the newly discovered object j an artificial ant has an object i, is
an expression determines whether the ant places the object i. Vs is a constant for the
determination of similarity. cos (i, j) represents the similarity of objects i and j. If an
artificial ant finds a new object and that has an object, artificial ants compares
the similarity of vector object and found object. If the value Vs exceeds the
predetermined degree of similarity obtained at this time, the ant put the object.
Equation (3) is used when a determination is made in Equation (2) is used to
compute the similarity between an object and a vector i of j. This value is represented
by 0 and 1. We let vanishingly low affinity be 0 and we let high affinity be 1. <x, y>
expresses the dot product of y and x, and | | x | | is the norm of x. In determining
whether to place the object in Equation 2 is obtained by this similarity.
When creating a group of several objects with similar vectors, our goal is to move
each cart the shortest distance to generate a cluster. In order to achieve this goal, the
artificial ants have the following regulatory action.
1. An artificial ant finds a cluster of objects and the cluster has more than a certain
norm, the artificial ant avoids picking up an object from the cluster. This number
can be updated.
2. An artificial ant with an object cannot find a cluster with the strength of the
norm, it moves randomly. Direction of movement of artificial ants that time is
affected by the vector of the object it has.
3. An artificial ant with an object cannot find any cluster within a certain walking
distance; it put the object back into the current position and starts the search
again.
Aggregating Multiple Robots with Serialization 183
(1)
⎧ 1, if f (i ) < Kp
Ppick (i ) = ⎨
⎩ 0, otherwise
(2)
⎧1, if Vs < cos( i, j )
Pdrop (i ) = ⎨
⎩0, otherwise
(3)
x, y
cos θ =
x. y
Based on the above rules, artificial ants are aligned to form a cluster of objects with
similar vector values the same as the phase 1 so that the ants need not to carry them
for a long distance. In order to implement the phase 1, the system locks the objects
with the norm above a certain feature so that any ants do not pick them up. Once
clustering is emerging, to update the size of the norm to be fixed so that artificial ants
can bring what were previously fixed. This is the rule to form a cluster with a larger
norm.
Features of 2, if an artificial ant cannot sense pheromone around it, it attempts to
discover pheromone by moving at random to one of the 8 squares around it.
Consideration of the costs incurred when changing the orientation of the cart, and
then to move forward toward the object as possible vectors (Fig. 4).
Feature of the 3 is the ability to reset the movement of artificial ants become too
expensive to move. The travel distance of each object is limited. This is intended to
eliminate the energy loss caused by unnecessary movement. Once the object is moved
to the left to right to find a cluster with a similar vector, it is loss energy. In order to
prevent such a situation, this feature is essential.
Several large clusters with the same vector values should emerge by repeating the
above rules eventually.
184 S. Sugiyama et al.
In this section, we report the results of the experiments we have conducted through
the simulator. Simulator is started with randomly placed objects in the field. The
number of artificial ants is determined by the user. Each artificial ant checks
pheromones around it at the every discrete time step to receive an order to commence.
Artificial ants, based on the value of this pheromone, find the object, pick it up, take
action and place it. Clusters of objects are formed based on the similarity of the vector
values. The newly formed cluster synthesizes pheromones of the object to form a
larger cluster. These clusters will update the value of the norm so that no artificial
ants picked up the objects that form a cluster. The simulator repeats this procedure
until the target number of clusters is formed.
When a certain number of clusters are formed, the simulator will start another
simulation. This simulation is for calculating the set position and movement path as
well as waiting timing of each object. Each object will move one square at each step
by the shortest route to their meeting place. If the object tries to move the position
where another object is, it waits to avoid a collision. Each object on the display leaves
a trajectory when it moves in the simulator so that the user of simulator can see how
the objects have moved at a glance.
The simulator evaluates the simulation results by using the sum of the distances of
objects and the number of clusters and the similarity of the vectors in the cluster. The
less the number of clusters and the sum of the distances, the higher it rates. If the
similarity of the vector is closest to 1 rated high. Objects belonging to each cluster
must lower the sum of the distances and also cluster similarity is high. The vector
value formed by clustering with the similarity affects behavioral characteristics of
ants with objects.
In this experiment, the field size was 100 × 100, is there was 100 objects and 100
artificial ants. Experiment was set up this way (Fig. 5).
Approximately 70% of the objects within each cluster are in the same direction.
The resulted clusters generated by the similarity of vector values show the
experiments are succeeded. Also, the cost of forming clusters of objects with similar
vector values is not different from that of without vector values. We can say that the
moving cost is successfully suppressed (Table 1). The experiments suggest that our
method is useful.
Aggregating Multiple Robots with Serialization 185
Old New
No C_No Cost AveCost No C_No Cost AveCost
1 10 934 8.50 1 9 1029 9.22
2 11 929 8.18 2 8 1077 9.88
3 10 905 8.60 3 12 881 8.17
4 8 1046 9.63 4 7 1099 11.00
5 11 907 7.73 5 11 916 7.73
Ave 10 944.2 8.53 Ave 9.4 1000.4 9.20
7 Summary
This paper present an algorithm to make clusters of objects based on the similarity of
the vector values of objects. The algorithm is an extension of the ant colony
optimization. This algorithm is to be used in a framework for controlling the robot
carts used in an airport. Since the similarity is the vector value, the robots in the
formed clusters must tend to have similar directions, i.e. facing the same direction.
186 S. Sugiyama et al.
This feature must greatly reduce the manual labor work, when it is implemented in
real environment.
We have constructed a simulator to demonstrate the effectiveness of our algorithm.
Approximately 70% of the objects within each cluster are facing in the same
direction. The resulted clusters that are generated by the vector is said to have
succeeded. We are re-implanting the algorithm to adjust the detection range of the
pheromone and pheromone concentration ratio so that more precise alignment can be
achieved.
As the next step, we plan not only to align the direction of the robots but also to
serialize the formed robots so that we can provide more benefits to the cart collection.
References
1. Kambayashi, Y., Takimoto, M.: Higher-Order Mobile Agents for Controlling Intelligent
Robots. International Journal of Intelligent Information Technologies 1(2), 28–42 (2005)
2. Takimoto, M., Mizuno, M., Kurio, M., Kambayashi, Y.: Saving Energy Consumption of
Multi-robots Using Higher-Order Mobile Agents. In: Nguyen, N.T., Grzech, A., Howlett,
R.J., Jain, L.C. (eds.) KES-AMSTA 2007. LNCS (LNAI), vol. 4496, pp. 549–558. Springer,
Heidelberg (2007)
3. Kambayashi, Y., Yamachi, H., Takimoto, M.: A Search for Practical Implementation of the
Intelligent Cart System. In: Congress on Computer Applications and Computational
Science, pp. 895–898 (2010)
4. Kambayashi, Y., Ugajin, M., Sato, O., Tsujimura, Y., Yamachi, H., Takimoto, M.,
Yamamoto, H.: Integrating Ant Colony Clustering Method to a Multi-Robot System Using
Mobile Agents. Industrial Engineering and Management System 8(3), 181–193 (2009)
5. Kambayashi, Y., Yamachi, H., Tsujimura, Y.: A Search for Efficient Ant Colony
Clustering. In: Asia Pacific Industrial Engineering and Management Systems Conference,
pp. 591–602 (2009)
6. Deneubourg, J., Goss, S., Franks, N., Sendova-Franks, A., Detrain, C., Chretien, L.: The
Dynamics of Collective Sorting: Robot-Like Ant and Ant-Like Robot. In: First Conference
on Simulation of Adaptive Behavior: From Animals to Animats, pp. 356–363. MIT Press
(1991)
7. Lumer, E.D., Faiesta, B.: Diversity and adaptation in populations of clustering ants, from
animals to animats 3. In: 3rd International Conference on the Simulation of Adaptive
Behavior, pp. 501–508. MIT Press, Cambridge (1994)
8. Kambayashi, Y., Harada, Y., Sato, O., Takimoto, M.: Design of an Intelligent Cart System
for Common Airports. In: 13th IEEE International Symposium Consumer Electronics,
CD-ROM (2009)
A Multi-attribute and Multi-valued Model
for Fuzzy Ontology Integration on Instance Level*
Abstract. Fuzzy ontology are often more useful than non-fuzzy ontologies in
knowledge modeling owing to the possibility for representing the incomplete-
ness and uncertainty. In this paper we present an approach to fuzzification on
the instance level of ontology using multi-value and multi-attribute structure. A
consensus-based method for fuzzy ontology integration is proposed.
1 Introduction
In an integration process most often one of the following aspects is realized:
• Several objects are merged to give a new object best representing them (merging
aspect)
• Several objects create a “union” acting as a whole (alignment aspect)
• Several objects are corresponded with each other (mapping aspect).
These aspects are most important and most popular in integration processes of infor-
mation systems.
In the knowledge management field, an integration task most often refers to a set
of elements of knowledge with the same kind of semantic structures, the aim of which
is based on determining an element best representing the given. The kinds of struc-
tures mean, for example relational, hierarchical, logical etc. The words “best
representing” mentioned above refer to the merging aspect and mean the following
criteria for integration:
• All data included in the objects to be integrated should be included in the result
of integration. Owing to this criterion all pieces of information included in the
component elements will appear in the integration result.
• All inconsistencies of elements to be integrated should be resolved. It often hap-
pens that referring to the same subject different elements contain inconsistent
data. Such situation is called an inconsistency. The integration result should not
contain inconsistency, that is integrity constraints should be fulfilled.
*
This work was partially supported by Vietnam National Foundation for Science and Technology
Development (NAFOSTED) under grant number 102.01-2011.10 (2011-2013) and by Polish
Ministry of Science and Higher Education under grant no. N N519 407437 (2009-2012).
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 187–197, 2012.
© Springer-Verlag Berlin Heidelberg 2012
188 H.B. Truong and N.T. Nguyen
Integration tasks are very often realized for database integration or knowledge
integration processes. Ontology integration is a special case of the second case. On-
tologies have well-defined structure and it is assumed that the result of ontology inte-
gration is also an ontology. Therefore, the first and second criteria are most popular.
It seems that satisfying the first criterion is simple since one can creating a new on-
tology making the sum of all sets of concepts, relations and axioms from component
ontologies. However, it is not always possible because of the following reasons:
• Occurrence of all elements in the integration result may contain inconsistency in
the sense that some of the component ontologies may be in conflict and this con-
flict will be moved to the integration result.
• Including all elements in the integration result may cause conflicts of the rela-
tions between ontology concepts.
Satisfying the second criterion is based on solving conflicts, for example, by using
consensus methods [15].
Similarly like for non-fuzzy ontologies [14], conflicts between fuzzy ontologies
may also be considered on the following levels:
• Conflicts on concept level: The same concept has different structures in different
ontologies.
• Conflicts on relation level: The relations between the same concepts are different
in different ontologies.
• Conflicts on instance level: The same instance has different descriptions in dif-
ferent concepts or ontologies.
The subject of this paper is working out algorithms for ontology integration on instance
level. In the next section we present the structure of ontology on instance level and the
definition of integration. In Section 3 a model for fuzzy instance integration using multi-
value and multi-attribute approach is included. Section 4 includes a set of postulates for
integration. In Section 5 an algorithm for fuzzy ontology integration on instance level are
presented and finally in Section 6 a brief overview of related works is included.
The basis for determining an ontology is a real world (A, V) where A is a finite set of
attributes describing the domain and V – the domain of A, that is V is a set of attribute
values, and V = ∪ a∈AVa (where Va is the domain of attribute a). In the previous work
[16] we have presented the following definition of fuzzy (A,V)-based ontology:
Fuzzy ontology = (C, R, Z)
where:
- C is the finite set of concepts. A concept of a fuzzy ontology is defined as a triple:
concept = (c, Ac, Vc, fc)
A Multi-attribute and Multi-valued Model for Fuzzy Ontology Integration 189
where c is the unique name of the concept, Ac ⊆ A is a set of attributes describing the
concept and Vc ⊆ V is the attributes’ domain: Vc = ∪ Va and fc is a fuzzy function:
a∈ A c
fc: Ac → [0,1]
representing the degrees to which concept c is described by attributes from set Ac.
Triple (Ac, Vc, fc) is called the fuzzy structure of concept c.
- R is a set of fuzzy relations between concepts, R = {R1, R2,…, Rm} where
Ri ⊆ C × C × (0, 1]
for i = 1, 2,…,m. A relation is then a set of pairs of concepts with a weight represent-
ing the degree to which the relationship should be. We assume that within a relation
Ri in an ontology a relationship can appear between two concepts only with one value
of weight, that is if (c, c′, v) ∈ Ri and (c, c′, v′) ∈ Ri then v = v′.
- Z is a set of constraints representing conditions on concepts and their relation-
ships. In this paper we do not deal with them.
Note that in the above definitions there is no reference to fuzzy aspect of instances.
It turned out that even in a non-fuzzy ontology where the knowledge about concepts
and their relations are complete and certain, there may appear some problem with
classifying instances to concepts because, for example, some attribute values for an
instances may be incomplete or unknown. For solving this problem, it seems that the
multi-value structure is very useful.
Here we will present a concept for representing the uncertainty and incompleteness
of instance description by means of a multi-value and multi-attribute structure.
Each attribute a ∈ A has a domain as a set Va of elementary values. A value of
attribute a may be a subset of Va as well as some element of Va. Set 2Va is called the
super domain of attribute a. For B ⊆ A let’s denote
VB = ∪ b∈B Vb and 2B = ∪ b∈B 2V .
b
Value v is also called a description of the instance within a concept. We can note that
an attribute value in an instance is not a single value, but a set of values. This is be-
cause it is not certain which value is proper for the attribute. Owing to this, the fuzzi-
ness can be represented. Note that in this case the fuzziness of attribute values is not
190 H.B. Truong and N.T. Nguyen
represented by a number, but by a set of values. The interpretation is that the proper
value of an attribute in an instance is an element of the set, it is not known which one.
For fuzzy instances the Instance Integration Condition (IIC) should be satisfied.
That is the descriptions of the same instance in different concepts should be consis-
tent. However, the same instance may belong to different concepts and may have
different descriptions. The following condition should be satisfied:
Let instance i belong simultaneously to concept c with description (i, v) and to concept
c′ with description (i, v′). If Ac ∩ Ac′ ≠ ∅ then there should be v(a) ∩ v′(a) ≠ ∅ for each
a ∈ Ac ∩ Ac′.
For a real world (A, V) we define the following notions [12]. Let B ⊆ A.
• A complex tuple (or tuple for short) of type B is a function
r: B → 2 VB
such that r(a) ⊆ Va for all a ∈ B. Instead of r(a) we will write ra and a tuple of type
B will be written as rB.
A tuple r of type B may be written as a set:
r = {(a, ra): a ∈ B}.
• An elementary tuple of type B is a function
r: B → VB
such that r(a) ∈ Va for all a ∈ B. If Va = ∅ then r(a) = ε, where symbol ε represents
a special value used for case when the domain is empty.
The set of all elementary tuples of type B is denoted by E-TU(B).
An elementary tuple r of type B may also be written as a set:
r = {(a, ra): a ∈ B}.
• By symbol φ we denote the set of all empty tuples, i.e. whose all values are empty.
By symbol φE we denote the set of all empty elementary tuples, i.e. whose all val-
ues are equal ε.
• By symbol θ we denote the set of all partly empty tuples, i.e. in which at least one
value is empty. Expression r∈θ will mean that in tuple r at least one attribute value
is empty and expression r∉θ will mean that in tuple r all attribute values are not
empty. Of course we have φ ⊂ θ. By symbol θE we denote the set of all partly emp-
ty elementary tuples.
• The sum of two tuples r and r' of type B is a tuple r" of type B such that r"a = ra ∪
r'a for each a ∈ B. This operation is written as
r" = r ∪ r'.
More generally, a sum of two tuples r and r' of types B and B' , respectively, is a
tuple r" of type B" = B ∪ B' such that
A Multi-attribute and Multi-valued Model for Fuzzy Ontology Integration 191
T ( X ,Y )
δa(X,Y) =
∑x∈Va E AR ( x )
where T(X,Y) represents the minimal cost needed for transforming set X into set Y.
∑ z∈V
N
ρa(X,Y) = Sa ( X , Y , z) .
2N − 1 a
Owing to the above defined functions one can define a function d between tuples as a
combination of them.
As shown in [12], it has turned out that in this case the best criterion for determin-
ing value v is the following:
n n
∑ d (v, vi ) = v'∈TYPE
min ∑ d (v ' , vi )
( A)
i =1 i =1
n
where A = ∪ Ai and d is a distance function between tuples.
i =1
Algorithms for determining consensus for the above problem can be found in [12].
These algorithms are dependent on the structure of attribute values. The value v which
is determined by one of these algorithms can be assumed to be the value of instance i
in the final ontology.
In many cases integration task is not based on consensus determining. As shown
above, consensus calculation is most often based on resolution of an optimization
problem. However, note that optimization is only one of possible conditions for inte-
gration result. As presented in the next section, we define several criteria for instance
description integration. These criteria represent some intuitive and rational require-
ments for determining a description proper for an instance which have different de-
scriptions in different in ontologies, or in the same ontology, but in different concepts.
We define the problem of instance descriptions integration as follows:
Given a set of instance descriptions
n
∩ ti ≺ t*
i =1
⎡n ⎤
∪
t* = ⎢ ti ⎥
⎢⎣i =1 ⎥⎦T *
⎡n ⎤ n
∪
where ⎢ ti ⎥ is the sum ∪ ti restricted to attributes from set T*.
⎣⎢i =1 ⎦⎥T * i =1
C4. Maximal similarity
Let da be a distance function values of attribute a ∈ A then the difference between
integration t* and the profile elements should be minimal in the sense that for
each a ∈ T* the sum
∑ d ( t *a , r )
r∈Z a
where
Za = {ria: ria is definite, i = 1, 2,…, n}
should be minimal.
∑r ∈X
ia a
d a (va , ria ) = min
v ' a ⊆V a
∑r ∈ X
ia a
d a (v'a , ria )
The most important step in the above algorithm is step 3 which for each attribute
determines its integration satisfying criterion C4. The integration problem of this type
has been formulated and analyzed in work [12]. In that work a set of postulates for
integration has been defined, and algorithms for its determining have been worked
out.
We can prove that the integration determined by this algorithm satisfies all criteria
C1, C2, C3 and C4. The computational complexity of this algorithm is O(n2).
6 Related Works
The main problem of ontology integration is often formulated as follows: For given
ontologies O1, …, On one should determine one ontology which best represents them.
In this way we have to deal with ontology merging. For integrating traditional (non-
fuzzy) ontologies many works have been done, among others in [4]-[7], [17], [18].
The main contributions of these works are based on methods for matching concepts
and their relations. The number of works concerned with integrating ontologies on
instance level is not large. Many authors do not treat instances as a part of ontology.
Fuzzy ontology conception is younger and there are not many works on this
subject. The same fact is for fuzzy ontology integration. For this kind of ontologies
one can distinguish two groups of works. In the first of them the authors proposed
logical-based approaches, in which they tried to couple both fuzzy and distributed
features using description and fuzzy logics [3], [8]. They worked out several discrete
tableau algorithms to achieve reasoning within this new logical system. In the second
group the authors proposed a fuzzy ontology generation framework in which a
concept descriptor is represented by means of fuzzy relation which encodes the
degree of a property value using a fuzzy membership function. The fuzzy ontology
integration is based on identifying the most likely location of particular concepts in
ontologies to be merged. In [1], [2] the authors proposed an approach to manage
imprecise and classic information in databases represented by fuzzy ontology.
Generally in the literature there are missing clear criteria for ontology integration.
Most often proposed algorithms refer to concrete ontologies and their justification is
rather intuitive than formal. The reason is that the semantics of concepts and their
relations are not clearly defined, but rather based on default values.
In this paper we propose to use consensus theory to fuzzy ontology integration on
instance level. The advantages of this approach are based on the fact that consensus
methods are very useful in processing many kinds of inconsistencies which very often
appear in integration tasks. Besides, consensus methods possess well-defined criteria.
196 H.B. Truong and N.T. Nguyen
As it is well known, ontologies even describing the same real world often contain
many inconsistencies because they have been created for different aims. Using con-
sensus methods for integrating fuzzy ontologies is novel since, to the best knowledge
of the authors, this approach is missing in the literature. For non-fuzzy ontology inte-
gration consensus methods have been used successfully [11], [19].
7 Conclusions
In this paper a method for integrating fuzzy instances is proposed. The structure for
fuzzy instance is not based on traditional real number degrees of instance membership
to classes, but on the possibility of representing the uncertain and incomplete aspect
by using the multi-value structure. Future works should concern implementing the
proposed algorithm and verifying the method for real ontologies.
References
1. Abulaish, M., Dey, A.: A Fuzzy Ontology Generation Framework for Handling Uncertain-
ties and Non-uniformity in Domain Knowledge Description. In: Proceedings of the Inter-
national Conference on Computing: Theory and Applications, pp. 287–293. IEEE (2007)
2. Blanco, I.J., Vila, M.A., Martinez-Cruz, C.: The Use of Ontologies for Representing Data-
base Schemas of Fuzzy Information. International Journal of Intelligent Systems 23(4),
419–445 (2008)
3. Calegari, S., Ciucci, D.: Fuzzy Ontology, Fuzzy Description Logics and Fuzzy-OWL. In:
Masulli, F., Mitra, S., Pasi, G. (eds.) WILF 2007. LNCS (LNAI), vol. 4578, pp. 118–126.
Springer, Heidelberg (2007)
4. Duong, T.H., Nguyen, N.T., Jo, G.S.: A Method for Integration across Text Corpus and
WordNet-based Ontologies. In: IEEE/ACM/WI/IAT 2008 Workshops Proceedings, pp. 1–4.
IEEE CS (2008)
5. Duong, T.H., Jo, G.S., Jung, J.J., Nguyen, N.T.: Complexity Analysis of Ontology Integra-
tion Methodologies: A Comparative Study. Journal of Universal Computer Science 15(4),
877–897 (2009)
6. Duong, T.H., Nguyen, N.T., Jo, G.S.: A Method for Integrating Multiple Ontologies.
Cybernetics and Systems 40(2), 123–145 (2009)
7. Fernadez-Breis, J.T., Martinez-Bejar, R.: A Cooperative Framework for Integrating Ontol-
ogies. Int. J. Human-Computer Studies 56, 665–720 (2002)
8. Lu, J., Li, Y., Zhou, B., Kang, D., Zhang, Y.: Distributed Reasoning with Fuzzy Descrip-
tion Logics. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007.
LNCS, vol. 4487, pp. 196–203. Springer, Heidelberg (2007)
9. Kemeny, J.G.: Mathematics without Numbers. Daedalus 88, 577–591 (1959)
10. Nguyen, N.T.: Using Distance Functions to Solve Representation Choice Problems. Fun-
damenta Informaticae 48, 295–314 (2001)
11. Nguyen, N.T.: A Method for Ontology Conflict Resolution and Integration on Relation
Level. Cybernetics and Systems 38(8), 781–797 (2007)
12. Nguyen, N.T.: Advanced methods for inconsistent knowledge management. Springer,
London (2008)
A Multi-attribute and Multi-valued Model for Fuzzy Ontology Integration 197
13. Nguyen, N.T.: Consensus system for solving conflicts in distributed systems. Journal of
Information Sciences 147, 91–122 (2002)
14. Nguyen, N.T.: Conflicts of Ontologies – Classification and Consensus-Based Methods for
Resolving. In: Gabrys, B., Howlett, R.J., Jain, L.C. (eds.) KES 2006, Part II. LNCS
(LNAI), vol. 4252, pp. 267–274. Springer, Heidelberg (2006)
15. Nguyen, N.T.: Inconsistency of Knowledge and Collective Intelligence. Cybernetics and
Systems 39(6), 542–562 (2008)
16. Nguyen, N.T., Truong, H.B.: A Consensus-Based Method for Fuzzy Ontology Integration.
In: Pan, J.-S., Chen, S.-M., Nguyen, N.T. (eds.) ICCCI 2010, Part II. LNCS (LNAI),
vol. 6422, pp. 480–489. Springer, Heidelberg (2010)
17. Noy, N.F., Musen, M.A.: SMART: Automated Support for Ontology Merging and Align-
ment. In: Proc. of the 12th Workshop on Knowledge Acquisition, Modelling and Manage-
ment (KAW 1999), Banff, Canada, pp. 1–20 (1999)
18. Pinto, H.S., Martins, J.P.: A Methodology for Ontology Integration. In: Proceedings of the
First International Conference on Knowledge Capture, pp. 131–138. ACM Press (2001)
19. Stephen, M.L., Hurns, M.N.: Consensus Ontologies: Reconciling the Semantics of Web
Pages and Agents. IEEE Internet Computing 5(5), 92–95 (2001)
Making Autonomous Robots Form Lines
1 Introduction
When we pass through terminals of an airport, we often see carts scattered in the
walkway and laborers manually collecting them one by one. It is a laborious task and
not a fascinating job. It would be much easier if carts were roughly gathered in any
way before the laborers begin to collect them. Multi-robot systems have made rapid
progress in various fields, and the core technologies of multi-robots systems are now
easily available [5]. Therefore, it is possible to make each cart have minimum
intelligence, making each cart an autonomous robot. We realize that for such a system
cost is a significant issue and we address one of those costs, the power source. A big
powerful battery is heavy and expensive; therefore such intelligent cart systems with
small batteries are desirable. Thus energy saving is an important issue in such a
system [4]. Travelers pick up carts at designated points and leave them arbitrary
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 198–207, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Making Autonomous Robots Form Lines 199
ACC algorithm, and other mobile agents gather the initial positions of the carts and
distribute the assembly positions. The fourth section describes how the collected cart
robots arrange themselves into lines. Through experiments, both in real robots and in
a simulator, we demonstrate the method we propose is useful for the autonomous
robots forming lines. The fifth section describes the simulator and demonstrates the
feasibility of our intelligent cart gathering system. In the section, we report the results
we have obtained from the experiments on the simulator. Finally, in the sixth section
we summarize the work and discuss future research directions.
2 Background
Kambayashi and Takimoto have proposed a framework for controlling intelligent
multiple robots using higher-order mobile agents [4][5]. The framework helps users to
construct intelligent robot control software by migration of mobile agents. Since the
migrating agents are higher-order, the control software can be hierarchically
assembled while they are running. Dynamically extending control software by the
migration of mobile agents enables them to make base control software relatively
simple, and to add functionalities one by one as they know the working environment.
Thus they do not have to make the intelligent robot smart from the beginning or make
the robot learn by itself. They can send intelligence later as new agents. Even though
they demonstrate the usefulness of the dynamic extension of the robot control
software by using the higher-order mobile agents, such higher-order property is not
necessary in our setting. They have implemented a team of cooperative search robots
to show the effectiveness of their framework, and demonstrated that their framework
contributes to energy saving of multiple robots [4]. They have achieved significant
saving of energy.
Deneubourg has formulated the biology inspired behavioral algorithm that
simulates the ant corps gathering and brood sorting behaviors [1]. Lumer has
improved Deneubourg's model and proposed a new simulation model that is called
Ant Colony Clustering [6]. His method could cluster similar objects into a few
groups. Kambayashi et al have improved Lumer's model and proposed a multi-robot
control by using ACC [2]. This method does not cluster similar objects but cluster
nearby objects by using pheromone. Kambayashi et al have designed an intelligent
cart system based on the proposed method [3]. As mentioned above, previous studies
could collect mobile robots roughly using ACC. In this paper, we present a technique
to form a series of short lines after collecting mobile robots.
3 System Model
Our system model consists of carts and a few kinds of static and mobile software
agents. All the controls for the mobile carts as well as ACC computation performed in
the host computer are achieved through the static and mobile agents. They are: 1) user
interface agent (UIA), 2) operation agents (OA), 3) position collecting agent (PCA),
4) clustering simulation agent (CSA), and 5) driving agents (DA). All the software
Making Autonomous Robots Form Lines 201
agents except UIA and CSA are mobile agents. A mobile agent traverses carts scattered
in the field one by one to collect their coordinates. After receiving the assembly positions
computed by a static agent, many mobile agents migrate to the carts and drive them to the
assembly positions. Fig. 1 shows the interactions of the cooperative agents to control an
intelligent cart. The followings are details of each agent:
1) User Interface Agent (UIA): The user interface agent (UIA) is a static agent that
resides on the host computer and interacts with the user. It is expected to
coordinate the entire agent system. When the user creates this agent with a list
of IP addresses of the intelligent carts, UIA creates PCA and passes the list to it.
2) Operation Agent (OA): Each cart has at least one operation agent (OA). It has
the task that the cart on which it resides is supposed to perform. Each intelligent
cart has its own OA. Currently all operation agents (OA) have a function for
collision avoidance and a function to sense RFID tags embedded in the floor
carpet to detect its precise coordinates in the field.
3) Position Collecting Agent (PCA): A distinct agent called position collecting agent
(PCA) traverses carts scattered in the field one by one and collects their
coordinates. PCA is created and dispatched by UIA. Upon returning to the host
computer, it hands the collected coordinates to the clustering simulation agent
(CSA) for ACC.
4) Clustering Simulation Agent (CSA): The host computer houses the static clustering
simulation agent (CSA). This agent actually performs the ACC algorithm by using
the coordinates collected by PCA as the initial positions, and produces the quasi-
optimal assembly positions of the carts, and then performs yet another simulation
to produce instructions each cart follows to reach its assigned goal position. Upon
terminating the simulation and producing the procedures for all the intelligent
carts, CSA creates a number of driving agents (DA).
5) Driving Agent (DA): The quasi-optimal arrangement coordinates, as well as
procedures to reach them, produced by the CSA are delivered by driving agents
(DA). One driving agent is created for each intelligent cart, and it contains the
set of procedures for the cart. The DA drives its intelligent cart to the designated
assembly position. DA is the intelligent part of the intelligent cart.
OA detects the current coordinate of the cart on which it resides. Each cart has its own
IP address and UIA hands in the list of the IP addresses to PCA. First, PCA migrates
202 K. Satta, M. Takimoto, and Y. Kambayashi
to an arbitrary cart and starts hopping between them one by one. It communicates
locally with OA, and writes the coordinates of the cart into its own local data area.
When PCA gets all the coordinates of the carts, it returns to host computer. Upon
returning to the host computer, PCA creates CSA and hands in the coordinate data to
CSA which computes the ACC algorithm.
The current implementation employs RFID (Radio Frequency Identification) to get
precise coordinates [9]. We set RFID tags in a regular grid shape under the floor
carpet tiles. The tags we chose have a small range so that the position-collecting agent
can obtain fairly precise coordinates from the tags. Also, the cart has a basic collision
avoidance mechanism using infrared sensors.
CSA is the other static agent and its sole role is ACC computation. When CSA
receives the coordinate data of all the carts, it translates the coordinates into
coordinates for simulation, and then performs the clustering. When CSA finishes the
computation and produces a set of assembly positions, it then creates the set of
procedures for autonomous cart movements.
Then CSA creates DA that conveys the set of procedures to the intelligent carts as
many. Each DA receives its destination IP address from PCA, and the set of
procedures for the destination cart, and then migrates to the destination cart. Each DA
has a set of driving procedures that drives its assigned cart to the destination, while it
avoids collision. OA has the basic collision detection and avoidance procedures, and
DA has task-specific collision avoidance procedures.
4 Forming Lines
As shown in Fig. 2, we have successfully collected robots at quasi-optimal positions
from scattered robots in the field using ACC. Simple clusters of robots are not very
helpful for practical purpose. We need to arrange them in order by some ways.
In this section, we discuss how to make collected cart robots be arranged into
chunks of short lines. The Alignment method we are proposing consists of two major
phases. In order to achieve line-forming, we extend DA so that they can interact with
DAs in nearby neighbor robots. In the first phase, nearby robots with similar
directions arrange themselves in short lines. In the second phase, isolated robot
attempt to adhere to already formed lined robots. This phase should make all the
robots form into groups. We describe the two phases in detail below.
(2) When the cloned agent arrives at the robot B, it overrides the DA on robot B
and start to drive it. This time, a combination of location information as well as
information from the web camera is used to achieve the task as is in the first phase.
Since the cloned agent comes from the robot A, it knows the color of the board of the
robot A and forwarding to the robot and makes it adhere to the robot A is
straightforward.
(3) Since robot B is the head of a lined robots, e.g. robots B, C, and D form a line
as a group, we need to make all of them go forward. This is done through the
dispatched agent’s migration as shown in Fig. 4. For example, in Fig. 4, after the
agent drives the robot B to the back of the robot A, it migrate to the robot C, and then
robot D. When it finds there are no more robots to drive, it simply kills itself.
Fig. 5 shows the experiment for the first phase by using two real robots.
5 Experiments
We have conducted two kinds of experiments to demonstrate our framework for the
intelligent robot carts is feasible. The first one is to check whether two relatively close
robots can form a line. As shown in Fig. 5, this is achieved successfully. Since the
task is mainly depends of the web camera. They need to be real close. But it should be
alright. Considering the cart application, the system should require cheap web
cameras. Isolated remaining robot can be assembled in the second phase.
Making Autonomous Robots Form Lines 205
For the second phase, we have built a simulator. As shown in Fig. 6, the field is
50 × 50 square grid. The number placed of robots is 50. The robots’ initial positions
are randomly determined. As shown in Fig. 6, quit a few of the robots get together to
form short lines. Even though they are far from the perfect, forming several short
lines is one big leap from randomly assembled clusters in the previous work.
0 45 90 135 180 225 270 315 360 0 45 90 135 180 225 270 315 360
Fig. 7. Scatter plot of the direction in which alignment before and after alignment
6 Summary
This paper shows a framework for controlling the robots, which are connected to a
communication network, to move into groups. The framework for the ACC to the
mobile robot in scattered field will be set to an optimal position on the field, followed
by fine alignment is performed. Through experiments we have demonstrated that our
algorithm makes the robots assembled into some short lines. Considering cart
application, making scattered robots into several short lines is a big leap toward a
practical system. Shintani et al conduct similar research project [10]. They mainly use
good web camera to get coordinates of other robots, and two kinds of mobile agents
namely pheromone agents and driving agent. They actually achieve more elaborate
system to serialize robots, but use more expensive equipment. Our approach, i.e.
using cheap equipment, we believe more practical. The experiments we have
conducted for a large number of robots are through simulator not actual robots. The
simulator, however, shows good results, and actually we have conducted another
project that is building real intelligent shopping carts that follow the users [11]. We
can combine those achievements into one big project so that we can build a practical
intelligent cart robot for assisting both the users and the collecting laborers.
References
1. Deneubourg, J., Goss, S., Franks, N., Sendova-Franks, A., Detrain, C., Chretien, L.: The
Dynamics of Collective Sorting: Robot-Like Ant and Ant-Like Robot. In: First Conference
on Simulation of Adaptive Behavior: From Animals to Animats, pp. 356–363. MIT Press
(1991)
2. Kambayashi, Y., Ugajin, M., Sato, O., Tsujimura, Y., Yamachi, H., Takimoto, M.,
Yamamoto, H.: Integrating Ant Colony Clustering Method to a Multi-Robot System Using
Mobile Agents. Industrial Engineering and Management System 8(3), 181–193 (2009)
Making Autonomous Robots Form Lines 207
3. Kambayashi, Y., Ugajin, M., Sato, O., Takimoto, M.: Design of an Intelligent Cart System
for Common Airports. In: 13th IEEE International Symposium Consumer Electronics,
CD-ROM (2009)
4. Takimoto, M., Mizuno, M., Kurio, M., Kambayashi, Y.: Saving Energy Consumption of
Multi-robots using Higher-Order Mobile Agents. In: Nguyen, N.T., Grzech, A., Howlett,
R.J., Jain, L.C. (eds.) KES-AMSTA 2007. LNCS (LNAI), vol. 4496, pp. 549–558.
Springer, Heidelberg (2007)
5. Kambayashi, Y., Takimoto, M.: Higher-order mobile agents for controlling intelligent
robots. International Journal of Intelligent Information Technologies 1(2), 28–42 (2005)
6. Lumer, E.D., Faiesta, B.: Diversity and adaptation in populations of clustering ants, from
animals to animats 3. In: 3rd International Conference on the Simulation of Adaptive
Behavior, pp. 501–508. MIT Press, Cambridge (1994)
7. Kambayashi, Y., Yamachi, H., Takimoto, M.: A Feasibility Study of the Intelligent Cart
System. In: SICE Annual Conference, pp. 1159–1163 (2010)
8. Kambayashi, Y., Yamachi, H., Takimoto, M.: A Search for Practical Implementation of
the Intelligent Cart System. In: International Congress on Computer Applications and
Computational Science, pp. 895–898 (2010)
9. Kambayashi, Y., Takimoto, M.: Location of Intelligent Carts Using RFID. In: Turcu, C.
(ed.) Deploying RFID: Challenges, Solutions, and Open Issues, pp. 249–264. InTech,
Rijeka (2011)
10. Shintani, M., Lee, S., Takimoto, M., Kambayashi, Y.: A Serialization Algorithm for Mobile
Robots Using Mobile Agents with Distributed Ant Colony Clustering. In: König, A.,
Dengel, A., Hinkelmann, K., Kise, K., Howlett, R.J., Jain, L.C. (eds.) KES 2011, Part I. LNCS
(LNAI), vol. 6881, pp. 260–270. Springer, Heidelberg (2011)
11. Kohtsuka, T., Onozato, T., Tamura, H., Katayama, S., Kambayashi, Y.: Design of a Control
System for Robot Shopping Carts. In: König, A., Dengel, A., Hinkelmann, K., Kise, K.,
Howlett, R.J., Jain, L.C. (eds.) KES 2011, Part I. LNCS (LNAI), vol. 6881, pp. 280–288.
Springer, Heidelberg (2011)
Genetic Algorithm-Based Charging Task
Scheduler for Electric Vehicles
in Smart Transportation
1 Introduction
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 208–217, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Genetic Algorithm-Based Charging Task Scheduler for EVs 209
energy exchanges. Even though many researches are trying to improve the driv-
ing range while cutting down the charging time, EVs still need to be charged
more often and it takes at least tens of minutes [3]. Accordingly, it is neces-
sary to build a nationwide charging infrastructure which embraces fast charging
stations, battery swapping stations, and individual charging points for slower
charging [2].
For the large scale deployment of EVs, the smart grid can have a centralized
management architecture and also local control entities working in administra-
tive units such as a building or a charging station. EV penetration will put
increased pressure on peaking unit not having any charging control strategy [4].
Moreover, some grids are likely to build additional generation capacity to catch
up with the increased power demand resulted from concentrated EV charging.
To cope with this problem, a grid unit can run demand management programs,
which are becoming more important to meet the customer requirement as well as
to achieve system goals like peak load reduction, power cost saving, and energy
efficiency. Demand response schemes can shed or shift peak load according to
the load type, mainly by scheduling the given charging tasks [1]. Moreover, via
the global network connection, they can further consider price signal change and
current load condition.
The scheduling problem for charging tasks is quite similar to process schedul-
ing in the real-time operating system, as each charging operation can be modeled
as tasks having execution time, start time, and a deadline [5]. The difference lies
in that charging tasks can run in parallel as long as the total power does not
exceed the provisioned capacity. Task scheduling is in most cases a complex
time-consuming problem sensitive to the number of tasks. It is difficult to solve
by conventional optimization schemes, and their severe execution time makes
them impractical in the real system. For such a problem, genetic algorithms can
provide efficient search techniques based on principles of natural selection and
genetics. Specifically, the convergence and performance of them are inherently
affected by the initial generation, especially when the solution space is large and
complex [6]. If the initial population has better quality, the genetic algorithm is
more likely to yield a better result.
Moreover, heuristic-based approaches also gain computational performance
possibly at the cost of accuracy, mainly taking advantage of empirical intelli-
gence. Provided that the heuristic is fast and efficient enough, it is possible to
make their solutions included in the initial population and run a genetic al-
gorithm to further improve the quality of feasible solutions. This strategy can
combine the time-efficiency of heuristic-based approaches and the iterative evo-
lutions of genetic algorithms. Based on this motivation, this paper designs an
efficient charging task scheduler for EVs, aiming at reducing the peak load in a
charging station and thus remedying the problem of power demand concentra-
tion in a specific time interval. Here, a fast heuristic is available by our previous
work [7], which finds reasonable solutions with the polynomial time complexity
for the number of charging tasks.
210 J. Lee et al.
facility. Receiving the request, the scheduler prepares the power load profile of
the vehicle type from the well-known vehicle information database. The load
profile, or interchangeably consumption profile, contains the power consumption
dynamics along the time axis for specific EV charging. Then, the station checks
whether it can meet the requirement of the new request without violating the
constraints of already admitted requests. The result is delivered back to the
vehicle, and the driver may accept the schedule, attempt a renegotiation, or
choose another station. The accuracy of load profile is critical to the correct
schedule generation, but we assume that sufficiently accurate profile is available
to focus on the scheduling scheme [4].
From the vehicle-side viewpoint, entering the station, the vehicle is assigned
and plugged to a charger as shown in Figure 1. The controller connects or dis-
connects power to each vehicle according to the schedule generated by either a
scheduler within a charging station or a remote charging server running in the
Internet. Nowadays, a high-capacity server in the cloud can provide computing
power to time-intensive applications. The SAE J1772 series define the standard
for electric connectors and their charging system architecture, including physi-
cal, electrical, communication protocol, and performance requirement [10]. Our
scheduler can work with this standard as a core service for electric vehicles. It
must be mentioned that the interaction between the scheduler and EVs can be
performed through the specific vehicle network, such as cellular networks, ve-
hicle ad hoc networks, or a combination of them under the control of vehicle
telematics system.
Global on−arrival
Scheduler
network request
Controller
Power
Line
it can wait with its charging interface connected to the jack. Here, we assume
that available jacks and waiting space are enough to accommodate all arriving
vehicles. Even if an EV doesn’t arrive within Ai , the scheduler can generate a
new allocation promptly if the computation time is not severe.
For a task, the power consumption behavior can vary according to the charging
stage, remaining amount, vehicle type, and the like. The power consumption
profile is practical for characterizing the consumption dynamics along the battery
charging stage. This profile is the basic information in generating a charging
schedule. In the profile, the power demand is aligned to the fixed-size time slot,
during which the power consumption is constant considering the availability
of automatic voltage regulation [11]. The slot length can be also affected by
power charging dynamics. The length of a time slot can be tuned according to
the system requirement on the schedule granularity and the computing time,
and likely coincides with the time unit generally used in the real-time price
signal. The charging operation can start only at the slot boundary for scheduling
efficiency.
First, the slack is defined as the difference between the deadline and the
last start time to meet the time constraint. The larger the slack, the task has
more options in its schedule generation. As for ordering by slack, tasks having
fewer options are placed first. Then, those tasks having relatively more options
are more likely to find slots having less power consumption. Next, the task hav-
ing longer operation length fills more table entries. If smaller-length tasks are
allocated first, the longer tasks will smooth the peak. Last, we can order the task
according to the weight, or per-slot power demand. The weight for a task is the
average power demand during its operation time. If tasks demanding more power
meet at the same time slot, the peak will get too large. To avoid this situation,
our scheme allocates those tasks first, as the next allocation can distribute the
power-intensive slots. While [7] selects the best one out of those allocations, this
paper makes all of them included in the initial population of a genetic algorithm.
The rest of population is filled randomly as usual.
randomly selects a pair of two crossover points and swaps the substrings from
each parent. Reproduction may generate a same chromosome as the existing
ones in the population. It is meaningless to have multiple instances of a single
schedule. So, they will be replaced by new random ones. Additionally, mutation
exchanges two elements in a chromosome. However, each element has a different
permissible range, so the mutation must be prohibited. The charging scheduler
is subject to time constraint. But, this constraint can be always met, as the
scheduler selects the allocation vector only within the valid range.
4 Performance Measurement
This section implements the proposed allocation method using Visual C++ 6.0,
making it run on the platform equipped with Intel Core2 Duo CPU, 3.0 GB
memory, and Windows Vista operating system. M is set to 20, so if a single
time unit is 10 min, the total scheduling window will be 200 min. For a task,
the start time is selected randomly between 0 and M − 1, while the operation
length exponentially distributes. A task will be discarded and replaced if the
finish time, namely, the sum of start time and the operation length, exceeds M .
In addition, the power level for each time slot ranges from 1 through 5. The
power scale is not explicitly specified in this experiment, as it is a relative-value
term. For each parameter setting, 50 tasks are generated, and the results are
averaged.
For performance comparison, we select 3 policies. First, the Smallest alloca-
tion is our heuristic [7], which is actually the best of the initial population in
the proposed scheme. Second, the Genetic allocation runs general genetic oper-
ations iteratively with the initial population randomly set. Third, the Random
allocation generates feasible schedules using the random numbers during the
approximately same time interval needed to execute the proposed scheme. It
randomly selects Ui out Di − Ai slots. The Random selection works quite well
as the candidate allocation is selected only within the valid range for each set.
Particularly, if the difference between Ui and Di − Ai is small for some tasks and
the number of tasks is small, this scheme can be sometimes efficient. After all, for
fair comparison, the same task set is given to the 4 schemes in each parameter
setting.
The first experiment measures the peak load reduction according to the num-
ber of tasks. In Genetic and proposed schemes, the population size is set to 60
and the number of iterations is set to 1,000. The operation length and the slack
of a task exponentially distribute with the average of 5.0 and 2.0, respectively,
while the number of tasks ranges from 5 to 20. Figure 2 plots the peak load for
each allocation scheme. Random, Smallest, and Genetic schemes show almost
same performance, while the proposed scheme reduces peak load by up to 4.9 %
for 20 task case. When there are less than 10 tasks, the Random scheme yields a
little bit better result than Genetic and Smallest schemes, as it can try a lot of
allocations benefiting from its simplicity. However, beyond 10 tasks, the others
are better.
Genetic Algorithm-Based Charging Task Scheduler for EVs 215
50 40
"Random"
"Random" 38 "Genetic"
40 "Genetic" "Smallest"
"Smallest" 36 "Proposed"
"Proposed"
34
Peak load
Peak load
30
32
20 30
28
10
26
0 24
6 8 10 12 14 16 18 20 500 1000 1500 2000 2500 3000
Number of tasks Iterations
The next experiment measures the effect of the number of iterations and
the result is exhibited in Figure 3. Here, other parameters are the same as the
previous experiment, but the number of tasks is fixed to 15. The Smallest scheme
is not affected by the number of iterations. This scheme is uncomparably fast, but
it shows almost the same performance as Genetic and Random schemes. The
proposed scheme, basically outperforming others by about 3.5 %, can further
reduce the peak load by 1.5 % by the extended iterations, namely from 500 to
3,000, while the Genetic scheme by 0.5% and the Random scheme by 0.3%. In
our observation, most task sets reaches stable peak loads before 200 iterations,
and are scarcely further improved.
40 40
"Random" "Random"
38 "Genetic" 38 "Genetic"
"Smallest" "Smallest"
36 "Proposed" 36 "Proposed"
34 34
Peak load
Peak load
32 32
30 30
28 28
26 26
24 24
50 100 150 200 250 300 2 3 4 5 6 7 8 9 10
Population size Average slack
In addition, Figure 4 plots the effect of population size to the peak load. The
Smallest scheme shows the poorest performance, while our scheme improves the
peak load by about 3.4 % over the whole range. The peak load largely decreases
according to the increase of the population size except when it is 100. It can result
from an extraordinary peak load in some task sets. Anyway, Figure 4 finds out
that the effect of population size is not so significant in vehicle charging schedule.
Moreover, Figure 5 shows the peak load when the slack changes from 2.0 to 5.0
216 J. Lee et al.
slots. The larger the slack, the more flexible schedule we can get. For the given
slack range, our scheme gets 2.6 % reduction in the peak load, while the others
get 3.0 %, 2.9 %, and 2.5 %. 4 schemes show almost the same peak load change
pattern.
Finally, how many ordered allocations to insert to the initial population,
namely, combination ratio, is an additional performance parameter. Figure 6
plots the result. When the combination ratio is 0, the proposed scheme is identi-
cal to the Genetic scheme, as the initial population is selected purely randomly.
Additionally, the Smallest scheme will be the same as the uncoordinated alloca-
tion, which allocates the operation as soon as the task is ready without permit-
ting preemption. Except when the combination ratio is 0, the proposed scheme
does not change much, just slightly reducing the peak load. For all range, the
proposed scheme outperforms others by about 2.6 %. The chromosome not the
best in the heuristic can contribute to improving the quality of next generations.
40
"Random"
38 "Genetic"
"Smallest"
36 "Proposed"
34
Peak load
32
30
28
26
24
0 0.2 0.4 0.6 0.8 1
Combination ratio
5 Conclusions
This paper has designed an efficient EV charging scheduler which combines the
time-efficiency of heuristic-based approaches and the evolutionary iteration of
genetic algorithms. It selects the initial population of a genetic algorithm from
a time-efficient heuristic and then regular genetic operations are applied. The
scheduler is aiming at reducing the peak power consumption in a charging sta-
tion while meeting the time constraint of all charging tasks. By this scheduler,
the concentrated charging problem can be relieved for the large deployment of
EVs while a charging station can provide a reservation service to EVs, which can
make a routing plan according to the interaction with charging stations. The per-
formance measurement result obtained from a prototype implementation shows
that our scheme outperforms Random, Genetic, Smallest schemes, reducing the
peak load for the given charging task sets by up to 4.9 %. As future work, we
are going to design an EV telematics framework which includes an efficient path
planning scheme capable of integrating a charging schedule.
Genetic Algorithm-Based Charging Task Scheduler for EVs 217
References
1. Gellings, C.: The Smart Grid: Enabling Energy Efficiency and Demand Response.
The Fairmont Press (2009)
2. Lopes, J., Soares, F., Almeida, P.: Integration of Electric Vehicles in the Electric
Power System. Proceedings of the IEEE, 168–183 (2011)
3. Markel, T., Simpson, A.: Plug-in Hybrid Electric Vehicle Energy Storage System
Design. In: Advanced Automotive Battery Conference (2006)
4. Shao, S., Zhang, T., Pipattanasomporn, M., Rahman, S.: Impact of TOU Rates on
Distribution Load Shapes in a Smart Grid with PHEV Penetration. In: Transmis-
sion and Distribution Conference and Exposition, pp. 1-6 (2010)
5. Facchinetti, T., Bibi, E., Bertogna, M.: Reducing the Peak Power through Real-
Time Scheduling Techniques in Cyber-Physical Energy Systems. In: First Interna-
tional Workshop on Energy Aware Design and Analysis of Cyber Physical Systems
(2010)
6. Togan, V., Dalgoglu, A.: An Improved Genetic Algorithm with Initial Population
Strategy and Self-Adaptive Member Grouping. Computer & Structures, 1204–1218
(2008)
7. Lee, J., Kim, H., Park, G., Jeon, H.: Fast Scheduling Policy for Electric Vehicle
Charging Stations in Smart Transportation. In: ACM Research in Applied Com-
putation Symposium, pp. 110–112 (2011)
8. Morrow, K., Karner, D., Francfort, J.: Plug-in Hybrid Electric Vehicle Charging
Infrastructure Review. In: Battelle Energy Alliance (2008)
9. Diaz-Gomez, P., Hougan, D.: Initial Population for Genetic Algorithms: A Metric
Approach. In: International Conference on Genetic and Evolutionary Methods,
pp. 43–49 (2007)
10. Toepfer, C.: SAE Electric Vehicle Conductive Charge Coupler, SAE J1772. Society
of Automotive Engineers (2009)
11. Sortomme, E., Hindi, M., MacPherson, S., Venkata, S.: Coordinated Charging of
Plug-in Hybrid Electric Vehicles to Minimize Distribution System Losses. IEEE
Transactions on Smart Grid, 198–205 (2011)
12. Lee, J., Park, G.-L., Kang, M.-J., Kwak, H.-Y., Lee, S.J.: Design of a Power
Scheduler Based on the Heuristic for Preemptive Appliances. In: Nguyen, N.T.,
Kim, C.-G., Janiak, A. (eds.) ACIIDS 2011, Part I. LNCS (LNAI), vol. 6591,
pp. 396–405. Springer, Heidelberg (2011)
13. Lee, J., Park, G.-L., Kwak, H.-Y., Jeon, H.: Design of an Energy Consumption
Scheduler Based on Genetic Algorithms in the Smart Grid. In: Jedrzejowicz,
P.,
Nguyen, N.T., Hoang, K. (eds.) ICCCI 2011, Part I. LNCS, vol. 6922, pp. 438–447.
Springer, Heidelberg (2011)
Self-Organizing Reinforcement Learning Model
1 Introduction
Machine Learning can be generally classified as supervised, unsupervised and
reinforcement learning (RL). Supervised learning requires clear input and output form
for comparison, and the goal is to construct a mapping from one to the other.
Unsupervised learning has no concept of mapping, and only process input data to find
out the potential classification. In contrast, RL uses a scalar reward signal to evaluate
input-output pairs and through trial and error to optimize the selected action for each
input. RL can be considered as an intermediary between supervised and unsupervised
learning. This learning method is most suitable for resolving an optimal input-output
mapping without prior knowledge. These three learning paradigms are often
combined to solve problem such as auto-associative multi-layer perceptron [1] and
SRV units [2].
Q-learning is a reinforcement learning technique relied on learning an action-value
function that gives the expected utility of taking a given action in a given state and
following by a fixed policy thereafter. One of the strengths of Q-learning is ability of
comparing the expected utility of the available actions without requiring a model of
the environment. Q-learning is an iterative, incremental and interactive algorithm with
simple update rule for Q-value, easy implementation and clear representation.
Computational features in Q-learning seem to make neural networks a natural
implementation choice for RL applications [3]. To combine RL theory with the
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 218–227, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Self-Organizing Reinforcement Learning Model 219
2 Reinforcement Learning
RL theory has provided an exhaustive framework for solving Markov decision
problems. The target of problem is to maximize a scalar reward signal on the state
transitions when the fundamental model dynamics are unknown. These methods are
classified as Temporal Difference (TD) learning, and Q-learning is one of the most
powerful methods.
As explained by the two pioneers of reinforcement learning, Richard S. Sutton and
Andrew G. Barto, in their Reinforcement Learning book (1998): Reinforcement
Learning is best understood by stating the problem that we want to solve [5]. The
problem is learning to achieve a goal solely from interaction with the environment.
The decision maker or learning element of RL is called an agent. The interactions
between agent and environment are depicted in
Fig. 1. The agent selects actions and the environment reacts, changes the state and
provides rewards to the agent indicating the degree of goodness of the action made by
the agent. The mission of agent is to maximize the rewards received with training and
errors.
To evaluate the goodness of a selected action within a definite state, a value
function , , is defined which maps state-action pairs to a numerical
value. While interacting with the environment, the agent must update this function to
reflect how well the actions chosen progress based on the rewards received in the
environment. How to update the values of the state-action pairs is pivotal issue of the
RL [6].
Fig. 1. Agent-Environment Interaction: the agent selects actions at each time step. The environment
updates the state and defines a numerical reward.
220 C.-H. Uang, J.-W. Liou, and C.-Y. Liou
2.1 Q-Learning
Q-learning behaves as follows: an agent tries an action at a particular state, and then
evaluates its consequences in terms of the received reward immediately after taking
an action [7].
Q-learning requires a value table as in Fig. 2, where each entry represents the
maximum expected reward from the corresponding state-action pair. The entries of
the table are called as Q-values, and optimization is then a simple matter of selecting
the action at each time-step with the highest Q-value for the current state. The only
requirement for convergence comes from constantly updating all state-action pairs. Q-
learning can be divided into following steps:
, , , , (1)
ܽଶ ġ ܳ
ܳሺݏଵ ǡ ܽଶ ሻሻġ 煑 煑
ġ
煑
ġ
煑 煑 煑
ġ
ܽ ġ ܳ
ܳሺݏଵ ǡ ܽ ሻġ ܳሺݏ ǡ ܽ ሻġ
Fig. 2. Q-table: each table entry contains the predicted-Q value for each state-action pair
The Self-Organising Map (SOM) was first conceived by Kohonen in 1982. Kohonen
describes the SOM as an “ordered, nonlinear, smooth mapping of high-dimensional
input data manifolds onto the elements of a regular, low-dimensional array” [8]. It has
recursive nature, and during each recursive step, only a subset of models is re-
organized. In Fig. 3, we present a schematic for a 2-dimensional SOM.
Self-Organizing Reinforcement Learning Model 221
Winning
Ƀ
Input
Fig. 3. 2-dimensional array of neurons conform to an SOM. The shaded circle denotes the
neighborhood of the winning neuron.
The unit with the shortest Euclidean distance from input vector is considered as
the winner because of characterizing the current input most closely. The weights of
the winner unit are immediately updated towards the input:
, (2)
where is the learning rate. Neighbors of the winner unit are also processed using
similar formula except the modification term is further multiplied by a decay
parameter according to the distance of those neighbors from the winner.
The weights of the map are initialized to random values. The above operation is
iterated for each input vector in the dataset, and effectively results in a competition
between different regions of the input space for units on the map. Compact regions of
the input space will attract more units than sparse ones. Neighborhood learning also
promotes topology preservation such that units closed to each other in the topology of
the input space will end up to be closed to each other in the weight space.
4 The Model
maps input data in response to the real-valued state information, and the index of each
unit is interpreted as a discrete state in the Q-table. A second SOM is the same as
above and used to represent the action space, with each unit of this second map
corresponding to a discrete action in the Q-table.
The first SOM is called input map, inhabits in the state space and attempts to
represent the input space at the highest resolution in the most active regions. The
second SOM is called action map, which inhabits in the action space. The second
SOM must be explored by trial and error to discover the highest reward for the
whole range of observed inputs. The following algorithm is used to achieve this
exploration: for any real-valued state vector, the unit of the input map with smallest
Euclidean distance from that state vector is identified as the winner; next, one of the
units of the action map is selected according to the Q-learning criterion—i.e. the
one with the highest Q-value for the current state if exploiting, and a random action
if exploring. This winning action unit is used as the basis for the real-valued action.
We can get the action weight vector as the proposed action. The proposed action is
then perturbed by random noise and becomes perturbed action for the actual output.
If the reward received with the perturbed action is better than the estimated
expected return associated with the winning state–action pair, then the exploration
in the action map appears to be successful, so the action map is updated towards the
perturbed action. Otherwise, no learning takes place in the action map. In any case,
the Q-value of the winning state–action pair is updated towards the actual one-step
corrected return.
The algorithm can be interpreted as standard Q-learning with the discrete states
and the discrete actions. Besides the topology preserving nature of the SOM, a
simple amendment to expedite the algorithm is not only to update the Q-value of
the winning state-action pair, but to update every state–action pair towards this
value proportional to the product of the two neighborhood functions (of the input
and action maps). We call this neighborhood Q-learning [9]. The complete
algorithm is summarized as below for the case of a 2-dimensional input space and a
2-dimensional action space:
1. Present an input vector, , to the system, and identify the winner in the input
map, .
2. Identify a unit in the action map,
One with best Q-value for state, with probability 1
.
Random action with probability
3. Identify the proposed action as the weights of unit , , .
4. Independently perturb each element of the proposed action by a small
random noise to yield a new perturbed action:
, 1,1 , 1,1
, ,
, ,
, , , , , ,
, ,
, ,
, ,
where is the state immediately after , is the learning rate of the action map,
is the learning rate of the input map, is the Q-learning rate, is the th
weight of the th unit of the action map ( are the weights of the input map),
1,1 yields a random number selected uniformly from the range 1,1 ,
controls the amount of exploration in the action space, and , , is the
value of the neighbourhood function of the input map at unit given winning unit
and a neighborhood size (similar for only for the action map). A simple linear
neighborhood is used so that , , max 0,1 / 1 where is
the distance between units and in the topology of the map. All these parameters
must be set empirically and annealed together throughout learning. The model is
illustrated in Fig. 4.
5 Experiments
In this experiment, the control problem requires a mapping learned from a continuous
2-dimensional state space to a continuous 2-dimensional action space. In Fig. 5, a goal
is generated at random on the circle shown, and the coordinates of the goal are
provided as input to the learning agent. The agent must output an adequate set of joint
angles so that the tip of the arm touches the goal. After an action is taken, reward is
224 C.-H. Uang, J.-W. Liou, and C.-Y. Liou
immediate and is simply the negative of the distance between the tip of the arm and
the goal. As shown in Fig. 6, the main task of agent is to learn a mapping from goal
space to arm space. Note that this is not supervised learning because the reward does
not contain direction information to the agent.
Outputs
Action SOM
Learn?
Q-table
Inputs
Fig. 4. The proposed learning model applied Q-learning and SOM for continuous inputs and
outputs
Table 1 shows the set of empirical parameters used to achieve the results in Fig. 7.
Performance depends crucially on the annealing speed for exploration and plasticity.
There are trade-offs between fast convergence of exploratory noise and sufficient
exploration for adequate search of the action space. The graph shown is for the fastest
annealing schedule which can significantly affect the final mean performance. Fig. 8
shows the input and action maps after learning on a typical trial. The input map has
responded to the input distribution of goal positions, and the action map, through a
process of trial and error, has also learned to represent those actions which offer
optimal solutions to the states represented in the input map. The strategy which
mapping input units to action units is indicated by the shading. It is obvious that
topology is preserved not only in the two maps, but also in the Q-table. The
performance graph from Fig. suggests that the action units indeed occupy higher
rewarded regions of the action space for the whole range of inputs.
Self-Organizing Reinforcement Learning Model 225
Goalġ
ݕ
L1 ߠଶ ġ
L2ġ
ߠଵ ġ
ݔ
Fig. 5. A simulation with two-joint arm residing inside the unit square. The base of the arm is
fixed at (0.5, 0), and the position of the end point of the arm is defined by the lengths of the two
arm segments (L1 and L2), and two relative angles, and , are measured in radians and
restricted to the range , . Note that is given relative to the ‘x-axis’.
Error signal
(-dist.)
Goal position
Agent
Arm angles
Fig. 6. The task is to learn a mapping from goal space to joint-angle space using an immediate
reward signal. The outputs of the agent are and . Note that these are not angles to move
through, but angles to move to. and uniquely describe the arm configuration.
Table 1. Empirical parameters for the proposed model’s application to the multi-joint arm
problem. Initially, all Q-values are set to zero and all action SOM weights are generated
uniformly within the range , , and input SOM weights in the range [0,1]. is the time-
step in which a single cycle of the algorithm is performed.
Parameter Value
Input map size 50 1 units
Action map size 50 1 units
Input map neighborhood size, 10
Action map neighborhood size, 10
Q-learning rate,
Discount factor, 0
Learning rate of input map,
226 C.-H. Uang, J.-W. Liou, and C.-Y. Liou
Table 1. (continued)
Fig. 7. Average reward against time using the parameters of Table 1. Each data point is
averaged over the last 1000 time-steps for each trial, and the plot shows the mean of these data
points over 20 independent trials.
InputSOM ߠଶ ġ
ݕ
ݔġ ߠଵ ġ
Fig. 8. Input map plotted in input space after learning on a typical trial. Units are shaded
according to the action unit with the highest Q-value for that input unit. Action map plotted in
action space after learning on a typical trial. Each angle is normalized to the range [0,1] (from
the range , ). Units are shaded according to their topological index in the action map.
Self-Organizing Reinforcement Learning Model 227
6 Conclusion
A model has been presented for representation and generalization in model-less RL.
The proposed model is based on the SOM and one-step Q-learning which provides
several desirable properties: real-valued states and actions are adapted dynamically; a
real-valued and potentially delayed reward signal is accommodated. The core of the
model is RL theory which is adhered to an explicit notion of estimated expected
return. The model has both an efficient learning and action selection phase, and as
long as the nature of the desired state–action mapping is unknown beforehand. The
topology preserving property of the SOM has also been shown to be able to add
useful constraints under constrained tasks, and to expedite learning through the
application of neighborhood Q-learning.
The drawbacks of this model are the theoretical possibility that arbitrarily poor
actions could be observed by the system given a sufficiently malicious reward function,
and the inherent scalability issues resulting from the fact that the representation uses
local parameters. Distributed models may offer a solution to the scalability issue, but
these models have been shown to introduce their own problems pertaining to flexibility,
stability, and robustness in the face of unpredictably distributed training data. Smith
(2001, Chapter 9) [10] suggest to combine local and distributed models may be of future
interest to the kinds of applications considered here.
References
1. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error
propagation. In: Parallel Distributed Processing, vol. 1. MIT Press, Cambridge (1986)
2. Gullapalli, V.: A stochastic reinforcement learning algorithm for learning real-valued
functions. Neural Networks 3, 671–692 (1990)
3. Smith, J.A.: Applications of the self-organizing map to reinforcement learning. Neural
Networks 15, 8–9 (2002)
4. Tesauro, G.J.: Practical issues in temporal difference learning. Machine Learning 8, 257–
277 (1992)
5. Sutton, R.S., Andrew, G.B.: Reinforcement Learning. MIT Press (1998)
6. Luis, R.S.: The Hierarchical Map Forming Model. Master’s thesis, Department of
Computer Science and Information Engineering, College of Electrical Engineering and
Computer Science, National Taiwan University (2006)
7. Watkins, C.J., Dayan, P.: Technical Note: Q-Learning. Machine Learning 8, 22 (1992)
8. Kohonen, T.: Self organization and associative memory, 2nd edn. Springer, Berlin (1987)
9. Smith, J.A.: Applications of the self-organizing map to reinforcement learning. Neural
Networks 15, 8 (2002)
10. Smith, A.J.: Dynamic generalization of continuous action spacesin reinforcement learning:
A neutrally inspired approach. PhD dissertation, Division of Informatics, Edinburgh
University, UK (2001)
Facial Feature Extraction and Applications: A Review
1 Introduction
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 228–238, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Facial Feature Extraction and Applications: A Review 229
Analyzing and identifying facial features events are laborious when the methods
adopted are primarily based on visual inspection. Accordingly, a method that can
automatic classify facial features in an image must be developed.
Several novel and particularly successful object and object category detection and
recognition methods based on image features, local descriptions of object appearance,
have recently been proposed. Most of the research and development activities in face
detection and identification known as the biometric authentication. In the recent
researches to these topics the focus of interest has changed from the image
recognition techniques toward the feature distinguish.
Many researches developed various methods for extraction and recognition of
facial features on gray and color images. The studies on facial feature extraction
continue to develop high accuracy, reduced complexity, high efficiency and less
computational time approach.
Performance-driven facial animation (PDFA) is based on the performer for the facial
expressions. PDFA use sensing devices in a virtual character to reconstruct facial
expressions. Williams [1] proposed a prototype, can be divided into two main steps:
feature tracking and 3D model construction. This method capture the performer's
facial feature points and mapped into the corresponding points of the face. And then,
after each movement for the performers do a track, and change the corresponding
model. Therefore, the method based on the cyclic process is also known as feature
retargeting. In the process of capturing the action, it is need to apply noise filter
between frames, such as band-pass filter.
The prototype PDFA is slightly restricted during the process of capturing features
in some periods of performance in the film. It cannot capture motion vectors form the
role for mapping to the new virtual character directly to replicate the same action. It
must be built a new one for duplication. In addition the movement between the frame
coherence is lack of intelligent systems for analysis these changes. It is necessary to
manually modify and fill for continuity. Also, the mapping of corresponding points
cannot directly use the context before and after the action.
Khanam and Mufti [2] utilized the fuzzy system and MPEG-4 feature points to
estimate the location of and employed open source software Xface to produce facial
animation. The fuzzy system design a membership function based the gap value of
each feature point locating in a different frame, and use fuzzy set of feature points to
design fuzzy rules for determining the respective expression. Finally, this approach is
according to the corresponding results to the animation, and enhances coherence
between frames to making more natural animation, the process shown in Fig. 1, the
results shown in Fig. 2.
Cosker et al. proposed the combination of expressions from multiple samples to
generate a new expression image, such as selecting the mouth expression from face
230 Y.-M. Wu et al.
image A and the forehead expression of face image B to generate the new expression
in C and then applying to the objective model [3]. Fragments of information about
these features can be stored in a database for future use, diagram shown in Fig. 3. This
method applied principal component analysis to create feature vectors, and use active
appearance model as a fragment of the contour features to meet the shapes with
different characteristics. Figure 4 shows the images on selection of characteristics by
active appearance model.
Fig. 2. The results of estimating action by using the fuzzy system [2]
3 Color Segmentation
Color segmentation is based on the unique color and background color information to
separate images. Jiang, Yao and Jiang proposed the color detection method using skin
probability map to set the probability threshold to cut the color and non-color area
with very high true acceptance rate, but the false acceptance rate is also high[4].
Therefore, in order to reduce the false acceptance rate, color segmentation is joined
the texture filter based on the Gabor wavelet transform-based to acquisition the
texture feature. Although the use of texture filter reduces the false acceptance rate
while it also reduces the rate of identification. In order to improve the rate of
identification, using morphological markers in the control of the marker-controlled
watershed segmentation analyze the previous texture image to calculate the mean
deviation and standard deviation, and a preset threshold. And more, if the mean
deviation and standard deviation smaller than the threshold value, the skin region can
be determined. The process is roughly divided into three phases: SPM color filter,
texture filter and mark-controlled watershed segmentation, as display in Fig. 5.
Phung, Bouzerdoum and Chai [5] utilized four classifiers to assess the accuracy of
color segmentation based on four common image processing color spaces, i.e. RGB,
HSV, YCbCr, CIE-Lab. 1) Piecewise linear decision boundary classifiers observe the
distribution of image pixel color values and set a fixed color range. In this category of
classifiers, skin and non-skin colors are separated using a piecewise linear decision
boundary. 2) Bayesian classifier with the histogram technique applied the decision
rule to consider a color pixel as a skin pixel. The decision rule is based on the a priori
probabilities of skin and non-skin and various classification costs and is associated
with a threshold determined empirically. The class-conditional pdfs can be estimated
using histogram or parametric density estimation techniques. 3) Gaussian Classifiers.
The class-conditional pdf of skin colors is approximated by a parametric functional
form, which is usually chosen to be a unimodal Gaussian [6,7] or a mixture of
Gaussians [8,9]. In the Gaussian Classifiers, A color pixel x is considered as a skin
pixel when a threshold is smaller than the squared Mahalanobis distance. 4) The
multilayer perceptron (MLP) is a feed-forward neural network that has been used
extensively in classification and regression. A comprehensive introduction to the
MLP can be found in [10]. Compared to the piecewise linear or the unimodal
Gaussian classifiers, the MLP is capable of producing more complex decision
boundaries.
The classification rates (CR) of the tested classifiers are shown in Table 1. The
Bayesian and MLP classifiers were found to have very similar performance. The
Bayesian classifier had a maximum CR of 89.79 percent, whereas the MLP classifier
had a maximum CR of 89.49 percent. Both classifiers performed consistently better
than the Gaussian classifiers and the piecewise linear classifiers. The classification
rates (CRs) at selected points on the ROC curves are shown in Table 2. We observe
that, at the histogram size of 256 bins per channel, the classification performance was
almost the same for the four color spaces tested, RGB, HSV, YCbCr, and CIE-Lab.
As our results show, such an expansion leads to more false detection of skin colors
and reduces the effectiveness of skin segmentation as an attention-focus step in object
detection tasks. Using lighting compensation techniques such as the one proposed by
Hsu et al. [11] is probably a better approach to coping with extreme or biased
lightings in skin detection.
GT
Classifier CbCr- HS- Baye- 2DG- 3DG- 3DG- 3D-
plane- MLP
ID fixed fixed sian pos pos pos/neg GM
set
FDR=10% FDR= FDR= FDR= 88.75 82.37 85.27 88.01 85.23 88.46
FDR=15% 29.09 19.48 18.77 86.17 81.07 83.45 85.57 83.63 85.97
FDR=20% CR= CR= CR= 82.97 78.85 80.84 82.47 81.29 82.84
CRmax 75.64 78.38 82.00 89.79 82.67 85.57 88.92 85.76 89.49
99% [75.58 [78.32 [81.94 [89.74 [82.61 [85.52 [88.87 [85.71 [89.44
conf. Int. 75.70 78.44] 82.06] 89.84] 82.73] 85.62] 88.97] 85.81] 89.54]
of CRmax
234 Y.-M. Wu et al.
Table 2. Classification Rates (CRs) of Eight Color Representations (Histogram Size = 256 Bins
per Channel) [5]
4 Feature Detection
Feature detection employs a particular description to distinguish or retrieve the
information of interest within images or blocks, such as edges, corners, colors, etc.,
for the establishment of the corresponding characteristic value (e.g. eigenvalue) of
information in order to facilitate general search. Then the connective step is to adopt
the appropriate image classifier to identify the existence of the same characteristics
among images or the features of value in the approximate range of information. These
steps are known as the feature extraction. Feature detection in the spatial domain and
frequency domain is with different methods used facial features on the detection and
capture in the relevant literature for the introduction in the following.
Spatial domain is the image coordinate space. The pixel value at each coordinate is the
intensity of the point. The applications on the spatial domain in practice around this area
are almost the computing of image pixels [12]. Song et al. [13] proposed to distinguish
the expression on the face wrinkles caused by changes in image intensity based on skin
deformation parameters as the face recognition features. Figure 6 shows the recognizing
facial expression on the eight regions (patch) defined by the MPEG-4 feature points
where the cross marks represent the feature points of the FAP. Lines connected to each
area using image ratio features to calculate the eigenvalues of each region to estimate the
skin deformation parameters. Image ratio features can affect the accuracy of detection on
handling images due to changes in light characteristics. The affecting of light conditions
has always been an important issue mostly on the feature detection applications [14].
Figure 7 is an example of using the image ratio features where (a) and (c) are the
expression and nature face images, respectively and the blocks of the same region is
characterized by (b). The identifications of the image can base on comparing different
images with the image ratio features.
Figure 8 shows the results of the recognition rate on the SDP, FAP and mixed
approach. Figure 9 exhibits the recognition rate profiting from the using FAP and
mixed features. The experimented results display the characteristics of the overall
recognition rate of SDP with a noticeable improvement.
Fig. 8. Comparison of the recognition rate of the SDP, FAP and comprehensive approaches [13]
Fig. 9. The recognition rate profiting from the FAP and comprehensive approaches in contrast
with SDA approach [13]
license plate to extract the characteristics of the objective images [15]. Gabor filter
feature is captured by Gabor filter and designed as an image processing operator,
called the simple Gabor feature. In order to be able to handle more complex problems,
such as looking at the characteristics of low-contrast images, Gabor filter is integrated
with multi-resolution space for forming a more efficient operator. Gabor filter
translates the pixel information of the image into different frequency band. The low
frequency of the image represents slowly changing information. In general, it is the
gray image areas, such as the tone of similar background.
As to the high frequency band, the high frequency of the image illustrates the rapid
changes in information. Usually, it is the area with high changes of the image, such as
the details of information objects, edges or the noise of image [10,12]. For two-
dimensional Gabor filter image processing is the Gaussian low-pass filter represented
in the form of complex plane [10, 15, 16]. Figure 10 illustrates the extracting features
and position on the face and license plate by using the Gabor filter. As to the optimal
parameter adjustment and effectiveness in its mathematical proof can be seen in
[15,16].
Fig. 10. The extracting features and position on the face and license plate by using the Gabor
filter
5 Image Registration
Image Registration is from different viewpoints and at different time, and/or different
sensors on the same picture according to their geometric relationship or the
calibration model by overlapping the multiple images for Image fusion [7]. Image
registration is usually applied to Image analysis such as comparing the characteristics
of information to observe the characteristic changes on different images for estimating
the possible change direction. Image registration is divided into base Image and
reference Image (input Image and sensed image) for further process.
Facial Feature Extraction and Applications: A Review 237
6 Conclusions
This paper presents different methods for feature point extraction and highlights their
performance. Various applications on feature point extraction are also summarized in
this study to provide a guide reference source for the researchers involved in facial
feature extraction and their applications.
References
1. Williams, L.: Performance-Driven Facial Animation. ACM SIGGRAPH Computer
Graphics 24(4), 235–242 (1990)
2. Khanam, A., Mufti, M.: Intelligent Expression Blending for Performance Driven Facial
Animation. IEEE Transactions on Consumer Electronics 53(2), 578–583 (2007)
3. Cosker, D., Borkett, R., Marshall, D., Rosin, P.L.: Towards Automatic Performance-
Driven Animation between Multiple Types of Facial Model. IET Computer Vision 2(3),
129–141 (2008)
4. Jiang, Z., Yao, M., Jiang, W.: Skin Detection Using Color, Texture and Space Information.
In: Proc. of the Fourth International Conf. on Fuzzy Systems and Knowledge Discovery
(FSKD 2007), August 24-27, vol. 3, pp. 366–370 (2007)
5. Phung, S.L., Bouzerdoum, A., Chai, D.: Skin Segmentation Using Color Pixel
Classification: Analysis and Comparison. IEEE Transactions on Pattern Analysis and
Machine Intelligence 27(1), 148–154 (2005)
6. Yang, J., Waibel, A.: A Real-Time Face Tracker. In: Proc. IEEE Workshop Applications
of Computer Vision, pp. 142–147 (December 1996)
7. Menser, B., Wien, M.: Segmentation and Tracking of Facial Regions in Color Image
Sequences. In: SPIE Visual Comm. and Image Processing, vol. 4067, pp. 731–740 (June 2000)
8. Greenspan, H., Goldberger, J., Eshet, I.: Mixture Model for Face Color Modeling and
Segmentation. Pattern Recognition Letters 22, 1525–1536 (2001)
9. Yang, M.-H., Ahuja, N.: Gaussian Mixture Model for Human Skin Color and Its
Applications in Image and Video Databases. In: SPIE Storage and Retrieval for Image and
Video Databases, vol. 3656, pp. 45–466 (January 1999)
10. Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 4th edn. Academic Press,
Burlington (2009)
11. Hsu, R.L., Abdel-Mottaleb, M., Jain, A.K.: Face Detection in Color Images. IEEE
Transactions on Pattern Analysis and Machine Intelligence 24(5), 696–706 (2002)
12. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Prentice Hall, New
Jersey (2002)
13. Song, M., Tao, D., Liu, Z., Li, X., Zhou, M.: Image Ratio Features for Facial Expression
Recognition Application. IEEE Transactions on Systems, Man, and Cybernetics – Part B:
Cybernetics 40(3), 779–788 (2010)
14. Mitra, S., Acharya, T.: Gesture Recognition: A Survey. IEEE Transactions on Systems,
Man, and Cybernetics – Part C: Applications and Reviews 37(3), 311–324 (2007)
15. Ilonen, J., Kamarainen, J.K., Paalanen, P., Hamouz, M., Kittler, J., Kälviäinen, H.: Image
Feature Localization by Multiple Hypothesis Testing of Gabor Features. IEEE
Transactions on Image Processing 17(3), 311–325 (2008)
16. Ilonen, J., Kamarainen, J.K., Kälviäinen, H.: Efficient Computation of Gabor Features,
Research Rep. 100, Dept. Inf. Technol., Lappeenranta Univ. Technol., Finland (2005)
An Intelligent Infant Location System Based on RFID
Shou-Hsiung Cheng
1 Introduction
As the characteristics of newborn babies are similar and the lack of expression of
ability, often cause of neonatal injury due to human negligence or error identified.
Infancy is the most vulnerable stage, a variety of injuries can cause permanent
sequelae or death. How to enhance the identification of newborns and to prevent
misuse hold are the important projects in neonatal care. Infant hearthcare with RFID
discarding the traditional paper job is becoming more and more common as hospitals
in today’s competitive environment. Wristband active RFID tags can be attached to
infants shortly after birth.
In recent years, Radio Frequency Identification (RFID) technology has been
widely accepted in hospitals to track and locate newborn babies. Hightower [1] was
the first to use RFID technology to build the indoor positioning system. He uses RSS
measurements and proposed SpotON system to estimate the distance between a target
tag and at least three readers and then applies trilateration on the estimated distances.
Lionel [2] proposed the LANDMARC method based on RFID to build indoor
positioning system. LANDMARC system have a greater read range and response
capability of the active sensor tags compared to SpotON. Wang et al. [3] propose a 3-
D positioning scheme, namely passive scheme, which relies on a deployment of tags
and readers with different power levels on the floor and the ceiling of an indoor space
and uses the Simplex optimization algorithm for estimating the location of multiple
tags. Stelzer et al. [4] uses reference tags to synchronize the readers. Then, TDOA
principles and TOA measurements relative to the reference tags and the target tag are
used to estimate the location of the target tag. Bekkali et al. [5] collected RSS
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 239–245, 2012.
© Springer-Verlag Berlin Heidelberg 2012
240 S.-H. Cheng
measure-ments from reference tags to build a probabilistic radio map of the area and
then, the Kalman filtering technique is iteratively applied to estimate the target’s
location.
The goal of this paper is proposes a straightforward and efficient infant location
system to reduce the potential risks of the theft and misuse hold. The system can
recognize without difficulty different locations of newborn babies which are attached
wristband active RFID tags.
Feature
Support 316 RF channels
Support anti-collision
Support buzzer alarm
LED Multi-LED visual indication
Baud Rate 2,400 bps ~ 115,200 bps
UID: Tag’s identification number
Support RSSI values 0-255 and RSSI values are inverse proportion
Specification
Frequency 2.45GHz
Modulation FSK
Distance 100 m
Power Consumption 3.0 mm
Transmission power 10dBm
Receiver sensitivity -103dBm
Iinterface Support USB / RS232 / RS422 / RS485 / TCPIP
Dimension 107W x 138H x 30D (mm)
Provide WinXP / VISTA / Win7 / WinCE / Linux SDK library
Software
for software development.
Feature
Wristband design.
Call button: Emergency reporting / Signal transmission.
Remote ON/OFF Tag.
Wireless tag programming.
Two colors LED visual indication: Generally, the emitting signal will glitter green; when it's low battery
or detect light sensor, it will glitter red sight.
Built-in light sensor for tamper proof.
:
Buzzer Remote active beep or click active beep.
Specification
Frequency 2.45GHz
Modulation FSK
Distance 100 m
Power Consumption 3.0 mm
Transmission power 10dBm
Receiver sensitivity -103dBm
Battery life
3 years(when the transmission number is 10 for each day) 5,
years in standby mode.
The remaining information packets include too many information such as index of
RFID tag, index of RFID reader, date and time, signal intensity, etc. However, only
the signal intensities are considered in this study to recognize the locations of active
RFID tags. The signal intensities are transformed to RSSI values and the RSSI values
are saved in database.
242 S.-H. Cheng
3 Location Algorithm
Because the signal of RFID is susceptible to environmental factors such as signal
refraction, scattering, multi-path effects, etc in indoor environments, it causes the
signal strength RSSI values received by the RFID readers change up or down.
Moreover, sometimes those medical equipments may have a strong blocking effect to
radio signals. These uncertain environmental factors can lead to the accuracy of the
neonatal positioning system is not good.
To recognize the locations of active RFID tags, a machine learning techniques are
used. In this study, the decision tree-based classifier is developed to decide the
correct RFID tags position.
ID3 decision tree algorithm is one of the earliest use, whose main core is to use a
recursive form to cut training data. In each time generating node, some subsets of the
training input tests will be drawn out to obtain the volume of information coming as a
test. After selection, it will yield the greatest amount of value of information obtained as
a branch node, selecting the next branch node in accordance with its recursively moves
until the training data for each part of a classification fall into one category or meet a
condition of satisfaction. C4.5 is the ID3 extension of the method which improved the
ID3 excessive subset that contains only a small number of data issues, with handling
continuous values-based property, noise processing, and having both pruning tree ability.
C4.5 decision tree in each node use information obtained on the volume to select test
attribute, to select the information obtained with the highest volume (or maximum
entropy compression) of the property as the current test attribute node.
Let A be an attribute with k outcomes that partition the training set S into k sub-
sets Sj (j = 1,..., k). Suppose there are m classes, denoted C = {c1 , " , cm } , and
ni
pi = represents the proportion of instances in S belonging to class ci , where
n
n = S and ni is the number of instances in S belonging to ci . The selection
measure relative to data set S is defined by:
m
Info( S ) = ∑ pi log 2 pi (1)
i =1
The information measure after considering the partition of S obtained by taking into
account the k outcomes of an attribute A is given by:
k Sj
Info( S , A) = ∑ Info( Si ) (2)
j =1 S
The information gain for an attribute A relative to the training set S is defined as follows:
Gain( S , A) = Info( S ) − Info( S , A) (3)
to the best attribute. Information gain has the drawback to favour attributes with a
large number of attribute values over those with a small number. To avoid this
drawback, the information gain is replaced by a ratio called gain ratio:
Gain(S, A)
GR( S , A) = k
Si Si
-∑ log 2
j=1
S S
(4)
Consequently, the largest gain ratio corresponds to the best attribute.
In this study, C5.0 is adapted to generate decision tree and to produce the result.
C4.5 algorithm is a continuation framework. Unlike C4.5, C5.0 provide orders which
are the most popular set of rules in many applications. The classification conditions
expressed in the form of rules one by one, which increase the classified readability.
C5.0 can deal with the future or nominal values of the various of information. The
results can easily be understood. The difference between C5.0 and C4.5 is that C5.0
can handle many more data types such as date, time, time stamp, sequence type of
discrete data, etc.
4 Experimental Environment
The equipment used in this study included three RFID readers, six newborn babies
attached wristband active RFID tags, for each connected network devices to the filter
server of signal packet. Experimental environment is 3.0 meters * 3.0 meters as shown in
Figure 2. Figure 2 also shows that experimental region is divided into nine regional grids.
5 Numerical Results
Every active RFID tags send an electromagnetic wave per second. The intensity of
electromagnetic waves transmitted by active RFID tags is transformed to the RSSI
value. Each reader support RSSI values 0-255. The reading range and RSSI are inverse
proportion. The number of experimental records is 300, including 75% of experimental
records as training data and 25% of experimental records as training data, in this study.
The experimental accuracies of RFID Tags locations are shown in Table 3.
6 Conclusion
This paper presents an intelligent infrant location system based on active RFID in
conjunction with decision tree-based classifier. From the experimental results
obtained in this study, some conclusions can be summarized as follows:
(1.) The experimental results show that the proposed infrant location system can
accurately recognizes the locations of newborn babies.
(2.) The method presented in the study is straightforward, simple and valuable for
practical applications.
References
1. Hightower, J., Vakili, C., Borriello, C., Want, R.: Design and Calibration of the SpotON
AD-Hoc Location Sensing System, UWCSE 00-02-02 University of Washington,
Department of Computer Science and Engineering, Seattle,
http://www.cs.washington.edu/homes/jeffro/pubs/
hightower2001design/hightower2001design.pdf
An Intelligent Infant Location System Based on RFID 245
2. Ni, L.M., Liu, Y., Lau, Y.C., Patil, A.P.: LANDMARC:Indoor Location Sensing Using
Active RFID (2003)
3. Wang, C., Wu, H., Tzeng, N.F.: RFID-based 3-D positioning schemes. In: IEEE
INFOCOM, pp. 1235–1243 (2007)
4. Stelzer, A., Pourvoyeur, K., Fischer, A.: Concept and application of LPM—a novel 3-D
local position measurement system. IEEE Trans. Microwave Theory Techniques 52(12),
2664–2669 (2004), http://www.ubisense.net/default.aspS
5. Bekkali, A., Sanson, H., Matsumoto, M.: RFID indoor positioning based on probabilistic
RFID map and kalman filtering. In: 3rd IEEE International Conference on Wireless and
Mobile Computing, Networking and Communications, IEEE WiMob (2007)
An Intelligently Remote Infant Monitoring System
Based on RFID
Shou-Hsiung Cheng
1 Introduction
As the characteristics of newborn babies are similar and the lack of expression ability,
it often lead to neonatal injury due to human negligence or error identified. Infancy is
the most vulnerable stage, a variety of injuries can cause permanent sequelae or death.
How to enhance the identification of newborns and to prevent misuse hold are the
important projects in neonatal care. Infant hearthcare with RFID discarding the
traditional paper job is becoming more and more urgent need as hospitals in today’s
competitive environment.
A highly concerning healthcare application is real-time tracking and location of
medical assets. Swedberg [1] proposed a patient-and employee-tracking system based on
active 433MHz RFID technology is currently being tested at Massachusetts General
Hospital. The pilot gathers information regarding patient flow and bottlenecks with the
expected outcome of gaining a better understanding of how the clinical system behaves.
It will potentially reveal aspects such as how long a patient sat alone in an examining
room or whether the medical personnel spent the proper time with the patient.
Chowdhury et al. [2] were patients are assigned a RFID wristband for identification
purposes, but their approach does not take advantage of storage capabilities of RFID
devices. The unique identification code provided by the wristband is only used as a
‘‘license plate’’ and all related data is stored and recovered in a backend server as in
traditional non-RFID patient management systems.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 246–254, 2012.
© Springer-Verlag Berlin Heidelberg 2012
An Intelligently Remote Infant Monitoring System Based on RFID 247
In recent years, Radio Frequency Identification (RFID) technology has been widely
accepted in hospitals to track and locate newborn babies. RFID positioning systems
can be broadly divided into two classes: tag and reader localization, depending on the
RFID component type of the target. In tag localization schemes, readers and possibly
tags are deployed as reference points within the area of interest and a positioning
technique is applied for estimating the location of a tag. Hightower [3] was the first to
use RFID technology to build the indoor positioning system. He uses RSS
measurements and proposed SpotON system to estimate the distance between a target
tag and at least three readers and then applies trilateration on the estimated distances.
Lionel [4] proposed the LANDMARC method based on RFID to build indoor
positioning system. LANDMARC system have a greater read range and response
capability of the active sensor tags compared to SpotON. Wang et al. [5] propose a 3-D
positioning scheme, namely passive scheme, which relies on a deployment of tags and
readers with different power levels on the floor and the ceiling of an indoor space and
uses the Simplex optimization algorithm for estimating the location of multiple tags.
On the other hand, in tag localization schemes, usually passive or active tags with
known coordinates and possibly readers are deployed as reference points and their IDs
are associated with their location information. Lee et al. [6] proposed passive tags are
arranged on the floor at known locations in square pattern. The reader acquires all
readable tag locations and estimates its location and orientation by using weighted
average method and Hough transform, respectively. Yamano et al. [7] utilize the
received signal strength to determine the reader’s position by using machine learning
technique. In the training phase, the reader acquires the RSS from every tag in various
locations in order to build a support vector machine. Xu et al. [8] proposed a Bayesian
approach to predict the position of a moving object.
This study proposes a straightforward and efficient remote infant monitoring system to
reduce the potential risks of the theft, misuse hold and abnormal body temperature. Not
only the proposed monitoring location system can recognize different babies but also can
track the locations of newborn babies by using active RFID tags. Further, the proposed
infant monitoring system can send off warning signals when the theft, misuse hold and
abnormal body temperature of the babies are occurred. The remote infant monitoring
system enable fast communicating with the clinical staff and families by the mobile
devices such as notebook, tablet PC, and smart phone.
In order to build the network systems combined RFID and wireless Wi-Fi systems in
the whole neonatal hospital, we chose the RFID readers and tags with the frequency of
2.45GHz. Every RFID tags will transmit the various intensity of electromagnetic wave.
The RFID Reader will send the various information packets of every RFID tags through
the network to the back-end Server to filter. However, the information received from
RFID Reader is not all available, which mingles lots of incomplete signals and noise, etc.
Therefore, the information packets should be filtered to remove unnecessary records.
According to collected information packets, filter server will filter out the signal which is
too low, loses some data or is invalid. The remaining information packets will be saved
in database. The remaining information packets will be used by the location classifier as
experimental data in order to recognize the locations of active RFID tags. The feature and
specification of active RFID reader are summarized as table1. The feature and
specification of active RFID tag are summarized as table2. The wristband active RFID
tag are built in two thermal sensors for continuously monitoring the temperature of
infants. One thermal sensor detects the skin temperature. The other detects the ambient
temperature.
An Intelligently Remote Infant Monitoring System Based on RFID 249
Feature
Support 316 RF channels
Support anti-collision
Support buzzer alarm
LED Multi-LED visual indication
Baud Rate 2,400 bps ~ 115,200 bps
UID: Tag’s identification number
T1: Ambient temperature sensor
T2: Skin temperature sensor
Note: T1 / T2 / SENSOR use for anti-tamper capability.
Support RSSI values 0-255 and RSSI values are inverse proportion
RSSI: Received Signal Strength Indication (0-255). Reading range and RSSI are inverse proportion.
Specification
Frequency 2.45GHz
Modulation FSK
Distance 100 m
Power Consumption 3.0 mm
Transmission power 10dBm
Receiver sensitivity -103dBm
Iinterface Support USB / RS232 / RS422 / RS485 / TCPIP
Dimension 107W x 138H x 30D (mm)
Provide WinXP / VISTA / Win7 / WinCE / Linux SDK library
Software
for software development.
Feature
Wristband design.
Call button: Emergency reporting / Signal transmission.
Remote ON/OFF Tag.
Wireless tag programming.
Two colors LED visual indication: Generally, the emitting signal will glitter green; when it's low battery
or detect light sensor, it will glitter red sight.
Built-in two thermal sensors for continuously monitoring the temperature of infants. One thermal sensor
detects the skin temperature. The other detects the ambient temperature.
:
Buzzer Remote active beep or click active beep.
Specification
Frequency 2.45GHz
Modulation FSK
Distance 100 m
Power Consumption 3.0 mm
Transmission power 10dBm
Receiver sensitivity -103dBm
Battery life
,
3 years(when the transmission number is 10 for each day) 5
years in standby mode.
250 S.-H. Cheng
The remaining information packets include too many information such as index of
RFID tag, index of RFID reader, date and time, signal intensity, ambient temperature ,
and skin temperature etc. However, only the signal intensities are considered in this
study to recognize the locations of active RFID tags. The signal intensities are
transformed to RSSI values and the RSSI values are saved in database. The ambient
temperature and skin temperature are collected and saved in database. The proposed
remote infant monitoring system can send off warning signals when the theft, misuse
hold and abnormal body temperature of the babies are occurred. The remote infant
monitoring system enable fast communicating with the clinical staff and families by
the mobile devices such as notebook, tablet PC, and smart phone.
3 Location Algorithm
Because the signal of RFID is susceptible to environmental factors such as signal
refraction, scattering, multi-path effects, etc in indoor environments, it causes the
signal strength RSSI values received by the RFID readers change up or down.
Moreover, sometimes those medical equipments may have a strong blocking effect to
radio signals. These uncertain environmental factors can lead to the accuracy of the
neonatal positioning system is not good.
To recognize the locations of active RFID tags, a machine learning techniques are
used. In this study, the neural network classifier is developed to decide the correct
RFID tags position. The basic element of a artificial neural network is a neuron. This
is a simple virtual device that accepts many inputs, sums them, applies a nonlinear
transfer function, and generates the result, either as a model prediction or as input to
other neurons. A neural network is a structure of many such neurons connected in a
systematic way. The neurons in such networks are arranged in layers. Normally, there
is one layer for input neurons, one or more layers of the hidden layers, and one layer
for output neurons. Each layer is fully interconnected to the preceding layer and the
following layer. The neurons are connected by links, each link has a numerical
weight associated with it, which determine the weights are the basic means of long-
term memory in artificial neural network. A neural networks learns through repeated
adjustments of these weights.
In this study, a radial basis function network is selected. It consists of three layers:
an input layer, a receptor layer, and an output layer. The input and output layers are
similar to those of a multilayer perceptron. However, the hidden or receptor layer
consists of neurons that represent clusters of input patterns, similar to the clusters in a
k-means model. These clusters are based on radial basis functions. The connections
between the input neurons and the receptor weights are trained in essentially the same
manner as a k-means model. Particularly, the receptor weights are trained with only
the input fields; the output fields are ignored for the first phase of training. Only after
the receptor weights are optimized to find clusters in the input data are the
connections between the receptors and the output neurons trained to generate
predictions. Each receptor neuron has a radial basis function associated with it. The
basis function used in the study is a multidimensional Gaussian function,
d i2
exp
2 σ i2
An Intelligently Remote Infant Monitoring System Based on RFID 251
where r is the vector of record inputs and c is the cluster center vector. The output
neurons are fully interconnected with the receptor or hidden neurons. The receptor
neurons pass on their activation values, which are weighted and summed by the
output neuron,
Ok = ∑W jk a j (2)
j
The output weights W jk are trained in a manner similar to the training of a two-layer
back-propagation network. The weights are initialized to small random values in the
range - 0.001 ≤wij ≤0.001 , and then they are updated at each cycle p by the
formula
w jk ( p) = w jk ( p - 1) + Δw jk ( p) (3)
4 Experimental Environment
The equipment used in this study included three RFID readers, nine active RFID tags
with same specification and feature, for each connected network devices to the filter
server of signal packet. Experimental environment is 3.0 meters * 3.0 meters as shown in
Figure 2. Figure 2 also shows that experimental region is divided into four regional grids.
252 S.-H. Cheng
5 Numerical Results
Every active RFID tags send an electromagnetic wave per second. The intensity of
electromagnetic waves transmitted by active RFID tags is transformed to the RSSI
value. Each reader support RSSI values 0-255. The reading range and RSSI are
inverse proportion. The number of experimental records is 500, including 75% of
experimental records as training data and 25% of experimental records as training data,
in this study. The experimental accuracies of RFID Tags locations are shown in
Table 3. The Screen of intelligently remote infant monitoring system is shown in
Figure 4.
References
1. Chowdhury, B., Khosla, R.: RFID-based hospital real-time patient management system. In:
6th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2007,
pp. 363–368 (2007)
2. Swedberg, C.: Massachusetts general uses RFID to better understand its clinics (October
2009c), http://www.rfidjournal.com/article/view/5324/S
3. Hightower, J., Vakili, C., Borriello, C., Want, R.: Design and Calibration of the SpotON
AD-Hoc Location Sensing System, UWCSE 00-02-02 University of Washington,
Department of Computer Science and Engineering, Seattle,
http://www.cs.washington.edu/homes/jeffro/pubs/
hightower2001design/hightower2001design.pdf
254 S.-H. Cheng
4. Ni, L.M., Liu, Y., Lau, Y.C., Patil, A.P.: LANDMARC:Indoor Location Sensing Using
Active RFID (2003)
5. Wang, C., Wu, H., Tzeng, N.: RFID-based 3-D positioning schemes. In: IEEE INFOCOM,
pp. 1235–1243 (2007)
6. Lee, H.J., Lee, M.: Localization of mobile robot based on radio frequency identification
devices. In: SICE-ICASE, International Joint Conference, pp. 5934–5939 (October 2006)
7. Yamano, K., et al.: Self-localization of mobile robots with RFID system by using support
vector machine. In: Proceedings of 2004 IEEWRSI International Conference on Intelligent
Robots and Systems, Sendai, Japan (2004)
8. Xu, B., Gang, W.: Random sampling algorithm in RFID indoor location system. In: IEEE
International Workshop on Electronic Design, Test and Applications, DELTA 2006 (2006)
An Intelligent Infant Monitoring System
Using Active RFID
Shou-Hsiung Cheng
1 Introduction
Because newborn babies are often difficult to distinguish and the ability of expression
is lack, it often cause of neonatal injury due to human negligence or error identified.
Infancy is the most vulnerable stage, a variety of injuries can cause permanent
sequelae or death. The staff is complex and the flow of people is very large in
neonatal hospital. How to enhance the identification of newborns and to prevent
misuse hold and to monitor the physiological status of newborn babies are the
important projects in neonatal care. It is becoming more and more important at
hospitals to establish an infant monitoring system using RFID in today’s competitive
environment.
In recent years, high expectations for the integration of RFID in healthcare
scenaios have emerged. However, In spite of recent reasearch interest in the
healthcare enviroment, RFID adoption is still in its infancy and a larger number of
experience need to be collected. As a consequence of Severe Acute Respiratory
Syndrome(SARS) where 37 patients died and part of the medical personnel was also
infected the Ministry of Economic Affairsin Taiwan granted research funds to support
the implementation of RFID in healthcare. Tzeng et al. [1] has been presented the
experience of five early adopters hospitals. Authors conclude that future empirical
research will be helpful invalidating their propositions requiring a bigger number of
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 255–263, 2012.
© Springer-Verlag Berlin Heidelberg 2012
256 S.-H. Cheng
experiences collected and studied. However they consider RFID useful in enhancing
patient care and analyzing workload of medical staff. Holzinger et al. [2] proposed
an experience of tracking elderly patients suffering from dementia was also tested in
the Albert Schweitzer II Hospital with the purpose of providing real-time location and
an alert system if a patient goes beyond his expected location. Medical personnel
provided positive feedback, but patients themselves reacted negatively to the idea of
surveillance.
At present, Radio Frequency Identification (RFID) technology has been widely
accepted in hospitals to track and locate newborn babies. RFID positioning systems
can be broadly divided into two classes: tag and reader localization, depending on the
RFID component type of the target. In tag localization schemes, readers and possibly
tags are deployed as reference points within the area of interest and a positioning
technique is applied for estimating the location of a tag. Hightower [3] was the first to
use RFID technology to build the indoor positioning system. He uses RSS
measurements and proposed SpotON system to estimate the distance between a target
tag and at least three readers and then applies trilateration on the estimated distances.
Lionel [4] proposed the LANDMARC method based on RFID to build indoor
positioning system. LANDMARC system have a greater read range and response
capability of the active sensor tags compared to SpotON. On the other hand, in tag
localization schemes, usually passive or active tags with known coordinates and
possibly readers are deployed as reference points and their IDs are associated with
their location information. Lee et al. [5] proposed passive tags are arranged on the
floor at known locations in square pattern. The reader acquires all readable tag
locations and estimates its location and orientation by using weighted average method
and Hough transform, respectively. Han et al. [6] arrange tags in triangular pattern so
that the distance in x-direction is reduced. They show that the maximum estimation
error is reduced about 18% from the error inthe square pattern.
This study proposes a straightforward and efficient infant monitoring system to
reduce the potential risks of the theft, misuse hold and abnormal body temperature.
Not only the proposed monitoring location system can recognize different babies but
also can track the locations of newborn babies by using active RFID tags. Further, the
proposed infant monitoring system can send off warning signals when the theft,
misuse hold and abnormal body temperature of the babies are occurred.
RFID tags can be read through bed linens, while newborn babies are sleeping without
disturbing them. RFID technology provides a method to transmit and receive data
from a newborn baby to health service provider/medical professionals without human
intervention (i.e., wireless communication). It is an automated data-capture
technology that can be used to identify, track, and store patient information
electronically contained on RFID wristband smart tag. Although, medical
professionals/consultants can access/update patient’s record remotely via Wi-Fi
connection using mobile devices such as Personal Digital Assistant (PDA), laptops
and other mobile devices. Wi-Fi, the wireless local area networks (WLAN) that
allows healthcare provider (e.g., hospitals) to deploy a network more quickly, at lower
cost, and with greater flexibility than a wired system.
Every RFID tags will transmit the various intensity of electromagnetic wave. The
RFID Reader will send the various information packets of every RFID tags through the
network to the back-end Server to filter. However, the information received from RFID
Reader is not all available, which mingles lots of incomplete signals and noise, etc.
Therefore, the information packets should be filtered to remove unnecessary records.
According to collected information packets, filter server will filter out the signal which is
too low, loses some data or is invalid. The remaining information packets will be saved
in database. The remaining information packets will be used by the location classifier as
experimental data in order to recognize the locations of active RFID tags. When
abnormal conditions of newborns are occurred, the nursing staff will control the
abnormal situation immediately and they will handle responsibly after receipt of warning.
Feature
Support 316 RF channels
Support anti-collision
Support buzzer alarm
LED Multi-LED visual indication
Baud Rate 2,400 bps ~ 115,200 bps
UID: Tag’s identification number
T1: Ambient temperature sensor
T2: Skin temperature sensor
Note: T1 / T2 / SENSOR use for anti-tamper capability.
Support RSSI values 0-255 and RSSI values are inverse proportion
RSSI: Received Signal Strength Indication (0-255). Reading range and RSSI are inverse proportion.
Specification
Frequency 2.45GHz
Modulation FSK
Distance 100 m
Power Consumption 3.0 mm
Transmission power 10dBm
Receiver sensitivity -103dBm
Iinterface Support USB / RS232 / RS422 / RS485 / TCPIP
Dimension 107W x 138H x 30D (mm)
Provide WinXP / VISTA / Win7 / WinCE / Linux SDK library
Software
for software development.
Feature
Wristband design.
Call button: Emergency reporting / Signal transmission.
Remote ON/OFF Tag.
Wireless tag programming.
Two colors LED visual indication: Generally, the emitting signal will glitter green; when it's low battery
or detect light sensor, it will glitter red sight.
Built-in two thermal sensors for continuously monitoring the temperature of infants. One thermal sensor
detects the skin temperature. The other detects the ambient temperature.
:
Buzzer Remote active beep or click active beep.
Specification
Frequency 2.45GHz
Modulation FSK
Distance 100 m
Power Consumption 3.0 mm
Transmission power 10dBm
Receiver sensitivity -103dBm
Battery life
,
3 years(when the transmission number is 10 for each day) 5
years in standby mode.
An Intelligent Infant Monitoring System Using Active RFID 259
The remaining information packets include too many information such as index of
RFID tag, index of RFID reader, date and time, signal intensity, ambient temperature,
and skin temperature etc. However, only the signal intensities are considered in this
study to recognize the locations of active RFID tags. The signal intensities are
transformed to RSSI values and the RSSI values are saved in database. The ambient
temperature and skin temperature are collected and saved in database. The proposed
infant monitoring system can send off warning signals to nursing station when the
theft, misuse hold and abnormal body temperature of the babies are occurred. The
infant monitoring system also can detect temperature anomalies of newborn babies
real time by the body temperature sensors.
(
d j = x1j , x2j ,", xnj )
by determining the probability. The probabilities are calculated as
(
Pr = Yi | X 1 = x1j , X 2 = x2j ,..., Xn = xnj )
Equation (1) also can be rewritten as
Pr =
(
Pr (Yi )Pr X 1 = x1j , X 2 = x2j ,..., X n = xnj | Yi )
(
Pr X 1 = x1j , X 2 = x2j ,..., X n = xnj )
( )
n
Pr ∝ Pr (Yi )∏ Pr X k = xkj | π kj , Yi
k =1
260 S.-H. Cheng
∧P N ijk + N ijk0
θ ijk =
N ij + N ij0
The posterior estimation is always used for model updating.
An Intelligent Infant Monitoring System Using Active RFID 261
4 Experimental Environment
The equipment used in this study included three RFID readers, nine active RFID tags
with same specification and feature, for each connected network devices to the filter
server of signal packet. Experimental environment is 3.0 meters * 3.0 meters as
shown in Figure 2. Figure 2 also shows that experimental region is divided into three
regional grids.
5 Numerical Results
Every active RFID tags send an electromagnetic wave per second. The intensity of
electromagnetic waves transmitted by active RFID tags is transformed to the RSSI
value. Each reader support RSSI values 0-255. The reading range and RSSI are
inverse proportion. The number of experimental records is 400, including 75% of
experimental records as training data and 25% of experimental records as training data,
in this study. The experimental accuracies of RFID Tags locations are shown in Table
3. The screen of intelligent infant monitoring system is shown in Fugure 4.
6 Conclusion
This paper presents an intelligent infant monitoring system based on active RFID in
conjunction with Bayesian network classifier. From the experimental results obtained
in this study, some conclusions can be summarized as follows:
(1.) The experimental results show that the proposed infant monitoring system can
accurately recognizes the locations of newborn babies.
(2.) The infant monitoring system also can detect temperature anomalies of newborn
babies real time by the body temperature sensors.
(3.) The infant monitoring system presented in the study is straightforward, simple
and valuable for practical applications.
References
1. Tzeng, S., Chen, W., Pai, F.: Evaluating the business value of RFID: evidence from five
case studies. International Journal of Production Economics, Abr 112(2), 601–613 (2008)
2. Holzinger, A., Schaupp, K., Eder-Halbedl, W.: An Investigation on Acceptance of
Ubiquitous Devices for the Elderly in a Geriatric Hospital Environment: Using the Example
of Person Tracking. In: Miesenberger, K., Klaus, J., Zagler, W.L., Karshmer, A.I. (eds.)
ICCHP 2008. LNCS, vol. 5105, pp. 22–29. Springer, Heidelberg (2008)
3. Hightower, J., Vakili, C., Borriello, C., Want, R.: Design and Calibration of the SpotON
AD-Hoc Location Sensing System, UWCSE 00-02-02 University of Washington, Department
of Computer Science and Engineering, Seattle,
http://www.cs.washington.edu/homes/jeffro/pubs/
hightower2001design/hightower2001design.pdf
An Intelligent Infant Monitoring System Using Active RFID 263
4. Ni, L.M., Liu, Y., Lau, Y.C., Patil, A.P.: LANDMARC:Indoor Location Sensing Using
Active RFID (2003)
5. Lee, H.J., Lee, M.: Localization of mobile robot based on radio frequency identification
devices. In: SICE-ICASE, International Joint Conference, pp. 5934–5939 (October 2006)
6. Han, S.S., Lim, H.S., Lee, J.: An efficient localization scheme for a differential-driving
mobile robot based on RFID system. IEEE Trans. Ind. Electron. 54, 3362–3369 (2007)
7. Hightower, J., Borriello, G.: Location systems for ubiquitous computing. IEEE Computer
34, 57–66 (2001)
Fuzzy Decision Making for Diagnosing Machine Fault
1 Introduction
Though the mature manufacturing technology of machine is developed, to handle
machine fault states in real time is still very important. If machines give the fault,
manufacturers have to spend a lot of time to find and eliminate the breakdown. If
there is an accurate machine fault diagnosis system, the decision makers may improve
ability of production and reduce defective rate.
Machine fault diagnosis is not only a traditional maintenance problem but also is
an important issue. There are many papers investigating these topics. Tse et al. [1]
designed for use in vibration-based machine fault diagnosis. Fong and Hui [2]
described an intelligent data mining technique that combines neural network and rule-
based reasoning with case-based reasoning to mine information from the customer
service database for online machine fault diagnosis. Liu and Liu [3] presented an
efficient expert system for machine fault diagnosis to improve the efficiency of the
diagnostic process. Zeng and Wang [4] investigated the feasibility of employing fuzzy
sets theory in an integrated machine-fault diagnostic system. Son et al. [5] presented a
novel research using smart sensor systems for machine fault diagnosis. Kurek and
Osowski [6] presented an automatic computerized system for the diagnosis of the
rotor bars of the induction electrical motor by applying the support vector machine.
Based on [10-12], in this study, we propose two propositions to treat the machine
diagnosis fault.
Section 2 is Preliminaries. The proposed diagnosing machine fault method is in
Section 3. The example implementation is in Section 4. Finally, we make the
conclusion in Section 5.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 264–269, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Fuzzy Decision Making for Diagnosing Machine Fault 265
2 Preliminaries
For the proposed algorithm, all pertinent definitions of fuzzy sets are given below
[7-12].
~
Definition 2.1. Triangular Fuzzy Numbers: Let A = ( p, q, r ) , p<q<r, be a fuzzy set
on R . It is called a triangular fuzzy number, if its membership function is
⎧x − p
⎪ q − p , if p ≤ x ≤ q
⎪
⎪r − x
μ A~ ( x ) = ⎨ , if q ≤ x ≤ r (1)
⎪r − q
⎪0, otherwise
⎪
⎩
~ ~ at q.
If r=q, p=q, then A is ( q, q, q ) . We call it the fuzzy point q
~ ~
Proposition 2.1. Let A1 = ( p1 , q1 , r1 ) and A2 = ( p2 , q2 , r2 ) be two triangular
fuzzy numbers, and k>0, then, we have
~ ~
A1 ⊕ A2 = ( p1 + p2 , q1 + q2 , r1 + r2 )
0
(1 ) (2)
~
( 2 ) k ⊗ A1 = ( kp1 , kq1 , kr1 )
0
(3)
~
Step 1: Let H be the fuzzy relation on the set M of machines and the set S of
symptoms as in Eq. (5).
~
Step 3: By Eq. (5) and Eq. (6), the fuzzy relation B on the set M of machines and
the set F of fault causes can be inferred by means of the compositional rule of
inference as in Eq. (7).
⎛ p ⎞
btv = min ⎜⎜ (∑ atk ⋅ rkv ), 1⎟⎟ (8)
⎝ k =1 ⎠
btv
Btv = q
(9)
∑b
k =1
tk
Fuzzy Decision Making for Diagnosing Machine Fault 267
Then, we have
q
0 < Btv ≤ 1 and ∑B
v =1
tv =1 (10)
for t=1, 2, …, r.
Step 4: From Eq. (9), we have the fuzzy inferred diagnosis of the machine Mt
as follows:
~ B B Btq
Bt = t1 + t 2 + ⋅ ⋅ ⋅ + (11)
F1 F2 Fq
Eq. (11) shows that Btv is the membership grade of the fault cause F j for the
machine M t , i.e., Btv is the fault grade of the F j for the machine M t .
Then, we have the following Proposition 3.1:
Proposition 3.1. By the fuzzy compositional rule of inference, we have
0
( 1 ) The optimal diagnosis based on the maximal membership grade:
We let
where
p
C k = Min(∑ at rtk , 1) (14)
t =1
268 L. Lin, H.-M. Lee, and J.-S. Su
Ck
Dk = q
∈ [0, 1] (15)
∑C
t =1
t
4 Numeric Example
If we have S = {S1 , S 2 , S 3 , S 4 } the set of one machine with fault symptoms, and
F = {F1 , F2 , F3 } be the set of fault causes as the following data in Table 1:
F1 F2 F3
S1 0.4 0.35 0.25
~
If A = (0.3, 0.4, 0.1, 0.2) is given by the decision maker, then by Proposition
3.2, we have D1 =0.391, D2 =0.311, D3 =0.298.
Fuzzy Decision Making for Diagnosing Machine Fault 269
0
( 1 ) By the maximal membership grade, the machine M * is diagnosed fault
cause F1 .
0
( 2 ) By the probability distribution principle, the probability of the fault cause F1 is
0.391, F2 is 0.311 and F3 is 0.298.
5 Conclusion
In this study, we present a fuzzy diagnosing machine fault model. We use the fuzzy
compositional rule inference to present two propositions to treat the machine
diagnosis fault.
References
1. Tse, P.W., Yang, W.-X., Tam, H.Y.: Machine fault diagnosis through an effective exact
wavelet analysis. Journal of Sound and Vibration 277(4-5), 1005–1024 (2004)
2. Fong, A.C.M., Hui, S.C.: An intelligent online machine fault diagnosis system. Journal of
Computing & Control Engineering 12(5), 217–223 (2001)
3. Liu, S.C., Liu, S.Y.: An Efficient Expert System for Machine Fault Diagnosis. The
International Journal of Advanced Manufacturing Technology 21(9), 691–698 (2003)
4. Zeng, L., Wang, H.P.: Machine-fault classification: A fuzzy-set approach. The
International Journal of Advanced Manufacturing Technology 6(1), 83–93 (1991)
5. Son, J.-D., Niu, G., Yang, B.-S., Hwang, D.-H., Kang, D.-S.: Development of smart
sensors system for machine fault diagnosis. Expert Systems with Applications 36(9),
11981–11991 (2009)
6. Kurek, J., Osowski, S.: Support vector machine for fault diagnosis of the broken rotor bars
of squirrel-cage induction motor. Neural Comput. & Applic. 19, 557–564 (2010)
7. Kaufmann, A., Gupta, M.M.: Introduction to Fuzzy Arithmetic Theory and Application,
Van Nortrand, New York (1991)
8. Zimmermann, H.T.: Fuzzy Sets Theory and Its Application. Kluwer Academic Publishers,
Boston (1991)
9. Zadeh, L.A.: Fuzzy Sets. Information and Control 8, 338–353 (1965)
10. Lee, H.-M.: Applying Fuzzy Set Theory to Evaluate the Rate of Aggregative Risk in
Software Development. Fuzzy Sets and Systems 79, 323–336 (1996)
11. Lin, L., Lee, H.-M.: Fuzzy Assessment Method on Sampling Survey Analysis. Expert
Systems with Applications 36, 5955–5961 (2009)
12. Lin, L., Lee, H.-M.: Group Assessment Based on the Linear Fuzzy Linguistics.
International Journal of Innovative Computing Information and Control 6(1), 263–274
(2010)
Evaluation of the Improved Penalty Avoiding
Rational Policy Making Algorithm
in Real World Environment
1 Introduction
Reinforcement learning (RL)[14] is a kind of machine learning[6,4]. It aims to
adapt an agent to a given environment with a reward and a penalty. Traditional
RL systems are mainly based on the Dynamic Programming (DP). They can get
an optimum policy that maximizes an expected discounted reward in Markov
Decision Processes (MDPs). We know Temporal Difference learning [14] and Q-
learning [14] as a kind of the DP-based RL systems. They are very attractive
since they are able to guarantee the optimality in MDPs. We know that Partially
Observable Markov Decision Processes (POMDPs) classes are wider than MDPs.
If we apply the DP-based RL systems to POMDPs, we will face some limitation.
Hence, a heuristic eligibility trace is often used to treat a POMDP. We know
TD(λ) [14] Sarsa(λ) [14] and Actor-Critic [3] as such kinds of RL systems.
The DP-based RL system aims to optimize its behavior under given reward
and penalty values. However, it is difficult to design these values appropriately for
the purpose of us. If we set inappropiate values, the agent may learn unexpected
behavior [9]. We know the Inverse Reinforcement Learning (IRL) [12,1] as a
method related to the design problem of reward and penalty values.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 270–280, 2012.
c Springer-Verlag Berlin Heidelberg 2012
Evaluation of the Improved PARP in Real World Environment 271
On the other hand, we are interested in the approach where a reward and
a penalty are treated independently. As examples of RL systems in the envi-
ronment where the number of types of a reward is one, we know the rationality
theorem of Profit Sharing (PS) [7], the Rational Policy Making algorithm (RPM)
[8] and so on. Furthermore, we know the Penalty Avoiding Rational Policy Mak-
ing algorithm (PARP) [9] and Improved PARP [15] as examples of RL systems
that are able to treat a penalty, too. We call these systems Expolitation-oriented
Learning (XoL) [11].
XoL have several features: (1) Though traditional RL systems require appro-
priate reward and penalty values, XoL only requires the degree of importance
among them. In general, it is easier than designing their values. (2) They can
learn more quickly since they trace successful experiences very strongly. (3) They
are not suitable for pursuing an optimum policy. The optimum policy can be
acquired with multi-start method [8] but it needs to reset all memories to get a
better policy. (4) They are effective on the classes beyond MDPs since they are
a Bellman-free method [14] that do not depend on DP.
We are interested in XoL since we require quick learning and/or learning
in the class wider than MDPs. Especially, we focus on Improved PARP whose
effectiveness has been confirmed on computer simulations [15,5]. However there
is no result in real world environment. In this paper, we show the effectiveness
of Improved PARP in real world environment using Keepaway task [13] that is
a testbed of multiagent soccer environment.
2 The Domain
2.1 Notations
Consider an agent in some unknown environment. For each discrete time step,
after the agent senses the environment as a pair of a discrete attribute and its
value, it selects an action from some discrete actions and executes it. In usual
DP-based RL systems and PS, a scalar weight, that indicates the importance of
a rule, is assigned to each rule. The environment provides a reward or a penalty
to the agent as a result of some sequence of actions.
We term the sequence of rules selected between the rewards as an episode.
Consider a part of an episode where the sensory input of the first selection rule
and the sensory output of the last selection rule are the same although both rules
are different. We term it a detour. The rules on a detour may not contribute
to obtain a reward. We term a rule irrational if and only if it always exists on
detours in any episodes. Otherwise, the rule is termed rational. We term a rule
a penalty rule if and only if it gets a penalty or the state transit to a penalty
state in which there are penalty or irrational rules only.
The function that maps sensory inputs to actions is termed a policy. The policy
that maximizes the expected reward per an action is termed as an optimum
policy. We term a policy rational if and only if the expected reward per an
action is positive.We term a rational policy a penalty avoiding rational policy if
272 K. Miyazaki, M. Itou, and H. Kobayashi
and only if it has no penalty rule. Furthermore, the policy that outputs just one
action for each sensory input is termed a deterministic policy.
Fig. 1. The Penalty Rule Judgment algorithm (PRJ); We can regard the marked rule as
a penalty rule. We can find all penalty rules in the current rule set through continuing
PRJ.
Np (xa)
P L(xa) = , (1)
N (xa)
where Np (xa) is the number of times which was judged as a penalty rule by PRJ,
and N (xa) is the number of times which was chosen untill then. If P L(xa) is
getting closer to 1, the rule “xa” has higher possibility of receiving penalty. Con-
versely, if P L(xa) closes to 0, the selected action will seldom receive a penalty.
Now, let γ be threshold of the penalty rule rate, then rules are classified as
follows:
penalty rule (if P L(xa) > γ)
rule xa = (2)
non penalty rule (otherwise)
This means that large γ will reduce the number of penalty rules and increase
the number of applicable rules in the state.
3 Keepaway Task
3.1 Basic Setting
In the Robocup soccer Keepaway environment [13] studied as a multiagent con-
secutive task benchmark,a keeper agent tries to keep a ball as long as possible
without being stolen by the opponent agent called a taker, requiring cooperative
behavior among keeper agents.
Though there are several researches [2,15] that use the keepaway simulator
provided by the paper [13], we execute an experiment in real world environment
using small robots. We use 3 keepers and a taker. This 3-vs-1 keepaway task
occupies a 230 [cm] × 230 [cm] playing field (Fig. 2). The ball-holding keeper is
K1 and the other keepers K2 and K3 based on whichever is closer to K1 . The
three keepers are initially located as shown in Fig. 2 where positions of K1 , K2
and K3 are (0,-150), (-100,-150) and (100,150). The taker T1 and the ball are
located at (0,150) and (0,-100) initially. Only keepers are learning agents and
learn their policies.
1
40[deg]
2 3
g]
䎚0
[de
[de
䎚0
g]
4
label
1 2345 6
K2 1 1232 3
K3 2 3113 2
T1 3 2321 1
Fig. 4. Agent action roulette
3.3 Actions
Following five macro actions are defined:
Stop(): stay at the current location.
Dribble(α): dribble in the direction α.
Kick(α): kick in the direction α.
Go Ball(): turn to the ball and moves one step.
Go Left(): turn by 45 degrees to the left and move one step.
Go Right(): turn by 45 degrees to the right and move one step.
α is selected from the set {ahead, lef t, right} where “ahead” means the front
direction and “left” or “right” is 45 degrees to the left or right. Dribble() and
Kick() are achieved by pushing the ball forward, but the pushing strenght of
Dribble() is one third of Kick().
K1 , the ball keeper, selects the action by two stages as shown in Fig. 4.
K1 selects an action from the set { Stop(), Dribble(), Kick() }. It is decided
with Roullet1 where the maximum weights of { Dribble(ahead), Dribble(Left),
Dribble(Right) } and { Kick(ahead), Kick(Left), Kick(Right) } are assigned to
the weights of Dribble() and Kick(), respectively. If the selected action is one of
Dribble() or Kick(), the detail action is decided with another roulette (Roullet2).
Evaluation of the Improved PARP in Real World Environment 275
OZSWV Q]\UV^^X[W
NVUX^X\[ PSYX[W R`T^a^a_VZ
R`T^a^a_VZ
@HBFE
-:<= 0+ .0 0+
4396;3
AJKGLGJI
1,0
27;686<<
?JHHBID /:569 >CL
The position of the robot is at the center of gravity of two markers : (Xrobot,
Yrobot) in Fig.7(a). The position of a ball is at the center of ball marker (orange):
(Xball, Yball) in Fig.7(b).
The orientation of the robot is calculated by the center of gravity of two
markers (Fig.8). Let the coordinate of the center of gravity of center marker
be (Xcenter, Ycenter) and another one be (Xoutside, Youtsde), we can get the
angle θ by θ = atan2(dy , dx ) (−180◦ < θ < 180◦ ) where dx = xoutside − xcenter
and dy = youtside − ycenter .
5.3 Results
Fig.9 shows the number of successful passes every 10 trials for 5 experiments
plotted againt the number of trials. Fig.10 is the average number of successful
passes of the 5 experiments of Fig.9.
In the case of ,1 K1 learned “Kick(right),“ that is the best action as de-
scribed later, within 30 trials. In ,
2 the agent learned “Kick(right)“ and “Drib-
ble(right)“ at the same time; this resulted in a little inferior performance. In
,
3 K1 learned “Kick(left)“ at first and the agent learned “Kick(right)“ af-
ter “Kick(left)“ was classified as a penalty rule. In , 4 K1 could not learn
“Kick(right)“ within 150 trials since the agent failed to get enough reward. In ,
5
K1 could learn “Kick(right)“ within 30 trials, but “Kick(right)“ was classified
as a penalty rule since the action often failed due to the uncertainty.
278 K. Miyazaki, M. Itou, and H. Kobayashi
5.4 Discussion
We can recognize the effect of learning with a penalty from Fig.10. Namely, since
“Kick(ahead)“ is more likely to fail, it was regarded as a penalty rule in the early
trials and the rule was removed from the candidates of the action to accelerate
the learning speed.
In all experimants, “Kick(right)“, “Kick(left)“ or “Dribble(ahead)“ were
learned with almost same probability in the initial state. “Kick(right)“ is better
than “Kick(left)“ since there is a slope such that “Kick(right)“ is preferred in
our environment. On the other hand, “Dribble(ahead)“ is helpful to avoid the
taker. Therefore, to learn “Kick(right)“ in the early trials is very profitable for
the agents.
1 is the best result. 2 and 3 are reasonable results, too. But 4 and 5
failed to learn in 150 trials.
4 was occured since it was difficult for the agent to
get enough reward in 150 trials. On the other hand, if we use larger γ, we would
be able to avoid the case of 5
PARP required more trials than Improved PARP in our previous computer
simulations [15,5]. It could not improve the number of successful passes within
150 trials on this task, too.
Evaluation of the Improved PARP in Real World Environment 279
6 Conclusion
References
1. Abbeel, P., Ng, A.Y.: Exploration and apprenticeship learning in reinforcement
learning. In: Proc. of the 22nd International Conference on Machine Learning,
pp. 1–8 (2005)
2. Arai, S., Tanaka, N.: Experimental Analysis of Reward Design for Continuing Task
in Multiagent Domains – RoboCup Soccer Keepaway. Transactions of the Japanese
Society for Artificial Intelligence 21(6), 537–546 (2006) (in Japanese)
3. Kimura, H., Kobayashi, S.: An analysis of actor/critic algorithm using eligibility
traces: reinforcement learning with imperfect value function. In: Proc. of the 15th
Int. Conf. on Machine Learning, pp. 278–286 (1998)
4. Hong, T., Wu, C.: An Improved Weighted Clustering Algorithm for Determination
of Application Nodes in Heterogeneous Sensor Networks. J. of Information Hiding
and Multimedia Signal Processing. 2(2), 173–184 (2011)
5. Kuroda, S., Miyazaki, K., Kobayashi, H.: Introduction of Fixed Mode States into
Online Profit Sharing and Its Application to Waist Trajectory Generation of Biped
Robot. In: European Workshop on Reinforcement Learning 9 (2011)
6. Lin, T.C., Huang, H.C., Liao, B.Y., Pan, J.S.: An Optimized Approach on Applying
Genetic Algorithm to Adaptive Cluster Validity Index. International Journal of
Computer Sciences and Engineering Systems 1(4), 253–257 (2007)
7. Miyazaki, K., Yamamura, M., Kobayashi, S.: On the Rationality of Profit Sharing
in Reinforcement Learning. In: Proc. of the 3rd Int. Conf. on Fuzzy Logic, Neural
Nets and Soft Computing, pp. 285–288 (1994)
8. Miyazaki, K., Kobayashi, S.: Learning Deterministic Policies in Partially Observ-
able Markov Decision Processes. In: Proc. of 5th Int. Conf. on Intelligent Au-
tonomous System, pp. 250–257 (1998)
9. Miyazaki, K., Kobayashi, S.: Reinforcement Learning for Penalty Avoiding Policy
Making. In: Proc. of the 2000 IEEE Int. Conf. on Systems, Man and Cybernetics,
pp. 206–211 (2000)
10. Miyazaki, K., Kobayashi, S.: A Reinforcement Learning System for Penalty Avoid-
ing in Continuous State Spaces. J. of Advanced Computational Intelligence and
Intelligent Informatics 11(6), 668–676 (2007)
11. Miyazaki, K., Kobayashi, S.: Exploitation-Oriented Learning PS-r# . J. of Advanced
Computational Intelligence and Intelligent Informatics 13(6), 624–630 (2009)
12. Ng, A.Y.,, Russell, S.J.: Algorithms for Inverse Reinforcement Learning. In: Proc.
of the 17th Int. Conf. on Machine Learning, pp. 663–670 (2000)
280 K. Miyazaki, M. Itou, and H. Kobayashi
13. Stone, P., Sutton, R.S., Kuhlamann, G.: Reinforcement Learning toward RoboCup
Soccer Keepaway. Adaptive Behavior 13(3), 0165–0188 (2005)
14. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford
Book. MIT Press (1998)
15. Watanabe, T., Miyazaki, K., Kobayashi, H.: A New Improved Penalty Avoiding
Rational Policy Making Algorithm for Keepaway with Continuous State Spaces. J.
of Advanced Computational Intelligence and Intelligent Informatics. 13(6), 675–682
(2009)
Similarity Search in Streaming Time Series
Based on MP_C Dimensionality Reduction Method
Abstract. The similarity search problem in streaming time series has become a
hot research topic since such data arise in so many applications of various areas.
In this problem, the fact that data streams are updated continuously as new data
arrive in real time is a challenge due to expensive dimensionality reduction
recomputation and index update costs. In this paper, adopting the same ideas of
a delayed update policy and an incremental computation from IDC index
(Incremental Discrete Fourier Transform (DFT) Computation – Index) we
propose a new approach for similarity search in streaming time series by using
MP_C as dimensionality reduction method with the support of Skyline index.
Our experiments show that our proposed approach for similarity search in
streaming time series is more efficient than the IDC-Index in terms of pruning
power, normalized CPU cost and recomputation and update time.
1 Introduction
A streaming time series (STS) is a real-value sequence C = c1, c2, …, where new
values are continuously added at the end of the sequence C as time progresses.
Because a STS includes a great number of values, similarity is measured with the W
last values of the streams (W is the length of a sliding window).
The similarity search problem in STS has become a hot research topic due to its
importance in many applications of various areas such as earthquake forecast, internet
traffic examination, moving object examination, financial market analysis, and
anomaly detection ([2],[5],[6]). The challenge in these applications is that STS comes
continuously in real time, i.e., time series data are frequently updated. Methods that
are used to perform similarity search on archived time series in the past may not work
efficiently in streaming scenarios because the update and recomputation costs in STS
are significant. Therefore one needs an efficient and effective method for similarity
search in this time series data type.
In [5] and [6], Kontaki et al. proposed an index structure, IDC-Index (Incremental
DFT Computation – Index), which can be used for the similarity search in STS. The
IDC-Index is based on a multi-dimensional index structure, R*-tree, improved with a
delayed update policy and an incremental calculation of DFT. This approach is used in
order to reduce update and re-computation costs in STS similarity search. However, the
effect of using IDC-Index is not high due to using R*-tree as index structure and DFT as
dimensionality reduction method. In order to enhance the efficiency of similarity search
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 281–290, 2012.
© Springer-Verlag Berlin Heidelberg 2012
282 T.-S. Nguyen and T.-A. Duong
in STS we propose a new approach which uses MP_C, as dimensional reduction method
and Skyline index as multidimensional index structure.
In the proposed approach, while using the same idea of a delayed update strategy
and an incremental calculation for feature extraction, we can show that our new
method for time series dimensionality reduction, MP_C, with the support of Skyline
index can provide a more efficient similarity search in STS than IDC-Index in terms
of pruning power, normalized CPU cost and recomputation and update time.
2 Preliminaries
2.1 Index Structures
The popular multidimensional index structures are R-tree and its variants ([1], [3]). In a
multidimensional index structure (e.g., R-tree or R*-tree), each node is associated with
a minimum bounding rectangle (MBR). A MBR at a node is the minimum bounding
box of the MBRs of its child nodes. A potential weakness in the method using MBR is
that MBRs in index nodes can overlap. Overlapping rectangles could have negative
effect on the search performance. Besides, another problem in the method using MBR
is that summarizing data in MBRs, the sequence nature of time series is not captured.
Skyline Index, another elegant paradigm for indexing time series data which uses
another kind of minimum bounding regions, is proposed by Li et al., 2004 [9].
Skyline Index adopts new Skyline Bounding Regions (SBR) to approximate and
represent a group of time series data according to their collective shape. An SBR is
defined in the same time-value space where time series data are defined. Therefore,
SBRs can capture the sequential nature of time series. SBRs allow us to define a
distance function that tightly lower-bounds the distance between a query and a group
of time series data. SBRs are free of internal overlaps. Hence using the same amount
of space in an index node, SBR defined a better bounding region. For k-nearest-
neighbor (KNN) queries, Skyline index approach can be coupled with some well-
known dimensionality reduction technique such as APCA and improve its
performance by up to a factor of 3 ([9]).
Many solution approaches have been proposed for similarity query in streaming time
series lately. In [4] Gao and Wang, 2002, proposed a method based on a prediction for
similarity search in STS. In this case, time series data are static and the query is
changes over time. The authors solve the problem by using Fast Fourier Transform
(FFT) to find out the cross correlations between the query and time series. The
Euclidean distance between the query and each time series is calculated based on the
predicted values. When the actual query is incoming the prediction error and the
predicted distances are used to discard false alarms.
In [8], Liu et al., 2003, proposed a model for processing STS based on an index
structure that can adapt to the change in the length of data objects. In this work, the
VA-stream and VA+-stream index structures are used to query k-nearest neighbors.
These methods partition the data space into 2b cells, where b is defined by user. They
distribute different number of bits for each dimension so that the total of these bits is
Similarity Search in STS Based on MP_C Dimensionality Reduction Method 283
2.3 IDC-Index
In this approach, DFT method is used for extracting features of streaming time series
and R*-tree, based on MBR is used as a multi-dimensional index structure for
efficient similarity search. To overcome the difficulties incurred by the streaming
environment, these authors used an incremental calculation of DFT in order to avoid
re-calculation every time a new value arrives and a delayed update policy in order to
avoid updating R*-tree continuously.
The incremental calculation of DFT is compute by the following formula ([6]):
3 MP_C Representation
MP_C – Compression and Clipping
Given a time series C and a query Q, without loss of generality, we assume C and Q
are n units long. C is divided into segments. Some points in each segment are chosen.
To reduce space consumption, the chosen points are transformed into a sequence of
284 T.-S. Nguyen and T.-A. Duong
bits, where 1 represents above the segment average and 0 represents below, i.e., if μ
is the mean of segment C, then
⎧1 if ct > μ
ct = ⎨
⎩0 otherwise
The mean of each segment and the bit sequence are recorded as segment features.
We can choose some points in each segment with different algorithms in time
order. For example, in order to choose l points we can extract the first or the last l
points of each segment and so on. For the simplicity and the ability of recording the
approximate shape of the sequence, in our method, we use the following simple
algorithm: (1) dividing each segment into sub-segments, and (2) choosing the middle
point of each sub-segment. Fig. 1 shows the intuition behind this technique, with l =
6. In this case, the sequence of bit 010111 and the μ value are recorded.
0 1 0 0 1 1 1
N l
D2 (Q, C ') = ∑∑ (d (qi , bci ))2 (3)
j =1 i =1
where
qµ i is the mean value of the i-th segment in Q, cµ i is the mean value of the i-th
segment in C, bci is binary representation of ci. d(qi, bci) is computed by the following
formula:
Lemma 1. If D(Q, C) is the Euclidean distance between query Q and time series C,
then DMP_C(Q, C’) ≤ D(Q, C).
The proof of Lemma 1 can be seen in our previous paper [10].
refer to an original time series data in the database. The MP_C_BR associated with a
non-leaf node is the smallest bounding region that spatially contains the MP_C_BRs
associated with its immediate children.
c’32
C1 BC1 = 0010 c’11
C’1 c’31
BC2 = 1010 c’12
C’2 c’21 c’42
C2
c’22 c’41
(a) (b)
Fig. 2. An example of MP_C_BR. (a)Two time series C1, C2 and their approximate MP_C
representations in four dimensional space. (b) The MP_C_BR of two MP_C sequences C’1 and
C’2. C’max = {c’11, c’21, c’32, c’42} and C’min = {c’12, c’22, c’31, c’41}
Two searching problems which we apply in our experiments are ε-range search and
KNN search algorithms.
In STS new data values arrive continuously and the number of STS in the database
may be very large. When a new value arrives, MP_C approximation for this time
series must be recomputed using the last W values of this time series. Therefore, the
costs in feature extraction recomputation may be high. Besides, the multi-dimensional
index structure must be updated every time a new value for a streaming time series
arrives. It may lead to a high overhead due to continuous deletions and insertions in
the index structure. In order to reduce these costs we apply an incremental
computation of MP_C method and a delayed update strategy.
• The Incremental Computation of MP_C Method
Let C = (c0, c1, …, cn-1) is the last sequence of length n of a streaming time series.
Suppose C is divided into N segments. The N segment mean values and the middle
points of each segment which are transformed into a bit sequence are recorded as
features of sequence. Let C’ = (c’0, c’1, …, c’N-1) and a bit sequence bc are the
representation of C in MP_C space. When a new value cn arrives we get a new
sequence S = (s1,s2, …,sn), where ci = si, i = 1,.., n-1 and sn (sn = cn) is a new value.
The MP_C sequence S’= (s’0, s’1, …, s’N-1) of S is computed by the previous MP_C
approximation of C. This incremental calculation of MP_C method is calculated by
the following formula:
N N
s 'i = c'i − cn + cn
n N i n N (i +1)
Similarity Search in STS Based on MP_C Dimensionality Reduction Method 287
And the middle point values, mpi, of segments are extracted at positions
⎢n n ⎥
⎢⎣ N i + 2 N ⎥⎦
where i = 0, …, N-1
The chosen middle points are transformed into a sequence of bits, where 1
represents above the average of the corresponding segment and 0 represents below,
i.e., if μi is the mean of segment Ci, then
⎧1 if mpi > μi
bc i = ⎨
⎩0 otherwise
n
i
N
N N N
s 'i = c'i − c n + c n =
n N i n N ( i +1) n n
∑s j
j = ( i −1) +1
N
In order to perform the incremental calculation of MP_C method, the last computed
MP_C sequences of all streaming time series must be stored.
• The Delayed Update Strategy in Skyline Index
According to the delayed update strategy, to prevent a continuous update of STS in
Skyline index when a new value arrives, an update threshold T is used to direct the
number of updates. To mitigate the cost of update, there is an additional link from time
series to a corresponding leaf node in Skyline index. Thank to these “stream to leaf”
links, the update can be performed faster from a leaf node to the root.
Suppose C is a STS and C1= (cn-w+1, … , cn) is last sequence of length w of C, where
cn is the last value of C. Let C’1 be a MP_C representation of C1. When a new value,
cn+1, arrives, we get a new sequence C2= (cn-w+2, … , cn+1) and C’2, a MP_C
representation of C2, is calculated by the incremental computation based on C’1.
Suppose that MP_C sequence C’1 is the last sequence which is updated into Skyline
index corresponding to C1. If the distance between the new MP_C representation C’2
and the old MP_C approximation C’1 is less than or equal to the update threshold T
(DMP_C(C’1, C’2) ≤ T), C’2 is not updated into Skyline index but it is recorded as the
most recent MP_C approximation of the streaming time series C which is used to
compute incrementally a new MP_C when a new value arrives.
Let C3= (cn-w+3, …, cn+2) be a new sequence when another new value, cn+2, arrives
and C’3, a MP_C representation of C3, is calculated incrementally based on C’2. C’3 is
recorded as the most recent MP_C approximation instead of C’2. If the distance
DMP_C(C’1, C’3) > T, Skyline index is updated by replacing C’1 with C’3 in the leaf node
and all MP_C_BRs on the path from this leaf node to the root are recomputed.
Otherwise, no action is done in the index structure.
288 T.-S. Nguyen and T.-A. Duong
In brief, we need both the previously calculated MP_C sequence and the last
recorded MP_C sequence. The former is used for the incremental calculation of the
new MP_C representation and the latter is used for making a decision whether a MP_C
sequence is updated in the index structure or not. Besides, to mitigate further the cost
of update, there is an additional link from time series to its corresponding leaf node in
Skyline index.
6 Experimental Results
the normalized CPU cost which is the fraction of the average CPU time to perform a
query using the index to the average CPU time required to perform a sequential
search. The normalized CPU cost of a sequential search is 1.0.
Fig. 3. The pruning powers on Consumer data, tested over a range of reduction ratios (8-128)
and query lengths (1024 (a), 512 (b))
The experiments have been performed over a range of query lengths (256-1024),
values of reduction ratios (8-128) and a range of data sizes (10000-30000 sequences).
For brevity, we show just two typical results. Figure 4 shows the experiment results
with a fixed query length 1024.
Between the two competing techniques, in similarity search the MP_C + Skyline
index performs faster than IDC-Index based on R*-tree.
Fig. 4. CPU cost of MP_C using Skyline index and IDC-Index on Consumer data, tested over
(a) a range of reduction ratios. (b) a range of data sizes.
Fig. 5. (a) Index building time and (b) Incremental computing and updating time of MP_C +
Skyline and IDC-Index on Consumer data
290 T.-S. Nguyen and T.-A. Duong
Building and Updating the Index. We also compare MP_C + Skyline to IDC-Index
in terms of the time taken to build the index and the time taken to perform the
incremental computation and the delayed update strategy. The experimental results in
Figure 5 show that the index building time and the incremental computation plus
update time of MP_C + Skyline is lower than that of IDC-Index.
7 Conclusions
We showed that our proposed technique MP_C with the support of Skyline index can
be used for the similarity search in streaming time series data. Our approach adopts
the same idea of IDC-Index which is based on a incremental computation and a
delayed update policy. Experimental results demonstrate that our MP_C method with
the support of Skyline index is better than IDC-Index in terms of pruning power and
normalized CPU cost. Besides, the index building time along with the incremental
computation and update time of MP_C using Skyline index can be faster than IDC-
Index based on R*-tree. The limitation in our experiment is that we have not yet
adapted the index update rate according to application requirements.
In future, we plan to investigate how to keep the threshold T up-to-date as streams
evolve with time according to the desirable update frequency. We expect that such
threshold could allow number of updates performed to the index which guarantees
efficiency.
References
1. Beckman, N., Kriegel, H.P., Schneider, R., Seeger, B.: The R*-tree: An Efficient and
Robust Access Method for Points and Rectangles. In: Proc. of 1990 ACM-SIGMOD
Conf., Atlantic City, NJ, pp. 322–331 (May 1990)
2. Babu, S., Widom, J.: Continuous queries over data streams. ACM SIGMOD Record 30(3),
109–120 (2001)
3. Guttman, A.: R-trees: a Dynamic Index Structure for Spatial Searching. In: Proc. of the
ACM SIGMOD Int. Conf. on Management of Data, June 18-21, pp. 47–57 (1984)
4. Gao, L., Wang, X.: Continually Evaluating Similarity-Based Pattern Queries on a
Streaming Time Series. In: Proc. ACM SIGMOD (2002)
5. Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y.: Efficient similarity search in
streaming time sequences. In: Proceedings of the 16th International Conference on
Scientific and Statistical Database Management (SSDBM 2004), Santorini, Greece (2004)
6. Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y.: Adaptive similarity search in streaming
time series with sliding windows. Data & Knowledge Engineering 16(6), 478–502 (2007)
7. Lian, X., Chen, L., Yu, J.X., Wang, G.: Similarity Match over High Speed Time Series
Streams. In: Proc. IEEE 23rd International Conference (2007)
8. Liu, X., Ferhatosmanoglu, H.: Efficient k-NN Search on Streaming Data Series. In:
Hadzilacos, T., Manolopoulos, Y., Roddick, J., Theodoridis, Y. (eds.) SSTD 2003. LNCS,
vol. 2750, pp. 83–101. Springer, Heidelberg (2003)
9. Li, Q., Lopez, I.F.V., Moon, B.: Skyline Index for Time Series Data. IEEE Trans. on
Knowledge and Data Engineering 16(6) (2004)
10. Son, N.T., Anh, D.T.: Time Series Similarity Search based on Middle Points and Clipping.
In: Proceedings of the 3rd Conference on Data Mining and Optimization (DMO 2011),
Putrajaya, Malaysia, June 28-29, pp. 13–19 (2011)
DRFLogitBoost: A Double Randomized Decision
Forest Incorporated
with LogitBoosted Decision Stumps
1 Introduction
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 291–300, 2012.
c Springer-Verlag Berlin Heidelberg 2012
292 Z.Md. Faisal, S.S. Monira, and H. Hirose
pattern recognition [15] seriously explore the theory and the use of ensemble
methodology.
In this paper, a hybrid ensemble method has been proposed, which is com-
posed with two popular classifier ensemble schemes, bagging [1] and adaboost
[9] (the most popular) of the Boosting family [21]. The proposed hybrid decision
forest ensemble method is motivated by the, “main ideas” of double bagging en-
semble [13]. In the novel hybrid decision forest, predictions from simple decision
trees and a boosting type ensemble are aggregated. In bootstrapping, approxi-
mately 13 of the observations of the original training set are left out (defined as,
out-of-bag sample (OOBS) by Breiman [2]). In this ensemble, these OOBS are
utilized as a separate training set for the adaboost ensemble. The real adaboost
[10] is efficient in discarding the course information generated in the adaboost.M1
algorithm [9]. In real adaboost, the predicted labels are transformed into class
posteriori probabilities of the outcomes into real valued scale. We have incorpo-
rated the binomial log likelihood loss function or the logit loss [10], instead of
the exponential loss function of the original adaboost algorithm. The class prob-
abilities of real logitboost with decision stump are utilized to enlarge the feature
space of the component base decision tree classifiers of the decision forest. In
this way, the decision forest is composed of the usual component decision trees
but trained on real logitboost module.
The underlying motivations behind the proposed ensemble are described briefly
in the following: a) the double randomization will increase the sparsity of the
common instances in each resample to train the decision tree and logitboosted
decision stumps. In this way the diversity of the ensemble will increase and af-
ter aggregation the variance of the decision tree will be reduced also. b) the
adaBoost is highly preferred for 2-class problems, so we have utilized it as the
additional classifier model. The exponential loss function of adaboost is non-
robust, so the logit loss function (logitboosting) is used, which is robust against
noisy problems. In addition to these, it will reduce the bias associated with the
construction of the forest; as this type of decision forest incur bias when con-
structed. The rest of the paper organized as: in Section 2, we have discussed
briefly about the constitutional steps of the new decision forest, emphasizing on
the real logitboost. In Section 3, we have described the two real world problems
handled in this paper with the description of the experiment and discussion of
the results of both the problems. This is followed by the Conclusion of the paper.
Input:
– L : training set
– X : the predictors in the training dataset
– B: number of classifier in the ensemble
– {ω1 , . . . , ωc }: the set of class labels
– ρ: small resampling ratio
– x: a data point to be classified
1. For b = 1, . . . , B
(a) L (b) ← Resample of size ρ from L .
(b) (b)
(b) X (b) denote the matrix of predictors x1 , . . . , xN from L(b) .
(c) L (2b) ← Resample from the out-of-bag sample L (−b) (of size 1-ρ).
(d) RealLBoost(b) ← A real LogitBoost model constructed on L (2b) .
(e) CP (b) ← A matrix with the columns are the class probability of the classes,
after training RealLBoost(b) on L(b)
(b)
(f) Ccomb ← (L(b) ∪ CP (b) ) : Construct the combined classifier
(g) T CP (b) ← x’s class posteriori probability generated by RealLBoost(b) .
(b)
(h) cbj (x, T CP (b) ) ← The probability assigned by the classifier Ccomb that x comes
from the class ωj .
EndFor
2. Calculate the confidence for each class ωj , by the “average” combination rule:
1
B
μj (x) = cbj ((x, T CP (b) )), j = 1, . . . , c.
B b=1
Bagging” method was proposed by Hothorn and Lausen [13] to construct en-
semble classifiers. In double bagging framework the out-of-bag sample is used
to train an additional classifier model to integrate the outputs with the base
learning model.
Bagging type ensemble methods reduce the prediction variance by a smooth-
ing operation but without much effect on the bias of the base model. In other
words, it is equivalent to say that the bagging type ensemble method with smaller
resamples will have poor effect in reducing the bias; but will reduce the variance
more than the usual bootstrap sample (size same as the original training set).
Though the constitutional steps of this new decision forest is similar to bagging
but this is trained on an enlarged feature space. This entails an enhanced rep-
resentational power for each of the base decision tree, which will lower the bias
of the decision forest when combined. Moreover the real logitboost module will
produce low biased estimates of class probabilities, which will in a sense increase
294 Z.Md. Faisal, S.S. Monira, and H. Hirose
the efficiency of the additional features for each base tree classifier and hence
will increase the prediction accuracy of the decision forest all together.
In our framework each OOBS is randomized once more to construct an real
logitboost and then the that real logitboost is applied back to the bootstrap
sample to extract the CP P , then all the CP P s of the real logitboost are stored
in the matrix CP P and are used as the additional features with the original
feature X as [X CP P ] which is an r × p(c + 1), where p is number of features,
c is number of classes in Dl . The detailed steps of the proposed method for
constructing an decision forest ensemble is described in Fig. 1
– Input:
a. X: Training set of size N .
b. C: A classifier, here decision stump.
c. T : Number of classifiers to construct.
d. w: Vector of weights for the observations in X.
e. x: Test instance for classification.
Transform the class labels to 0 and 1. Define this as y ∗
1. Initialization
Initialize the weights for each observations in the training set as N1 .
Also set initial probability p0 = 0.5
let a committee function F (x) = 0.
2. Additive Regression Modeling
Step 1a. Compute the working response and weights using the probability
estimate and class labels as
y ∗ − η(pt−1 (Xi ))
zt (i) =
η(pt−1 (Xi ))(1 − η(pt−1 (Xi )))
The dataset description is given in Table 1. All these datasets are available at
[8].
In each table for each row (dataset) the best performing method is marked bold.
For each table we have performed the comparison of the proposed ensemble
method with other classifier methods by two-tailed Wilcoxon Sign Rank test1 .
The significance level is 95%. In all the tables, the method which is significantly
worse than the proposed method is marked by a “•” and the method which is
significantly better than the proposed method is marked by a “◦”.
Discussion of the Results. In Table 2, we have given the AUC values of the
ensemble methods stated above. The highest AUC values are marked in bold for
each dataset. We see from the table that the new decision forest outperformed
most of the methods in two of the datasets (in Australian and Japanese credit
data), where as, in German credit data, it is outperformed by the RSM-NN
ensemble method.
In Table 3, the type-I error values of the ensemble methods are presented. The
lowest values are marked in bold for each dataset. It is apparent that, the new
ensemble method is not the best performer in the case of lower type-I error. The
new decision forest produced lower type-I values than some of the methods in all
the datasets. It should be noted that, the type-I values of other methods except
the new ensemble method are not consistent. But the new ensemble method
has lower type-I error in two datasets and lowest type-I error in one dataset.
Considering this, it can be advised that the new ensemble method is better than
other methods.
Table 4, represents the type-II error values of the ensemble methods stated
above. The best (lowest) values are marked in bold for each dataset. We see
from the table, the same pattern of performance of the methods; that is the
new decision forest has lower type-II values in two datasets and lowest in one
dataset and not any single method produce lower type-II error values for all the
datasets. Considering this, the new decision forest can be regarded as a better
method than other methods.
event forecast. In addition to these we have also computed the AUC (Area
under the ROC curve) of models for both the problems. As we know higher
AUC is desirable for binary prediction problem and this measure is now used
more frequently than accuracy in binary prediction problems.
4 Conclusion
In this paper, a new hybrid decision forest is proposed. The decision forest is
incorporated with small bootstrap sample aggregation and real logitboosted de-
cision stumps. The decision forest is benefited from the enlarged feature space
produced by logitboosted decision stumps for each decision tree during the small
bootstrap aggregation. For the performance examination of the new decision
forest, it is applied in two real world problems; credit score classification and
extreme rainfall forecast. For the performance check, it is compared with some
very well known machine learning algorithms available. In the credit classifi-
cation problem, it is most accurate in classifying the credits and consistent in
differentiating between good credits as good and bad credits as bad. The re-
sults of the extreme rainfall prediction suggest that the new decision forest has
capability to efficiently perform the categorical rainfall forecasting task.
References
1. Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)
2. Breiman, L.: Out-of-bag estimation. Tech. Rep. 2 (1996)
300 Z.Md. Faisal, S.S. Monira, and H. Hirose
1 Introduction
A growing interest in the structured output prediction has been observed re-
cently. Machine learning algorithms dealing with structured output prediction
problems are able to generalize in a set of training input-output pairs but the
input or the output (sometimes both of them) are more complex in comparison
with traditional data types. Usually structured prediction works with output
space containing complex structure like sequences, trees or graphs. Combined
applicability and generality of learning in complex spaces result in a number of
significant theoretical and practical challenges.
Considering the standard classification or regression problems, they discover
the mapping to output space that is either real-valued or a value from a small,
unstructured set of labels. On the other hand structured output prediction de-
picts the output space that might contain a structure rich in information which
can be utilized in the learning process. Due to the profile of structured output
prediction it requires in some cases an optimization or search mechanism over
complete output space which makes the problem non-trivial.
There exists a variety of structured output prediction problems, e.g. protein
function classification, semantic classification of images or text categorization.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 301–309, 2012.
c Springer-Verlag Berlin Heidelberg 2012
302 T. Kajdanowicz and P. Kazienko
2 Related Work
The most obvious and directly arising method for structured output prediction is
a probabilistic model jointly considering the input and output variables. There
exist many examples of such probabilistic models for a variety of inputs and
outputs, e.g. probabilistic graphical models or stochastic context free grammars.
Given an input, the predicted output might be determined as the result of max-
imization of the posteriori probability, namely the technique of maximum poste-
riori estimation. In such approaches, the learning is to model the joint input and
output data distribution. However, it is well known that this approach of first
modelling the distribution and subsequently using maximum a posteriori esti-
mation for prediction is indirect and might be suboptimal. Therefore, the other,
direct discriminative approach might be more appropriate. Such a discriminative
learning algorithms perform a prediction on the basis of scoring function opti-
mization over the output space. Recently presented and studied discriminative
learning algorithms include Max-margin Markov Nets that consider the struc-
tured output prediction problem as a quadratic programming problem [12], a bit
more flexible approach that is an alternative adjustment of logistic regression to
the structured outputs called Conditional Random Fields [10], Structured Per-
ceptron [1] that has minimal requirements on the output space shape and is easy
to implement, Support Vector Machine for Interdependent and Structured Out-
puts (SV M ST RUCT ) [14], which applies variety of loss functions and an example
of adaptation of the popular back-propagation algorithm - BPMLL [15] where a
new error function takes multiple targets into account.
Another approach to structured prediction, LaSO, presented in [2] proposes
a framework for predicting structured outputs by learning as search optimiza-
tion. LaSO allows to reduce the requirement appearing in likelihood- or margin-
based algorithms that output structure needs to be computed from the set of
all possible structures. As an extension to that approach, SEARN algorithm [3]
assumes transformation of structured prediction problems into binary prediction
problems to which a standard binary classifier can be applied.
Based on similar assumption another example of structured output algorithm
is an extension of the original AdaBoost algorithm to structured prediction. It is
the AdaBoostSeq algorithm proposed by authors in [7] and utilized in this paper
within experimental studies.
Summarizing, both presented approaches, the generative modelling and the
discriminative learning algorithms make use of some scoring function to score
each element of the output space. In the case of methods that work in space of
Learning and Inference Order in Structured Output Elements Classification 303
partial outputs the order of score function calculation of output space elements is
important and may determine the overall accuracy. Therefore appropriate output
space learning order is required in order to provide best possible generalization.
where: F μ (x) is the combined, final meta classifier for the μth sequence item
(structure element); Φ(x, Θkμ ) represents the kth base classifier, performing ac-
cording to its Θkμ parameters and returning a binary class label for each instance
x; αμk is the weight associated to the kth classifier.
Values of the unknown parameters (αμk and Θkμ ) are obtained from mini-
mization of prediction error for each μth sequence element for all K classifiers
using stage-wise suboptimal method[13]. The actual sequence-loss balancing cost
function J is defined as:
N
J(αμ , Θμ ) = exp(−yiμ (ξFm−1
μ
(xi ) + (1 − ξ)yiμ R̂m
μ
(xi ) + αμ Φ(xi , Θμ ))) (2)
i=1
μ
where: R̂m (xi ) is an impact function denoting the influence on prediction accord-
ing to the quality of preceding sequence labels predictions; ξ is the parameter
that allows controlling the influence of impact function in weights composition,
ξ ∈ 0, 1.
304 T. Kajdanowicz and P. Kazienko
μ
The proposed R̂m (xi ) impact function is composed as following:
m−1
μ
R̂m (xi ) = αμj Rμ (xi ) (3)
j=1
μ l
Fig. 1. An example of search tree for structure of four elements composing the learning
order for the AdaBoostSeq algorithm
The main objective of the performed experiments was to evaluate the classifica-
tion accuracy of the AdaBoostSeq algorithm driven by various learning ordering
of output elements. The method was examined according to Hamming Loss and
306 T. Kajdanowicz and P. Kazienko
Classification Accuracy for six distinct datasets. Some standard evaluation mea-
sures from the previous work have been used in the experiments. The utilized
measures are calculated based on the differences of the actual and the predicted
sets of labels (classes) over all cases xi in the test set. The first measure is
Hamming Loss HL, which was proposed in [11] and is defined as:
N
1 Yi F (xi )
HL = (5)
N i=1 |Yi |
where: N is the total number of cases xi in the test set; Yi denotes actual (real)
labels (classes) in the sequence, i.e. entire structure corresponding to instance
xi ; F (xi ) is a sequence of labels predicted by classifier and stands for the
symmetric difference of two vectors, which is the vector-theoretic equivalent of
the exclusive disjunction in Boolean logic.
Fig. 2. Hamming Loss error of examined ordering methods (Best - ordering with the
smallest Hamming Loss error, H1, H2, H3 - proposed heuristics, Random - random
ordering)
The performance of the analysed methods was evaluated using 10-fold cross-
validation and the evaluation measures from Equation 5 and Equation 6.
The experiments were carried out on six distinct, real datasets from the same
application domain of debt portfolio pattern recognition [6]. Datasets represent
the problem of aggregated prediction of sequential repayment values over time for
a set of claims. The structured output for each debt (case) is composed of a vector
of binary indicator denoting whether at a certain period of time it was repaid at
a certain level. The output to be predicted is provided for all consecutive periods
of time the case was repaid. For the purpose of the experiment, the output was
limited to only six elements. The number of cases in the datasets varied from
1,703 to 6,818, while the number of initial, numeric input attributes was the
same: 25. Note that the number of input attributes refers only classification
for the first element in the sequence; for the others, outputs of the preceding
elements are added to their input.
Table 1. Results obtained in the experiments, where orderings denote: Best - ordering
with the best classification accuracy, H1, H2, H3 - proposed heuristics, Random -
random ordering; HL - Hamming Loss, CA - Classification Accuracy
Ordering method
Best H1 H2 H3 Random
Dataset HL CA HL CA HL CA HL CA HL CA
1 0.046 0.801 0.051 0.772 0.054 0.7692 0.0547 0.7608 0.054 0.767
2 0.083 0.725 0.097 0.680 0.0995 0.6762 0.1024 0.6806 0.097 0.686
3 0.080 0.776 0.091 0.733 0.0942 0.7375 0.0907 0.7535 0.096 0.734
4 0.123 0.607 0.135 0.576 0.1447 0.5327 0.1356 0.5791 0.139 0.562
5 0.121 0.500 0.135 0.451 0.1301 0.4653 0.131 0.4648 0.133 0.461
6 0.124 0.583 0.147 0.556 0.1481 0.5205 0.1536 0.4912 0.151 0.519
Average 0.096 0.665 0.109 0.628 0.112 0.617 0.111 0.622 0.112 0.621
Table 2. Average relative errors in comparison with the best ordering, in percentage
(0 denotes the same accuracy like obtained with the best ordering), the orderings are:
Best - ordering with the best classification accuracy, H1, H2, H3 - proposed heuristics,
Random - random ordering; HL - Hamming Loss, CA - Classification Accuracy
Ordering method
H1 H2 H3 Random
Dataset HL CA HL CA HL CA HL CA
1 10,66% 3,54% 15,47% 3,93% 16,77% 4,97% 16,10% 4,25%
2 14,16% 6,27% 16,47% 6,76% 19,39% 6,16% 13,60% 5,47%
3 10,90% 5,47% 14,83% 4,93% 11,11% 2,87% 16,72% 5,34%
4 8,60% 5,15% 15,09% 12,28% 8,80% 4,64% 11,26% 7,38%
5 10,40% 9,78% 6,71% 6,94% 7,40% 7,04% 8,68% 7,86%
6 15,22% 4,69% 16,09% 10,75% 19,81% 15,77% 18,02% 11,04%
Average 11,72% 5,61% 13,88% 7,28% 13,50% 6,56% 13,75% 6,60%
6 Conclusions
The problem considered in the paper concerns discovering appropriate learning
order in the structured output prediction. It was based on AdaBoostSeq algo-
rithm that assumes that labels of the already classified output elements are used
as additional input features for the next elements. Since the elements in the
sequence may be correlated, the order of learning may influence accuracy of the
entire classification. According to experiments’ results, the margin between the
worst and the best order may be even several dozen of percent for Hamming Loss
measure and for Classification Accuracy. Moreover, three proposed heuristics,
each of distinct complexity, showed the ability to result in better than random
learning order. Overall, the H1 heuristic method proposed in the paper appears
to be a reasonable direction to find the learning order providing better results
than the other simple orders. It is at the same time much less computationally
expensive than checking all possible orders to find the best one.
Learning and Inference Order in Structured Output Elements Classification 309
References
1. Collins, M.: Discriminative training methods for hidden Markov models: Theory
and experiments with perceptron algorithms. In: Conference on Empirical Methods
in Natural Language Processing, vol. 10, pp. 1–8 (2002)
2. Daume, H., Marcu, D.: Learning as Search Optimization: Approximate Large Mar-
gin Methods for Structured Prediction. In: International Conference on Machine
Learning, ICML 2005 (2005)
3. Daume, H., Langford, J., Marcu, D.: Search-based structured prediction. Machine
Learning 75, 297–325 (2009)
4. Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and
an application to boosting. Journal of Computer and System Sciences 55, 119–139
(1997)
5. Ghamrawi, N., McCallum, A.: Collective multi-label classification. In: Proceed-
ings of the 3005 ACM Conference on Information and Knowledge Management,
pp. 195–200 (2005)
6. Kajdanowicz, T., Kazienko, P.: Prediction of Sequential Values for Debt Recovery.
In: Bayro-Corrochano, E., Eklundh, J.-O. (eds.) CIARP 2009. LNCS, vol. 5856,
pp. 337–344. Springer, Heidelberg (2009)
7. Kajdanowicz, T., Kazienko, P.: Boosting-based Sequence Prediction. New Gener-
ation Computing 29(3), 293–307 (2011)
8. Kazienko, P., Kajdanowicz, T.: Base Classifiers in Boosting-based Classification of
Sequential Structures. Neural Network World 20, 839–851 (2010)
9. Kajdanowicz, T., Kazienko, P.: Structured Output Element Ordering in Boosting-
Based Classification. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011,
Part II. LNCS (LNAI), vol. 6679, pp. 221–228. Springer, Heidelberg (2011)
10. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic
models for segmenting and labeling sequence data. In: International Conference on
Machine Learning, ICML 2001, pp. 282–289 (2001)
11. Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text catego-
rization. Machine Learning 39, 135–168 (2000)
12. Taskar, B., Guestrin, C., Koller, D.: Max-margin Markov networks. In: Advances in
Neural Information Processing Systems, vol. 16, pp. 25–32. MIT Press, Cambridge
(2004)
13. Theodoris, S., Koutroumbas, K.: Pattern Recognition. Elsevier (2009)
14. Tsochantaridis, I., Hofmann, T., Thorsten, J., Altun, Y.: Large margin methods
for structured and interdependent output variables. Journal of Machine Learning
Research 6, 1453–1484 (2005)
15. Zhang, M.L., Zhou, Z.H.: Multi-label neural networks with applications to func-
tional genomics and text categorization. IEEE Transactions on Knowledge and
Data Engineering 18, 1338–1351 (2006)
Evaluating the Effectiveness of Intelligent Tutoring
System Offering Personalized Learning Scenario*
Adrianna Kozierkiewicz-Hetmańska
1 Introduction
Intelligent tutoring systems also called e-learning systems or systems for distance educa-
tion are very popular among teachers and students. Those systems allow a teacher to save
time because once prepared educational materials could be used many times by many
students and prepared tests do not need laborious assessment. Moreover, it is possible to
work out learning material in an interesting way by applying various multimedia tech-
niques. Students are allowed to learn at any convenient time and place using e-learning
systems. Furthermore, intelligent tutoring systems increase the learning effectiveness:
users achieve better learning results in a shorter time.
In order to keep the intelligent tutoring systems interesting they require further devel-
opment and detailed analysis. It is also needed to conduct a research on influence of the
applied didactic methods on effectiveness of the learning process. Moreover, the imple-
mented and prepared systems should be evaluated by users. It is one of the most impor-
tant step of a system design which is very often omitted. So far there are not enough
works containing information about quality of designed e-learning systems confirmed by
experimental research or it is not possible to generalize conclusions because they concern
only systems which have been tested.
In this work we want to introduce the results of experiments which prove the hypo-
thesis that it is possible to increase the effectiveness of intelligent tutoring systems by
taking into consideration the user profile in determination of the learning scenario.
*
This research was financially supported by Polish Ministry of Science and Higher Education
under the grant N N519 407437.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 310–319, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Evaluating the Effectiveness of Intelligent Tutoring System 311
By the learning scenario we mean the order and the presentation form of an educa-
tional material proposed to a student by the e-learning system. The formal definition
of the learning scenario can be found in [8], [11],[12]. It is intuitively true that
students are interested and more motivated if they learn using intelligent tutoring
systems where the learning materials and tests are suitable for studentsÊ preferences,
needs, interests or current knowledge level. Our research confirms the efficiency of
e-learning systems with personalization of the learning process. In addition, we check
the influence of personalization of the learning process according to age.
We designed and implemented a prototype of an e-learning system which was used
to conduct an experiment. The prototype of the e-learning system is a simplification
of an intelligent tutoring system proposed in author’s previous work. The author
designed the model of the intelligent tutoring system which offers an individual learn-
ing process on each step. After the registration process the student is classified to a
group of similar users based on a set of attributes selected by an expert as a criterion
of the classification. The opening learning scenario for a new learner is determined
based on finished scenarios of students who belong to the same class as the new one.
For this task the algorithms using the consensus theory are worked out [8], [12], [13].
After each lesson the user has to pass a computer adaptive test where each question is
selected based on the current student's knowledge level [10]. If the student achieves a
sufficient test score (more than 50%) he continues learning according to the previous-
ly selected learning scenario. Otherwise, it is a signal for the system that the opening
learning scenario should be modified. The author proposed two methods which have
been described in [9], [11], [12] and [13]. The learning process is finished if all
lessons from the learning scenario are taught.
In the next Section an overview of previous researches concerning assessment of
methods applied in various intelligent tutoring systems is presented. Section 3 de-
scribes the concept of an experiment with a short description of the implemented
e-learning system prototype. Section 4 shows the results of an experiment with
appropriate conclusions. Finally, general conclusions and future work are described.
2 Related Works
The evaluation of intelligent tutoring systems is an important but often neglected
stage of an e-learning system development. In many works it is possible to find for-
mal system models for distance education without an experimental proof of efficiency
of those system. Sometimes researches are limited to a system which has been tested
and conclusions are not general or the results are not statistically significant.
System ELM-ART [18] is one of the first and adaptive web-based system where
various methods are applied. In ELM-ART knowledge is represented by means of the
multi-layered overlay model which supports adaptation in the system. System offers
individual curriculum sequencing and adaptive annotation of links. Additionally, the
system supports adaptive testing and stores student’s preferences which are used to
create a learning environment suitable for user’s needs. System ELM-ART was
evaluated only in comparison to previous version of this system called ELM-PE.
Moreover, the research was conducted on a small population (less than 30). Tests
showed that system ELM-ART is better than ELM-PE.
312 A. Kozierkiewicz-Hetmańska
EFit [6] is a system which provides highly individualized instructions and teaching
environment adapted to learner’s knowledge and learning capabilities. Authors
presented a study carried out with 194 children at a German lower secondary school.
The results showed that, compared to a non-treatment control group, children im-
proved their arithmetic performance if they learned using eFit.
In Shikshak [4] knowledge is represented as a tree or a topic dependency graph.
Student model stores the performance of a learner represented by a fuzzified value
and some additional information. Shikshak plans the individual teaching method
based on user’s performance level and selects the type of material (such as explana-
tion based, visual based, instructional) as suited to the learning style of a particular
student using fuzzy rules. The study based on Shikshak system were conducted on a
small population (only thirty three students). The results demonstrated an overall 4%
improvement in performance while teaching using Shikshak in comparison with the
performance of teaching in traditional classrooms. Additionally, authors presented
some inherent features of the system. The received results concerned only tested
system.
SEDHI [17] classifies the students according to profiles (defined based on a statis-
tical study of the course user and usage data) and adapts the navigation using the
techniques of link hiding and link annotation. The evaluation of the prototype allowed
only to demonstrate the appropriateness of applied methods but results were not
binding because only five students took part in the experiment.
DEPTHS [7] is another intelligent tutoring system which generates teaching plans
and adaptive presentation of the teaching material. DEPTHS also supports adaptive
presentation and adaptive navigation. In this system knowledge is organized in a de-
pendency graph. The student model stores and updates data about user’s performance
within a specific subject domain. The efficiency of DEPHTS was evaluated by 42
students. The first assessment relied on measuring the learner’s satisfaction level.
Students reported that the system had helped them to learn and provided many useful
information, feedback messages and advice for further work. Authors demonstrated
that students who learned with DEPTHS performed better than students who learned
in the traditional way. Finally, the experiment found that the proposed student model
did not reflect the realistic students’ knowledge.
System WELSA [16] tries to identify students’ learning preferences and based on
worked out adaptation rules offers students individualized courses. Authors of
WELSA involve 64 undergraduate students in an experiment. Users could follow two
courses (in the Computer Science area): one adaptive and one non-adaptive. The ob-
tained results show that the matched adaptation approach increased the efficiency of
the learning process with a lower amount of time needed for studying and a lower
number of hits on learning resources.
More general studies are presented in [3]. System AnimalWatch [3] offers prob-
lems and tasks customized to student’s proficiency level and adaptively focuses on
the areas that each student needs to practice most. Authors of this system present re-
searches involved pre and post test comparison. Results indicated that students im-
proved from pre to post test after working with AnimalWatch. Improvement was
greatest for students with the weakest initial math skills who were also most likely to
use the multimedia lessons and worked examples. Authors studied over 400 students
attending schools in Los Angeles who used AnimalWatch.
Evaluating the Effectiveness of Intelligent Tutoring System 313
Pillay and Wilss [15] pointed out that students achieve better results if the material
was matched to individuals’ preferred cognitive style. Students tended to perform
better (66%) than those who received mismatched instruction (62%). The small
sample size (only 26 students) and the uneven distribution over the cognitive styles
prevented the computation of any statistically significant analyses.
Another study was carried out by Kwok and Jones [14]. Authors found that stu-
dents at the extremes of the learning style spectrum needed guidance in selecting an
appropriate navigation method and it helped raise their interest in the material.
In paper [1] the results of 3 experiments were introduced. The studies demonstrated
that both introverts and extroverts had benefits from adaptive e-learning systems but
extroverted learners would benefit even more from using those systems. The sample size
of the experiments (less than 40) could not guarantee the validity of the interpretations.
The experiment conducted as a part of this study is most similar to experiment de-
scribed in paper [2] where authors used the Felder-Solomon Learning Style Question-
naire to measure the learning style preferences of students. The obtained results show that
all students achieved significantly higher scores while browsing session matched to their
learning style. However, in our research we examined bigger population (more than 21
people) and learning material was prepared for all learning styles (not only global and
sequential). Additionally, we investigate the influence of personalization of the learning
process according to age.
were proposed the learning scenario with an additional presentation at the beginning,
which contains the big picture and the learning material is organized from general
ideas to details. Sequential students were presented the material logically ordered
from details to general ideas. The learning scenario for visual students contained more
pictures, flows, charts etc. Verbal students got more out of words. Active learners
were offered more tests and practical tasks [5]. The author implemented, within the
prototype of e-learning system, the rules for tailoring the learning scenario to the
student’s characteristic.
For all users the learning material is about intersections, roadway signs related to in-
tersections and right-of-way laws. After learning, the user has to pass a test within 10
minutes. The test consists of 10 questions chosen randomly from a question bank consist-
ing of 30 items. After solving the test the learner is presented with a test score in percen-
tage. If the student fails the test (more than 50% of wrong answers) for the first time he is
offered the same learning scenario once again. After the second failure the system choos-
es the learning scenario with students achieving the best results. If the test score is still
unsatisfactory the experiment is finished without a successful graduation.
Hypothesis 1. The mean test scores of experimental and control groups for students
without driving licenses satisfies the following assumption: H 0 : μ exs = μ cont or
alternative assumption: H 1 : μ exs > μ cont .
Proof. Firstly, we check the normality of the distribution of analyzed features using
the Shapiro-Wilk test. For both groups (experimental and control) we can conclude
the normality of distribution of analyzed features ( Wexp = 0.9516 > W(0.05,55) = 0.951
and Wcont = 0.962 > W(0.05,67) = 0.956 ) and next, we use the parametric test. There
is no information considering standard deviation and sizes of groups are different but
large ( nexs + ncont ≥ 120 ) that is why for testing the null hypothesis the statistic
follow normal distribution N (0,1) is assumed:
μ exs − μ cont
u= (1)
2 2
S exs S
+ cont
nexs ncont
Evaluating the Effectiveness of Intelligent Tutoring System 315
where: μ exs - average score value for the experimental group, μ cont - average score
value for the control group, S exs - estimated standard deviation of the sample for the
experimental group, S cont - estimated standard deviation of the sample for the control
group, nexs -size of the experimental group, ncont -size of the control group. The
tested statistical value equal:
59.818 − 52.239
u= = 1,834
(23.08) 2 (22.248) 2
+
55 67
The significance level of 0.05 is assumed. The confidence interval equals [1.64,+∞ ) .
The tested statistical value belongs to the confidence interval u ∈ [1.64,+∞ ) , that is
why the null hypothesis is rejected and the alternative hypothesis is assumed.
We can estimate the difference between the average test scores of experimental and
control group. Let us determine the following confidence interval:
u 0.95 u 0.95
(( μ exs − μ cont ) − ; ( μ exs − μ cont ) + ) (2)
S exs 2 S cont 2 S exs 2 S cont 2
+ +
nexs ncont nexs ncont
We obtain: (7.182;7.976)
Conclusion 1. The mean test score of the experimental group is greater than the
mean score of the control group in case that students have no driving licenses. Stu-
dents who were offered the personalized learning scenario achieve better learning
results by more than 7.182% and less than 7.976% than students who were proposed
the universal learning scenario.
Hypothesis 2. The mean test scores of experimental and control groups for students
with driving licenses satisfies the following assumption: H 0 : μ exs = μ cont or alterna-
tive assumption: H 1 : μ exs > μ cont .
Proof. For both groups (experimental and control) we cannot reject the hypothesis
about the normality of distribution of analyzed features (the Shapiro-Wilk test:
Wexp = 0.9713 > W(0.05,92) = 0.963 and Wcont = 0.965 > W(0.05,82) = 0.961 ). As it
was done before we calculate the statistic (2):
79.674 − 78.781
u= = 0.34
(16.65) 2 (17.49) 2
+
92 82
The significance level of 0.05 is assumed. We have u ∉ [1.64,+∞) , that is why the
null hypothesis cannot be rejected.
Conclusion 2. The mean test score of experimental and control groups are not signif-
icantly different in case that students have a driving license.
Hypothesis 3. The mean test scores of experimental and control groups for users
under age of 18 satisfies the following assumption: H 0 : μ exs = μ cont or alternative
assumption: H 1 : μ exs > μ cont .
μ exs − μ cont
t= (3)
(nexs − 1) ⋅ S exs 2 + S cont 2 ⋅ (ncont − 1) ⎛ 1 1 ⎞
⋅ ⎜⎜ + ⎟
⎟
nexs + ncont − 2 ⎝ exs
n n cont ⎠
48.75 − 35.417
t= = 2.245
(16 − 1) ⋅ (20.879) 2 + (16.578) 2 ⋅ (24 − 1) ⎛ 1 1 ⎞
⋅⎜ + ⎟
16 + 24 − 2 ⎝ 16 24 ⎠
The significance level of 0.05 is assumed. The tested statistical value belongs to the
confidence interval t ∈ [1.686,+∞ ) , that is why the null hypothesis is rejected and the
alternative hypothesis is assumed.
Conclusion 3. The mean test score of the experimental group is greater than the
mean score of the control group in case that users are underage.
Hypothesis 4. The mean test scores of experimental and control groups for users over
age of 18 satisfies the following assumption: H 0 : μ exs = μ cont or the alternative
assumption: H 1 : μ exs > μ cont .
318 A. Kozierkiewicz-Hetmańska
Proof. In both groups (experimental and control) we cannot reject the hypothesis about
the normality of the distribution of analyzed features (the Shapiro-Wilk test:
Wexp = 0.9438 > W(0.05,39) = 0.939 and Wcont = 0.949 > W(0.05,43) = 0.943 ). As it
was done before we check the equality of the standard deviation using the F-Snedecor
test. We obtain F = 1.349 . For α = 0.05 the tested statistical value does not belong to
the confidence interval [1.88,+∞ ) (two-tailed test), so we cannot reject the hypothesis
that the standard deviation of two samples are equal. Next, we calculate the statistic (3):
64.359 − 61.628
t= = 0.579
(39 − 1) ⋅ (22.395) 2 + (19.281) 2 ⋅ (43 − 1) ⎛ 1 1 ⎞
⋅⎜ + ⎟
39 + 41 − 2 ⎝ 39 41 ⎠
For the significance level equal to 0.05 we obtain the following confidence interval:
[1.664,+∞ ) . The tested statistical value does not belong to the confidence interval
t ∉ [1.664,+∞) , that is why the null hypothesis cannot be rejected.
Conclusion 4. The mean test score of experimental and control groups are not
significantly different in case that students are adults.
References
1. Amal Al-Dujaily, A., Ryu, H.: A Relationship between e-Learning Performance and Perso-
nality. In: Proceedings of the Sixth IEEE International Conference on Advanced Learning
Technologies, Kerkrade, Holandia, pp. 84–86 (2006)
Evaluating the Effectiveness of Intelligent Tutoring System 319
2. Bajraktarevic, N., Hall, W., Fullick, P.: Incorporating learning styles in hypermedia envi-
ronment: Empirical evaluation. In: Proceedings of AH 2003, at the 12th World Wide Web
Conference, Budapest, Hungary, pp. 41–52 (2003)
3. Beal, C.R., Arroyo, I.M., Cohen, P.R., Woolf, B.P.: Evaluation of AnimalWatch: An intel-
ligent tutoring system for arithmetic and fractions. Journal of Interactive Online Learn-
ing 9, 64–77 (2010)
4. Chakraborty, S., Roy, D., Basu, A.: Shikshak: An Architecture for an Intelligent Tutoring
System. In: Proc. 16th International Conference on Computers in Education, Taipei,
Taiwan, pp. 24–31 (2008)
5. Felder, R.M., Silverman, L.K.: Learning and Teaching Styles in Engineering Education.
Engineering Education 78(7), 674–681 (1988)
6. Graff, M., Mayer, P., Lebens, M.: Evaluating a web based intelligent tutoring system for
mathematics at German lower secondary schools. Education and Information Technolo-
gies 13, 221–230 (2008)
7. Jeremic, Z., Jovanovic, J., Gaševic, D.: Evaluating an Intelligent Tutoring System for De-
sign Patterns: the DEPTHS Experience. Journal of Educational Technology & Socie-
ty 12(2), 111–130 (2009)
8. Kozierkiewicz, A.: Determination of Opening Learning Scenarios in Intelligent Tutoring
Systems. In: Zgrzywa, A., Choroś, K., Siemiński, A. (eds.) New Trend in Multimedia and
Network Information Systems, pp. 204–213. IOS Press (2008)
9. Kozierkiewicz-Hetmańska, A.: A Conception for Modification of Learning Scenario in an
Intelligent E-learning System. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.)
ICCCI 2009. LNCS (LNAI), vol. 5796, pp. 87–96. Springer, Heidelberg (2009)
10. Kozierkiewicz-Hetmańska, A., Nguyen, N.T.: A Computer Adaptive Testing Method for
Intelligent Tutoring Systems. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds.)
KES 2010, Part I. LNCS, vol. 6276, pp. 281–289. Springer, Heidelberg (2010)
11. Kozierkiewicz-Hetmańska, A., Nguyen, N.T.: A Method for Scenario Modification in In-
telligent E-Learning Systems Using Graph-Based Structure of Knowledge. In: Nguyen,
N.T., Katarzyniak, R., Chen, S.-M. (eds.) Advances in Intelligent Information and Data-
base Systems. SCI, vol. 283, pp. 169–179. Springer, Heidelberg (2010)
12. Kozierkiewicz-Hetmańska, A., Nguyen, N.T.: A method for learning scenario determina-
tion and modification in intelligent tutoring systems. Applied Mathematics and Computer
Science 21(1), 69–82 (2011)
13. Kozierkiewicz-Hetmańska, A.: A Method for Scenario Recommendation in Intelligent
E-Learning Systems. Cybernetics and Systems 42(2), 82–99 (2011)
14. Kwok, M., Jones, C.: Catering for different learning styles. Association for Learning
Technology Journal (ALT-J) 3(1), 5–11 (1985)
15. Pillay, H., Willss, L.: Computer assisted instruction and individual cognitive style prefe-
rences in learning: Does it matter? Australian Educational Computing 11(2), 28–33 (1996)
16. Popescu, E.: Adaptation provisioning with respect to learning styles in a Web-based edu-
cational system: an experimental study. Journal of Computer Assisted Learning 26(4),
243–257 (2010)
17. da Silva, G.T., Rosatelli, M.C.: Adaptation in Educational Hypermedia Based on the Clas-
sification of the User Profile. In: Ikeda, M., Ashley, K.D., Chan, T.-W. (eds.) ITS 2006.
LNCS, vol. 4053, pp. 268–277. Springer, Heidelberg (2006)
18. Weber, G., Peter Brusilovsky, P.: ELM-ART: An Adaptive Versatile System for Web-
based Instruction. International Journal of Artificial Intelligence in Education 12, 351–384
(2001)
Knowledge Discovery by an Intelligent Approach
Using Complex Fuzzy Sets
Abstract. In the age of rapidly increasing volumes of data, human experts have
come to the urgent need to extract useful information from the huge amount of
data. Knowldege discovery in databases has obtained much attention for
researches and applications in business and in science. In this paper, we present
a neuro-fuzzy approach using complex fuzzy sets (CFSs) for the problem of
knowledge discovery. A CFS is an advanced fuzzy set, whose membership is
complex-valued and characterized by an amplitude function and a phase
function. The application of CFSs to the proposed complex neuro-fuzzy system
(CNFS) can increase the functional mapping ability to find missing data for
knowledge discovery. Moreover, we devise a hybrid learning algorithm to
evolve the CNFS for modeling accuracy, combining the artificial bee colony
algorithm and the recursive least squares estimator method. The proposed
approach to knowledge discovery is tested through experimentation, whose
results are compared with those by other approaches. The experimental results
indicate that the proposed approach outperforms the compared approaches.
1 Introduction
In the era of the modern information world, data have been accumulating
exponentially in various forms. We now have been facing the urgent need to develop
new theories and tools to extract useful information from data automatically and
effectively. If it is understandable and interpretable by human beings, information can
be turned into knowledge. For knowledge discovery and system modeling, several
artificial intelligence (AI) based approaches have been presented, where fuzzy logic,
neural networks, and neuro-fuzzy systems (NFSs) have been playing significantly
important roles for modeling and applications. In general, fuzzy-system-based
approaches have the advantage of representing knowledge in the form of If-Then
rules, which are transparent to human beings. In contrast, neural-network-based
approaches are with the merit of adaptability. The hybrid of fuzzy and neural models
has been obtaining the popularity in the community of research, although each of
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 320–329, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Knowledge Discovery by an Intelligent Approach Using Complex Fuzzy Sets 321
AI-based approaches still has its own important feature in the perspective of research.
Zhang et al. [1] used granular neural networks for data fusion and knowledge
discovery. Fayyad et al. [2] presented a good overview for knowledge discovery,
where the various methods of data mining provide just a single step for the whole
process of information discovery. Castellano et al. [3] presented a neuro-fuzzy
modeling framework for knowledge discovery. Qin et al. [4] proposed a kernel-based
imputation to deal with missing data in knowledge discovery. In [5], Zhang et al.
presented a fuzzy modeling approach for training data selection and multi-object
optimization. In [6], Rezaee and Zarandi developed a data-driven TSK fuzzy
approach for fuzzy modeling. Juang [7] presented the design of recurrent neural
network using a hybrid method, which uses genetic algorithm and particle swarm
optimization for modeling. Kurban and Beşdok [8] investigated several training
methods for a RBF neural network in the application of terrain classification.
Boskovitz and Guterman [9] used a neuro-fuzzy system for image segmentation and
edge detection. Cpałka [10] designed a neuro-fuzzy classification system. Jang [11]
presented the famous adaptive neural-network-based fuzzy inference system
(ANFIS), which molds the hybrid of a neural network and a fuzzy inference system,
for system modeling and forecasting. Scherer in [12] used a neuro-fuzzy system for
nonlinear modeling. In general, neural fuzzy systems [13] are excellent tools for
modeling and for knowledge discovery. Qin and Yang [14] studied a neuro-fuzzy
method for image noise cancellation. In [15], Zounemat-Kermani and Teshnehlab
presented a neuro-fuzzy approach to time series forecasting.
In this paper, we present a framework called the complex neuro-fuzzy system (CNFS)
for knowledge discovery. A CNFS is a neuro-fuzzy based system whose kernel is
composed of fuzzy If-Then rules which are characterized by complex fuzzy sets (CFSs).
Ramot at al. [16] proposed the theory of CFS. A CFS is an advanced fuzzy set whose
membership degree is complex-valued and characterized by an amplitude function and a
phase function in the unit disc of the complex plane. The property of complex
membership state of a CFS indeed makes difference from a traditional type-1 fuzzy set,
whose membership degree is defined normally in the real-valued unit interval of [0, 1].
The application of CFSs [17]-[19] to the design of adaptive systems can increase their
adaptability for learning, so that the functional mapping capability of the adaptive
systems can be augmented. For this motivation, we proposed the CNFS approach using
CFSs for the process of knowledge discovery. For knowledge discovery, the goal of the
study is that with the proposed CNFS approach we try to represent numerical-linguistic
records [1] by rules. In the perspective of If-Then rules, we apply the proposed CNFS
models to create a framework of knowledge discovery, which can transform these
numerical-linguistic records into fuzzy If-Then rules. And, in the principle of divide-and-
conquer, the parameters of CNFS are imaginatively separated into two groups: the
premise parameters and the consequent parameters. For the parameter learning of CNFS,
we devise a hybrid ABC-RLSE learning method, which integrates the famous artificial
bee colony (ABC) algorithm [20]-[22] and the well-known recursive least squares
estimator (RLSE) method [13]. The ABC-RLSE algorithm is applied to evolve the
parameters of CNFS, in the way that the ABC is used to update the premise parameters
and the RLSE is used to update the consequent parameters. The ABC-RLSE can achieve
fast training for the proposed approach to the application of knowledge discovery.
322 C. Li and F.-T. Chan
We arrange the rest of the paper in the following. In Section 2, we present the
knowledge discovery framework. In Section 3, the proposed CNFS and the ABC-RLSE
learning method for knowledge discovery are specified. In Section 4, Experimentation is
conducted to test the proposed approach for knowledge discovery. The experimental
results are compared to other approaches. Finally, the paper is concluded.
x y
Feature extraction
CNFS modeling Linguistic record output
x(a1),y(a2) a
CNFS1
x(b1),y(b2) b Linguistic record
CNFS2 z
x(c1),y(c2) c
CNFS3
Linguistic feature
x(d1),y(d2) d
CNFS4 z(a,b,c,d)
Learning algorithm
(ABC-RLSE)
discovery. To optimize these CNFS models, the ABC-RLSE hybrid learning method
is applied. The optimization is based on the measure of root mean square error
(RMSE). Note that the details for theory of CNFS and the ABC-RLSE learning
method are specified later in the following section.
(C) Linguistic Record Output. The block of linguistic record output has two subparts.
One is to generate the linguistic record z, and the other is to produce the feature
vector, denoted as (a,b,c,d), for z.
3 Methodology
exp
Re Im (1)
cos A sin ,
where √ 1; and are the mean and the spread of the GCFS, respectively.
Suppose that we have a CNFS that consists of K first-order Takagi-Sugeno (T-S)
fuzzy rules, given as follows.
for i =1,2,...,K, where i is the index for the ith fuzzy rule; M is the number of inputs; xj is
the jth input linguistic variable for j=1,2,…,M; hj is the jth input base variable;
is the jth premise, which is defined by a GCFS in (2); is the nominal output; { ,
j=0,1,…,M} are consequent parameters. The complex fuzzy inference process can be cast
into a neural-net structure to become a CNFS with six layers, specified below.
Layer 0: This layer receives the inputs and transmits them to Layer 1. The input
vector at time t is given as follows.
324 C. Li and F.-T. Chan
T
t t . (4)
Layer 1: This layer is called the complex fuzzy-set layer, which is used to calculate
complex membership degrees of GCFSs. Layer 2: This layer is called the firing-
strength layer. The firing strength of the ith rule is calculated and defined as fellow.
∏ exp …
, (5)
where and …
are the amplitude function and the phase function of the
firing strength of the ith rule. Layer 3: This layer is used for the normalization of the
firing strengths. The normalized firing strength for the ith rule is given as follows.
∏ exp
…
∑
. (6)
∑ ∏ exp
…
Layer 4: The layer is called the consequent layer. The normalized consequent of the
ith rule is represented as follows.
0
∑ 1 ,
∏ 1 exp (7)
1 …
∑ .
∑ 1∏ 1 exp
1 …
Layer 5: This layer is called the output layer. The normalized consequents from
Layer 4 are congregated into the layer to produce the CNFS output, given as follows.
∑ ∑ . (8)
where ξRe(t) is the real part of the output, and ξIm (t) is the imaginary part. Based on
(8), the CNFS can be viewed as a complex-valued function, expressed as follows.
F , FR , FI , , (10)
where FRe(.) is the real part of the CNFS output; FIm(.) is the imaginary part; H(t) is
the input vector to the CNFS; W denotes the parameter set of the CNFS, including the
premise parameters and the consequent parameters, denoted as WIf and WThen,
respectively.
I T . (11)
Knowledge Discovery by an Intelligent Approach Using Complex Fuzzy Sets 325
For the training of the proposed CNFS, we apply the ABC-RLSE hybrid learning
method, where WIf and WThen are updated by the ABC and RLSE, respectively, in a
hybrid way.
The artificial bee colony (ABC) algorithm is an optimization method which
simulates the nectar-searching behavior by bees. Basically, the bees of a bee colony
can be separated into three groups: the employed bees (EBs), the onlooker bees
(OBs), and the scout bees (SBs). The EBs perform their job by flying to the food
sources and bringing back food to the colony. Moreover, these EBs reveal the
locations of food sources to the OBs by dancing. With the dancing information, each
of the OBs selects a food source to go. The selection of a food source is dependent on
the nectar amount of each food source. For the SBs, they fly randomly to look for new
food sources. Note that the nectar amount of each food source corresponds to a fitness
value for a specific optimization problem and the food source location represents a
candidate solution to the optimization problem. For the ABC algorithm, assume there
are S food sources. The location of the ith food source is expressed as Xi=[ xi1, xi2, …,
xiQ] for i=1,2,…,S. In the ABC algorithm, the location of the ith food source is
updated by the following equation.
1 , (12)
for i=1,2,…,S and j=1,2,…,Q, where xij(t) is the jth dimensional coordinate of the ith
food source at iteration t, k is a random integer in the set of {1,2,…,S} with the
constraint of i≠k, and φij is a value between [-1, 1]. An onlooker bee goes to the
vicinity of Xi with the probability given below.
∑
, (13)
where . is the fitness function. In the ABC, if the fitness of a food source is
not improved further through a predetermined number of cycles, called limit, then that
food source is assumed to be abandoned. The operation of the ABC is specified in
steps. Step 1: initialize the locations of bees. Step 2: send employed bees to food
sources and compute their fitness values. Step 3: send onlooker bees to the food
sources, according to (13), and then compute their fitness values. Step 4: send scout
bees for other food sources to help find better food source as possible. Step 5: update
the locations of food sources and save the best one so far. Step 6: if any termination
condition is met, stop; otherwise increase the iteration index and go back to Step 2 to
continue the procedure.
The recursive least squares estimator (RLSE) is a recursive method for the problem
of least squares estimation (LSE). The model of LSE is given below.
, (14)
where y is the target; u is the input to model; {fi(u), i=1,2,..,m} are known functions of
u; {θi , i=1,2,…,m} are unknown parameters to be estimated; ε is the model error.
Note that the parameters {θi, i=1,2,…,m} can be viewed as the consequent parameters
326 C. Li and F.-T. Chan
of the proposed CNFS. The observed samples are collected to be used as training data
for the proposed CNFS. The training data (TD) is denoted as follows.
TD , , 1,2, … , , (15)
where (ui, yi) is the ith data pair in the form of (input, target). With (15), the LSE
T
model can be expressed in matrix form, y=Aθ+ε, where … ;
T T
… ; … ; A is the matrix which is formed by the known
functions {fi(uj), i=1,2,..,m} for j=1,2,…,N.
The optimal estimation for θ can be calculated using the following RLSE equations
recursively.
, (16)
, (17)
for k = 0,1,2,…,(N-1), where θk is the estimator at the kth iteration; T , is
the (k+1)th row of [A, y]. To implement the RLSE, we initialize θ0 with the zero
vector and P0=αI, where α is a large positive number and I is the identity matrix.
4 Experimentation
The target of this experiment is to find the unknown records and represents the
unknown records by IF-Then rules. We use the trapezoidal feature vector to represent
a linguistic record [1]. Suppose we have a data set for (x, y, z) that includes numerical
and linguistic data, as shown in Table 1, in which x and y are the two variables for
inputs; z is the variable for output linguistic record; denotes a input linguistic
record, meaning it is around the value d; “?” denotes that the output linguistic record
is missing. For the linguistic record of x, there is a feature vector by (x(a1), x(b1),
x(c1), x(d1)). Similarly, for the linguistic record of y, there is a feature vector by
(y(a2), y(b2), y(c2), y(d2)). For the record of z, a multiplication z= xy is implied,
whose feature vector for linguistic representation is represented by (a, b, c, d). There
are nine fuzzy If-Then rules to represent these known records, shown in the Table 2.
1 1.5 2 2.5 3
1 1 ? 2 ? 3
1.5 ? ? ? ? ?
x
2 2 ? 4 ? 6
2.5 ? ? ? ? ?
3 3 ? 6 ? 9
: linguistic record around a numerical value d.
? : unknown record.
Knowledge Discovery by an Intelligent Approach Using Complex Fuzzy Sets 327
x, (x(a1), x(b1), x(c1), x(d1)) y, (y(a2), y(b2), y(c2), y(d2)) z = xy, (a, b, c, d)
around 1, (1, 0.1, 0.5, 0.5) around 1, (1, 0.1, 0.5, 0.5) around 1, (1, 0.1, 0.5, 0.5)
around 1, (1, 0.1, 0.5, 0.5) around 2, (2, 0.2, 1.0, 1.0) around 2, (2, 0.2, 1.0, 1.0)
around 1, (1, 0.1, 0.5, 0.5) around 3, (3, 0.3, 1.5, 1.5) around 3, (3, 0.3, 1.5, 1.5)
around 2, (2, 0.2, 1.0, 1.0) around 1, (1, 0.1, 0.5, 0.5) around 2, (2, 0.2, 1.0, 1.0)
around 2, (2, 0.2, 1.0, 1.0) around 2, (2, 0.2, 1.0, 1.0) around 4, (4, 0.4, 2.0, 2.0)
around 2, (2, 0.2, 1.0, 1.0) around 3, (3, 0.3, 1.5, 1.5) around 6, (6, 0.6, 3.0, 3.0)
around 3, (3, 0.3, 1.5, 1.5) around 1, (1, 0.1, 0.5, 0.5) around 3, (3, 0.3, 1.5, 1.5)
around 3, (3, 0.3, 1.5, 1.5) around 2, (2, 0.2, 1.0, 1.0) around 6, (6, 0.6, 3.0, 3.0)
around 3, (3, 0.3, 1.5, 1.5) around 3, (3, 0.3, 1.5, 1.5) around 9, (9, 0.9, 4.5, 4.5)
In order to discover the 16 unknown linguistic records for z in Table 1, we use four
CNFS models to discover their feature vectors. The framework of knowledge
discovery is shown in Fig. 1. The feature vectors of the known (x, y, z) linguistic
record pairs are used as the training data for the four CNFS models, for which the
ABC-RLSE method is applied for the training purpose. The settings for the ABC-
RLSE method are shown in Table 3.
After learning, the CNFS models can generate feature vectors for the sixteen
unknown linguistic records, respectively, as shown in Table 4. The discovered
linguistic records are represented by If-Then rules, as shown in Table 5. The results
are compared to other approaches. Performance comparison in the measure of average
absolute error (ABE) [1] is given in Table 6.
1 1.5 2 2.5 3
1 1 1.4940 2 2.5067 3
1.5 1.4888 2.2241 2.9770 3.7311 4.4649
2 2 2.9880 4 5.0136 6
2.5 2.5114 3.7523 5.0235 6.2969 7.5359
3 3 4.4818 6 7.5207 9
328 C. Li and F.-T. Chan
x y z, (a, b, c, d)
around 1 around 1.5 around 1.4940, (1.4940, 0.1499, 0.7498, 0.7542)
around 1 2.5 around 2.5067, (2.5067, 0.2500, 1.2501, 1.2409)
1.5 around 1 around 1.4888, (1.4888, 0.1499, 0.7489, 0.7482)
1.5 around 1.5 around 2.2241, (2.2241, 0.2249, 1.1231, 1.1288)
1.5 around 2 around 2.9970, (2.9970, 0.3000, 1.4978, 1.4965)
1.5 2.5 around 3.7311, (3.7311, 0.3750, 1.8724, 1.8570)
1.5 around 3 around 4.4649, (4.4649, 0.4500, 2.2467, 2.2448)
around 2 around 1.5 around 2.9880, (2.9880, 0.2999, 1.4997, 1.5085)
around 2 2.5 around 5.0136, (5.0136, 0.5000, 2.5002, 2.4819)
around 2.5 around 1 around 2.5114, (2.5114, 0.2500, 1.2511, 1.2521)
around 2.5 around 1.5 around 3.7523, (3.7523, 0.3750, 1.8764, 1.8888)
around 2.5 around 2 around 5.0235, (5.0235, 0.4999, 2.5023, 2.5043)
around 2.5 2.5 around 6.2969, (6.2969, 0.6249, 3.1282, 3.1079)
around 2.5 around 3 around 7.5359, (7.5359, 0.7499, 3.7535, 3.7564)
around 3 around 1.5 around 4.4818, (4.4818, 0.4500, 2.2496, 2.2626)
around 3 2.5 around 7.5207, (7.5207, 0.7499, 3.7503, 3.7231)
ABE
Method Training phase Testing phase
CGNN [1] 0.0001 0.303
FGNN [1] 0.0001 0.224
Proposed 5.58×10-8 0.0194
5 Conclusion
The complex neuro-fuzzy system (CNFS) based approach using complex fuzzy sets
for knowledge discovery has been presented. The ABC-RLSE hybrid learning method
has been devised for training the proposed CNFS models in the framework of
knowledge discovery. Through experimentation, the proposed approach has shown
excellent performance in finding missing data for knowledge discovery. The
experimental results indicate that our proposed approach outperforms the compared
methods in performance comparison.
References
1. Zhang, Y.Q., Fraser, M.D., Gagliano, R.A., Kandel, A.: Granular neural networks for
numerical-linguistic data fusion and knowledge discovery. IEEE Transactions on Neural
Networks 11, 658–667 (2000)
2. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: The KDD process for extracting useful
knowledge from volumes of data. Commun. ACM 39(11), 27–34 (1996)
Knowledge Discovery by an Intelligent Approach Using Complex Fuzzy Sets 329
3. Castellano, G., Castiello, C., Fanelli, A.M., Mencar, C.: Knowledge discovery by a neuro-
fuzzy modeling framework. Fuzzy Sets and Systems 149, 187–207 (2005)
4. Qin, Y., Zhang, S., Zhu, X., Zhang, J., Zhang, C.: POP algorithm: Kernel-based imputation
to treat missing values in knowledge discovery from databases. Expert Systems with
Applications 36, 2794–2804 (2009)
5. Zhang, Q., Mahfouf, M.: A hierarchical Mamdani-type fuzzy modelling approach with
new training data selection and multi-objective optimisation mechanisms: A special
application for the prediction of mechanical properties of alloy steels. Applied Soft
Computing 11, 2419–2443 (2011)
6. Rezaee, B., Zarandi, M.H.F.: Data-driven fuzzy modeling for Takagi–Sugeno–Kang fuzzy
system. Information Sciences 180, 241–255 (2010)
7. Juang, C.F.: A hybrid of genetic algorithm and particle swarm optimization for recurrent
network design. IEEE Transactions on Systems, Man, and Cybernetics 34, 997–1006
(2004)
8. Kurban, T., Beşdok, E.: A comparison of RBF neural network training algorithms for
inertial sensor based terrain classification. Sensors, 6312–6329 (2009)
9. Boskovitz, V., Guterman, H.: An adaptive neuro-fuzzy system for automatic image
segmentation and edge detection. IEEE Transactions on Fuzzy Systems 10, 247–262
(2002)
10. Cpałka, K.: A new method for design and reduction of neuro-fuzzy classification systems.
IEEE Transactions on Neural Networks 20, 701–714 (2009)
11. Jang, S.R.: ANFIS: adaptive-network-based fuzzy inference system. IEEE Transactions on
Systems, Man, and Cybernetics 23, 665–685 (1993)
12. Scherer, R.: Neuro-fuzzy relational systems for nonlinear approximation and prediction.
Nonlinear Analysis: Theory, Methods & Applications 71, 1420–1425 (2009)
13. Jang, J.S.R., Sum, C.T., Mizutani, E.: Neuro-fuzzy and soft computing. Prentice-Hall,
Englewood Cliffs (1997)
14. Qin, H., Yang, S.X.: Adaptive neuro-fuzzy inference systems based approach to nonlinear
noise cancellation for images. Fuzzy Sets and Systems 158, 1036–1063 (2007)
15. Zounemat-Kermani, M., Teshnehlab, M.: Using adaptive neuro-fuzzy inference system for
hydrological time series prediction. Applied Soft Computing 8, 928–936 (2008)
16. Ramot, D., Milo, R., Friedman, M., Kandel, A.: Complex fuzzy sets. IEEE Transactions on
Fuzzy Systems 10, 171–186 (2002)
17. Chen, Z., Aghakhani, S., Man, J., Dick, S.: ANCFIS: A neurofuzzy architecture employing
complex fuzzy sets. IEEE Transactions on Fuzzy Systems 19, 305–322 (2011)
18. Dick, S.: Toward complex fuzzy logic. IEEE Transactions on Fuzzy Systems 13, 405–414
(2005)
19. Aghakhani, S., Dick, S.: An on-line learning algorithm for complex fuzzy logic. In: IEEE
International Conference on Fuzzy Systems (FUZZ), pp. 1–7 (2010)
20. Irani, R., Nasimi, R.: Application of artificial bee colony-based neural network in bottom
hole pressure prediction in underbalanced drilling. J. Petroleum Science and
Engineering 78, 6–12 (2011)
21. Karaboga, D., Basturk, B.: A powerful and efficient algorithm for numerical function
optimization: artificial bee colony (ABC) algorithm. J. Global Optimization 39, 171–459
(2007)
22. Ozturk, C., Karaboga, D.: Hybrid artificial bee colony algorithm for neural network
training. In: IEEE Congress on Evolutionary Computation (CEC), pp. 84–88 (2011)
Integration of Multiple Fuzzy FP-trees
Abstract. In the past, the MFFP-tree algorithm was proposed to handle the
quantitative database for efficiently mining the complete fuzzy frequent
itemsets. In this paper, we propose an integrated MFFP (called iMFFP)-tree
algorithm for merging several individual MFFP trees into an integrated one. It
can help derive global fuzzy rules among distributed databases, thus allowing
managers to make more sophisticated decisions. Experimental results also
showed the performance of the proposed approach.
1 Introduction
Depending on the variety of knowledge desired, data mining approaches can be
divided into association rules [1, 3], classification [10, 18], clustering [12, 13], and
sequential patterns [2, 17], among others. Among them, mining association rules from
databases is especially common in data mining research [5, 6, 16]. Many algorithms
were used for processing the whole database to find the desired information. In real-
world applications, however, a parent company may own multiple branches, and each
branch has its own local database. The manager in a parent company needs to make a
decision for the entire company from the collected databases in different branches.
Thus, it is important to efficiently integrate many different databases to make a usable
decision.
In the past, Lan and Qiu proposed a novel parallel algorithm called PFPT algorithm
[14]. It merged several FP trees into an integrated one, thus avoiding the FP-growth
mining approach to generate a huge number of intermediate results. In this paper, we
extend the PFPT algorithm and propose a MFFP-tree merging algorithm for
*
Corresponding author.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 330–337, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Integration of Multiple Fuzzy FP-trees 331
integrating different databases into one, forming an integrated MFFP (called iMFFP)
tree. The iMFFP tree inherits the property of a MFFP tree for handing quantitative
databases in fuzzy data mining.
The remainder of this paper is organized as follows. The related work is described
in section 2. The proposed algorithm for integrating multiple MFFP trees is stated in
section 3. An example to illustrate the propsoed algorithm is mentioned in section 4.
Experimental results are given in section 5. Conclusions are provided in section 6.
In this section, some related researches are briefly reviewed. They consist of the fuzzy
data mining approaches and the frequent pattern tree.
In recent years, fuzzy set theory [19] has been used frequently in intelligent systems
because of its simplicity and similarity to human reasoning. Kuok et al. proposed a
fuzzy mining approach for handling numerical data in databases and deriving fuzzy
association rules [11]. Hong et al. proposed fuzzy mining algorithms for discovering
fuzzy rules from quantitative transaction data [8]. Papadimitriou et al. proposed an
approach based on FP-trees for finding fuzzy association rules [15]. In addition, Hong
et al. also proposed the multiple fuzzy frequent pattern tree algorithm [9] to
efficiently mine complete fuzzy frequent itemsets from quantitative databases, which
extended the FP-tree mining process for constructing its suitable tree structures.
Frequent pattern mining is one of the most important research issues in data mining.
The initial algorithm for mining association rules was given by Agrawal et al. [1] in
the form of Apriori algorithm which is based on level-wise candidate set generation
and test methodology. However, because the size of the database can be very large, it
is very costly to repeatedly scan the database to count supports for candidate itemsets.
The limitations of the Apriori algorithm are overcome by an innovative approach; the
frequent pattern (FP) tree structure and the FP-growth algorithm by Han et al. [7].
Their approach can efficiently mine frequent itemsets without the generation of
candidate itemsets, and it scans the original transaction database only twice. The
mining algorithm consists of two phases; the first constructs an FP-tree structure and
the second recursively mines the frequent itemsets from the structure. After the FP-
growth algorithm is then executed, the frequent itemsets with a given support would
be derived from the FP-tree structure.
In this section, details of the proposed MFFP-tree merging algorithm are described
below.
332 T.-P. Hong et al.
4 An Illustrative Example
Assume that there are two quantitative databases DB1 and DB2 which shown in Table
1, and the minimum support threshold s is set to 30%. Both of each consisted of 4
transactions and 5 items, denoted {A} to {E}.
Assume that the fuzzy membership functions are the same for all items shown in
Figure 1. In this example, amounts are represented by three fuzzy regions: {Low},
{Middle}, and {High}. Thus, three fuzzy membership values are produced for each
item in a transaction according to the predefined membership functions.
0 1 6 11 Amount
STEPs 1 to 3: The quantitative values of the items in the transactions are represented
as fuzzy sets using the membership functions. The scalar cardinality of each fuzzy
region in the transactions of two databases is calculated as the count value and be
checked against the specified minimum count, which is (8 × 0.3) (= 2.4) to find fuzzy
frequent 1-itemsets. The results are shown in Table 2.
334 T.-P. Hong et al.
STEP 4: The sub-MFFP tree for two different quantitative databases are respectively
built. The results of two trees are respectively shown in Figures 2 and 3.
{root}
A.Middle C.High
1.6 1.4
B.Low D.Low
1.6 1.4
C.High C.Middle
0.4 0.2
{root}
A.Middle B.Low
1.6 1.4
C.Middle
0.6
STEP 5: The leaf nodes of MFFP-tree in DB1 are then traced one by one. In this
example, three branches can be desired from these leaf nodes. Figure 4 shows the
result of the three branches.
Integration of Multiple Fuzzy FP-trees 335
C.High C.Middle
0.4 0.2
STEP 6: Insert the three branches of the sub-MFFP tree of DB1 into the iMFFP tree.
Note that the sub-MFFP tree of DB2 is considered as the initial iMFFP-tree in this
example.
STEPs 7 & 8: After the step 6 is executed, two sub-MFFP trees are then merged
together. Since no sub-MFFP trees should be merged, we create the Header_Table
and insert node-links from the entry of a fuzzy region in the Header_Table to the first
branch of that fuzzy region. The final iMFFP-tree has thus been constructed. The final
results are shown in Figure 5.
{root}
Header_Table
A.Middle B.Low C.High
Item Count 3.2 1.4 1.4
A.Middle 4.2
C.High B.Low C.High A.Middle D.Low
B.Low 3.0 1.4 1.6 0.6 0.6 1.4
C.Middle 3.4
C.High 3.8 D.Low C.Middle C.Middle C.Middle A.Middle C.Middle
D.Low 2.8 1.4 1.4 0.4 0.4 0.4 0.4
5 Experimental Results
The experiments were performed on a real dataset called CONNECT [4]. It was
divided into 2 and 5 sub-databases for constructing 2 and 5 sub-MFFP trees. The
execution time of the proposed iMFFP-tree algorithm and the PFPTC algorithm [14]
was compared in different minimum support thresholds. The results are shown in
Figure 6.
336 T.-P. Hong et al.
It is obvious to see that the proposed algorithm had a better performance than the
PFPTC algorithm both in the two or five sub-databases for integration.
6 Conclusions
In real-world applications, the information from several branches can be integrated
into efficient rules for a parent industry to make correct decision. Thus, we propose
the integrating MFFP-tree (iMFFP-tree) algorithm for merging several sub-MFFP
trees into one. The branches in the sub-MFFP tree are efficiently extracted to integrate
the other sub-MFFP trees in a sequence, thus forming an integrated MFFP tree for
decision making. Experimental results also show that the proposed iMFFP-tree
algorithm has a better performance than the PFPTC algorithm.
References
1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases.
In: The International Conference on Very Large Data Bases, pp. 487–499 (1994)
2. Agrawal, R., Srikant, R.: Mining sequential patterns. In: The International Conference on
Data Engineering, pp. 3–14 (1995)
3. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in
large databases. In: The International Conference on Management of Data, pp. 207–216
(1993)
4. Bayardo, R.: UCI repository of machine learning databases,
http://fimi.ua.ac.be/data/connect.dat
5. Berzal, F., Cubero, J.C., Marín, N., Serrano, J.M.: Tbar: An efficient method for association
rule mining in relational databases. Data and Knowledge Engineering 37, 47–64 (2001)
6. Chen, M.S., Han, J., Yu, P.S.: Data mining: An overview from a database perspective.
IEEE Transactions on Knowledge and Data Engineering 8, 866–883 (1996)
7. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A
frequent-pattern tree approach. Data Mining and Knowledge Discovery 8, 53–87 (2004)
Integration of Multiple Fuzzy FP-trees 337
8. Hong, T.P., Kuo, C.S., Wang, S.L.: A fuzzy aprioritid mining algorithm with reduced
computational time. Applied Soft Computing 5, 1–10 (2004)
9. Hong, T.P., Lin, C.W., Lin, T.C.: Mining complete fuzzy frequent itemsets by tree structures.
In: IEEE International Conference on Systems, Man, and Cybernetics, pp. 563–567 (2010)
10. Hu, K., Lu, Y., Zhou, L., Shi, C.: Integrating classification and association rule mining: A
concept lattice framework. In: The International Workshop on New Directions in Rough
Sets, Data Mining, and Granular-Soft Computing, pp. 443–447 (1999)
11. Kuok, C.M., Fu, A., Wong, M.H.: Mining fuzzy association rules in databases. SIGMOD
Record 27, 41–46 (1998)
12. Lent, B., Swami, A., Widom, J.: Clustering association rules. In: The International
Conference on Data Engineering, pp. 220–231 (1997)
13. Liu, F., Lu, Z., Lu, S.: Mining association rules using clustering. Intelligent Data
Analysis 5, 309–326 (2001)
14. Lan, Y.J., Qiu, Y.: Parallel frequent itemsets mining algorithm without intermediate result.
In: The International Conference on Machine Learning and Cybernetics, pp. 2102–2107
(2005)
15. Papadimitriou, S., Mavroudi, S.: The fuzzy frequent pattern tree. In: The WSEAS
International Conference on Computers, pp. 1–7 (2005)
16. Park, J.S., Chen, M.S., Yu, P.S.: Using a hash-based method with transaction trimming for
mining association rules. IEEE Transactions on Knowledge and Data Engineering 9, 813–825
(1997)
17. Srikant, R., Agrawal, R.: Mining sequential patterns: Generalizations and performance
improvements. In: The International Conference on Extending Database Technology:
Advances in Database Technology, pp. 3–17 (1996)
18. Sucahyo, Y., Gopalan, R.: Building a More Accurate Classifier Based on Strong Frequent
Patterns. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 1036–1042.
Springer, Heidelberg (2004)
19. Zadeh, L.A.: Fuzzy sets. Information and Control, 338–353 (1965)
A Quadratic Algorithm
for Testing of Omega-Codes
Nguyen Dinh Han1 , Phan Trung Huy2 , and Dang Quyet Thang3
1
Hung Yen University of Technology and Education, Vietnam
[email protected]
2
Hanoi University of Science and Technology, Vietnam
[email protected]
3
Nam Dinh University of Technology and Education, Vietnam
[email protected]
1 Preliminaries
The notion of infinitary codes has been considered in [10,9]. In the class of codes
of finite words, a subclass of ω-codes has been studied in many works [7,6,4,1]
which showed its role in development of languages of finite and infinite words.
Until now, as shown in a deep survey on tests for ω-codes conducted by Augros
and Litovsky [1], the best testing algorithm for ω-codes has time complexity
O(n3 ), where n is the size of the transition monoid of the minimal automaton
recognizing the input language. In Section 2 of our paper, a new technique based
on finite graphs to establish a test for ω-codes is presented. As a consequence,
we obtain an effective testing algorithm for ω-codes with time complexity O(n2 )
which is the main result of the paper and it is presented in Section 3.
In the following, we recall some notions (for more details, we refer to [3,5]).
Let A be a finite alphabet. As usual, A∗ is the free monoid generated by A whose
elements are called finite words, Aω is the set of all infinite words. An infinitary
language can include both finite and infinite words, the largest A∞ = A∗ ∪ Aω .
The empty word is denoted by ε and A+ = A∗ − {ε}. The length |w| of the word
w = a1 a2 · · · an with ai ∈ A is n. By convention |ε| = 0. A subset of A∗ is called
a language while a subset of Aω is an ω-language. For any language X ⊆ A∗ , we
denote by X ∗ the submonoid of A∗ generated by X, X + = X ∗ − {ε} and X ω
the ω-language
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 338–347, 2012.
c Springer-Verlag Berlin Heidelberg 2012
A Quadratic Algorithm for Testing of Omega-Codes 339
X ω = {w ∈ Aω | w = u1 u2 · · · with ui ∈ X}.
Since X is a code if and only if X ∗ is a free monoid, the code properties of X can
be verified on the submonoid generated by X. The following basic properties that
can be checked by definition, provide us tools to establish an effective algorithm
to test whether a regular language of finite words is an ω-code or not.
Then, we have the following theorem that gives a criterion for ω-codes.
A Quadratic Algorithm for Testing of Omega-Codes 341
Remark 1. The correctness of the test is deduced from Theorem 1, but we remark
that the proof of Theorem 1 in [1] should be improved with the use of lemmata
and propositions.
Then, we have the following theorem that gives a criterion for ω-codes.
for x ∈ M do
for f ∈ M do
if e
= f and T (f, x) = e then
add f −→ e into E;
Proof. Indeed, we can assume X is given by a tuple (h, K, P ) with h−1 (K) = X.
Step 1 (constructing G).
From P, K, we can construct S = K ∗ , T = K + with the time complexity
O(n2 ) by using Lemma 4.
Use S, T, K, P , we define a graph G = (V, E), with V = P ×I where I = {1, 2},
and E is constructed by the following procedure
for e ∈ P do
for s ∈ P do
if s ∈ S (or equivalently f lagS(s) == 1) then
red
add the arrow ((e, 1) −→ (e.s, 2)) to E;
for m ∈ P do
for e ∈ P do
if m.e ∈ K (or equivalently f lagK(m.e) == 1) then
blue
add the arrow ((m, 2) −→ (e, 1)) to E
if m.e ∈ K and m ∈ T and e
= 1P then
start(e, 1) = 1;
Note that we present start(e, 1) = 1 for the condition e ∈ T −1 K − L where the
start is an array of n elements in P .
Hence, constructing G requires time O(n2 ).
346 N.D. Han, P.T. Huy, and D.Q. Thang
boolean containsCycle(Graph g)
for each vertex v in g do
v.mark = WHITE;
for each vertex v in g do
if v.mark == WHITE and start(v, 1) == 1 then
if visit(g, v) then
return TRUE;
return FALSE;
References
1. Augros, X., Litovsky, I.: Algorithm to test rational ω-codes. In: Proceedings of
the Conference of The Mathematical Foundation of Informatics, pp. 23–37. World
Scientific (October 1999)
2. Berman, K.A., Paul, J.L.: Algorithms - Sequential, parallel, and distributed. Thom-
son Learning, Inc., USA (2005)
3. Berstel, J., Perrin, D., Reutenauer, C.: Theory of Codes. Academic Press Inc.,
New York (1985)
4. Devolder, J., Latteux, M., Litovsky, I., Staiger, L.: Codes and infinite words. Acta
Cybernetica 11(4), 241–256 (1994)
5. Lallement, G.: Semigroups and Combinational Applications. John Wiley and Sons,
Inc. (1979)
6. Lam, N.H., Van, D.L.: On strict codes. Acta Cybernetica 10(1-2), 25–34 (1991)
7. Mateescu, A., Mateescu, G.D., Rozenberg, G., Salomaa, A.: Shuffle-like operations
on ω-words. In: Păun, G., Salomaa, A. (eds.) New Trends in Formal Languages.
LNCS, vol. 1218, pp. 395–411. Springer, Heidelberg (1997)
8. Sedgewick, R.: Algorithms in C++, Part 5: Graph algorithms. Addition-Wesley,
Pearson Education, Inc., USA (2002)
9. Staiger, L.: On infinitary finite length codes. Informatique Théorique et Applica-
tions 20(4), 483–494 (1986)
10. Van, D.L.: Contribution to Combinatorics on Words. Ph.D. thesis, Humboldt
University, Berlin (1985)
A Study on the Modified Attribute Oriented Induction
Algorithm of Mining the Multi-value Attribute Data
( )
Abstract. Attribute Oriented Induction method short for AOI is one of the
most important methods of data mining. The input value of AOI contains a
relational data table and attribute-related concept hierarchies. The output is a
general feature inducted by the related data. Though it is useful in searching for
general feature with traditional AOI method, it only can mine the feature from
the single-value attribute data. If the data is of multiple-value attribute, the
traditional AOI method is not able to find general knowledge from the data. In
addition, the AOI algorithm is based on the way of induction to establish the
concept hierarchies. Different principles of classification or different category
values produce different concept trees, therefore, affecting the inductive
conclusion. Based on the issue, this paper proposes a modified AOI algorithm
combined with a simplified Boolean bit Karnaugh map. It does not need to
establish the concept tree. It can handle data of multi value and find out the
general features implied within the attributes.
1 Introduction
Data mining extracts implicit, previously unknown and potentially useful information
from databases. Many approaches have been proposed to extract information.
According to the classification scheme proposed in recent surveys (Chen et al., 1996;
Han and Kamber, 2001), one of the most important ones is the attribute-oriented
induction (AOI) method. This approach was first introduced in Cai et al. (1990),
Han etal. (1992), Han et al. (1993).
The AOI method was developed for knowledge discovery in relational databases.
The input of the method includes a relation table and a set of concept trees (concept
hierarchies) associated with the attributes (columns) of the table. The table stores the
task-relevant data, and the concept trees represent the background knowledge. The
core of the AOI method is on-line data generalization, which is performed by first
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 348–358, 2012.
© Springer-Verlag Berlin Heidelberg 2012
A Study on the Modified AOI Algorithm of Mining the Multi-value Attribute Data 349
examining the data distribution for each attribute in the set of relevant data,
calculating the corresponding abstraction level that the data in each attribute should
be generalized to, and then replacing each data tuple with its corresponding
generalized tuple. The major generalization techniques used in the process include
attribute-removal, concept-tree climbing, attribute-threshold control, propagation of
counts and other aggregate values, etc. Finally, the generalized data is expressed in
the form of a generalized relation from which many kinds of rules can be discovered,
such as characteristic rules and discrimination rules. For more details, please refer to
the original papers (Cai et al., 1990; Han et al., 1992; Han et al., 1993).
Undoubtedly, the AOI method has achieved a great success. Because of its success,
extensions have been proposed in the following directions: (1) extensions and
applications based on the basic AOI approach (Han et al., 1993; Han et al., 1998; Lu et
al., 1993), (2) more efficient methods of AOI (Carter and Hamilton, 1995; Carter and
Hamilton, 1998; Cheung et al., 2000), (3) more general background knowledge
(Hamilton et al., 1996; McClean et al., 2000), (4) integrating AOI with other information
reduction methods (Hu and Cercone, 1996; Shan et al., 1995) and (5) proposing new
variants of generalized rules (Tsumoto, 2000). (6) proposes a dynamic programming
algorithm, based on AOI techniques, to find generalized knowledge from an ordered list
of data(Chen and Shen, 2005).
AOI related algorithms conduct data induction with the help of concept hierarchies,
which are needed for each inducted attribute and taken as a prerequisite to apply AOI.
Concept hierarchies are the main characteristics of AOI and the primary reasons that AOI
can conduct induction. However, the major characteristics have also become the major
bumper of AOI applications. Two problems are rooted on the hierarchies. The first one
is the scarce availability of creditable concept hierarchies. In many cases, users who need
to summarize data for huge tables find the application of AOI unrealistic simply because
the targeted attributes do not have sensible concept hierarchies. The second problem
stems from the concept hierarchies are that concept hierarchies and associated attributes
can only hold up to a single value. Unfortunately, many census data which would
otherwise make very good applications of AOI, store data with multiple-value formats.
For example, Table 1 shows a census data for several areas which are crime hot spots.
The table contains the attributes of Area, Marital status, Gender and Education. Except
Area, the other three attributes store data in set oriented multi-valued format. Each value
in the set is a pair of <ordinal value, count>, where the ordinal value denotes a banded or
categorical value which are totally ordered in their sorts; the count is the population in the
area that are categorized into the corresponding group. For example, {<g1, 30> <g2,
70>}, in Area1 means that 30 persons in the area is has the gender of g1 and 70 of them
has the gender of g2.
However, there are still ways AOI following a number of common shortcomings:
(1) must have the concept trees, set up the concept trees varies from person to person,
and finally summed up the rules will be different. In this paper, an algorithm is
proposed to induct data organized in sets of ordinal and numeric pairs. Furthermore,
no hierarchies are needed to induct the data. (2) can only deal with single-valued
property. Unfortunately, a lot of information belong to many multi-value property
values, such as census data, so the existing AOI response to the above questions,
especially for multi-valued attribute value data, this paper presents a new AOI
method, first the value of property 2 element of treatment, and then use Boolean
350 S.-M. Huang, P.-Y. Hsu, and W.-C. Wang
In view of these weaknesses, this paper proposed a new AOI method, In the
algorithm, a translation procedure is deployed to translate each multi-value into a
sequence of binary digits. The binary sequences are merged and hence inducted with
Karnaugh map. The the last attribute value and then the same group of variables can
be summarized in total by the following rules:
{<g1,L><g2,H>}{<a1,L><a3,H>}{<e1,L><e3,H><e4,H>} Y%
That is:
Law and order problems will have the characteristics of Y% is the number of
female more and more elderly and more than for tertiary education more.
r0 r4 r12 r8
r0 r1 r0 r4
r1 r5 r13 r9
r1 r5
r3 r7 r15 r11
r0 r2 r3 r7
r2 r6 r14 r10
r1 r3 r2 r6
0 0 1 0
0 1 0 0
7 1 0 0
0 0 0 0
Based on the rule of Karnaugh map, we can take one variable out of the two
adjacent variables. In Table 2, Education, there are two groups which variable is
adjacent with each other, (r3, r5) and (r5, r7), and the variable r12 stands alone. The
expression is indicating as shown follows.
F (e1 , e2 , e3 , e4 ) = ∑ m(r 3, r 5, r 7, r12)
= (r 3 + r 7 ) + r 5 + r12 (2)
= (0011 + 0111) + 0101 + 1100
= 0 _ 11 + 0101 + 1100
Equation (2) reveals that the value ‘0011’ and ‘0111’ will be transferred to ‘0_11’.
Following similar procedure, the ‘age’ got 3 values whose Karnaugh map can be
shown in Fig. 3.
0 0
5 1
3 0
1 0
After similar simplification process, the result of ‘age’ attribute can be described in
equation (3).
F (a1 , a2 , a3 ) = ∑ m(r1, r 2, r 3, r 5)
= (r1 + r 3) + r 2 + r 5 (3)
= (001 + 011) + 010 + 001
= 0 _ 1 + 010 + 001
Equation (3) reveals that the value ‘001’ and ‘011’ will be transferred to ‘0_1’.
A Study on the Modified AOI Algorithm of Mining the Multi-value Attribute Data 353
It is not necessary to simplify ‘gender’ attribute, since there is only 2 values within it.
Table 3. The Boolean values of 10 crime hot spots (DB) has been converted into DB’
Simplification of Boolean algebra Karnaugh map in the most efficient, Karnaugh map
provides a graphic (box) between the point of view of the relationship, to find two
adjacent, 4, 8, 16 a group (re-select), you can simplify.
But the result has been repeated to select the option to repeat the question of
attribution circle and lead to property value when summed up, the issue of double
counting, it should be amended to do something. Circle option was to repeat, the
majority of the combined selection and simplification to four variables Karnaugh map
as an example, shown in Fi. 6.
0 0 0 0 0 0 0 0
3 6 4 0 1 4 6 0
2 5 0 0 2 7 3 0
0 2 0 0 0 0 0 0
(a) (b)
(1) Fig. 6 (a) based on the original Karnaugh map simplification rules, r1, r3, r5, r7
as a group, r5, r13 a, r7, r6 a, r5 and r7 of which are repeated to select, r1, r3, r5, r7
total number of (3 +2 +6 +5) = 16, r5, r13 Total Views (6 +4) = 10, r7, r6 total
number of (5 +2) = 7, in order to avoid repeated induction calculation, so I chose to
select r1, r3, r5, r7, chosen to give up r6, r13.
(2)Fig. 6 (b) based on the original Karnaugh map simplification rules, r1, r3, r5, r7
circle a group of selected, r5, r7, r13, r15 circle a group of selected, but r5, r7 was
chosen to repeat, so I chose to select the number of larger (4 +7 +6 +3) = 20 of r5, r7,
r13, r15 for a group, r1, r3 for the other group.
3 Performance Evaluation
3.1 Experimental Environment
performance of the algorithm. This paper uses a real case of 2500 data of the
MovieLens data sets. First, the meaningful attributes are filtered; the description of
the attributes is shown in Table 5. Then the users’ features on a rating of 4-5 points
are inducted in zip code units from the questionnaire using the algorithm proposed,
where the MovieLens data sets were collected by the GroupLens Research Project at
the Minnesota University.
The data was collected through the MovieLens web site (movielens.umn.edu) during
a seven-month period from September 19th to April 22nd in 1998. The data has been
treated which means that the users who have less than 20 ratings or do not have
complete demographic information are removed from the data set.
Rule 1 : {<a2, H><a3, L>} {<o1, H><o4, L>} {<m1, H><m2, L>} 62.72%
%
Interpretation of the rule: among those users rating 4-5 points, 62.72 of them are
between 21~60 years of age, and less are older than 61. Most of them are engineers in
terms of the nature of work, while students are in the minority. They favor romantic
movies more and comedic ones less.
(2) Stability
This paper is an empirical research. The data is divided into units with 300 data in
each unit. The experiment is tested successively and accumulatively to investigate the
stability of the inductive ability of the algorithm proposed by the study. The result
shows that the algorithm has stable inductive ability no matter what size of the data is,
as shown in Fig. 8.
Rule1
70.00%
Rule2
60.00%
Rule3
50.00%
Rule4
40.00% Rule5
30.00% Rule6
20.00% Rule7
Rule8
10.00%
Rule9
0.00%
Rule10
300 600 900 1200 1500 1800 2100 2400 2500
Rule11
4 Conclusions
This paper proposes a modified AOI algorithm combining the simplified binary digits
with Karnaugh Map. It is capable of dealing with data with multi-valued attributes
without establishing the concept trees and extracting the general features implicit in
the attributes.
This research concludes 3 contributions according to the empirical results. First, it
solves the bottleneck problem of the traditional AOI method. There is no need to
establish the concept hierarchies and concept trees during the inductive processes thus
preventing from the heave workload. Second, the traditional AOI method is not able
to deal with data with multi-valued attributes, while the method proposed by this
paper can. Third, the data induction has very good inductive ability and stability.
References
1. Cai, Y., Cercone, N., Han, J.: An attribute-oriented approach for learning classification
rules from relational databases. In: Proceedings of Sixth International Conference on Data
Engineering, pp. 281–288 (1990)
2. Carter, C.L., Hamilton, H.J.: Performance evaluation of attribute-oriented algorithms for
knowledge discovery from databases. In: Proceedings of Seventh International Conference
on Tools with Artificial Intelligence, pp. 486–489 (1995)
3. Carter, C.L., Hamilton, H.J.: Efficient attribute-oriented generalization for knowledge
discovery from large databases. IEEE Transactions on Knowledge and Data
Engineering 10(2), 193–208 (1998)
4. Chen, M.S., Han, J., Yu, P.S.: Data mining: An overview from a database perspective.
IEEE Transactions on Knowledge and Data Engineering 8(6), 866–883 (1996)
5. Cheung, D.W., Hwang, H.Y., Fu, A.W., Han, J.: Efficient rule-based attribute-oriented
induction for data mining. Journal of Intelligent Information Systems 15(2), 175–200 (2000)
6. Hamilton, H.J., Hilderman, R.J., Cercone, N.: Attribute-oriented induction using domain
generalization graphs. In: Proceedings of Eighth IEEE International Conference on Tools
with Artificial Intelligence, pp. 246–252 (1996)
7. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Academic Press, New York
(2001)
8. Han, J., Cai, Y., Cercone, N.: Knowledge discovery in databases: An attribute-oriented
approach. In: Proceedings of International Conference on Very Large Data Bases, pp. 547–
559 (1992)
9. Han, J., Cai, Y., Cercone, N.: Data-driven discovery of quantitative rules in relational
databases. IEEE Transactions on Knowledge and Data Engineering 5(1), 29–40 (1993)
10. Han, J., Nishio, S., Kawano, H., Wang, W.: Generalization-based data mining in object-
oriented databases using an object-cube model. Data and Knowledge Engineering 25, 55–97
(1998)
11. Lu, W., Han, J., Ooi, B.C.: Discovery of general knowledge in large spatial databases. In:
Proceedings of 1993 Far East Workshop on Geographic Information Systems (FEGIS 1993),
pp. 275–289 (1993)
358 S.-M. Huang, P.-Y. Hsu, and W.-C. Wang
12. Marcovitz, A.: Introduction to Logic Design. McGraw Hill, New York (2005)
13. Tokhenim, R.: Digital Electronics: Principles and Applications. McGraw Hill, New York
(2005)
14. McClean, S., Scotney, B., Shapcott, M.: Incorporating domain knowledge into attribute-
oriented data mining. International Journal of Intelligent Systems 15(6), 535–548 (2000)
15. Chen, Y.L., Shen, C.C.: Mining generalized knowledge from ordered data through
attribute-oriented induction techniques. European Journal of Operational Research 166,
221–245 (2005)
A Hybrid Evolutionary Imperialist Competitive
Algorithm (HEICA)
1 Introduction
Recently there has been considerable amount of attention devoted to bio-inspiration
and bio-mimicry, for solving computational problems and constructing intelligent
systems. In the scope of computational intelligence it seems there are at least six main
domains of intelligence in biological systems and wild life: Swarming,
Communication and Collaboration, Reproduction and Colonization, Learning and
Experience, Competition and Evolution [1].
Evolutionary algorithms, such as Genetic Algorithm (GA) [2], Simulated
Annealing (SA), Particle Swarm Optimization (PSO) [3-4] and Ant Colony
Optimization (ACO) [5] are computer simulation of natural processes such as natural
evolution and annealing processes in materials.
The Imperialist Competitive Algorithm (ICA) was proposed by Atashpaz, in 2007
[6]. ICA is the mathematical model and the computer simulation of human social
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 359–368, 2012.
© Springer-Verlag Berlin Heidelberg 2012
360 F. Ramezani, S. Lotfi, and M.A. Soltani-Sarvestani
2 Background
3 Related Works
Abderchiri and Meybodi [7] proposed two algorithms for Solving SAT problems:
First, a new algorithm that combines ICA and LR. Secondly, a hybrid Hopfield
network (HNN)-Imperialist Competitive Algorithm (ICA). The proposed algorithm
(HNNICA) has a good performance for solving SAT problems.
Vahid Khorani, Farzad Razavi and Ahsan Ghoncheh [8] proposed R-ICA-GA
(Recursive-ICA-GA) based on the combination of ICA and GA. A new method
improves the convergence speed and accuracy of the optimization results. They run
ICA and GA consecutively. Results show that a fast decrease occurs while the
proposed algorithm switches from ICA to GA.
Jain and Nigam [9] proposed a hybrid approach by combining the evolutionary
optimization based GA and socio-political process based colonial competitive
algorithm (CCA). They used CCA–GA algorithm to tune a PID controller for a real
time ball and beam system.
Razavi and others [10] studied the ability of evolutionary Imperialist Competitive
Algorithm (ICA) to coordinate over current relays. The ICA was compared to the GA.
The algorithms were compared in terms of the mean convergence speed, mean
convergence time, convergence reliability, and the tolerance of convergence speed in
obtaining the absolute optimum point.
362 F. Ramezani, S. Lotfi, and M.A. Soltani-Sarvestani
In this section, combines two algorithms to present a novel hybrid algorithm. The
pseudo-code of the HEICA is presented in follow:
Procedure HEICA
Step 1: Initialization;
Generate some random people;
Randomly allocate remain people to others countries;
Select more powerful leaders as the empires;
Step 2: Evolutionary Algorithm
Roulette Wheel Selection;
Crossover;
Mutation;
Replacement;
Step 3: Imperialist Competitive Algorithm
People Assimilation; Move the people of each country
toward their relevant leaders.
People Revolutionary;
Countries Assimilation; Move the leaders of each
country toward their empires and move the people of
each country as the same as their leaders.
Countries Revolutionary;
Imperialistic Competition; Pick the weakest country
from the weakest empire and give it to the empire
that has the most likelihood to possess it.
Elimination; Eliminate the powerless empires.
Step 4: Terminating Criterion Control; Repeat Steps 2-3
until a terminating criterion is satisfied.
4.1 Population
This algorithm starts by generating a set of candidate random solutions in the search
space of the optimization problem. The generated random points are called the initial
population which consists of persons. Persons in this algorithm are the counterpart of
Chromosomes in GA and Particles in PSO which are array of candidate solutions. In
human society, groups of people form community and are involved in community
development.
• Autocracy: This type of community does not have leader. The people are free
and there is no force. They do what they want.
• Monarchy: A monarchy has a king or queen, who sometimes has absolute
power. Power is passed along through the family. This type of community has a
monarch; People should follow her. The powerful person is selected as the
monarch in each country. Different monarchy countries exist in empires; the
best monarch of this type of country selects as empire.
4.3 Initialization
The algorithm starts with an initial random population called person. Some of the best
person in the population selected to be the leaders and the rest form the people of
these countries. Each country has an equal population.
The total power of a country depends on both the power of the leaders and the
power of its people. This fact is modeled by defining the total power of a country as
the power of the leader of the country plus a percentage of mean power of its people.
The power of the people has an effect on the total power of that country:
. (1)
. is the total power of i-th country.
Based on monarchy countries power, some of the best initial countries (the
countries with the least cost function value), become Imperialists and start taking
control of other countries and form the initial Empires. The best leaders of the
countries determine the empire of the Empires. We select of the most powerful
countries to form the empires. To divide the countries among imperialists
proportionally, we use the normalized cost of an imperialist by [6]:
.
(2)
is the cost of i-th imperialist and . is its normalized cost. Having the
normalized cost of all imperialists, the normalized power of each imperialist is
defined by [6]:
. (3)
∑
• External: External operations are among the countries. Assimilation occurs just
among monarchy countries of each imperialist they move toward the empires.
The country moves toward the imperialist it means all the people among these
countries move in the same way, toward the empires. Just monarchy countries
have an empire therefore assimilation is toward the empire. Revolution occurs in
all countries. All the people of one country should move toward the same way
because revolution is against the empire in monarchy countries. In other
countries they try to improve their countries therefore they move with each
other.
• Internal: Internal operations are among the people of the countries. Assimilation
occurs in all countries they move toward the leaders. Revolution occurs in all
countries, people try to get the position.
For evaluating performance of the proposed algorithm, the simulation results are
compared with results of EA, ICA, PSO and ABC [11]. M is mean of best results.
Performance calculated from Equation 5 as follow for minimum optimization:
100% 1 (5)
Used common benchmark functions are listed in Table 1.The comparison results of
F1-F10 are shown in Table 2. Best stability and convergence diagrams of different
functions are shown in Fig. 1 and Fig. 2. In Table 3, the performance of HEICA
algorithm is compared with PSO and ABC for high dimensional problem. In Table 4,
the proposed HEICA algorithm was tested on benchmark functions provided by
CEC2010 Special Session on Large Scale Global Optimization [12]. HEICA performs
better than SDENS [13]. HEICA performs better on 1.2E5 and 6E5 FEs and
jDElsgo[14] performs better on 3E6 FEs. To show the efficiency of HEICA in solving
different function, logarithmic scale diagram is used. A logarithmic scale is a scale of
measurement using the logarithm of a physical quantity instead of the quantity itself.
Since, in this study, the values cover a wide range; logarithmic scale makes it easy to
compare values.
A Hybrid Evolutionary Imperialist Competitive Algorithm (HEICA) 365
5.2 Discussion
This paper proposes a novel hybrid approach consisting EA and ICA and its
performance is evaluated using various test functions. Performance parameter shows
HEICA perform better optimization than EA, ICA, PSO and ABC in all test functions.
It shows that hybrid algorithm perform better optimization.
F6 10 10 10
0.5
F7(Schaffer) 0.5 100 , 100
1 0.001
F9(SumSquares) 1 1
F10(Ackley) . ∑ ∑ 30 30
20 20
F F1 F2 F3 F4 F5
D. 10 50 10 30 10 50 10 50 10 50
Gen. 1000 2000 2000 4000 1000 2000 1000 3000 1000 2000
Pop. 150 500 250 600 150 400 450 750 150 300
EA 9.74E-03 5.15E+0 6.50E+00 7.42E+011.27E+007.81E+01 2.41E-02 1.06E-02 8.52E+00 1.58E+03
ICA 6.04E-04 8.18E-03 8.49E-01 1.11E+01 1.23E-01 2.46E+00 1.77E-03 2.15E-02 9.57E-01 9.59E+01
PSO 2.30E-35 2.14E-12 1.33E+00 3.32E+011.69E+006.72E+01 5.68E-02 1.00E-02 6.88E+02 7.76E+03
ABC 7.30E-17 1.14E-15 2.60E-03 5.79E+00 0 5.79-12 1.67E-17 2.22E-17 2.90E-02 3.19E+02
HEICA 1.36E-39 8.88E-23 3.23E-03 7.06E+00 0 0 0 3.33E-16 1.27E-04 6.36E-04
PEA(%) 100 100 100 99 100 100 100 100 100 100
PPSO(%) 100 100 100 97 100 100 100 100 100 100
PABC(%) 100 100 85 85 - 100 100 -823 100 100
F F6 F7 F8 F9 F10
D. 10 50 2 10 50 10 50 10 50
Gen. 1000 2500 1000 1000 1500 1000 2000 1000 2000
Pop. 150 450 150 150 200 150 500 150 200
EA 3.63E+05 1.55E+45 2.09E-02 1.09E+01 3.73E+04 1.01E-02 1.12E+02 1.52E+00 6.03E+00
ICA 1.78E+01 1.20E+28 1.34E-08 1.00E+00 1.19E+02 4.79E-04 9.53E-02 2.74E-01 7.15E-01
PSO 8.18E-16 2.16E+11 3.45E-04 - - 2.61E-35 7.73E-11 4.26E-15 1.12E-03
ABC 6.32E-17 3.19E+02 6.88E-06 - - 7.26E-17 1.57E-15 6.93E-15 2.02E-07
HEICA 5.80E-37 1.11E-01 0 2.79E-39 9.06E-12 5.76E-39 8.77E-23 4.00E-15 6.51E-11
PEA(%) 100 100 100 100 100 100 100 100 100
PPSO(%) 100 100 100 - - 100 100 6 100
PABC(%) 100 100 100 - - 100 100 42 100
Decades 100
1
1 101 201 301 401 Best
0.00001 Average
10
Fitness
Fitness
1EͲ10
1EͲ15
1
1EͲ20 Best 1 101 201 301 401
Average Decades
1EͲ25 (a) 0.1 (b)
Fig. 1. Convergence diagram of HEICA: (a) F1 (D = 10) (b) F10 (D = 50)
Run Run
1
1EͲ09 1 11 21
Fitness
EA 1 11 EA
1EͲ19 HEICA 0.001 HEICA
Fitness
1EͲ29 0.000001
1EͲ39
1EͲ09
1EͲ49
1EͲ59 1EͲ12
(a) (c)
Fig. 2. Best stability diagram of HEICA: (a) F1 (D = 10) (b) F10 (D = 50)
368 F. Ramezani, S. Lotfi, and M.A. Soltani-Sarvestani
References
1. Hajimirsadeghi, H., Lucas, C.: A Hybrid IWO/PSO Algorithm for Fast and Global
Optimization. In: International IEEE Conference Devoted to 150 Anniversary of
Alexander Popov (EUROCON 2009), Saint Petersburg, Russia, pp. 1964–1971 (2009)
2. Holland, J.H.: Adoption in Natural and Artificial Systems. University of Michigan, Ann
Arbor (1975)
3. Kennedy, J., Eberhart, R.C.: Particle Swarm Optimization. In: Proceedings of IEEE
International Conference on Neural Networks, Australia, pp. 1942–1948 (1995)
4. Eberhart, R., Kennedy, J.: A New Optimizer Using Particle Swarm Theory. In:
Proceedings of the Sixth International Symposium on Micro Machine and Human Science
(MHS 1995), pp. 39–43 (1995)
5. Dorigo, M., Maniezzo, V., Colorni, A.: Ant System: Optimization by a Colony of
Cooperating Agents. IEEE Transactions on Systems, Man and Cybernetics, 1–13 (1996)
6. Atashpaz-Gargari, E., Lucas, C.: Imperialist Competitive Algorithm: An Algorithm for
Optimization Inspired By Imperialistic Competition. In: Proceedings of the IEEE Congress
on Evolutionary Computation, Singapore, pp. 4661–4667 (2007)
7. Abdechiri, M., Meybodi, M.R.: A Hybrid Hopfield Network-Imperialist Competitive
Algorithm for Solving the SAT Problem. In: 3rd International Conference on Signal
Acquisition and Processing (ICSAP 2011), pp. 37–41 (2011)
8. Khorani, A.V., Razavi, B.F., Ghoncheh, C.A.: A new Hybrid Evolutionary Algorithm
Based on ICA and GA: Recursive-ICA-GA. In: The 2010 International Conference on
Artificial Intelligence, pp. 131–140 (2010)
9. Jain, T., Nigam, M.J.: Synergy of Evolutionary Algorithm and Socio-Political Process for
Global Optimization. Expert Systems with Applications 37, 3706–3713 (2010)
10. Razavi, F., Khorani, V., Ghoncheh, A., Abdollahi, H., Askarian, I.: Using Evolutionary
Imperialist Competitive Algorithm (ICA) to Coordinate Overcurrent Relays. In: The 2011
International Conference on Genetic and Evolutionary Methods, GEM 2011 (2011)
11. Karaboga, D., Basturk, B.: Artificial Bee Colony (ABC) Optimization Algorithm for
Solving Constrained Optimization Problems. In: Melin, P., Castillo, O., Aguilar, L.T.,
Kacprzyk, J., Pedrycz, W. (eds.) IFSA 2007. LNCS (LNAI), vol. 4529, pp. 789–798.
Springer, Heidelberg (2007)
12. Tang, K., et al.: Benchmark Functions for the CEC 2010 Special Session and Competition
on Large-Scale Global Optimization. s.l.: Nature Inspired Computation and Applications
Laboratory, Technical report (2010), http://nical.ustc.edu.cn/cec10ss.php
13. Wang, H., Wu, Z., Rahnamayan, S., Jiang, D.: Sequential DE Enhanced by Neighborhood
Search for Large Scale Global Optimization. In: WCCI 2010 IEEE World Congress on
Computational Intelligence, pp. 4056–4062 (2010)
14. Brest, J., Zamuda, A., Fister, I., Maucec, M.S.: Large Scale Global Optimization using
Self-adaptive Differential Evolution Algorithm. In: WCCI 2010 IEEE World Congress on
Computational Intelligence, pp. 3097–3104 (2010)
Decoding Cognitive States from Brain fMRIs:
The “Most Differentiating Voxels” Way
Abstract. Since the early 1990s, fMRI has come to dominate the brain
mapping field due to its relatively low invasiveness, absence of radiation
exposure, and relatively wide availability. It is widely used to get a 3-
D map of brain activity, with a spatial resolution of few milliseconds.
We try to employ various machine learning techniques to decode the
cognitive states of a person, based on his brain fMRIs. This is particu-
larly challenging because of the complex nature of brain and numerous
interdependencies in the brain activity. We trained multiple classifiers
for decoding cognitive states and analyzed the results. We also intro-
duced a technique for considerably reducing the large dimensions of the
fMRI data, thereby increasing the classification accuracy. We have com-
pared our results with current state-of-the-art implementations, and a
significant improvement in the performance was observed. We got 90%
accuracy, which is significantly better than the state-of-the-art imple-
mentation. We ran our algorithm on a heterogeneous dataset containing
fMRI scans from multiple persons, and still got an accuracy of 83%, which
is significant since it shows our classifiers were able to identify some ba-
sic abstract underlying neural activity, which are subject-independent,
corresponding to the each cognitive states.
1 Introduction
Since the introduction of brain fMRIs, study of human brain and its function-
ing, has received a tremendous boost. fMRIs were significantly better from its
predecessors considering its high resolution, absence of brain radiation exposure
and relatively wide availability. A single fMRI scan would give a brain image
of about 15000 voxels in size. This rich data can be used for several studies
including efforts to decode cognitive state of a person.
In this study, we used various machine learning tools to build several classifiers
that can be used to decode subject’s cognitive states in a particular time interval.
Different stimuli can result the activation of different brain portions. i.e., the
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 369–376, 2012.
c Springer-Verlag Berlin Heidelberg 2012
370 C. Namballa, R. Erai, and K. Venkataramani
brain regions getting activated while reading a book would be different from the
regions that get activated while watching a movie. Once we learn this mappings,
we can use them in several fields. These data can be potentially used to diagnose
different medical conditions like Alzheimers disease.
2 Related Works
FMRI is a fairly new technology, developed in 1990s and is being heavily used
in medical diagnostics. Little study has been done in using fMRI scans to train
artificial intelligent agents to computationally model the human brain. Almost
all the major work in the field of decoding cognitive states from brain fMRIs have
Decoding Cognitive States from Brain fMRIs 371
been done in the past seven years, by Tom Mitchell and his colleagues at Carnegie
Mellon University [2][3][4]. He and his team collected the star-plus dataset[5],
which is widely used in studies involving fMRI data. We also used this dataset
for training/testing of our classifiers, and used Mitchell’s results as a benchmark
to evaluate our algorithm. Our work is based on the paper by Mitchell et al.[2],
in which they investigated the possibility of building computational models,
that can be used to predict whether a person is seeing a picture or reading a
sentence given his/her brain fMRI scan. They used a variety of dimensionality
reduction techniques based on some simple heuristics, and trained a series of
classifiers including Naive Gaussian Classifier, Support Vector Machines and
Nearest Neighbour classifier. Despite of the hyper dimensionality of the data,
and relatively basic classification techniques, they managed to get impressive
results, considering how complex and interdependent the brain functions really
are. They also trained cross-subject classifiers and still got decent performance.
Apart from Mitchell et al., there has been some efforts in employing machine
learning techniques on data collected using other similar devices that records
brain activity. For example, Blankertz et al. trained classifiers on EEG data
collected from a single subject [6]. Friston et al. had dealt with an exact opposite
problem than ours, where they tried to predict the future values of a voxel based
on its previous states.
The work we present here are based on Mitchell’s works[2][3][4], augmented with
a new technique to reduce the dimensionality resulting in improved accuracy.
3.1 Procedure
As explained before, the experiment consists of a set of trials, and the data was
partitioned into trials. For some of these intervals, the subject simply rested, or
gazed at a fixation point on the screen. For other trials, the subject was shown
a picture and a sentence, and instructed to press a button to indicate whether
the sentence correctly described the picture. For these trials, the sentence and
picture were presented in sequence, with the picture presented first on half of
the trials, and the sentence presented first on the other half of the trials. Forty
such trials were available for each subject. The timing within each such trial is
as follows:
– The first stimulus (sentence or picture) was presented at the beginning of
the trail (1st image).
372 C. Namballa, R. Erai, and K. Venkataramani
– Four seconds later (9th image) the stimulus was removed, replaced by a
blank screen.
– Four seconds later (17th image) the second stimulus was presented. This
remained on the screen for four seconds, or until the subject pressed the
mouse button, whichever came first.
– A rest period of 15 seconds (30 images) was added after the second stimulus
was removed from the screen. Thus, each trial lasted a total of approximately
27 seconds (approximately 54 images). For further information, please refer
[5].
4 Our Approach
Every classification problem has two phases: Extracting the useful features, and
training classifiers to do the actual classification. The first phase, i.e. feature
selection/dimensionality reduction has a very important role in this particular
project, as we are dealing with data in extremely high dimensions( 150000),
prone to noise. Hence, the accuracy of the classifier is heavily dependent on the
method one uses to reduce the dimension of the data. This is the area where our
approach differs from that of T. M. Mitchell and his team. We were able to come
up with a better algorithm that reduced the dimension of the data considerably,
without sacrificing the separability of the data distribution.
We used a two level mechanism to reduce the dimension. While the first step
reduces the effect of noisy voxels, second step chooses voxels which are best
suited to distinguish between sentences and pictures.
voxels would respond equally to both pixels and voxles, making them an active
voxel, but still a less attractive option to distinguish pictures from sentences.So,
in this step, discard these voxels, further reducing the number of voxels selected
by step 1. Here also we use a variant of t-testing. As the first step, record the
values of each voxel when a picture is shown and a sentence is given to read.
Now for each voxel, we again created two distribution, one corresponding to its
activity when an image is shown and another corresponding to its activity when
a sentence was given.
The difference between the two distribution of a voxel shows how efficient
is that voxel in differentiating pictures and sentences. If the intersected area is
larger, that means that both the distributions are more similar, making that
voxel less suitable for differentiating pictures and sentences. On other hand, if
the common area is small, that means, that voxel exhibit different response while
seeing a picture than when reading a sentence, making it an excellent choice to
be used in classification. Now, assuming that the distributions follow normal
distribution, the intersected area is directly proportional to their variances and
inversely proportional to the difference of the means of the distributions. So our
objective is to discard those voxels whose t values are very small, where,
|μ1 − μ2 |
t= (1)
σ1 + σ2
So, after this step, we are left with n voxels, which are best suited to differentiate
between pictures and sentences.
4.2 Classification
To do the actual classification, we built three classifiers Gaussian Naive Bayesian
(GNB), Support Vector Machines (SVM) and Nearest Neighbor classifier(KNN)
[7] and trained them using the reduced dimension data obtained by the previous
steps.
To train a support vector machine, we used linear kernel support vector
machine. And for the K- nearest neighbor classifier, we considered Euclidean
distance as a distance metric with k=1, 3,5,7 and 9.
5 Experimental Results
We implemented the system using MATLAB 7.10. and compared previously
published results[2][3][4] with the results obtained by our approach. We found
that our method significantly improves the accuracy over the existing methods.
Table 1. The table shows that augmenting our feature selection method improved the
accuracies of all the classifiers that we trained for
In the above table, there is a comparison between the the results obtained
by Mitchell et al’s method, with the results that we got after enhancing the
algorithm with our new dimensionality reduction technique. From the above
presented data, we can easily observe that our approach is giving significantly
better performance than their approach.
Fig. 1. Accuracy vs Number of features: Our observations shows that limiting the
number of features to 250 gives the best results for all the classifiers
also tried to build a cross-subject classifier. The idea was to train a single model
using multiple subject’s fMRI data, and test the trained model using a new
subject’s dataset. However, there were some practical problems associated with
this approach. Since we don’t have a unified spacial coordinate system to identify
the voxels, it would be extremely difficult to identify which one of the voxels in
subject A corresponds to a given voxel in subject B. Due to the huge resolution
of the fMRI scans, it’s almost impossible to calibrate fMRI scanners to produce
scans in a unique coordinate system.
Classifier Accuracy
GNB 77.08
SVM 79.86
1NN 81.94
3NN 77.08
5NN 82.64
One workaround for this problem would be, to mix all the subject’s data
together, and then randomly partition them into testing and training sets, so
that both the sets would hopefully contain data from all the subjects. Even
though this doesn’t solve the problem, since both the test and train set contain
data from all the subjects, slight spacial inconsistencies of the voxels would get
averaged out. This is the closest that we can get to a cross-subject testing. Even
with this heuristics, we were able to get impressive results using our algorithm,
on cross-subject datasets. Table 2 summarizes the results we observed.
6 Conclusion
In this paper, we investigated how well machine learning techniques can be used
to decode cognitive states from brain fMRIs. We studied the past researches in
the field, and were able to augment the existing system with a new feature selec-
tion technique, which was able to identify the most relevant voxels that can be
used to distinguish the given cognitive states.We built three different classifiers
(GNB, SVM, KNN) and compared the results with the current state of the art
systems results. The best accuracy that we got for the single-subject dataset was
of 90%, using SVMs. Our algorithm showed surprisingly accurate performance
for multi-subject dataset, implying that the feature selection algorithm that we
used, indeed implicitly managed to identify the parts of the brain, which are
accountable for the given cognitive states.
Since our algorithm is quite general in nature, it can be used to differentiate
between any two cognitive states. For example, it can be used over the star-plus
dataset to analyze “ambiguity” in the sentences. In future,apart from detecting
the cognitive state, this study can be generalized in identifying parts of brain
that are responsible for specific cognitive states.
376 C. Namballa, R. Erai, and K. Venkataramani
References
1. Functional magnetic resonance imaging,
http://en.wikipedia.org/wiki/Functional_magnetic_resonance_imaging
2. Mitchell, T.M., Hutchinson, R., Niculescu, R.S., Pereira, F., Wang, X., Just, M.,
Newman, S.: Learning to decode cognitive states from brain images. Machine Learn-
ing 57(1), 145–175 (2004)
3. Wang, X., Hutchinson, R., Mitchell, T.M.: Training fmri classifiers to detect cogni-
tive states across multiple human subjects. In: Proceedings of the 2003 Conference
on Neural Information Processing Systems, Citeseer (2003)
4. Wang, X., Mitchell, T.M., Hutchinson, R.: Using machine learning to detect cogni-
tive states across multiple subjects. CALD KDD Project Paper (2003)
5. Cmu’s starplus fmri dataset archive,
http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-81/www/
6. Blankertz, B., Curio, G., Mller, K.R.: Classifying single trial eeg: Towards brain
computer interfacing. advances in neural inf. Proc. Systems 14(1), 157–164 (2002)
7. Duda, R.O., Hart, P.E., Stork, D.G., et al.: Pattern classification, vol. 2. Wiley,
New York (2001)
A Semi-Supervised Method
for Discriminative Motif Finding
and Its Application to Hepatitis C Virus Study
1 Introduction
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 377–384, 2012.
c Springer-Verlag Berlin Heidelberg 2012
378 T.N. Le and T.B. Ho
2 Background
2.1 Discriminative Motif Discovery
Protein motif is particular amino acid sequence that is characteristic of a specific
structure or function of the molecule, and thus it appears more frequently than
it is expected. The identification of these motifs from protein sequences plays an
important role in controlling the cellular localization, can highlight interactive
regions shared among proteins, or regions that are characteristic of a specific
function/structure of molecule [3].
Traditionally, motif finding problem has been dominated by generative models
using only one sequence class to produce descriptive motifs of the class. Recently,
many studies focus on discovering of discriminative motifs that can be used to
distinguish sequences belonging to two different classes [1–3, 9–11]. Discrimi-
native motifs are those occurring more frequently in one set of sequences and
not occur in another set. These motifs can help to classify well a sequence into
certain class or to describe the characteristics of a class.
DMOPS is one of motif models categorized by [2] based on counting the
number of total occurrences of motif in sequences. Because the significant regions
in sequence are generally better preserved, our previous work focused on DMOPS
motifs to detect the relationship between HCV NS5A protein and IFN/RBV
therapy effect. These DMOPS are promising as they present many patterns that
were not known previously.
Given two sets of positive and negative sequences, we find a minimal set
of DMOPS motifs satisfying two conditions: (1) Complete: each sequence in
positive class, denoted by P os, contains at least one found motif, (2) Consistent:
found motifs occur in sequences of positive class but do not occur in sequences
in negative class, denoted by N eg.
A subsequence will be a DMOPS motif when it satisfies both α-coverage and
β-discriminant thresholds. Given parameters α (0 < α < 1) and β (0 < β < 1)
a subsequence P is an α-coverage for class positive if
|coverP os (P )|
|P os| ≥ α,
and is a β-discriminant for class positive if
|coverP os (P )|
|coverS (P )| ≥ β,
where coverP os (P ) is the set of sequences in class positive that each sequence
contains a given subsequence P and coverS (P ) = coverP os (P ) ∪ coverN eg (P ).
If P is both α-coverage and β-discriminant for class positive, we will say P is α,
β-strong for class positive. Similar concepts can be defined for class negative.
Because the number of sequences of class positive and negative is small in
[9], DMOPS motifs that are found from training set are not enough to divide
the dataset separately, and hence, the average accuracy of assessment on test
dataset is achieved 68.8%. This result suggests us to use unlabeled sequences to
search more general DMOPS motifs that can help us distinguish well two sets
of positive and negative sequences and improve the predictive accuracy.
In the next section, we will present our semi-supervised learning framework
and the method to enlarge the number of sequences for small labeled dataset.
1: M otif Set = φ
2: α, β ← Initialize(P os, mina, minb)
3: while P os = φ & (α, β)
= (mina, minb) do
4: N ewM otif ← Motif(P os, N eg, U L, α, β, γ)
5: if N ewM otif = φ then
6: P os ← P os \ Cover + (N ewM otif )
7: M otif Set ← M otif Set ∪ N ewM otif
8: else
9: Reduce(α, β)
10: end if
11: M otif Set ← PostProcess(M otif Set)
12: end while
13: return(M otif Set)
4.2 Experiment
This experiment focuses on validating and comparing the accuracy assessment
of DMOPS motif finding algorithm before and after enlarging labeled dataset.
Therefore we perform 3-fold cross validation five times with parameters are set
as follows:
382 T.N. Le and T.B. Ho
1: U 1+ = φ, U 1− = φ
2: for each motif ∈ motif + /motif − do
3: if motif match sequence ∈ U then
4: if sequence ∈ U 1+ /U 1− then
5: increase rank(sequence) by 1
6: else
7: U 1+ /U 1− ← sequence
8: rank(sequence) ← 1
9: end if
10: end if
11: end for
12: checkConsistency(U 1+ , U 1− )
13: chooseHighestRank(U 1+ , U 1− )
14: return(U 1+ , U 1− )
5 Conclusion
We have explored the use of self-training-based semi-supervised learning to en-
large training set of discriminative motif finding problem in case the number
of labeled data is small. The proposed method works in an iterative procedure
to choose the best match sequence candidates among unlabeled sequences. Ex-
periment results show that with more data for training set, the DMOPS motif
finding algorithm can obtain higher accuracy.
In this work, our method was developed based on assumption of containing
the same discriminative motif among sequences. However, there are many cases
that a sequence probably does not contain a discriminative motif but it may
belong to the class of motif. There is reason to think that this sequence has a
similarity in some respect with the sequence containing discriminative motif, for
example gene distance among sequences.
References
[1] Lin, T., Murphy, R.F., Bar-Joseph, Z.: Discriminative motif finding for predict-
ing protein subcellular localization. IEEE/ACM Transactions on Computational
Biology and Bioinformatics 8(2) (2011)
[2] Kim, J.K., Choi, S.: Probabilistic models for semi-supervised discriminative motif
discovery in DNA sequences. IEEE/ACM Transactions on Computational Biology
and Bioinformatics 8(5) (2011)
[3] Vens, C., Rosso, M.N., Danchin, E.G.J.: Identifying discriminative classifcation-
based motifs in biological sequences. Bioinformatics 27(9), 1231–1238 (2011)
[4] Gao, M., Nettles, R.E., et al.: Chemical genetics strategy identifies an HCV NS5A
inhibitor with a potent clinical effect. Nature 465, 953–960 (2010)
[5] Guilou-Guillemette, H.L., Vallet, S., Gaudy-Graffin, C., Payan, C., Pivert, A.,
Goudeau, A., Lunel-Fabiani, F.: Genetic diversity of the hepatitis C virus: Impact
and issues in the antiviral therapy. World Journal of Gastroenterology 13(17),
2416–2426 (2007)
[6] Los Alamos National Laboratory, http://hcv.lanl.gov/
384 T.N. Le and T.B. Ho
Robert Burduk
1 Introduction
The problem of cost-sensitive classification is broadly discussed in literature. The
major costs are the costs of feature measurements (costs of tests) and the costs of
classification errors. The costs of feature measurements are discussed in [10], [11],
[12], [14] other works included the cost of classification errors [1], [8]. In a more
realistic setup, there are good reasons for considering both the costs of feature
measurements and the costs of classification errors. For example, there should
be a balance between the cost of measuring each feature and the contribution of
the test to accurate classification. It often happens that the benefits of further
classification are not worth the costs of feature measurements. This means that
a cost must be assigned to both the tests and the classification errors. In the
works [3], [17], [13], [15] both feature costs and misclassification costs are taken
into account.
The class of the fuzzy-valued loss functions is definitely much wider than the
class of the real-valued ones [6]. This fact reflects the richness of the fuzzy ex-
pected loss approach to describe the consequences of incorrect classification in
contrast with the real-valued approach. For this reason, several studies have pre-
viously developed decision problems in which values assessed to the consequences
of decisions are assumed to be fuzzy [4], [7], [16].
The synthesis of the hierarchical classifier is a complex problem. It involves
specifying the following components:
– decision logic, i.e. hierarchical ordering of classes,
– feature used at each stage of decision,
– decision rules (strategy) of performing the classification.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 385–392, 2012.
c Springer-Verlag Berlin Heidelberg 2012
386 R. Burduk
Let us
xi ∈ Xi ⊆ Rdi , di ≤ d, i∈M (3)
denote the vector of features used at i-th node, which have been selected from
vector x.
Comparison of Cost for Zero-One and Stage-Dependent Fuzzy Loss Function 387
Ψi : Xi → M i , i∈M. (4)
Formula (4) is a decision rule (recognition algorithm) used at i-th node which
maps observation subspace to the set of immediate descendant nodes of i-th
node. Equivalently, decision rule (4) partitions observation subspace Xi into
disjoint decision regions Dxki , k ∈ Mi , so that observation xi is allocated to
node k if ki ∈ Dxki , namely:
Let us L(iN , jN ) denote the loss incurred when the object of class jN is assigned
to class iN (iN , jN ∈ M(N )). Our aim is to minimize the mean risk, that is the
mean value of the loss function:
R∗ (πN
∗
)= min R(πN ) = min E[L(IN , JN )]. (6)
Ψin ,ΨiN −1 Ψin ,ΨiN −1
∗
We will refer to πN strategy as the globally optimal N -stage recognition strategy.
L(iN , jN ) = I(iN
= jN ). (7)
This loss function assigns loss equal 0 for the correct classification and equal 1
for the incorrect classification.
By putting (7) into (6), we obtain the decision rules for the two-stage decision
tree:
Ψi∗n (xin ) = in+1 ,
∗
q (jN /in+1 , jN )p(jN )fjN (xin ) =
jN ∈Min+1 , (8)
∗
= max q (jN /k, jN )p(jN )fjN (xin )
k∈Min
jN ∈Mk
∗
where q (jN /in+1 , jN ) denotes the probability of accurate classification of the
∗
object of class jN at the second stage using πN strategy rules, on condition that
at the first stage in+1 decision has been made.
where w is the first common predecessor of nodes iN and jN . The fuzzy loss
function defined as above means that the loss depends on the node of the decision
tree at which misclassification has been made. The interpretation of this loss
function is presented in Fig. 1.
~
0 L0
~ ~
5 L5 6 L6
1 2 3 4
By putting (9) into (6), we obtain the optimal (Bayes) strategy whose decision
rules at the first stage are as follows:
4 Cost Model
In the costs model we specify two costs. There are the feature acquisition costs
and the misclassification costs. We assume that the feature acquisition cost for
each internal node is known. It means that i-th node has associated feature
acquisition costs F AC(i). In this case, each feature has an independent cost and
the cost of a set of features is just an additive cost.
Each patch s(k), k ∈ M represents a sequence of ordered feature values and
the final classification. Each path has an associated feature acquisition cost. It
can be computed as follows:
Comparison of Cost for Zero-One and Stage-Dependent Fuzzy Loss Function 389
F AC(s(k)) = F AC(i). (11)
i∈s(k), i∈M
/
Now we present the misclassification cost. Each path has an associated expected
misclassification cost. It can be computed as follows:
EM C(s(k)) = L(iN , k) q(in /min , k). (12)
iN ∈M(N ) in ∈s(k)−{0}
The total costs of a path is the sum of the feature acquisition costs and the
expected misclassification costs:
For the node-dependent loss function we can assume that the loss function in the
root node (first stage of the decision tree) is larger than at the loss in each node
390 R. Burduk
of the second stage L(r)
= L(r) + a(mk ), where a(mk ) presents the additional
value of loss in node mk . Such an assumption is quite natural, means that we
make the greater error in classification sooner. For this assumption the expected
misclassification cost for the node-dependent fuzzy loss function can be presented
as follows:
(2.98, ∞) for λ = 0 and λ = 1 respectively. For the present example the difference
between the expected total cost for the zero-one and the node-dependent loss
function (18) is equal 0.23 and 0.22 for λ = 0 and λ = 1 respectively. For these
calculations deffuzification center of gravity method was used [9].
5 Conclusion
In this paper, we have concentrated on the costs of the two stage binary hier-
archical classifier. In this study we assume that the decision tree is known, that
is, the work does not generate its structure. The study considered two types
of costs, the feature acquisition costs and the misclassification costs. For the
zero-one and the node-dependent fuzzy loss function an expected total cost of
the globally optimal strategy was presented. The work focuses on the difference
between the expected total cost for these two cases of the loss function.
In future work we can consider various defuzzification and comparison of the
fuzzy numbers method in order to obtain the final result of the discussed two
types loss functions.
References
1. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression
trees. California, Wadsworth (1984)
2. Burduk, R.: Classification error in Bayes multistage recognition task with fuzzy
observations. Pattern Analysis and Applications 13(1), 85–91 (2010)
3. Burduk, R.: Costs-Sensitive Classification in Multistage Classifier with Fuzzy Ob-
servations of Object Features. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.)
HAIS 2011, Part II. LNCS (LNAI), vol. 6679, pp. 245–252. Springer, Heidelberg
(2011)
4. Burduk, R., Kurzynski, M.: Two-stage binary classifier with fuzzy-valued loss func-
tion. Pattern Analysis and Applications 9(4), 353–358 (2006)
5. Campos, L., González, A.: A Subjective Approach for Ranking Fuzzy Numbers.
Fuzzy Sets and Systems 29, 145–153 (1989)
6. Domingos, P.: MetaCost: A General Method for Making Classifiers Cost-Sensitive.
In: Proceedings of the Fifth International Conference on Knowledge Discovery and
Data Mining (KDD 1999), pp. 155–164 (1999)
7. Jain, R.: Decision-Making in the Presence of Fuzzy Variables. IEEE Trans. Systems
Man and Cybernetics 6, 698–703 (1976)
8. Knoll, U., Nakhaeizadeh, G., Tausend, B.: Cost-Sensitive Pruning of Decision Trees.
In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 383–386.
Springer, Heidelberg (1994)
9. Van Leekwijck, W., Kerre, E.: Defuzzification: criteria and classification. Fuzzy
Sets and Systems 108(2), 159–178 (1999)
392 R. Burduk
10. Núñez, M.: The use of background knowledge in decision tree induction. Machine
Learning 6(3), 231–250 (1991)
11. Penar, W., Woźniak, M.: Experiments on classifiers obtained via decision tree in-
duction methods with different attribute acquisition cost limit. Advances in Soft
Computing 45, 371–377 (2007)
12. Penar, W., Woźniak, M.: Cost-sensitive methods of constructing hierarchical clas-
sifiers. Expert Systems 27(3), 146–155 (2010)
13. Saar-Tsechansky, M., Melville, P., Provost, F.: Active feature-value acquisition.
Management Science 55(4), 664–684 (2009)
14. Tan, M.: Cost-sensitive learning of classification knowledge and its applications in
robotics. Machine Learning 13, 7–33 (1993)
15. Turney, P.: Cost-sensitive classification: Empirical evaluation of a hybrid genetic
decision tree induction algorithm. Journal of Artificial Intelligence Research 2,
369–409 (1995)
16. Viertl, R.: Statistical Methods for Non-Precise Data. CRC Press, Boca Raton
(1996)
17. Yang, Q., Ling, C., Chai, X., Pan, R.: Test-cost sensitive classification on data
with missing values. IEEE Transactions on Knowledge and Data Engineering 18(5),
626–638 (2006)
Investigation of Rotation Forest Ensemble Method
Using Genetic Fuzzy Systems for a Regression Problem
Abstract. The rotation forest ensemble method using a genetic fuzzy rule-based
system as a base learning algorithm was developed in Matlab environment. The
method was applied to the real-world regression problem of predicting the
prices of residential premises based on historical data of sales/purchase
transactions. The computationally intensive experiments were conducted aimed
to compare the accuracy of ensembles generated by our proposed method with
bagging, repeated holdout, and repeated cross-validation models. The statistical
analysis of results was made employing nonparametric Friedman and Wilcoxon
statistical tests.
1 Introduction
Ensemble machine learning models have been focusing attention of many researchers
due to its ability to reduce bias and/or variance when compared to their single model
counterparts. The ensemble learning methods combine the output of machine learning
systems, called in literature „weak learners‰ due to their performance, in order to get
smaller prediction errors (in regression) or lower error rates (in classification). The
individual estimator must provide different patterns of generalization, thus the
diversity plays a crucial role in the training process. To the most popular methods
belong bagging [1], boosting [19], and stacking [20]. In this paper we focus on
bagging family of techniques.
Bagging, which stands for bootstrap aggregating, devised by Breiman [1] is one of
the most intuitive and simplest ensemble algorithms providing good performance.
Diversity of learners is obtained by using bootstrapped replicas of the training data.
That is, different training data subsets are randomly drawn with replacement from the
original training set. So obtained training data subsets, called also bags, are used then
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 393–402, 2012.
© Springer-Verlag Berlin Heidelberg 2012
394 T. Lasota et al.
parallel to the feature axes, even a small rotation of the axes may lead to a very
different tree [15]. In this paper we explore the rotation forest technique using
a genetic fuzzy system as a base learning method applied to the real-world regression
problem of predicting the prices of residential premises based on historical data of
sales/purchase transactions obtained from a cadastral system. To the best of our
knowledge there are not yet any published results of ensemble models created using
rotation forest with genetic fuzzy systems, probably due to their considerable
computational load. The research was conducted with our newly developed
experimental system in Matlab environment to test multiple models using different
resampling methods. So far, we have investigated genetic fuzzy systems and genetic
fuzzy networks applied to construct regression ensemble models to assist with real
estate appraisal using our system [11], [16], [17].
Our study consisted in the application of the rotation forest (RF) method using
a genetic fuzzy system (GFS) as a base learning algorithm to a real-world regression
problem of predicting the prices of residential premises based on historical data of
sales/purchase transactions obtained from a cadastral system. We compared in terms
of accuracy the models produced by RF with the ones generated employing other
resampling techniques such repeated bootstrap, i.e. bagging (BA), repeated holdout
(HO), and repeated cross-validation (CV).
In our Matlab experimental system we developed a data driven fuzzy system of
Mamdani type, where for each input and output variables triangular and trapezoidal
membership functions were automatically determined by the symmetric division of
the individual attribute domains. The evolutionary optimization process utilized
a genetic algorithm of Pittsburgh type and combined both learning the rule base and
tuning the membership functions using real-coded chromosomes. Similar designs are
described in [5], [6], [14]. The rotation forest method was implemented in the Matlab
environment based on ideas described by Rodrígues et al. [18] and Zhang et al. [23].
The pseudo code of our rotation forest ensemble method employing the genetic fuzzy
system as a base learning algorithm is presented in Fig. 1.
3 Plan of Experiments
Four following approaches to create ensemble models using genetic fuzzy systems as
a base learning algorithm were employed in our experiments: rotation forest (RF),
bagging (BA), repeated holdout (HO), and repeated cross-validation (CV). In each
case an ensemble comprised 50 component models. They were tested with following
parameters:
RF: M=2, 3, and 4 – rotation forest with three different numbers of input attributes
in individual feature subset, as described in Section 2.
BA: B90, B70, B50 – rotation forest with three different sizes of bootstrap samples
drawn from Xij equal to 50%, 70%, and 90%, respectively, as described in Section 2.
396 T. Lasota et al.
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Given:
• GFS: genetic fuzzy system used as a base learning algorithm
• L: number of fuzzy models (fuzzy inference systems - FIS) that make up an ensemble
• n: number of input attributes in a base training data set
• N: number of instances in a base training data set
• X: N × n matrix of input attribute values for individual instances
• Y: N × 1 vector of output attribute values
• T=(X, Y): base training dataset as the concatenation of X and Y
• F: feature set, Fij – j-th attribute subset used for training i-th FISi
• M: number of input attributes in individual feature subset
• Xij: data corresponding Fij selected from matrix X
• X ij′ : - bootstrap sample from Xij
• Dij: M× M matrix to store the coefficients of principal components computed by PCA
• Ri: block diagonal matrix built of Dij matrices
• R ia : rotation matrix to obtain training set for i-th FIS
Training Phase
For i=1,2, ,L
• Calculate rotation matrix Ria for i-th FIS
1. Randomly split F into K subsets Fij (j=1,..K) for M attributes each (last subset may
contain less than M attributes)
2. For j=1,2,…,K
a. Select columns of X that correspond to the attributes of Fij to compose a
new matrix Xij
b. Draw a bootstrap sample X ij′ from Xij, with sample size smaller than
that of Xij, with eg size (B) equal to 50%, 70%, or 90% of Xij
c. Apply PCA to X ij′ to obtain a matrix Dij whose p-th column comprises
the coefficients of p-th principal component
3. EndFor
4. Build a block diagonal matrix Ri of matrices Dij (j=1,2,…,K)
5. Construct the resulting rotation matrix Ria by rearranging rows of Ri to match the
order of attributes in F
a
• Compute (X Ri , Y) and use it as an input of GFS (training dataset) to obtain i-th FISi
EndFor
Predicting Phase
( )
• For any instance xt from test dataset, let FIS i x t R ia be the value predicted by i-th
fuzzy model, then the predicted output value ytp can be computed as
( )
L
1
ytp = ∑ FISi xt Ria
L i=1
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Fig. 1. Pseudo code of rotation forest ensemble method employing the genetic fuzzy system
Investigation of Rotation Forest Ensemble Method Using Genetic Fuzzy Systems 397
BA: B100, B80 – m-out-of-n bagging with replacement with different sizes of
samples. The numbers in the codes indicate what percentage of the training set was
drawn.
HO: H100, H80 – repeated holdout, i.e., m-out-of-n bagging without replacement
with different sizes of samples. The numbers in the codes indicate what percentage of
the training set was drawn.
CV: 5x10cv, 10x5cv– repeated cross-validation, with different k-fold cross-
validation splits, for k=5 and 10, were repeated 5 and 10 times, respectively.
Real-world data used in experiments was drawn from an unrefined dataset
containing above 50 000 records referring to residential premises transactions
accomplished in one Polish big city with the population of 640 000 within eleven
years from 1998 to 2008. In this period most transactions were made with non-market
prices when the council was selling flats to their current tenants on preferential terms.
First of all, transactional records referring to residential premises sold at market prices
were selected. Then, the dataset was confined to sales transaction data of apartments
built before 1997 and where the land was leased on terms of perpetual usufruct.
Hence, the final dataset counted 5303 records. Five following attributes were pointed
out as price drivers by professional appraisers: usable area of a flat (Area), age of a
building construction (Age), number of storeys in the building (Storeys), number of
rooms in the flat including a kitchen (Rooms), the distance of the building from the
city centre (Centre), in turn, price of premises (Price) was the output variable.
Due to the fact that the prices of premises change substantially in the course of
time, the whole 11-year dataset cannot be used to create data-driven models. In order
to obtain comparable prices it was split into subsets covering individual years. Then
the prices of premises were updated according to the trends of the value changes over
11 years. Starting from the beginning of 1998 the prices were updated for the last day
of subsequent years. The trends were modelled by polynomials of degree three. The
chart illustrating the change trend of average transactional prices per square metre is
given in Fig. 2. We might assume that one-year datasets differed from each-other and
might constitute different observation points to compare the accuracy of ensemble
models in our study. The sizes of one-year datasets are given in Table 1.
Fig. 2. Change trend of average transactional prices per square metre over time
1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
269 477 329 463 530 653 546 580 677 575 204
398 T. Lasota et al.
As a performance function the root mean square error (RMSE) was used, and as
aggregation functions of ensembles arithmetic averages were employed. Each input
and output attribute in individual dataset was normalized using the min-max
approach. The parameters of the architecture of fuzzy systems as well as genetic
algorithms are listed in Table 2.
In order to ensure the same experimental conditions for all ensemble learning
methods, before employing a given method each one-year dataset was split into
training and test sets in the proportion of 80% to 20% instances using stratified
approach. Namely, each dataset was divided into several clusters with k-means
algorithm. The number of clusters was determined experimentally according to the
best value of Dunn’s index. Then, 80% of instances from each cluster were randomly
selected to the training set and remaining 20% went to the test set. After that, each
learning method was applied to the training set and 50 models were generated. The
performance of each model was determined using test set. And finally, the average
value of accuracy measure over all component models constituted the resulting
indicator of the performance of a given method over the one-year dataset. A schema
illustrating the flow of experiments is shown in Fig. 3. In turn, having averaged values
of RMSE for all one-year datasets we were able to make non-parametric Friedman
and Wilcoxon statistical tests to compare the performance of individual methods with
different parameters as well as all considered methods.
Fig. 3. Outline of experiment with training and test sets created using stratified approach
Investigation of Rotation Forest Ensemble Method Using Genetic Fuzzy Systems 399
4 Results of Experiments
The performance of RF for different values of parameters M and B in terms of RMSE
is illustrated graphically in Figures 4 and 5 respectively. In turn, the results provided
by the BA, HO, and CV models created using genetic fuzzy systems (GFS) are shown
in Figures 6, 7, and 8 respectively. The comparison of the best selected variants of
individual methods is given in Fig. 9. The statistical analysis of the results was carried
out with non-parametric Friedman and Wilcoxon tests performed in respect of RMSE
values of all models built over 11 one-year datasets. The tests were made using
Statisitca package. Average ranks of individual models provided by Friedman tests
are shown in Table 3, where the lower rank value the better model, and asterisks
indicate statistical significance at the level α=0.05.
Fig. 4. Performance of Rotation Forest ensembles with different values of parameter M and B70
Fig. 5. Performance of Rotation Forest ensembles with different values of B and M=2
Fig. 6. Performance comparison of Repeated Holdout ensembles for H100 and H80
400 T. Lasota et al.
Fig. 9. Performance comparison of the best of RF, BA, HO, and CV ensembles
Table 3. Average rank positions of models determined during Friedman test (* significant)
With our specific datasets and experimental setup the best results produced RF
with M=2 and B90, however the differences in accurracy among RF with different
parameters were statictically insignificant. To the final comparison ensembles created
using RF with M=2 and B90 were chosen bacause parameter M=2 provides the
Investigation of Rotation Forest Ensemble Method Using Genetic Fuzzy Systems 401
highest diversity of component models and B90 ensured the best accuracy when
compared to HO and BA. As for individual methods, HO100 and CV10x5 revealed
significantly better performance than H80 and CV5x10 respectively. No significant
differences were observed between B100 and B80.
The final comparison comprised the best selected ensembles produced by
individual methods, namely RF(M=2,B90), BA80, H100, and CV10x5. Friedman test
indicated significant differences in performance among the methods. According to
Wilcoxon test CV outperformed significantly all other techniques. RF, BA, and HO
performed equivalently, but among them RF gained the best rank value. When
considered each one-year data point separately, RF revealed better performance than
BA and HO over 5 one-year datasets and for the dataset of 2005 outperformed all
other methods.
References
1. Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)
2. Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)
3. Bryll, R.: Attribute bagging: improving accuracy of classifier ensembles by using random
feature subsets. Pattern Recognition 20(6), 1291–1302 (2003)
4. Bühlmann, P., Yu, B.: Analyzing bagging. Annals of Statistics 30, 927–961 (2002)
5. Cordón, O., Gomide, F., Herrera, F., Hoffmann, F., Magdalena, L.: Ten years of genetic
fuzzy systems: current framework and new trends. Fuzzy Sets and Systems 141, 5–31
(2004)
402 T. Lasota et al.
6. Cordón, O., Herrera, F.: A Two-Stage Evolutionary Process for Designing TSK Fuzzy
Rule-Based Systems. IEEE Trans. Sys., Man, and Cyb.-Part B 29(6), 703–715 (1999)
7. Fumera, G., Roli, F., Serrau, A.: A theoretical analysis of bagging as a linear combination
of classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(7),
1293–1299 (2008)
8. Gashler, M., Giraud-Carrier, C., Martinez, T.: Decision Tree Ensemble: Small
Heterogeneous Is Better Than Large Homogeneous. In: Seventh International Conference
on Machine Learning and Applications, ICMLA 2008, pp. 900–905 (2008)
9. Ho, T.K.: The Random Subspace Method for Constructing Decision Forests. IEEE
Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
10. Jędrzejowicz, J., Jędrzejowicz, P.: Rotation Forest with GEP-Induced Expression Trees.
In: O’Shea, J., Nguyen, N.T., Crockett, K., Howlett, R.J., Jain, L.C. (eds.) KES-AMSTA
2011. LNCS (LNAI), vol. 6682, pp. 495–503. Springer, Heidelberg (2011)
11. Kempa, O., Lasota, T., Telec, Z., Trawiński, B.: Investigation of Bagging Ensembles of
Genetic Neural Networks and Fuzzy Systems for Real Estate Appraisal. In: Nguyen, N.T.,
Kim, C.-G., Janiak, A. (eds.) ACIIDS 2011, Part II. LNAI (LNCS), vol. 6592, pp. 323–332.
Springer, Heidelberg (2011)
12. Kotsiantis, S.: Combining bagging, boosting, rotation forest and random subspace
methods. Artificial Intelligence Review 35(3), 223–240 (2011)
13. Kotsiantis, S.B., Pintelas, P.E.: Local Rotation Forest of Decision Stumps for Regression
Problems. In: 2nd IEEE International Conference on Computer Science and Information
Technology, ICCSIT 2009, pp. 170–174 (2009)
14. Król, D., Lasota, T., Trawiński, B., Trawiński, K.: Investigation of Evolutionary
Optimization Methods of TSK Fuzzy Model for Real Estate Appraisal. International
Journal of Hybrid Intelligent Systems 5(3), 111–128 (2008)
15. Kuncheva, L.I., Rodríguez, J.J.: An Experimental Study on Rotation Forest Ensembles. In:
Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 459–468. Springer,
Heidelberg (2007)
16. Lasota, T., Telec, Z., Trawiński, G., Trawiński, B.: Empirical Comparison of Resampling
Methods Using Genetic Fuzzy Systems for a Regression Problem. In: Yin, H., Wang, W.,
Rayward-Smith, V. (eds.) IDEAL 2011. LNCS, vol. 6936, pp. 17–24. Springer, Heidelberg
(2011)
17. Lasota, T., Telec, Z., Trawiński, G., Trawiński, B.: Empirical Comparison of Resampling
Methods Using Genetic Neural Networks for a Regression Problem. In: Corchado, E.,
Kurzyński, M., Woźniak, M. (eds.) HAIS 2011, Part II. LNCS, vol. 6679, pp. 213–220.
Springer, Heidelberg (2011)
18. Rodrígeuz, J.J., Kuncheva, I., Alonso, C.J.: Rotation forest: A new classifier ensemble
method. IEEE Trans. on Pattern Analysis and Mach. Intel. 28(10), 1619–1630 (2006)
19. Schapire, R.E.: The strength of weak learnability. Mach. Learning 5(2), 197–227 (1990)
20. Wolpert, D.H.: Stacked Generalization. Neural Networks 5(2), 241–259 (1992)
21. Zhang, C.-X., Zhang, J.-S.: A variant of Rotation Forest for constructing ensemble
classifiers. Pattern Analysis & Applications 13(1), 59–77 (2010)
22. Zhang, C.-X., Zhang, J.-S.: RotBoost: A technique for combining Rotation Forest and
AdaBoost. Pattern Recognition Letters 29(10), 1524–1536 (2008)
23. Zhang, C.-X., Zhang, J.-S., Wang, G.-W.: An empirical study of using Rotation Forest to
improve regressors. Applied Mathematics and Computation 195(2), 618–629 (2008)
Data with Shifting Concept Classification
Using Simulated Recurrence
1 Introduction
Concept shift is a hidden change in the data model, which may be caused e.g., by lack
of knowledge about characteristics of data distributions, which may influence the
classification accuracy [14]. Let's consider the classification task to name given 3-D
objects on the basis of their 2-D images captured from various perspectives. In such
scenario the typical classification system might make wrong decisions e.g., if a 3-D
cylinder is photographed from different angles, then it can produce different 2-D
images, e.g. once being represented by a circle and once by a rectangle. These 2-D
figures may also indicate other 3-D objects (a circle may be an image of a sphere or a
cone and a rectangle - of a cube or a cuboid). This problem is called "concept drift"
and is often categorized as one of the two types: gradual and sudden (also called:
concept drift and concept shift, respectively).
Gradual concept drift appears in the data stream as a sequence of minor changes in the
hidden features of data, which follow a certain trend. E.g., the data stream is a set of
the 2-D images of a certain 3-D object, which starts to rotate. The classification
system decides whether the following 2-D images are representing a new 3-D object
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 403–412, 2012.
© Springer-Verlag Berlin Heidelberg 2012
404 P. Sobolewski and M. Woźniak
or they are still the pictures of the same object, but taken from different perspectives.
By analyzing a group (called "window") of samples, the classification system may
decide that the incoming 2-D images do not indicate that the classified 3-D object has
been replaced by a new one. The fact that the changes are minor and follow an
organized trend makes the classification task relatively easy by the means of detection
and adaptation. When the class changes (i.e., the classified 3-D object is instantly
replaced with a different one), then the stream of the 2-D images changes more
drastically, therefore the gradual hidden changes in the feature vectors describing the
consequent data samples are unlikely to be mistaken with the actual class change.
Let us consider the same example, but with the sudden concept drift. As the concept
shift usually does not follow an organized trend or sustain stable impetuosity, the
changes in the rotation angle of the 3-D object are more abrupt and less organized,
causing the following 2-D images to differ more from each other. Also, if the order of
the 3-D objects presented to the classification system is chaotic (the classes change
randomly), then the constantly changing 2-D images may indicate either a change of
the class a hidden change in the data characteristics. The classification system then
has to decide whether the incoming 2-D images are representing a new 3-D object or
the same object, however seen from a different perspective.
Popular approaches to classify data with concept shift may be described in general
by the following procedure [6]:
Fig. 1. Pseudo-code of the algorithm classifying data stream with concept shift
Recurring concept shift is a special case of concept shift. In this model the previous
concepts may reappear in the data stream. In the example with 2-D images of 3-D
objects it would mean that the rotation angles of some 3-D objects could repeat.
Recurring concept shift has been introduced by Widmer et al. [19]. Authors proposed
to deal with this phenomenon on the basis of the repository of known concepts with
the corresponding classifiers, which are kept in the memory to be used for the
classification of the future data if the concept recurs [12][5][9].
Data with Shifting Concept Classification Using Simulated Recurrence 405
The pseudo-code of algorithm which deals with the recurring concept shift is
presented in Fig.2.
Fig. 2. Pseudo-code of classification algorithms for the recurring concept shift data streams
The only difference between the procedure presented above and the one described
in Fig.1 is in point 2, in which the classification system reuses the already trained
classifier to classify the new data after the shift.
An important point in both procedures is the first step, which requires a drift
detection mechanism. Most algorithms described in literature detect the concept drift
in data stream on the basis of classification error-rate fluctuations [15][10], which can
only be acquired when the knowledge of the true class labels of data samples is
available. Such assumption does not adhere to the definition of intelligent
classification systems, which should be self-sufficient and as independent from the
knowledge of the data labels as possible. Although the drift detection mechanism is
important for the classification systems, it can be considered as a separate research
area and it is not implemented in the presented solution.
2 Simulated Recurrence
Recurrence of concept allows the classification system to perform the classification
with the use of the classifiers previously trained on the recurring concept, lowering the
costs of wrong decisions and minimizing the need for access to true class labels, which
is normally required to train a new classifier with the new concept data. We aim to
implement this procedure in the non-recurring concept shift scenario by simulating the
recurrence of the concepts.
Assuming that the degree of the concept shift is limited i.e., changes of expected
values of conditional probability density functions are finite in size, the system trains
the classifiers on the data artificially generated from each simulated concept within the
allowed concepts area and keeps them stored in the classifier repository for later use.
The classifiers are grouped into ensembles, from which the most appropriate one is
chosen to perform the classification of the new data window drawn from the data
stream.
If the base classifier, trained on the real data is denoted as Ψ, then the ‘artificial
classifiers’, trained on the data samples generated from simulated concepts are
described as Ψ(1), Ψ (2), ..., Ψ(n), where n is the number of simulated concepts.
The pseudo-code of the proposed solution is presented in Fig. 3.
406 P. Sobolewski and M. Woźniak
Fig. 3. Pseudo-code of the data stream classification with the simulated recurrence approach
The simulated recurrence method [17] can be explained more clearly on the basis of
the previously used example with the 3-D objects represented by the 2-D images. As
before, let’s assume that the current concept of a 3-D object is represented by the 2-D
images of the certain perspective of this object. Shift impetuosity is limited by the
rotation of the 3-D object, e.g. at most 30 degrees in each direction, creating an area of
possible concepts. Simulated recurrence is implemented by artificially generating the
2-D images of various perspectives of the 3-D objects within the allowed concepts area
and with each perspective a separate classifier is trained and stored in the repository.
Both the shift impetuosity limit and the number of 2-D images generated this way are
defined as the preliminary parameters for the algorithm.
Let us assume that we have n classifiers, Ψ(1), Ψ , ..., Ψ(n) at our disposal. Each
(2)
, ,…, , (1)
where
1,
0,
The combined classifier ΨS makes decisions on the basis of the majority voting rule,
denoted as:
Ψ ,Ψ x (2)
where
0
, .
1
1
Ψ , Ψ , , 1, … , , (3)
1
/
where
Next, a set of m observations ,…, is divided into sets X1 and X0, representing
observations for which the measured pair of classifiers Ψ(l) and Ψ(h) agrees or disagrees
on the class affiliations, respectively. Two parameters are calculated on the basis of the
observations belonging to sets X1 and X0:
408 P. Sobolewski and M. Woźniak
1
, , , . (5)
2
The measure of diversity between a pair of classifiers Ψ(l) and Ψ(h) is then calculated as
follows:
, ,
Ψ ,Ψ . (6)
Ψ Ψ ,Ψ . (7)
The crossover operator generates two offsprings from two parents according to the
traditional crossover rule, namely by partially swapping the Ws vectors between the
parents. The crossover model assumes, that if the crossover takes place for a group of
individuals, then a pair is chosen randomly as parents and 1/3 of the bits in the Ws vector
of one parent are randomly selected and exchanged with the corresponding bits of the Ws
vector of the second parent, creating two new Ws vectors, namely the children.
Selection of the individuals for the new population is performed from the merged
descendant population with the set of the individuals created by the mutation and
crossover operators. The probability of selecting a particular individual is proportional
to the value of its fitness function value. The number of members in the new
population stays the same as in the previous population, including the elite which was
previously promoted.
The optimization process ends when the results obtained by the best member
deteriorate in the course of a given number of the subsequent learning cycles.
4 Experiments
The aim of experiments is to evaluate the influence of the number of the classifiers in
the ensemble and the number of classifiers in the repository on the performance of the
classification system with simulated recurrence.
4.1 Setup
The evaluation takes place on 100 windows of 40 samples drawn randomly from each
of 78 possible concepts and the accuracy of the system is measured on each window by
dividing the number of correctly classified samples by the total number of samples in
the window. Parameters of the learning algorithm are set as follows:
- upper limit for the number of learning algorithm cycles: 30,
- population quantity: 20,
- elite fraction size: 50%,
- probability of crossover: 50%,
- probability of mutation: 20%.
The analysis of results is performed by comparing the mean accuracies achieved by the
proposed classification system for 10, 20 and 50 classifiers in the repository and 4, 6
and 8 classifiers in the ensemble. The efficiency obtained by the single classifier
trained only with the original data is also presented for the reference.
A simple linear quadratic classifier is used as a base classifier in the experiments.
The experimental scenario is based on the benchmark dataset from the UCI Machine
Learning Repository [1], the Wine Dataset. This dataset consists of 178 samples
distributed non-uniformly between 3 classes (59, 71 and 48 samples for each class
respectively), which are described by the vectors of 13 real numeric features. During
the experiments, all the original dataset is used as an available training and reference
data for the system. The artificial datasets are generated on the basis of the original
dataset by simulating the possible shifts in the original concept and the classifiers
repository is created by training classifiers on each of the artificial datasets.
410 P. Sobolewski and M. Woźniak
4.2 Results
To validate the statistical significance of the results, the vectors of mean accuracies
obtained by each ensemble for all possible concepts have been tested with a paired
t-test to reject the null hypothesis with 5% probability. The cumulative results over all
possible concepts are presented in the table below:
Each possible concept is evaluated separately and the numbers of concepts with
corresponding mean accuracies are grouped into 10 equal bins as the accuracy
histograms – below the worst and the best scenario, respectively:
Inspired by the recurring concept drift, the method of implementing a simulated
recurrence into a static classification system has proven to extend its ability to classify
data with shifting concept, without an additional concept drift detection mechanism.
Even with only 13% of all possible concepts simulated as recurring, the classifier
ensemble algorithm performs 25% better overall than the single classifier trained solely
on the reference dataset and achieves an impressive mean accuracy of above 80% over
all the possible concepts if more concepts are simulated.
Data with Shifting Concept Classification Using Simulated Recurrence 411
Fig. 4. Histograms representing the mean accuracies over all possible concepts of two system
configurations (single classifier and an ensemble of 4 classifiers with the repository of 50
simulated concepts)
Surprising result is the lower accuracy for larger ensembles. The reason for this is
when there are more than 4 classifiers in the ensemble it may lead to situations with
more than one class label getting the most votes (e.g. 3-3-2 votes from 8 or 4-4-4 from
12 classifiers in the committee, in which case the label is chosen randomly).
5 Conclusions
The model of trained ensemble which is able to deal with concept shift problem was
presented. The proposed method exploits strength of recurring concept drift model and
adapts it to the problem of concept shift where the changes of the chosen probability
characteristics are limited in each drift step. Our proposition was evaluated on the basis
of computer experiments carried out on the basis of chosen dataset from UCI. The
obtained results are promising but we realized that their scope was limited therefore
our future research will be focused on the three areas, namely:
1. Extend the efficiency tests of the simulated recurrence ensemble classification
algorithm by evaluating the method on other reference datasets and implementing
different shift simulating rules,
2. Develop an efficient concept drift detection system based on the simulated
recurrence, able to be implemented into an existing classification system as an
autonomous detector.
3. Form a fair statistical measure to compare the algorithms used for classification
of streaming data with concept drift.
Acknowledgment. This work is supported in part by The Ministry of Science and
Higher Education under the grant which is realizing in the period 2010-2013.
References
1. Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California,
School of Information and Computer Science, Irvine, CA (2010),
http://archive.ics.uci.edu/ml
412 P. Sobolewski and M. Woźniak
2. Demsar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of
Machine Learning Research 7, 1–30 (2006)
3. Dietterich, T.G.: Ensemble Methods in Machine Learning. In: Kittler, J., Roli, F. (eds.)
MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
4. Fan, T.G., Zhu, Y., Chen, J.M.: A new measure of classifier diversity in multiple classifier
system. ICMLC 1, 18–21 (2008)
5. Gama, J., Kosina, P.: Tracking Recurring Concepts with Meta-learners. In: Lopes, L.S.,
Lau, N., Mariano, P., Rocha, L.M. (eds.) EPIA 2009. LNCS, vol. 5816, pp. 423–434.
Springer, Heidelberg (2009)
6. Gama, J., Medas, P., Castillo, G., Rodrigues, P.P.: Learning with Drift Detection. In:
Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295.
Springer, Heidelberg (2004)
7. Ho, T.K., Hull, J.J., Srihari, S.N.: Decision Combination in Multiple Classifier Systems.
IEEE Transactions on Pattern Analysis and Machine. Intelligence 16(1), 66–75 (1994)
8. Jackowski, K., Wozniak, M.: Algorithm of designing compound recognition system on the
basis of combining classifiers with simultaneous splitting feature space into competence
areas. Pattern Analysis and Applications 12, 415–425 (2009)
9. Katakis, I., Tsoumakas, G., Vlahavas, I.P.: An Ensemble of Classifiers for coping with
Recurring Contexts in Data Streams. In: 18th European Conf. on Artificial Intelligence,
Greece, pp. 763–764 (2008)
10. Klinkenberg, R., Joachims, T.: Detecting Concept Drift with Support Vector Machines. In:
Proceedings of the Seventeenth International Conference on Machine Learning (ICML),
pp. 487–494. Morgan Kaufmann, San Francisco (2000)
11. Kuncheva, L.I., Whitaker, C.J.: Measures of Diversity in Classifier Ensembles and Their
Relationship with the Ensemble Accuracy. Machine Learning 51(2), 181–207 (2003)
12. Li, P., Wu, X., Hu, X.: Mining Recurring Concept Drifts with Limited Labeled Streaming
Data. Journal of Machine Learning Research - Proceedings Track, 241–252 (2010)
13. Michalewicz, Z.: Genetics Algorithms + Data Structures = Evolutions Programs. Springer,
Heidelberg (1996)
14. Narasimhamurthy, A.M., Kuncheva, L.I.: A framework for generating data to simulate
changing environments. In: Proc. IASTED, Artificial Intelligence and Applications,
Innsbruck, Austria, pp. 415–420 (2007)
15. Nishida, K., Yamauchi, K.: Learning, detecting, understanding, and predicting concept
changes. In: Proc. of IJCNN, pp. 2280–2287 (2009)
16. Pechenizkiy, M., Bakker, J., Zliobaite, I., Ivannikov, A., Kärkkäinen, T.: Online mass flow
prediction in CFB boilers with explicit detection of sudden concept drift. SIGKDD
Explorations 11(2), 109–116 (2009)
17. Sobolewski, P., Woźniak, M.: Artificial Recurrence for Classification of Streaming Data
with Concept Shift. In: Bouchachia, A. (ed.) ICAIS 2011. LNCS, vol. 6943, pp. 76–87.
Springer, Heidelberg (2011)
18. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble
classifiers. In: Proc. of the 9th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, pp. 226–235. ACM Press, New York (2003)
19. Widmer, G., Kubat, M.: Learning in the Presence of Concept Drift and Hidden Contexts.
Machine Learning 23(1), 69–101 (1996)
20. Windeatt, T.: Diversity measures for multiple classifier system analysis and design.
Information Fusion 6(1), 21–36 (2005)
Neighborhood Selection and Eigenvalues
for Embedding Data Complex in Low Dimension
1 Introduction
LLE[1] is a well known approach for representing the structure of high dimen-
sional data within low dimensional embeddings. The LLE algorithm contains 3
steps as following:
The first step of LLE algorithm is to find out the neighborhoods of every points.
Traditionally, the k-nearest neighbors approach is the most widely used one. This
approach has many advantages such as easy to implement, suitable for most of
cases, fast enough and can be parallelized and further accelerated[2]. But for
some other type of dataset, the k-nn approach will face difficulties. For example,
for non-uniform sampling datasets or datasets contains complex structure, the
selection of k may be a serious problem. For these kind of problems, the -distance
approach is introduced.
Although there are already attempts for using other neighborhood func-
tions, such as weighted k-nn[3][4][5][6], clustering approaches[7], or including
k-means[7][8]. But these alternatives are still based and analyzed on original
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 413–422, 2012.
c Springer-Verlag Berlin Heidelberg 2012
414 J.-W. Liou and C.-Y. Liou
k-nn method. In this paper, the -distance will be taken into main considera-
tion as a different conceptual method from k-nn for dealing with more complex
datasets.
The last step of LLE requires computation for finding minimal eigenvalues and
corresponding eigenvectors in a positive semi-definite matrix. Since the working
precision for computer is always limited, directly finding eigenvalues nearest to
zero will get eigenvalue near to machine epsilon which could be zero if we have
infinite precision. The detail for this problem will be discussed later.
Isomap[9] is another popular approach for embedding data in low dimension.
The Isomap algorithm also contains 3 steps as below:
Since the k-nn and -distance approaches for neighborhood selection are already
built in the Isomap approach, the focus for this paper is on comparison of eigen-
values obtained from LLE and Isomap. Although in [10], there are some brief
analysis about the eigenvalues obtained from LLE, the analysis was focus on
choosing the correct number of embedding dimensions. In this paper, we only
map data to 2 embedding dimensions, so the choosing only occurs in finding
appropriate k for k-nn and for -distance.
2 Method
2.1 Neighborhood Selection
For neighborhood selection, we will use two approaches. The first method is
k-nearest neighbors. The k-nearest neighbors method is finding k nearest data
points for each data point in the dataset as its neighbors. For the -distance
approach, a point is a neighbor of a certain point p if the Euclidean distance
from the point to p is no more than a certain distance .
Neighborhood selection example for k-nn as√8-nn can be shown in Fig. 1(a),
while example for -distance within radius = 0.05 can be shown in Fig. 1(b).
0.5 0.5
0 0
−0.5 −0.5
−0.5 0 0.5 −0.5 0 0.5
√
(a) 8-nn selection. (b) 0.05 selection.
3 Experiment
3.1 Datasets
Some artificially generated datasets are used to test the ability of different ap-
proaches. The first dataset is from the swiss roll dataset with a hole to make
sure all approaches are usable. The swiss roll dataset contains 2000 points as
in Fig. 2(a). The second dataset is sampled from a dual circular tube dataset.
The dataset samples one tube for 700 points, and samples the other tube for 800
points so that the sample size is summed up to 1500 points. The dataset can be
shown in Fig. 2(b).
The third experimental data is sampled non-uniformly from a knot tube
dataset. The sample size is 2000. The sampled data can be shown in Fig. 2(c).
The fourth dataset is a database of 8-bit grey-level face images from the same
person with different angles and moods which can be considered as true data.
The number of images is 1965 and the resolution of images is 28 × 20.
416 J.-W. Liou and C.-Y. Liou
(a) The swiss roll dataset. (b) The dual tube dataset. (c) The knot dataset.
3.2 Parameters
The LLE parameters can be separated as regularization parameters, eigenvalue
solving parameters, and k-nn or -distance parameters. The regularization for
all method introduced in previous section is the default parameter as 10−3 for
simulation dataset. For the true dataset, the data dimension is 560, which should
be larger than effective parameter range of k-nn approach, so the regularization
for the true dataset is set to zero. For -distance on true dataset, the regulariza-
tion is set to 10−3 for searching in a big enough range to ensure that no data is
isolated. The eigenvalue solving parameter is only effective for -distance which
indicates minimum non-epsilon eigenvalue is 10−14 for simulation dataset, and
10−12 for the true data. For Isomap, only the k in k-nn and the in -distance
are considered as effective parameters. The detail for solving Isomap matrix will
not be discussed in this paper.
3.3 Result
For the swiss roll with hole dataset, the corresponding eigenvalues solved from
LLE for effective range of k or can be shown in Fig. 3(a) to Fig. 3(c). For LLE,
the third eigenvalue in the figures, shown as red line for representative, should
be always near zero and the corresponding eigenvector will never be used for
the final embedding result. The reasonable embedding results for k-nn can be
obtained when k is from 6 to 16. For k value between 11 and 13, since the second
eigenvalue goes up, the embedding results enclose the hole in the swiss roll and
are considered as descent embedding results. The selected embedding results for
k-nn can be seen in Fig. 4. For the fractional nearest neighbors, we can obtain
enhanced resolution for eigenvalues trends while the corresponding eigenvalues
changes are similar to the original k-nn. The corresponding eigenvalue trends
can be seen in Fig. 3(b).
For the -distance approach, the eigenvalues for different parameters can be
shown
√ in√Fig. 3(c). The√effective√ range for the swiss roll with hole dataset is from
11 to 38, exclude √22 to √25 according to the figure. But the embedding
result using within 22 to 25 get twisted but still reasonable embedding
result, which is still worse than embbedding results from other reasonable s.
Some selected embedding results for -distance can be shown in Fig. 5.
Neighborhood Selection and Eigenvalues for Embedding Data Complex 417
Since LLE selected the d eigenvectors with least positive eigenvalues except
zeros, the corresponding eigenvalues will generally increasing as the increasing
parameters. The sharper change points indicate the major change of embedding
results, such as from too less information to reasonable embeddings or from
reasonable embeddings to complex structure which is no more understandable.
The range between two change points will obtain similar embedding results.
−7 −7
x 10 x 10 −7
x 10
2.5 3 5
2 2.5
4
Eigenvalue
Eigenvalue
Eigenvalue
2
1.5 3
1.5
1 2
1
0.5 1
0.5
0 0 0
3 7 11 15 19 3 7 11 15 19 0 10 20 30 40 50
# nearest neighbors # nearest neighbors Distance2
Fig. 4. Embedding results from LLE using k-nn on swissroll with hole dataset
√ √ √ √
(a) = 10.5 (b) = 24 (c) = 32 (d) = 40
Fig. 5. Embedding results from LLE using -distance on swissroll with hole dataset
For Isomap applied on the swiss roll with hole dataset, the corresponding
eigenvalue changes for k-nn and -distance methods can be shown in Fig. 6(a)
and Fig. 6(b). For k-nn approach, the dramatic eigenvalue change from 8-nn
to 10-nn only has implicit meaning for the embedding process which becomes
significantly shown after 12-nn, and the embedding result starts to be folded after
418 J.-W. Liou and C.-Y. Liou
6 6
x 10 x 10
3 3
2.5 2.5
2 2
Eigenvalue
Eigenvalue
1.5 1.5
1 1
0.5 0.5
0 0
3 12 21 30 2 3 4 5 6 7 8
# nearest neighbors Distance
Fig. 7. Embedding results from Isomap using k-nn on swissroll with hole dataset
Fig. 8. Embedding results from Isomap using -distance on swissroll with hole dataset
For the dual tubes dataset, the eigenvalues solved from LLE by different pa-
rameters using k-nn, fractional k-nn and -distance can be shown in Fig. 9.
The effective range for k-nn is 10 ≤ k ≤ 14. If k > 14, the two tubes will
Neighborhood Selection and Eigenvalues for Embedding Data Complex 419
√ √
be intersected. The effective range for -distance is 0.05 ≤ ≤ 0.07. The
significant change of the second eigenvalue determines the effective range in this
dataset.
−8 −8 −7
x 10 x 10 x 10
18 18 3.5
2.8
Eigenvalue
Eigenvalue
Eigenvalue
12 12
2.1
1.4
6 6
0.7
0 0 0
4 9 14 19 3 7 11 15 19 0 0.06 0.11 0.17 0.23
# nearest neighbors # nearest neighbors Distance2
The eigenvalues solved by Isomap using k-nn and -distance can be shown in
Fig. 10. The red and blue line shown the eigenvalues of the second component
extracted from Isomap. Since the dual tube is not intersected within 3-D space,
the separated dual tube result is more desired sometimes. The effective range
for k-nn of Isomap is k < 19 while the effective range for -distance is < 0.35.
The different level of eigenvalues mapped to different embedding results from
two tubes to dual tubes in the embedding space to twisted dual tubes.
8000 8000
7000 7000
6000 6000
5000 5000
Eigenvalue
Eigenvalue
4000 4000
3000 3000
2000 2000
1000 1000
0 0
4 11 18 25 30 0.15 0.3 0.45 0.6
# nearest neighbors Distance
For the knot dataset, the eigenvalues solved from LLE by different parameters
using k-nn, fractional k-nn and -distance can be shown in Fig.√11. Only√7-nn
can successfully
√ unfold
√ the knot, while effective range is from 0.36 to 0.54
except 0.46 to 0.48. The first eigenvalue change is more important in this
dataset using LLE.
420 J.-W. Liou and C.-Y. Liou
−8 −8 −8
x 10 x 10 x 10
3 3 4
3.5
3
Eigenvalue
Eigenvalue
Eigenvalue
2 2 2.5
2
1.5
1 1
1
0.5
0 0 0
4 9 14 19 3 7 11 15 19 0.1 0.3 0.5 0.7
# nearest neighbors # nearest neighbors Distance2
The eigenvalues solved by Isomap using k-nn and -distance can be shown in
Fig. 12. The effective k is 4 ≤ k ≤ 8, while 6 ≤ k ≤ 8 can only obtain descent
twisted tube which is not intersected. The effective is 0.525 ≤ ≤ 0.735.
5 5
x 10 x 10
2 4
1.5 3
Eigenvalue
Eigenvalue
1 2
0.5 1
0 0
4 13 22 30 0.3 0.5 0.7 0.9
# nearest neighbors Distance
For the face dataset, the eigenvalues solved by LLE using the three approaches
can be shown in Fig. 13. The true data is much more complex, so the strategy
for finding suitable embbeding is no more applicable for the true data. For -
distance approach, the trade-off between enough number of points included and
not too many connections included becomes serious decision because some data
are far away from other data and will almost always be excluded from the final
embedding result.
The eigenvalues solved by Isomap using k-nn and -distance can be shown
in Fig. 14. For Isomap, similar embeddings can be obtained from k-nn and -
distance from the corrsponding value regions. The change point for Isomap using
-distance show more changes from different distance range while using k-nn, we
cannot find any significant change point. The first jump for distance close to
1.45 indicates the information from insufficient to barely enough to build up an
embedding. The second jump means including another isolated large component
Neighborhood Selection and Eigenvalues for Embedding Data Complex 421
so that the first eigenvalue jumps up again just as what happened in the dual
tube dataset. The distance between flat area will just give similar embedding
results.
−7 −6 −8
x 10 x 10 x 10
18 2.5 8
2
6
Eigenvalue
Eigenvalue
Eigenvalue
12
1.5
4
1
6
2
0.5
0 0 0
3 7 11 15 19 3 8 13 18 23 2 5 8 11
# nearest neighbors # nearest neighbors Distance2
4 4
x 10 x 10
15 10
10
Eigenvalue
Eigenvalue
4
5
0 0
0 50 100 150 200 1 1.45 1.9 2.35 2.8 3.25
# nearest neighbors Distance
4 Conclusion
From the eigenvalues obtained from LLE and Isomap, we can observe the evolu-
tion of unfolding internal data structure across different parameters. Using too
small distance or number of neighbors will result in insufficient connections so
that the global view of data is considered as incomplete. While using too large
distance or number of neighbors will force LLE and Isomap to reserve too much
information to represent within the corresponding number of embedding dimen-
sions. The eigenvalues obtained from different parameters are proven to be a
possible criteria to choose relatively good distance or number of neighbors for
good enough embeddings within the required number of embedding dimensions.
But for real world dataset, the searching range for parameters is data depen-
dent and should be large enough to ensure containing better embedding results,
422 J.-W. Liou and C.-Y. Liou
although too large search range will need too much time to finish. The resonable
search range for the k-nn or -distance approach for embedding dataset in fixed
number of embedding dimensions is still remained as a future research issue.
References
1. Sam, R., Lawrence, S.: Nonlinear Dimensionality Reduction by Locally Linear Em-
bedding. Science 290(5500), 2323–2326 (2000)
2. Yeh, T.T., Chen, T.-Y., Chen, Y.-C., Shih, W.-K.: Efficient Parallel Algorithm for
Nonlinear Dimensionality Reduction on GPU. In: IEEE International Conference
on Granular Computing, pp. 592–597. IEEE Computer Society (2010)
3. Chang, H., Yeung, D.-Y.: Robust Locally Linear Embedding. Pattern Recogni-
tion 39, 1053–1065 (2006)
4. Pan, Y., Ge, S.S., Mamun, A.A.: Weighted Locally Linear Embedding for Dimen-
sion Reduction. Pattern Recognition 42, 798–811 (2009)
5. Wen, G., Jiang, L., Wen, J.: Local Relative Transformation with Application to
Isometric Embedding. Pattern Recognition Letters 30, 203–211 (2009)
6. Zuo, W., Zhang, D., Wang, K.: On Kernel Difference-weighted K-nearest Neighbor
Classification. Pattern Analysis and Applications 11, 247–257 (2008)
7. Wen, G., Jiang, L.-J., Wen, J., Shadbolt, N.R.: Clustering-Based Nonlinear Di-
mensionality Reduction on Manifold. In: Yang, Q., Webb, G. (eds.) PRICAI 2006.
LNCS (LNAI), vol. 4099, pp. 444–453. Springer, Heidelberg (2006)
8. Wei, L., Zeng, W., Wang, H.: K-means Clustering with Manifold. In: Seventh In-
ternational Conference on Fuzzy Systems and Knowledge Discovery, pp. 2095–2099.
IEEE Xplore Digital Library and EI Compendex (2010)
9. Joshua, T., de Vin, S., John, C.L.: A Global Geometric Framework for Nonlinear
Dimensionality Reduction. Science 290(5500), 2319–2323 (2000)
10. Lawrence, S., Sam, R.: Think Globally, Fit Locally: Unsupervised Learning of Low
Dimensional Manifolds. Journal of Machine Learning Research 4, 119–155 (2003)
A Bit-Chain Based Algorithm
for Problem of Attribute Reduction
1 Introduction
Most of recent research on attribute reduction fall into four categories: the reduction
based on discernibility matrix, the reduction based on positive region, the reduction
based on random strategy and the reduction on special information systems.
The discernibility matrix is used to compute reductions and the core of dataset
[20]. Simplifying the discernibility function retrieved from discernibility matrix is the
most basic method to reduce superfluous attributes. There are much research on
discernibility function from simplicity such as Johnson’s algorithm and Genetic
Reducer [3] to complexity such as the Universal reduction [19]. Besides, some
scholars focus on changing discernibility matrix [9][12][17].
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 423–431, 2012.
© Springer-Verlag Berlin Heidelberg 2012
424 T.-T. Nguyen, V.-L.H. Nguyen, and P.-K. Nguyen
The positive region measures the positive of attributes based on coefficients and
reducing bad attributes. There are a lot of research based on this technique. A Greedy
Reduction Algorithm based on Consistency [18] and Distance Measure Assisted
Rough Set Attribute Reduction [11] are samples. According to another study,
researchers divided the entire dataset into subsets called topology and computes
measure of significance of attributes to reduce redundant attributes in an information
system where a large amount of data have to be processed [4]. A theoretic framework
based on rough set theory, called positive approximation, was also introduced [6]. In
addition, there are many papers proposed coefficients reflecting the important degree
of each attribute [10][15][16], especially entropy and fuzzy entropy.
The random strategy improves time efficiency and space efficiency, hard problems
of attribute reduction algorithm. Genetic Algorithm is the typical algorithm of random
strategy. Many reduction techniques are based on this algorithm [2][3][13]. Moreover,
dynamic programming is a useful technique to improve attribute reduction problem. It
permits to explore the optimal sets of significant attributes and reduces the process
complexity [7].
In reality, most of information systems are incomplete and incorrect. As an
indispensable result, maximal tolerance classification was invented to classify
incomplete information systems and reduce superfluous attributes [1]. One of other
special information systems used to solve the incomplete and incorrect status is object
oriented information system. Besides, researchers used an alternative approach based
on AND and OR operations in multi-soft sets [5]. To contribute to treatments for the
problem, fuzzy-rough reductions or fuzzy-rough feature selection is created and also a
noticeably concept. Fuzzy entropy is applied to reduce attributes in this fuzzy-rough
set [15].
In addition, there are still some special methods to reduce number of attributes
such as Binary Conversion [8] or concept of lattice, an efficient tool for knowledge
representation and knowledge discovery [14].
All contributions will be welcome, however small. According to that point of view,
this paper introduces a new approach to simplify the discernibility function.
Mathematical model on binary strings is presented and the maximal random prior
forms set of binary strings is defined as a basis for finding a reduction form of the
discernibility function.
2 Formulation Model
Definition 1 (bit-chain): < a1a2 … am > (for ai ∈ {0,1}) is a m-bit-chain. Zero chain is
a bit-chain with each bit equals 0.
Definition 2 (intersection operation ∩ ): The intersection operation ∩ is a dyadic
operation in bit-chains space.
< a1a2 … am > ∩ < b1b2 … bm > = < c1c2 … cm >, ai, bi ∈ {0,1}, ci = min(ai,bi)
Definition 3 (cover operation ): A bit-chain A is said to cover a bit-chain B if and
only if with every position having bit-1 turned on in B, A has a corresponding bit-1
turned on.
Let A = < a1a2 ... am >, B = < b1b2 ... bm >, (∀bi=1..m | (bi = 1) → (ai = 1)) ⇒ A B
A Bit-Chain Based Algorithm for Problem of Attribute Reduction 425
3.1 Idea
FIND_MaximalRandomPriorSet
Input: m-bit-chains set S
Output: maximal random prior set P
1. P = ∅;
2. for each s in S do
3. flag = False;
4. for each p in P do
5. temp = s ∩ p;
6. if temp <> 0 then//temp differs from zero chain
7. replace p in P by temp;
8. flag = True;
9. break;
10. end if;
11. end for;
12. if flag = False then
13. P = P ∪ {s};//s becomes ending element of P
14. end if;
15. end for;
16. return P;
together with pi, create a new maximal random prior form, we terminate
intersection operations between sk+1 and remaining elements in P (line 9).
• If all intersection operations between sk+1 and each element in P return
zero chain, it means sk+1 does not cover any element in P. Thus, the
element sk+1 is form δ – {sk+1}, then sk+1 is inserted into P (line 13).
In both cases, we receive the set P satisfing the properties of the maximal random
prior set of S. So, Theorem 2 is correct when S has k + 1 elements.
In conclusion: FIND_MaximalRandomPriorSet algorithm can find out the maximal
random prior set P of a bit-chains set S with a given order.
The maximal random prior set P is useful in solving and reducing Boolean algebra
functions. One of the most important applications of the set P is finding out a solution
of attribute reduction problem in rough set.
In rough set theory, information system is a pair (U, A), where U is a non-empty finite
set of objects and A is a non-empty finite set of attributes. A decision system is any
information system of the form (U; A ∪ {d}), where d ∉ A is decision attribute.
x1 x2 x3 x4
x1 ∅ ∅ ∅ ∅
x2 b,d ∅ ∅ ∅
x3 a,d ∅ ∅ ∅
x4 ∅ a,b,c b,c ∅
4.2 The Maximal Random Prior Set and Attribute Reduction Problem
5 Trial Installation
FIND_MaximalRandomPriorSet algorithm is developed and tested on a personal
computer with specification: Windows 7 Ultimate 32-bit, Service Pack 1 Operating
System; 4096MB RAM; Intel(R) Core(TM)2 Duo, E7400, 2.80GHz; 300GB HDD.
Programming language is C#.NET on Visual Studio 2008. The results of some testing
patterns:
430 T.-T. Nguyen, V.-L.H. Nguyen, and P.-K. Nguyen
References
1. Yang, F., Guan, Y., Li, S., Du, L.: Attributes reduct and decision rules optimization based
on maximal tolerance classification in incomplete information systems with fuzzy
decisions. Journal of Systems Engineering and Electronics 21(6), 995–999 (2010)
2. Ravi Shankar, N., Srikanth, T., Ravi Kumar, B., Ananda Rao, G.: Genetic Algorithm for
Object Oriented Reducts Using Rough Set Theory. International Journal of Algebra 4(17),
827–842 (2010)
3. Sakr, N., Alsulaiman, F.A., Valdés, J.J., Saddik, A.E., Georganas, N.D.: Feature Selection
in Haptic-based Handwritten Signatures Using Rough Sets. In: IEEE International
Conference on Fuzzy Systems (2010)
4. JansiRani, P.G., Bhaskaran, R.: Computation of Reducts Using Topology and Measure of
Significance of Attributes. Journal of Computing 2(3), 2151–9617 (2010) ISSN 2151-9617
A Bit-Chain Based Algorithm for Problem of Attribute Reduction 431
5. Herawan, T., Ghazali, R., Deris, M.M.: Soft Set Theoretic Approach for Dimensionality
Reduction. International Journal of Database Theory and Application 3(2) (June 2010)
6. Qian, Y., Liang, J., Pedrycz, W., Dang, C.: Positive approximation: An accelerator for
attribute reduction in rough set theory. Artificial Intelligence 174(9-10) (June 2010)
7. Moudani, W., Shahin, A., Chakik, F., Mora-Camino, F.: Optimistic Rough Sets Attribute
Reduction using Dynamic Programming. International Journal of Computer Science &
Engineering Technology 1(2) (2010)
8. Chang, F.M.: Data Attribute Reduction using Binary Conversion. WSEAS Transactions
On Computers 8(7) (July 2009)
9. Wu, H.: A New Discernibility Matrix Based on Distribution Reduction. In: Proceedings of
the International Symposium on Intelligent Information Systems and Applications
(IISA 2009), Qingdao, P. R. China, October 28-30, pp. 390–393 (2009)
10. Lee, M.-C.: An Enterprise Financial Evaluation Model Based on Rough Set theory with
Information Entropy. International Journal of Digital Content Technology and its
Applications 3(1) (March 2009)
11. Parthaláin, N.M., Shen, Q., Jensen, R.: A Distance Measure Approach to Exploring the
Rough Set Boundary Region for Attribute Reduction. IEEE Transactions On Knowledge
And Data Engineering 22 (2009)
12. Yao, Y., Zhao, Y.: Discernibility Matrix Simplification for Constructing Attribute
Reducts. Information Sciences 179(5), 867–882 (2009)
13. Liu, H., Abraham, A., Li, Y.: Nature Inspired Population-based Heuristics for Rough Set
Reduction. In: Abraham, A., Falcón, R., Bello, R. (eds.) Rough Set Theory. SCI, vol. 174,
pp. 261–278. Springer, Heidelberg (2008)
14. Liu, J., Mi, J.-S.: A Novel Approach to Attribute Reduction in Formal Concept Lattices.
In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT
2008. LNCS (LNAI), vol. 5009, pp. 426–433. Springer, Heidelberg (2008)
15. Parthaláin, N.M., Shen, Q., Jensen, R.: Finding Fuzzy-Rough Reducts with Fuzzy Entropy.
In: IEEE International Conference on Fuzzy Systems (2008)
16. Sun, X., Tang, X., Zeng, H., Zhou, S.: A Heuristic Algorithm Based on Attribute
Importance for Feature Selection. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D.,
Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 189–196.
Springer, Heidelberg (2008)
17. Yang, M., Chen, S., Yang, X.: A novel approach of rough set-based attribute reduction
using fuzzy discernibility matrix. In: Proceedings of the Fourth International Conference
on Fuzzy Systems and Knowledge Discovery, vol. 03 (2007)
18. Hu, Q.-H., Zhao, H., Xie, Z.-X., Yu, D.-R.: Consistency Based Attribute Reduction. In:
Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 96–107.
Springer, Heidelberg (2007)
19. Tan, S., Cheng, X., Xu, H.: An Efficient Global Optimization Approach for Rough Set
Based Dimensionality Reduction. In: ICIC International, pp. 725–736 (2007)
20. Pawlak, Z.: Rough Sets. In: The Tarragona University Seminar on Formal Languages and
Rough Sets (August 2003)
SMART Logistics Chain
Arkadiusz Kawa
Abstract. Modern logistics companies today rely on advanced ICT solutions for
information processing and sharing. Access to data and information about the
demand for logistics services and supply opportunities are becoming a key
competitive factor. Unfortunately, only the largest companies can afford
advanced systems. Small and medium logistics companies have limited or no
IT-competence. Tools are therefore needed to facilitate cooperation between
smaller logistics companies, which in turn will reduce transaction costs. The
paper proposes an idea of the SMART model, which is based on agent
technology and cloud computing. It will allow easier collection and flow of
information as well as better and cheaper access to logistics management
systems.
1 Logistics Industry
Reliable and fast transportation, as well as efficient logistics services play a more and
more important role in the activities of many companies. Logistics is not only a
source of competitive advantage, but can also decide whether the firm will exist in the
market at all. Therefore, a company that is focused on its core business and does not
have adequate resources and experience in the logistics field is beginning to use the
services of an external logistics company.
Logistics service providers are companies belonging to the so-called transport,
forwarding and logistics industry. This sector covers the activities of companies of
different size, multiplicity of services and global range. It includes very large but also
small firms, offering a range of services - from simple transport services, through
service forwarding, warehousing, palletizing, packing, packaging, to full service of
supply chains. Their range of activities may comprise a region (e.g. province),
country, continent or the whole world [2].
The term “transport, forwarding and logistics industry” itself points to a
combination of more or less distinct activities in the past. Operators in the industry,
developing new skills, fall within the competence of their more or less close
competitors (carriers, freight forwarders, logistics providers). On the other hand, by
working with clients, they extend their offers with additional services, sometimes
taking their competitors’ sales and marketing functions.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 432–438, 2012.
© Springer-Verlag Berlin Heidelberg 2012
SMART Logistics Chain 433
The great problem of the logistics industry is a lack of formal semantics which
prevents automated data integration. There are not any universal solutions which let
smaller companies work together in the changing conditions and access resources,
software and information provided to computers and other devices on demand.
3 SMART Model
In relation to the above questions, one should look for ways to help tackle the
problem. A proposal for such a solution is the SMART model - Specialized, Market-
driven, Applicable, Reactive and Technologically-oriented Logistics Chain.
The idea of this model starts from the assumption that a single company’s intelligence
does not necessarily imply the system’s intelligence. In fact, through cooperation
companies should rationalize their logistics processes, obtain cost savings and reduce
empty shipments. At the moment, companies are not activating collaboration as they are
traditionally managed like “family enterprises”. This limits their ability to get potential
opportunities offered by collaboration with other actors operating in the market.
That is why there is a need to create an electronic platform which will enable
SMEs to cooperate, especially to gain access to data about logistics services and
supply capacities. One of the solutions to these problems is cloud computing with
web semantic services based on the Internet network (see fig. 1). It avoids capital
expenditure on hardware, software, and services by paying a third-party provider only
for what SMEs need to use.
Since obtaining and processing appropriate data within cloud computing is a
complex and laborious process for individual actors, this research proposal suggests
making use of a very promising agent technology. Software agent is a piece of
software that acts for a user or another program. Agents are autonomous, thus the user
can activate and disconnect them from the network, provided the agent mission is
well-defined. They are capable of modifying the way in which they achieve their
objectives and are, therefore, called smart agents. They are able to gather information
and transform it into useful knowledge. Such agents know the data processing
methods and improve them in the course of the learning process, i.e. in the course of
the system operation. Smart agents are robust (they accomplish the group’s objective
even if some members of the group are unsuccessful), self-organized (activities are
neither centrally controlled nor locally supervised), and adaptive (they respond to the
dynamically changing environment). The agent community may evolve; weaker
agents are eliminated and replaced by those better adapted to the market conditions.
The model proposed is based on a semantic web concept including such elements
as XML, RDF, and ontologies which allow efficient automatic data collection about
logistics service providers and their resources. Special logistic ontology has been
created for the needs of the project (definitions of the core elements of logistics
ontologies). The standard description of the content in cloud computing has been
introduced in order to let software agents process data and information appropriate for
their purpose and meaning. The main feature of this model is its interoperability
which will enable different information systems to cooperate, safely exchange data
with a predefined structure, and mutually use this data further in order to create
information. What is important is that the access to such cloud computing does not
require application of any specialized IT systems.
SMART Logistics Chain 435
Companies are provided with an electronic platform with unlimited availability and
safety of its use. Relevant information is collected from various entities (e.g.
providers of logistics services), filtered and aggregated on the server. Then they are
made available in a suitable form to customers.
Thanks to the use of cloud computing, better and cheaper access to the systems of
global logistics networks (such as DHL, UPS, FedEx) and other suppliers in this market
(e.g. insurance companies, petrol stations, suppliers of car parts) will be possible.
The members of the platform will be able to optimize their cost of transport, for
instance by using common transport, e.g. rail, instead of private road transport.
submits a bid to another agent ("eMarket" for example, representing the electronic stock
exchange). The buyer asks the electronic exchange environment to send proposals for
sale of up to 500 tonnes of plastic for the price of less than 75 units per tonne. The
presented proposal has its ID "bid09", which facilitates communication between agents
by referring to it in other messages. The protocol is also a reference to the ontology
which contains the specification of terms and their meanings in the field of plastics
suppliers.
(cfp
:sender (agent-identifier :name Buyer)
:receiver (set (agent-identifier :name eMarket))
:content
"((action (agent-identifier :name eMarket)
(sell plastic max 500))
(any ?x (and (= (price plastic) ?x) (< ?x 75))))"
:ontology plastic-suppliers
:reply-with bid09
:language fipa-sl)
The recipient may accept the proposed offer or reject it, e.g. because of too low
prices. If it is the first case, further communication is aimed at implementation of the
proposed action; if it is the second case, the agent continues the search and sends a
CFP message to other agents of the eMarket type. In the next step, the Buyer Agent,
who bought the product on a commodity market, would like to have a third logistics
party (3PL) to carry out the dispatch. For this purpose, it forwards the data on the
numbers of the containers and shipping routes from London to Paris to the agent
representing the server cloud computing (CC).
(request
:sender (agent-identifier :name Buyer)
:receiver (set (agent-identifier :name CC))
:content
"((action (agent-identifier :name 3PL)
(deliver containers000093-001956 from LON to PAR)))"
:protocol fipa-request
:language fipa-sl
:reply-with order00678)
In the next step, in cloud computing, a suitable 3PL that can fulfill a given task is
determined. The 3PL agent may agree on the task that the CC agent asks for or refuse.
(agree
:sender (agent-identifier :name 3PL)
:receiver (set (agent-identifier :name CC))
:content
"((action (agent-identifier :name 3PL)
(deliver containers000093-001956 (loc 49°31.607 'N 22°12.096'E)))
(priority order00678 high))"
:in-reply-to order00678
:protocol fipa-request
:language fipa-sl)
SMART Logistics Chain 437
query-ref
:sender (agent-identifier :name CC)
:receiver (set (agent-identifier :name 3PL))
:content
"((all ?x (available-service j ?x)))"
:in-reply-to order00678
:protocol fipa-query
:language fipa-sl)
In response to the above request, the 3PL agent informs that it offers transport,
storage and packing services.
(inform
:sender (agent-identifier :name 3PL)
:receiver (set (agent-identifier :name CC))
:content
"((= (all ?x (available-service 3PL ?x))
(set (forwarding service)
(warehousing service)
(co-packing service))))"
:ontology logistic-services
:in-reply-to order00678
:protocol fipa- query-ref
:language fipa-sl)
Such messages sent between the agents are numerous. However, due to their
similar structure, this paper has been limited to present the most important ones.
Acknowledgements. The paper was written with financial support from the
Foundation for Polish Science [Fundacja na rzecz Nauki Polskiej].
438 A. Kawa
References
1. Ciesielski, M.: Rynek usług logistycznych, Difin, Warsaw (2005)
2. Jeszka, A.M.: Problem redefinicji branży na przykładzie przesyłek ekspresowych.
Gospodarka Materiałowa i Logistyka (7) (2003)
3. http://www.businessmonitor.com
4. http://www.fipa.org/specs/fipa00037/SC00037J.html
5. http://www.kep.pl
6. http://research.microsoft.com/en-us/people/sriram/
rehof-cloud-mysore.pdf
7. http://www.transportintelligence.com
8. Kawa, A., Pawlewski, P., Golinska, P., Hajdul, M.: Cooperative Purchasing of Logistics
Services among Manufacturing Companies Based on Semantic Web and Multi-agents
System. In: Demazeau, Y., et al. (eds.) Trends in PAAMS. AISC, vol. 71, pp. 249–256.
Springer, Heidelberg (2010)
The Analysis of the Effectiveness of Computer Assistance
in the Transformation of Explicit and Tactic Knowledge
in the Course of Supplier Selection Process
1 Introduction
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 439–448, 2012.
© Springer-Verlag Berlin Heidelberg 2012
440 K. Werner, L. Hadas, and P. Pawlewski
2 Knowledge Transformation
Due to all of the above, knowledge is a resource that cannot be easily represented in a
simulation process and it is difficult to assist its transformation with use of IT tools.
These four groups are considered as knowledge assets, which are created, developed and
one by one transformed into the spiral SECI process, that consists of socialization,
externalization, combination and internalization [7]. According to m. Warner and M.
Witzel the transformation of knowledge is a process which is the essence of
management: knowledge resources, i.e. gathered data, analyses and inspirations, gain
ultimate value.
442 K. Werner, L. Hadas, and P. Pawlewski
The conceptual model of simulation was constructed on the basis of the methodology
of building process flow model [8], which uses elements of IDEF0 methodology.
However, the main component of the prepared conceptual model is the so-called
concept of a process object [9]. A process object (i.e. aim of a process) can be a
material product, a document (information product), decision (information product).
The suggested solution ascribes a process to an object. Therefore, the process can be
unambiguously recognized and the state of the process can be clearly determined so
that one can precisely identify the current stage of the process. It is very important to
take into account the time here, i.e. to consider the process in time. In order to identify
the current stage of a process, it is necessary to identify the place where, in the given
time, the so-called process object is present. In order to identify the process of
knowledge transformation, the so-called knowledge unit was introduced as a process
object. Further in the article, it is identified as KU. Knowledge unit is an abstract
concept which can b identified by defining the so-called evaluation set.
KU ∈ {KU1, KU2, KU3, …., KUn}
where: n ∈N.
The developed conceptual model consists of:
• a process map
• a process chart based on IDEF0 methodology
• a table describing the evaluation set KU
• components enabling and showing development directions of the model.
A process map is shown in Figure 1. The map illustrates activities in the process and
shows the flow of KU through the process.
The model of mass support was selected for the implementation of a given activity.
In the mass support system a demand occurs (in this case, it is KU) along with a demand
for support. The system of massive support reacts by meeting the requirement if
possible. Otherwise it retains the demand until the right moment and, consequently,
queues appear in front of operating stations. The queues are formed by various tasks. At
a given moment, they are represented by one specific operation which is to be
performed at the given station. The queue is a set of specific individual operations that
belong to various tasks waiting in front of an operating station. The characteristics that
describes the mass support models refers to the following:
• stream of demands
• service process
• the way a queue is serviced.
The stream of demands is described with use of a time interval between consecutive
demands. The interval may be constant. In case, the demands are random, the interval is
also a random variable and its probability function should be determined. The
cumulative probability functions, that determine the probability that a time interval is
larger than a certain time value, are commonly used. In the analyzed case, the stream of
demands is passive, which should be interpreted as an infinite queue of demands which
are selected for the first operation as soon as possible.
444 K. Werner, L. Hadas, and P. Pawlewski
4 Simulation Experiments
The simulation referred to the process of supplier selection on the basis of available
knowledge about offered products. The simulation was restricted to 1600 man-hours, so
as to reflect effective annual standard hours of a single worker. The simulation was an
experiment and that is why, the analyses were restricted to studying the process
efficiency. In order to thoroughly analyze the whole process, it would be necessary to
imitate the functional division of labor in the investigated company.
Knowledge units (KN) were generated only at the beginning of the process. While
analyzing the process time, in order to, e.g. obtain the missing data, it turned out that it
is relatively long and it may become a bottleneck. In reality, there may be several
operating stations that implement the same tasks. Then, the multi-streaming character of
The Analysis of the Effectiveness of Computer Assistance 445
the process will restrain it from becoming a bottleneck. The FIFO rule (first in, first out)
is kept for input and output of the process, because the analysis of the plans
implementation was not the objective of the investigation. Before each input and output,
a queue of KN was formed and then an automatic machine took the first KN, that had
been waiting for the longest time.
The experiments were carried out for the version with a traditional procedure and for
the one that is computer-assisted. For comparison purposes, for both versions the same
streams of random numbers were used. They were changed when the experiment
number was changed.
Total lead time (buffer and action) - of individual tasks Total time buffer - D0-D18 path -
without the computer support
50ON
50
49ON Total lead time (buffer and action) -
49 D0-D18 path - without the computer
48ON support
48
47ON Total time buffer - D0-D18 path -
47 withthe computer support
46ON
46
45ON Total lead time (buffer and action) -
45 D0-D18 path - with the computer
44ON
support
44
43ON Total time buffer - D0-D17.2 path -
43
without the computer support
42ON
42
41ON
Total lead time (buffer and action) -
41
40ON D0-D17.2 path - without the
40 computer support
39ON
39 Total time buffer - D0-D17.2 path -
38ON with the computer support
38
37ON
37 Total lead time (buffer and action) -
36ON D0-D17.2 path - with the computer
36 support
35ON
35 Total time buffer - D0-D15 path -
34ON without the computer support
34
33ON
33 Total lead time (buffer and action) -
32ON D0-D15 path - without the computer
32
support
31ON
31 Total lead time (buffer and action) -
30ON
D0-D15 path - with the computer
30
29ON support
29
Total lead time (buffer and action) -
28ON
28 D0-D15 path - with the computer
27ON support
27
26ON Total time buffer - D0-D19 path -
Numer JW
LT[min]
Fig. 2. Comprehensive analysis of residence time In the buffers, lead time of an operations and
time of residence In the system with ON and without ON for all knowledge unit.
The Analysis of the Effectiveness of Computer Assistance 447
6 Conclusions
References
[1] Boiral, O.: Tacit Knowledge and Environmental Management. Long Range Planning 35,
296 (2002)
[2] Cempel, C.Z.: Nowoczesne Zagadnienia Metodologii i Filozofii Badań, ITE Radom,
Poznań, p. 62 (2005)
[3] Fertsch, M., Pawlewski, P.: Comparison of process simulation software technics,
Modelling of modern enterprises logistics. In: Fertsch, M., Grzybowska, K., Stachowiak,
A. (eds.), Monograph, p. 29, 39. Publishing House of Poznań University of Technology,
Poznań (2009)
[4] Grudzewski, W.M., Hejduk, I.K.: Zarządzanie wiedzą w przedsiębiorstwach, Difin,
Warszawa, p. 22 (2004)
[5] Law, A.M., Kelton, W.D.: Simulation Modeling and Analysis, p. 13. McGraw-Hill, New
York (2000)
[6] Mikuła, B.: Zarządzanie wiedzą w organizacji, Podstawy zarządzania przedsiębiorstwami
w gospodarce opartej na wiedzy, Difin, Warszawa, p. 113 (2007)
[7] Mikuła, B., Pietruszka-Ortyl, A., Potocki, A.: Zarządzanie przedsiębiorstwem XXI
wieku, Difin, Warszawa, p. 69–72, 129 (2002)
[8] Pawlewski, P.: Budowa modelu przepływu procesu, Instytut Inżynierii Zarządzania.
Politechnika Poznańska, pp. 1–10 (2006), http://www.wbc.poznan.pl
[9] Pawlewski, P.: Metodyka Modelowania Dynamicznych Zmian Struktury Zasobowej
Procesu Produkcyjnego w Przemyśle Budowy Maszyn. Rozprawa habilitacyjna –
maszynopis, Poznań, p. 97 (2011)
[10] Rojek, M.: Wspomaganie procesów podejmowania decyzji i starowania w systemach o
różnej skali złożoności z udziałem metod sztucznej inteligencji, Wyd. Uniwersytetu im.
Kazimierza Wielkiego, Bydgoszcz, p.4 (2010)
[11] Tzung-Pei, H., Cheng-Hsi, W.: An Improved Weighted Clustering Algorithm for
Determination of Application Nodes in Heterogeneous Sensor Networks. Journal of
Information Hiding and Multimedia Signal Processing 2(2), 173–184 (2011)
[12] Wang, T., Liou, M., Hung, H.: Application of Grey Theory on Forecasting the Exchange
Rate between TWD and USD. In: International Conference on Business and Information,
Academy of Taiwan Information System Research and Hong Kong Baptist University,
Hong Kong, July 14-15 (2005)
Virtual Logistics Clusters – IT Support for Integration
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 449–458, 2012.
© Springer-Verlag Berlin Heidelberg 2012
450 P. Golinska and M. Hajdul
The paper presents the creation of a comprehensive tool supporting the consolidation
of transport processes for group of independent companies. The aim of cooperation is
optimizing transport costs resulting from economies of scale achieved through
sustainable exploitation of resources in the group. The benefit of co-operation is also
reducing the cost of staff associated with the organization of transport by transferring
some responsibilities to a coordinator of transport.
The main task for the reference model is to stimulate cooperation between group
of manufacturing (users) and transport enterprises (providers). The concept is based
on idea of sustainable development. Figure 1 presents reference model.
Cooperating entities shall exchange information electronically via dedicated electronic
platforms. In the reference model three groups of players are defined (see fig. 1):
Fig. 1. Reference model for transport process performed by virtual cluster [3]
452 P. Golinska and M. Hajdul
• Users of transport services – companies that are engaged in the production and/or
selling of the products. Logistics is not their main source of business. These
companies may have their own means of transport, logistics infrastructure or they
cooperate with the providers of logistics services. They issue the demand for
transportation. They order the execution of logistics operations in most of the
cases by issuing order to providers of services.
• Provider of transport services – companies whose core business is the provision of
logistics services. Their task is the coordination of orders issued by logistics users.
In the case where one of the cooperating firms has its own facilities, e.g. means of
transport and is able to provide transport services for the other transport users,
then it performs also the role of transport services providers.
• Coordinator – represents the users and deals with the coordination of logistics
processes (e.g. analysis of the possible aggregation of the transport orders issued
by the different users, price negotiations, and the choice of modes of transport).
Developed reference model takes into consideration [4]:
• assumptions for the European transport policy development,
• co-modality of transport processes,
• one common standard in transport & logistics data exchange which has been
developing by several EU research projects such as FREIGHTWISE, e-Freight,
INTEGRITY, Smart-CM, SMARTFREIGHT, DiSCwise, COMCIS, iCargo,
• correlation between logistics system of the enterprise and the
regional/national/continental transport systems,
• traceability issue, like GS1 [5],
• possibility of using particular computing tools and information exchange
techniques.
Within the framework of the model the following instruments stimulating cooperation
between enterprises need to be addressed during implementation phase:
• methodology enabling estimation of potential savings and profits that derive from
cooperation among enterprises in logistics processes organization,
• legal framework for cooperation between shippers, logistics service providers and
co-ordinator (orchestrator),
• methodology that assure common planning of logistics processes in a group of
enterprises taking into consideration relation of a trade-off between transport,
inventory management as well as warehousing processes,
• methodology that assures common planning of transport processes in a group of
enterprises in compliance with a trade-off relation between micro scale
(enterprise) and macro scale (region),
• exploitation of existing e-platforms (T-Scale, Logit 4SEE) that support
interoperability and harmonizing logistics processes in group of enterprises and as
a result allows joint organization of haulages. Furthermore the e-platforms are to
support standardized data exchange process thereby enabling cooperation between
users and transport service provider for all modes.
Virtual Logistics Clusters – IT Support for Integration 453
The pilot survey was con nducted among group of 40 companies. The compannies
represent both transport ussers and transport providers including logistics operattors
and forwarders. First grou
up of questions (see fig.2) concerns the present situattion
regarding the application of IT tools as well as willingness to implement nnew
platform.
Fig. 2. Positive an
nswers regarding the IT support of transport processes
In the analyzed group 54% 5 of companies have IT systems which support tthey
everyday operations by plaanning and organization of transport processes, but still the
companies see the need to use the integration platform for cooperation amoong
transport providers and trannsport users (80% positive answers). Over 90% of transpport
providers (including logistiics operators) were interested in selling they services via
integration platform. In the open question regarding factors influencing the
willingness to use the platfo
orm the most popular answers were:
• cost reduction,
• better resource utilization,
• flexibility,
• reduction of time need ded for communication,
• less errors occurring due to lack of information or inappropriate informattion
exchange.
Additional interviews werre conducted with 18 transport users. The answers are
presented in figure 3.
454 P. Golinska and M. Hajdul
H
At present the analysed companies don’t use very often common communicatiions
standards (30% of positiv ve answers). Most of the companies use electronic ddata
exchange. In the open queestion the respondents have identified the main barrriers
which appear by implementtation of communications standards:
Most of transport users haave declared that common communication standards are
needed. The main advan ntages of implementation of communications standaards
defined by users are:
Additional interviews weree conducted with 22 transport providers. The answers are
presented in figure 4.
Virtual Logistics Clusters – IT Support for Integration 455
At present the analysed companies don’t use very often common communicatiions
standards (30% of positivee answers). Most of the analysed transport providers use
electronic data exchange (c.a.60%). In the open question the respondents hhave
identified the main barriers which appear by implementation of communicatiions
standards. From transport providers’
p point of view the typical barrier is:
• high cost of implemeentation,
• long time needed forr implementation,
• human factor ( difficculties with the staff training),
• technical difficultiess (system errors) in the initial stage of implementation.
Most of transport provideers (94%) have declared that common communicattion
standards are needed. Thee main advantages of implementation of communicatiions
standards defined by provid
ders are:
• reduction of errors by
y order fulfilment,
• cost reduction,
• reduction of time needded for communication,
• transparency of inform
mation flow,
• access to information in real-time.
In the next section authorss describe the integration tool called T-Scale, which heelps
the companies to form virtuual transport clusters.
456 P. Golinska and M. Hajdul
3 Integration Tool
T-Scale is an IT tool which is enabling the exchange of information and plan missions
in real-time between actors involved in the carriage (transport user, the provider of
transport services, coordinator). Cooperation of independent companies is also a way
to increase the availability of the loading capacities in trucks. Another advantage is
the reduction of traffic, thus reducing the negative impact of road transport on the
environment.
Unlike the existing transport’s e-market the tool gives the opportunity for
coordination and consolidation of orders, thus optimizing transport costs arising from
the achieved economies of scale. It increases trust and security resulting from
cooperation in a closed group of companies and monitoring of cooperation by the
independent entity (virtual cluster). Figure 5 presents the simplified information flow
within the virtual cluster.
The aim of the tool is to manage the transport fleet within the cluster in order to
eliminate empty routes on the way back and/or consolidate transport demand to joint
location.
The initial test of proposed tool was carried out in pharmacy and FMCG (fast
moving consumer goods) sector within DiSCwise project, founded by DG Enterprise,
European Commission in 2010-2011. The project aims to Develop, Demonstrate and
Deploy a Reference Architecture for Interoperability in the Transport and Logistics
Sector in an effort to achieve:
• integration of small and medium sized transport service providers into efficient
door-to-door supply chains at cost affordable to them,
• facilitating a more sustainable transport for users to select environment-friendly
alternatives.
The initial test was performed in virtual pharmacy cluster in Poland. The aim was to
support cooperation between suppliers, wholesalers, logistics service providers in
Virtual Logistics Clusters – IT Support for Integration 457
order to reduce transport costs through reduction of empty runs and aggregation of
deliveries of independent companies. One of the major constraints was not to decrease
customer service level.
Reduction of transport costs by increasing the level of exploitation of the load
space, while maintaining the required frequency of supply. Combining volumes of
deliveries carried out by only two companies in exactly the same days, for exactly the
same location has enabled cost savings achieved at the level of the ca. 18% of the
costs incurred so far. Performed analysis showed that the great potential for reducing
costs by means of a common organisation of the supply is in:
• aggregating volumes of deliveries regionally,
• defining common customers’ locations by region, connecting the supply to the
common location in the few days period-negotiating conditions for the supply
of customers in joint locations.
First tests were carried out on a small sample representing group of four distributors.
Before implementation of new organisational solution, companies did not cooperate at
any level, but had deliveries to the same clients or clients located very close to each
other. In analysed case illustrating average day, each of the chemists ordered selected
amount of products, which were delivered in special boxes (ca. 30), dedicated and
standardized for pharmacy sector. Due to application of IT support for forming virtual
transport cluster the number of deliveries was limited at the same time the required
customer service level was secured. It is worth paying attention to is that number of
trucks is reduced what decreases road traffic congestion. Therefore, the solution
eliminates disadvantages of the traditional method of transport process organization.
4 Conclusions
Presented in the paper cooperative business model for virtual logistics cluster requires
sharing knowledge and information along the supply chain, according to the common
and easy to understand language – standards for data exchange. To achieve this, the
information and communication systems used for managing transport and logistics
operation must be interoperable and the actors need to be able to share that
information according to their own business rules. One of the major challenges in
development of new virtual logistics clusters is how to create solution which would be
able to change this situation. Luckily, almost unlimited access to the Internet makes
possible cooperation in the area of transport process not only between big company
and also SMEs. The lack of consistency of business process performed by particular
entities and the variety IT systems used by companies, cause problems with automatic
partners networking. The presented by authors approach provides framework for
planning and coordination of transport according to the concept of co-modality. The
tool helps to form virtual clusters in order to reach common goal of cost effectiveness
as well as the sustainable development goals (congestion reduction).
References
1. Porter, M.: Clusters and the New Economics of Competition. Harvard Business Review,
77–90 (November-December 1998)
2. Pedersen, T.J., Paganelli, P., Knoors, F.: One Common Framework for Information and
Communication Systems in Transport and Logistics. DiSCwise Project Deliverable,
Brussels (2010)
3. Golinska, P., Hajdul, M.: Multi-agent Coordination Mechanism of Virtual Supply Chain. In:
O’Shea, J., Nguyen, N.T., Crockett, K., Howlett, R.J., Jain, L.C. (eds.) KES-AMSTA 2011.
LNCS, vol. 6682, pp. 620–629. Springer, Heidelberg (2011)
4. Hajdul, M.: Model of coordination of transport processes according to the concept of
sustainable development. LogForum 3(21), 45–55 (2010)
5. GS1 standards in transport and logistics, GS1 Global Office, Brussels (2010)
Supply Chain Configuration in High-Tech Networks
Abstract. The supply chain configuration has recently been one of the key
elements of supply chain management. The complexity of the relations and
variety of the aims of their particular members cause it to be very difficult to
build a supply chain effectively. Therefore, finding a feasible configuration in
which both the business network and the company can achieve the highest
possible level of performance constitutes a problem. Authors proposed the
SCtechNet model based on graph theory, business network concept and the
competitiveness indicator that helps to solve this problem by dynamic
configuration of supply chains. The simulation results based on proposed model
are presented and discussed.
1 Introduction
In the face of globalization and the development of the knowledge-based economy,
the question arises if changes in the global market environment contribute to the
creation of new potential determinants of company competitiveness. Such new
determinants may be network relationships and business networks. Firms may find
participation in network relationships and business networks an essential determinant
for developing competitiveness and the ability to create and use such relationships
may become a necessity that will ensure their success.
According to the network approach consistent with the main IMP Group research
[12] thread, a business network constitutes a collection of long-term formal and
informal relationships (direct and indirect) which exist between two or more entities
[8]. Within the considered framework, a system of links often is characterized as
being decentralized and informal.
The principle of strategic equality of business entities diverges a great deal from
economic reality. Often, it is possible to identify a firm(s) which plays a dominant
role within the framework of linked entities in this respect. Firms, with increasing
frequency, consciously create business networks concentrated around them. These
types of relationships illustrate the strategic approach of the development of network
links. The strategic approach [3, 9] stresses the active and conscious development of a
network of relations and the presence of one main entity (flagship company)
intentionally building a strategic network. The main characteristic of relations
between the partners of a network is the asymmetric and strategic control exercised by
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 459–468, 2012.
© Springer-Verlag Berlin Heidelberg 2012
460 A. Kawa and M. Ratajczak-Mrozek
2 Previous Work
This article continues the research into the development of competitive advantage and
competitiveness among high-tech firms as a result of network relationships presented
at [11]. An empirical analysis of 74 Polish firms was carried out during the research.
The results confirmed that vertical links are more often developed among Polish high-
tech firms (buyers and suppliers) than horizontal links (between competitors or
institutions of research or education). Generally, firms continue to cooperate with
each other within the framework of the supply chain increasingly often. At the same
time, high-tech firms with an identified competitive advantage (the better firms) more
intensively utilized the relations established within the supply chain.
A firm’s competitive advantage is treated as a relative measure of the quality of a
firm’s operations and is defined through the prism of the relative differences in
financial and non-financial performance (performance differentials) with respect to
the achievements of the closest competitors. Competitive advantage is studied based
on a consolidated formula including total income, total sale, return on investment
Supply Chain Configuration in High-Tech Networks 461
(ROI), and market share. Due to difficulties in comparing companies of various sizes
or operating in different markets, a subjective method of assessing their activities in
comparison with their closest competitors that is based on a relative assessment of the
enterprises themselves was adopted. The 5-point Likert scale was used for the
assessment. The respondents, by answering the questions posed in the questionnaire
relating to four of the effects of performance (total income, total sale, ROI, market
share) were to provide their own self-assessment in relation to the closest competitors
(5 – considerably worse, 4 – worse, 3 – almost the same, 2 – better, 1 – considerably
better)1. The application of such an evaluation method facilitates the comparison of
results with those of other firms with different business characteristics. The adaptation
of this evaluation method is based upon the earlier experiences of Fonfara [5].
The analysis of firms’ performance in four areas with respect to the closest
competition served to construct a competitiveness indicator (CI). This indicator was
defined as the average of the four results – overall profit, market share, sales volume
and ROI. In the subsequently presented deliberations, the lower the positive deviation
of the indicator from the value of 3 (remembering that in the adopted scale, the value
of 3 indicated “almost the same”) the greater the advantage.
The statistical analysis which was carried out in two stages confirmed the adopted
approach and the justification for linking the four defined elements of a firm’s results
in to one indicator.
3 SCtechNet Model
The choice of contractors for the transaction is usually guided by the criteria of resources
availability and price competitiveness. Typically, an enterprise inquires several subjects
individually and then chooses the most attractive offer. Unfortunately, it is a very time-
consuming task which requires efficient data interchange and analysis. Therefore, there is
a need for solutions which will make it possible to currently browse, analyze and choose
the best of all available offers meeting predefined requirements in a short time. One such
solution is definitely the SCtechNet model which allows a data flow within the whole
business network.
The SCtechNet model (Supply Chain Configuration in High-Tech Networks) is a
development of DyConSC model [7, 10] and it is extended here with the business
network concept and the competitiveness indicator. It is mainly aimed at building
dynamic and flexible temporary supply chains within the business network. The
SCtechNet enables each entity of the supply chain to independently adjust their plans
in such a way that they become optimal both within one high-tech enterprise and the
whole supply chain.
This model is based on service-oriented architecture (which grants system
interoperability) and the graph theory. All IT systems of enterprises are interconnected
and enable software agents to access their content. Autonomous agents are capable of
identifying which tasks and customers’ requirements can be satisfied. Agents
representing various enterprises from particular tiers cooperate with one another,
coordinate and negotiate conditions to achieve the common goal whereas every agent
1
For the purposes of this article, the scale has been reversed.
462 A. Kawa and M. Rataajczak-Mrozek
Among the subsequent tiers a flow of goods and information about them taakes
place. All supplies are conducted sequentially so no tier can be omitted. A fl flow
(edges, according to graph theory) of goods in certain quantities takes place betw
ween
Supply Chain Configuration in High-Tech Networks 463
the entities (nodes) in business network. The criteria for the choice for the preceding
entity comprise the price of products or services and the competitiveness indicator
(CI). The first criterion is a precondition. If the price meets the requirements, CI is
comparable in the next step. CI facilitates finding the best flow (of the lowest values)
with an appropriate capacity. It boils down to the minimum cost flow problem,
included in the aforementioned graph theory and described below.
Let us assume that in network G (enterprise network) a flow of value θ from s
(source, i.e. factory) to t (sink, i.e. FC) is sought. Value θ represents the total demand
of FC. The costs dij of sending a flow unit along an edge (i, j) and maximum
capacities cij of particular edges are defined, too. Additionally, let us mark the flow fij
over edges (i, j). The notion of minimum cost flow may be formulated in the
following way [4]:
The sum must be minimized ∑ , d f ,
The assumptions are as follows:
1. For each edge (i, j) in network G
0 ≤ fij ≤ cij,
2. For node (entity) s
∑ f ∑ ,f θ,
where in ∑ f all edges entering node j are summed and in ∑ f all edges
going out of node i are summed.
3. For node (entity) t
∑ f ∑ f θ,
4. For the remaining nodes (entities) j
∑ f ∑ f 0.
Although the model presented above describes the task of linear programming, solving it
by general liner programming methods is ineffective due its network structure. In this
case the Busacker-Gowen (BG) algorithm is helpful which is presented in [4]. This
method consists in increasing the flow along consecutive paths augmenting as much as
their capacity allows. The order of appointing paths depends on their length which, in this
case, is determined by unit costs. If the flow has achieved value θ, computing finishes.
Otherwise, the network is modified in such a way that the flow found so far is taken into
account. In the residual network (G*) the cheapest path from s to t needs to be found and
the greatest number of units is send along it. These two stages are repeated until the flow
of the predefined value θ is accomplished or until the current network contains the path
from s to t.
In order to find the cheapest chain from the source to the sink the algorithm of
finding the shortest paths must be applied. The SCtechNet model has used the BMEP
(Bellman, Moore, d’Escopo, Pape) algorithm (see more in [4]).
The arrangement of the network is randomly generated. In the first step, the
requirements for the price are checked, i.e. the level of price cannot be higher than 30
euro per unit. If the nodes do not have any corresponding connections (because the
condition is not met) with their predecessors and/or sequential constituents, they are
eliminated from the network. In this case the link between them is not created.
However, it does not mean that those nodes will not be taken into account in other
business networks.
There is only one flagship company in each network. It is worth to emphasize that
different FCs may compete with others from other business networks. The quantity of
entities in the second, third and fourth tier is assumed to range from 10 to 100. This
number can be increased or decreased with the slider (nodes number). The entities are
represented by agents which interact and communicate, e.g. request current
production capability, product price, product quantity etc.
Other parameters (increased or decreased with the sliders, too) are as follows:
_ SCD (supply chain demand) – demand of the flagship company which equals
the whole supply chain demand (it changes from 1 000 to 10 000 units);
_ SI (supply indicator) – factor of supply changeability of particular entities
excluding flagship company (it changes from 20% to 100%).
The properties of link agents between constituents were chosen randomly as a pair of
competitiveness indicator (CI) and capacity (CA). The first one is between 1.0 and 5.0.
This indicator has been described above (cf. section 2). The CA is established as a
variable according to the following procedure: SCD * SI + random (SCD * SI + 1). For
example, if SCD = 50 and SI = 0.5, then CA amounts to not less than 25 and not more
than 50.
The FC demand can be completely or partially satisfied. It depends on the quantity
supply of the preceding constituents. Similarly, the demand of enterprises from the
2nd tier can be fulfilled completely or partially and also depends on the supply of the
entities from the 3rd tier, and so on.
In order to find the best supply chain (so the shortest path in the graph), the BG and
BMEP algorithms are activated (cf. Section 3). In the network, there can be at least
one such chain. Their number depends on the supply and demand changeability.
The main aim of the simulations carried out was to study how the changes of the
number of nodes and the supply indicator influence the supply chains number and the
average level of the competitiveness indicator along the business network.
In the first simulations, the number of entities in a particular tier was changing
from 10 to 100, incrementing each time by 10 (simultaneously, this number in other
tiers was stable and equaled 20) on the assumption that SCD = 10 000 and SI = 20%.
They were run 1 000 times for each case which gives a total of 10 000 times.
The findings of the simulations show that as the nodes number augments, the
average number changes (see fig. 2). For example, if the assemblers number increases
10 times (from 10 to 100 nodes), the average supply chains number decreases only by
12%, but in the case of factories it rises by 3%. It may, then, be concluded that the
average number of supply chains is not considerably influenced by the change of the
number of nodes in particular tiers.
Supply Chain Configuration in High-Tech Networks 465
6,2
6,1
6
5,9 Assemblers
5,8
Suppliers
5,7
Factories
5,6
5,5
5,4
10 20 30 40 50 60 70 80 90 100
Fig. 2. Influence of enterprises number change in particular tiers on supply chains number in
the network
It is different, though, in the case of the average level of the whole competitiveness
indicator (CI), in which changes are more visible (see fig. 3). When there is an increase in
the nodes number, CI decreases by 15% on average, but for assemblers the fall is greatest
and reaches 24%. Thus, the growth of the number of nodes in particular tiers causes the
average CI along a supply chain to diminish. It can be explained by the following
dependence: the more suppliers there are in a given tier, the higher the competitiveness
among them is and the better the conditions become for the final customers. The CI of
each tier tends towards 1.0, so the CI of the whole supply chain tends towards 3.0. Figure
3 shows one more correlation – the greater the nodes number in the vicinity of the
flagship company (so assemblers, suppliers, and factories successively), the lower the CI.
5
4,9
4,8
4,7
4,6
4,5
4,4
4,3 Assemblers
4,2
4,1 Suppliers
4
3,9 Factories
3,8
3,7
3,6
3,5
3,4
3,3
10 20 30 40 50 60 70 80 90 100
Fig. 3. Influence of enterprises number change in particular tiers on the average competitiveness
indicator in the business network
Next, the impact of the variation of the supply indicator (so the capacity of enterprises)
on the supply chains number and the competitiveness indicator was studied, on the
466 A. Kawa and M. Ratajczak-Mrozek
assumption that the nodes number is stable and amounts to 20 and SCD = 10 000. In the
simulations, the variable supply indicator was shifted by 20 percentage points, from 20%
to 100%. The simulations carried out show that the supply chain number falls sharply
from 13 to 1, i.e. about 92% (see fig. 4). It is interesting that the supply indicator raised
twice (i.e. from 10% to 20%) causes the average supply chains numbers, which can
satisfy the FC demand more quickly, to drop from 13 to 6, i.e. 53%. As a result of the
increase of the supply indicator from 10% to 100%, CI declines by 22% (see the red line
in fig. 5). The greatest drop (about 11%) takes place in the first step (from 10% to 20%).
It may be stated on the basis of the analysis of the data from this simulation experiment
that it is more profitable to collaborate with enterprises which have greater capacities and
can offer greater supply (assuming that the nodes number is stable). It reduces the
number of supply chains.
14
13
12
11
10
9
8
7
6
5 sc-num
4
3
2
1
0
Fig. 4. Influence of factor of supply changeability on the supply chains number in the business
network
4,9
4,8
4,7
4,6
4,5
4,4
4,3
4,2
4,1
4,0 avgCI-num10
3,9
3,8 avgCI-num50
3,7
3,6
3,5 avgCI-num100
3,4
3,3
3,2
3,1
3,0
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
In the last part of the simulations, the impact of the variation of the supply
indicator on CI was investigated, having presumed that the nodes number is stable
and amounts to 50 (see the blue line in fig. 5), and then to 100 (the black line). Here,
the correlations are the same as above – the growth of the number of nodes in tiers
causes the average CI along a supply chain to diminish.
5 Conclusion
Acknowledgements. The paper was written with financial support from the
Foundation for Polish Science [Fundacja na rzecz Nauki Polskiej].
References
1. Campbell, A.J., Wilson, D.T.: Managed Networks. In: Iacobucci, E. (ed.) Networks in
Marketing. Sage Publications, USA (1996)
2. Chiu, M., Lin, G.: Collaborative supply chain planning using the artificial neural network
approach. Journal of Manufacturing Technology Management 15(8) (2004)
3. D’Cruz, J.R., Rugman, A.M.: Developing international competitiveness: the five partners
model. Business Quarterly 8(2) (1993)
4. Deo, N., Kowalik, J.S., Sysło, M.M.: Discrete Optimization Algorithms with Pascal
Programs. Prentice-Hall Inc., Englewood Cliffs (1983)
5. Fonfara, K.: A typology of company behaviour in the internationalisation process (a network
approach). In: 24th IMP-Conference in Uppsala, Sweden (2008)
6. Ford, D., Gadde, L.E., Håkansson, H., Snehota, I.: Managing business relationships. Wiley
(2003)
7. Fuks, K., Kawa, A., Wieczerzycki, W.: Dynamic Configuration and Management of
e-Supply Chains Based on Internet Public Registries Visited by Clusters of Software
Agents. In: Mařík, V., Vyatkin, V., Colombo, A.W. (eds.) HoloMAS 2007. LNCS (LNAI),
vol. 4659, pp. 281–292. Springer, Heidelberg (2007)
468 A. Kawa and M. Ratajczak-Mrozek
8. Håkansson, H., Snehota, I.: No business in an island: the network concept of business
strategy. Scandinavian Journal of Management 5(3) (1989)
9. Jarillo, J.C.: Strategic networks. Creating the bordless organization, Butterworth
Heinemann (1995)
10. Kawa, A.: Simulation of Dynamic Supply Chain Configuration Based on Software Agents and
Graph Theory. In: Omatu, S., Rocha, M.P., Bravo, J., Fernández, F., Corchado, E., Bustillo, A.,
Corchado, J.M., et al. (eds.) IWANN 2009, Part II. LNCS, vol. 5518, pp. 346–349. Springer,
Heidelberg (2009)
11. Ratajczak-Mrozek, M.: Sieci biznesowe a przewaga konkurencyjna przedsiębiorstw
zaawansowanych technologii na rynkach zagranicznych, Wydawnictwo Uniwersytetu
Ekonomicznego w Poznaniu, Poznań (2010)
12. http://www.impgroup.org
Identification and Estimation of Factors Influencing
Logistic Process Safety in a Network Context
with the Use of Grey System Theory
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 469–477, 2012.
© Springer-Verlag Berlin Heidelberg 2012
470 R. Mierzwiak, K. Werner, and P. Pawlewski
factor’s value causes the growth of another factor’s value, then a character of such an
interaction is marked by + symbol and called a positive interaction. In the opposite
case, such an interaction is marked by – symbol and called a negative interaction.
It is essential to know a direction of factors’ interaction and a character of their
interaction. However, these are not all the information necessary to profound
knowledge of properties of a constructed network. Firstly, knowledge about a strength
which factors use to interact between themselves is also needed. Secondly, questions
“In what way can we influence the factors?” and “Are the factors susceptible to this
influence (what can be determined by the use of a term of controllability)?” should be
answered. To this aim, experts’ opinions need to be referred to due to a variety of
existed factors and a difficulty of gaining such an information considering costs and
measuring obstacles. Therefore, the authors propose to use a special algorithm to
determine experts’ common opinion through using Grey Relational Analysis which is
one of Grey Systems Theory’s techniques.
Grey Systems Theory was proposed for the first time in 1982 by a Chinese scientist
Julong Deng to model phenomena in which data are uncertain and incomplete and
mechanisms which rule modelled phenomena are only partially known. The theory
has remained unknown for a long period of time due to the fact that its first systematic
contribution in English language appeared just in 1989 [4] and the first English
manual was published in Europe just in 2005. Despite the difficulties in
popularisation of Grey Systems Theory, it has finally got many interesting
applications especially in technical and economic sciences [1], [2], [3], [5], [13].
A study of strength of relation between factors with the use of experts’ estimation and
a procedure of Grey Relational Analysis starts from defining a scale on which the
estimation will be done. A proposed solution is accepting a five-point scale where 1
means a very little strength of relation between factors whereas 5 is a very big and
significant strength of interaction. After accepting the scale, each of the experts
performs the estimation for every identified pair of factors among which a relation
resulting from a graphic representation of a network takes place. Mathematically, this
can be written in the following way
1 , 2 … , 1, (1)
where,
– a number of experts,
– a number of identified factors
, - numerical designation of factors which undergo estimation
where:
′
1, (4)
∑
′
– a value of nth expert’s estimation after its exposition to an average
operator
– an estimation given by nth expert
∑ - experts’ estimations average
∆ ∆ 1 ,∆ 2 …∆ (5)
where
∆ | ′ |, 1, (6)
In the succeeding step, M and m are reckoned according to formulas (7) and (8).
max ∆ (9)
∆ (10)
Identification and Estimation of Factors Influencing Logistic Process Safety 473
Values received earlier allow to creating a vector of grey relation grades for kthitem in
the following form
, … (11)
where
1, 0,1 (12)
When having grey relation grades for kth item calculated, we can calculate an average
grey relation grade that projects strength of relation between , according to a
formula presented below
Γ = ∑ ∑ (13)
When calculating Γ for each pair of factors and assuming that Γ 0for
factors for which according to a constructed network it appears that there is a lack of
interactions, we can construct a matrix in the form of the following formula
Γ (14)
Having at disposal a matrix A, we calculate for it two values and which are an
average of values appearing in the matrix’s verses and columns respectively what can
be expressed by formulas (15) and (16).
∑ Γ (15)
∑ Γ (16)
Values and allow to determining factors’ character and on their basis four
groups can be differentiated, namely
• active factors when 0,5 and 0,5. These are factors which have
influence on other factors, however, they cannot be influenced themselves.
• lazy factors when 0,5 and 0,5. These are factors which do not
influence the other factors but they can be influenced themselves.
• passive factors when 0,5 and 0,5. These are factors which do not
undergo the influence of other factors, however, they cannot influence other
factors themselves.
474 R. Mierzwiak, K. Werner, and P. Pawlewski
• crucial factors when 0,5 and 0,5. These are factors which have a
strong influence on the other factors but simultaneously they can be strongly
influenced by those factors.
A very important issue in a factors’ analysis which are included in a given network is
their estimation according to a criterion of susceptibility to controllability. Accepting
the criterion of susceptibility to control, two groups of factors can be distinguished,
namely manageable factors and non-manageable factors. The former are the factors
which managers have influence on. The latter cannot be changed by managers. A
selection procedure of the mentioned groups is similar to a procedure of determining
active, passive, crucial, and lazy factors. The difference lies in the fact that the experts
do not estimate strength of relation but a possibility to control these factors. This
estimation is also done on a ten-point scale. This way, a vector of a given factor’s
controllability’s estimations is received. On this basis, using formulas from (3) to (13)
with a little formal corrections connected with symbols in superscripts we determine
Γ grade. When we have the mentioned grade, we can claim that a factor is
non-manageable if Γ 0,5. However, a factor is considered to be manageable
when Γ 0,5.
Table 1. Stakeholders in a buying decision process and their requirements concerning safety of
a process realisation
Table 1. (continued)
i->j
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1 0,000 1,000 0,579 1,000 0,000 0,000 0,073 0,000 0,108 0,000 0,000 0,125 0,656 0,219 0,230 0,243 0,198 0,036 0,248 0,248
2 0,603 0,000 0,864 0,412 0,000 0,000 0,283 0,000 0,097 0,000 0,000 0,000 0,305 0,250 0,271 0,206 0,251 0,195 0,213 0,208
3 0,310 0,603 0,000 0,564 0,000 0,144 0,223 0,259 0,000 0,097 0,097 0,000 0,207 0,310 0,259 0,355 0,000 0,304 0,428 0,219
4 0,549 0,838 0,549 0,000 0,000 0,149 0,187 0,093 0,219 0,073 0,093 0,115 0,472 0,149 0,187 0,149 0,304 0,110 0,149 0,231
5 0,422 0,420 0,259 0,223 0,000 0,000 0,036 0,000 0,000 0,072 0,000 0,097 0,206 0,097 0,073 0,097 0,373 0,073 0,149 0,137
6 0,149 0,125 0,163 0,125 0,000 0,000 0,000 0,093 0,149 0,093 0,093 0,073 0,073 0,219 0,110 0,073 0,115 0,373 0,562 0,136
7 0,334 0,000 0,213 0,223 0,000 0,219 0,000 0,000 0,149 0,310 0,109 0,213 0,149 0,115 0,093 0,213 0,149 0,149 0,324 0,156
8 0,072 0,000 0,000 0,097 0,000 0,213 0,110 0,000 0,133 0,305 0,213 0,334 0,149 0,320 0,562 0,472 0,364 0,250 0,585 0,220
9 0,187 0,206 0,304 0,310 0,000 0,320 0,000 0,406 0,000 0,223 0,213 0,656 0,223 0,320 0,756 0,792 0,364 0,213 0,320 0,306
10 0,320 0,125 0,000 0,109 0,000 0,258 0,000 0,437 0,576 0,000 0,149 0,072 0,223 0,223 0,187 0,201 0,201 0,149 0,093 0,175
11 0,000 0,125 0,097 0,115 0,000 0,297 0,000 0,223 0,292 0,250 0,000 0,097 0,292 0,473 0,373 0,370 0,334 0,342 0,173 0,203
12 0,249 0,249 0,394 0,405 0,000 0,133 0,144 0,440 0,400 0,000 0,267 0,000 0,109 0,324 0,428 0,533 0,036 0,206 0,149 0,235
13 0,459 0,187 0,355 0,403 0,000 0,110 0,149 0,115 0,149 0,036 0,198 0,036 0,000 0,206 0,206 0,423 0,576 0,125 0,405 0,218
14 0,310 0,247 0,125 0,247 0,000 0,279 0,093 0,149 0,267 0,133 0,213 0,097 0,324 0,000 0,556 0,267 0,231 0,267 0,183 0,210
15 0,389 0,219 0,267 0,462 0,000 0,324 0,231 0,283 0,567 0,072 0,198 0,206 0,230 0,423 0,000 0,373 0,251 0,198 0,223 0,259
16 0,223 0,187 0,110 0,403 0,000 0,206 0,173 0,223 0,355 0,198 0,251 0,244 0,428 0,109 0,423 0,000 0,173 0,230 0,133 0,214
17 0,545 0,187 0,247 0,163 0,000 0,320 0,000 0,576 0,279 0,149 0,198 0,405 0,316 0,355 0,254 0,109 0,000 0,283 0,310 0,247
18 0,223 0,097 0,230 0,163 0,000 0,864 0,110 0,093 0,223 0,036 0,230 0,036 0,036 0,036 0,036 0,036 0,279 0,000 0,549 0,173
19 0,269 0,198 0,163 0,223 0,000 0,403 0,230 0,223 0,267 0,115 0,110 0,072 0,149 0,109 0,206 0,206 0,247 0,355 0,000 0,187
0,296 0,264 0,259 0,297 0,000 0,223 0,108 0,190 0,223 0,114 0,139 0,152 0,239 0,224 0,274 0,269 0,234 0,203 0,273
What is more, a relevant factor which complicates the use of a network thinking
methodology in a proposed version is vagueness of used expressions and a lack of
terminological coherence. In authors’ opinion, this problem can be eliminated by the
use of a quality approach to a network’s description. The quality approach is
developed by Polish scientists within the scope of a general theory which is
qualitology. Particularly useful in this area might be the works of Mantura [8] and
Szafrański [10] which are unfamiliar due to a lack of translations into English.
References
[1] Akay, D., Atak, M.: Grey prediction with rolling mechanism for electricity demand
forecasting of Turkey, Energy, 32 (2007)
[2] Cempel, C.Z.: Zastosowanie teorii szarych systemów do modelowania i prognozowania
w diagnostyce maszyn (The use of Grey Systems Theory to model and forecast in
machinery diagnostics). Diagnostyka 2(42) (2007)
[3] Chen, C., Ting, S.: A study using the grey system theory to evaluate the importance of
various service quality factors. International Journal of Quality and Reliability
Management 19(7) (2002)
[4] Deng, J.: Introduction to grey system theory. The Journal of Grey System 1(I), 1–24
(1989)
[5] Hsu, C., Wen, Y.: Improved Grey Prediction Models for Trans-Pacific Air Passenger
Market, Transportation. Planning and Technology 22 (1998)
[6] Krawczyk, S.: Logistyka w zarządzaniu marketingiem, p. 38. AE Publishing, Wrocław
(1998)
[7] Lin, T.C., Huang, H.C., Liao, B.Y., Pan, J.S.: An Optimized Approach on Applying
Genetic Algorithm to Adaptive Cluster Validity Index. International Journal of Computer
Sciences and Engineering Systems 1(4), 253–257 (2007)
[8] Mantura, W.: Elementy kwalitologii. Wydawnictwo Politechniki Poznańskiej. Poznań
(2011)
[9] Nowosielski, S. (ed.): Procesy i projekty logistyczne, p. 4. UE Publishing, Wrocław
(2008)
[10] Szafrański, M.: Skuteczność działań w systemach zarządzania jakością. Wydawnictwo
Politechniki Poznańskiej. Poznań (2006)
[11] Szymonik, A.: Bezpieczeństwo w logistyce, Difin, Warszawa, p. 11 (2010)
[12] Tzung-Pei, H., Cheng-Hsi, W.: An Improved Weighted Clustering Algorithm for
Determination of Application Nodes in Heterogeneous Sensor Networks. Journal of
Information Hiding and Multimedia Signal Processing 2(2), 173–184 (2011)
[13] Wang, T., Liou, M., Hung, H.: Application of Grey Theory on Forecasting the Exchange
Rate between TWD and USD. In: International Conference on Business and Information,
Academy of Taiwan Information System Research and Hong Kong Baptist University,
Hong Kong, July 14-15 (2005)
[14] Zimniewicz, K.: Współ czesne koncepcje i metody zarządzania, PWE, Warszawa (1999)
Implementation of Vendor Managed Inventory Concept
by Multi-Dimensionally Versioned Software Agents
1 Introduction
Inventory management has always been an issue, which companies had to cope with.
Nowadays firms tend to constantly search for new ways for decreasing their
operational costs, especially those resulting from unnecessary supplies. They try to
generate more precise forecasts, establish a better information flow between business
partners or optimize their inner processes to ensure a more effective way to manage
their stock levels. Moreover firms quickly realized the importance of collaboration
within a supply chain and its impact on operational costs. This led to the development
of many distribution and inventory control methodologies in management. Those
methodologies have been constantly evolving from relatively static to more dynamic
models. Good inventory levels are now a measure of business competitiveness [1].
Those concepts were quickly improved by utilizing the potential of information
and communication technology (ICT). Simple inventory management applications
evolved throughout the years into complex Enterprise Resource Planning (ERP) and
Supply Chain Management (SCM) systems. Those complex giants however have
some significant cons, which limit their usability for enterprises – they are expensive
and their implementation process in extremely time-consuming. Those factors were
the key element which inspired the authors to propose a new concept in inventory
management systems.
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 478–487, 2012.
© Springer-Verlag Berlin Heidelberg 2012
Implementation of Vendor Managed Inventory Concept 479
In this paper, 2 main goals were achieved. The first goal was to propose an
alternate technology to be used in inventory management. The key concept was the
usage of agent technologies in the process of inventory management. The main goal
of the presented solution is to implement a classic inventory management technique –
VMI but in a improved, more effective and broadly available version. The second
goal was the demonstration of a new software agent model which is both – intelligent
and highly mobile. The proposed agent model is essential to implement the proposed
inventory management solution which has high requirements for agent used to
implement it.
The structure of this paper is as follows.
Section 2 is a brief description of the traditional VMI concept in inventory
management. It also contains a proposition of how to utilize intelligent software
agents (the agent-VMI concept) to familiarize this concept to a broader spectrum of
companies, especially in the sector of small and medium enterprises (SME).
In section 3, the MDV agent model is presented. It is based on the concept of multi-
dimensional versioning and code segmentation. The roles of subsequent agent parts are
described in detail. Moreover, the process of agent migration, environmental adaptation,
agent knowledge and functionality management as well as code discarding are discussed.
The fourth section deals with an functionality algorithm of the proposed agent
model and describes the whole procedure which takes place when software agents
manage the inventory levels in supply chains. It contains a step-by-step description of
the proposed approach.
Section 5 concludes the paper, emphasizing the advantages of utilizing intelligent
software agents as the communication platform in the VMI model, especially those
designed according to the MDV model.
− The first step is to verify the data as accurate and meaningful. Depending on
the software, much of this verification is automated.
− On a scheduled basis, the software calculates a reorder point for each item
based on the movement data and any overrides contributed by the customer
or supplier. These overrides might include information such as promotions,
projects, seasonality, new items and so forth.
− The VMI software compares the quantity available at the customer with the
reorder point for each item at each location. This determines if an order is
needed.
− The order quantities are then calculated. Typically calculation of order
quantities takes into account such issues as carton quantities and transaction
costs [9].
This procedure is based on the simple concept of minimal inventory levels and does not
take into account any random events such as changing demand, weather conditions (for
example for transportation purposes) etc.
The benefits of successfully implemented VMI which can occur are the following:
− lower customer inventories – are caused by a better organization of inventory
management due to limited inventory item controlled by the vendor;
− lower supplier inventory – under VMI, the vendor knows the exact demand
which he must meet, therefore he can easily lower his inventory levels to
minimum;
− lower stock levels through-out the whole supply chain [10, 11] – a key issue
in supply chain management is to satisfy the customer needs. Traditionally
this issue has been solved by maintaining a high stock level to meet any
demand fluctuation. Taking the 2 arguments stated above, VMI enables to
lower the stock levels of certain entities in the supply chain, thus lowering the
overall inventory costs of the whole supply chain.
− lower administrative costs – this benefit occurs to both parties engaged in VMI,
and is caused by the usage of automated systems instead of regular employees.
− increased sales – those are caused by the fact, that the inventory levels are set
to an optimal level, therefore decreasing chances of stock-out situations.
There are however some issues that must be taken into consideration when planning
to implement the described concept. In the business environment, VMI is generally
implemented by companies which satisfy the following two conditions:
− The company is a strong player, big enough to force the usage of VMI in
their supply chain, therefore extorting its usage on their suppliers or clients
(depending on the company’s profile). It leads to situations where business
partners have to adjust their IT systems to meet VMI requirements.
− The enterprise has enough free resources to invest in perfectly working
electronic communication systems, which enable an undisturbed and constant
information flow within the supply chain which is generally essential for VMI to
work.
Moreover to make VMI work, the parties have to agree on the following terms:
Implementation of Vendor Managed Inventory Concept 481
For the sake of platform independence and security, we assume that the bootstrap
agent is interpreted by the visited environment. In other words the bootstrap agent is a
source code, rather than binary code.
There are four basic functions provided by the stationary proxy agent:
− It serves as a communication channel between the user and bootstrap agent:
it knows the current location of the bootstrap agent and can influence the
path of its movement, as well as tasks performed by the bootstrap agent.
− It encompasses all variants of the agent’s code that model the variability of
agent behavior. Depending on what environment is currently visited by the
bootstrap agent, and according to the bootstrap agent’s demands, the proxy
agent sends a relevant variant of agent code directly to the bootstrap agent,
thus enriching it with the skills required in this new environment and, as a
consequence, enabling it to continue the global agent mission. Notice, that
code transmission redundancy is avoided, since the unnecessary code (not
relevant agent variants) remains together with the stationary component.
− It assembles data items that are sent to it directly from the bootstrap agent,
extended by a proper agent variant, which are not useful for mobile code,
however, it could be interesting to the user. The data assembled is stored in a
so called knowledge repository.
− Whenever required, the proxy agent responds to the user who can ask about
mission results and data already collected (e.g. a percentage of data initially
required). If the user is satisfied with the amount of data already available,
proxy agent presents the data to the user and finishes its execution (together
with the mobile component).
To summarize the aforementioned discussion, one can easily determine the behavior and
functions of a moving component (code). When it migrates to the next network node,
only the bootstrap agent is transmitted, while the agent variant is just automatically
removed from the previous node. There is no need to carry it together with the bootstrap
agent, since it is highly probable that a new environment requires a different agent
variant. When migration ends, the bootstrap agent checks its specificity, and sends a
request for a corresponding agent variant transmission, directly to the proxy agent. When
the code is completed, the mobile agent component restarts its execution.
During the inspection of consecutive network nodes only the information that
enriches the intelligence of a moving agent is integrated with the agent bootstrap (thus
it can slightly grow over time). Pieces of information that do not increase the agent
intelligence are sent by the mobile component of the agent directly to the stationary
part, in order to be stored in the data repository managed by it. This agents feature,
namely getting rid of unnecessary data, is called self-slimming.
Now we focus on possibilities of the MDV`s agent versioning, i.e. on the content
of the proxy agent that is always ready to select a relevant piece of agent code
according to the demand of the bootstrap agent. We distinguish three orthogonal
dimensions of agent versioning:
− agent segmentation,
− environmental versioning,
− platform versioning.
Implementation of Vendor Managed Inventory Concept 483
Now let us illustrate the behavior of a MDV agent by the example which shows
step-by-step typical scenario of migration of code and data over the network.
1. The user creates a MDV agent on his/her computer using a relevant
application, defines its mission, environments to be visited, assigns his/her
certificate, and finally disconnects or shuts down the application.
2. The MDV agent starts its execution on the origin computer. It is composed
of the bootstrap agent, the proxy agent and the knowledge repository.
3. The bootstrap agent migrates through the network to the first environment
under investigation.
4. The bootstrap agent is interpreted by the visited environment; it checks its
specificity and platform details; assume that it is e-marketplace implemented
as a web-site, running at the computer equipped with Intel Core i7 Processor
managed by MS-Windows 7 operating system; the environment allows for
binary agents execution.
5. The bootstrap agent contacts the proxy agent through the network asking for
the first agent segment to be send, in a version relevant to the e-marketplace
being visited, and in a variant matching the hardware and software
parameters already recognized.
6. First segment of the agent migrates to the environment visited and starts its
execution – let’s say offer browsing. The data collected by the segment
which are not required in further execution, e.g. offers that could be
interesting in the future, are sent back directly to the knowledge repository,
thus self-slimming the agent. All offers that should be negotiated are marked
by the agent segment which finishes its execution (its code is deleted).
7. The bootstrap agent contacts the proxy agent again asking for the second agent
segment to be send, for example, a segment responsible for negotiation.
8. The second segment arrives and starts its execution. For the sake of
simplicity we assume that negotiation fails with all respective agents related
to the offers marked by the previous segment.
9. The bootstrap agent is informed about negotiation results and the second
segment is deleted.
10. The bootstrap agent migrates through the network to the second environment
for further investigation. Steps 4 – 10 are repeated in this environment.
4 Implementation Model
the existence of software representatives (agents) of the client, with whom he can
establish a negotiation process. There are two possible outcomes of the test carried
out – the agent either finds an agent (as illustrated on lines 11-27) or not (lines 29-47).
In the first case the negotiation process can start, therefore the bootstrap agent
requests an appropriate part of agent code which is responsible for negotiations, from
his server. He then asks the client’s agent for the current inventory levels and matches
them with the data he possesses about the client (the required inventory levels, etc.). If
the inventory levels are optimal – the bootstrap agent deletes its downloaded parts and
migrates to another location (lines 24-27). In other case it requests another segment,
this time responsible for external factor analysis (for example it could have the
possibility to check some weather forecasts for the next days, and calculate the impact
of this factor on the whole re-order process). At this moment, the agent is ready to
begin its negotiations with the client, while taking into account various factors like for
example transportation delays resulting from bad weather conditions. The vendor
agent offers a re-order based on previously defined conditions corrected by external
factors. If the client’s agent agrees to the suggested terms, an order is send to the
vendors headquarters. Otherwise negotiations continue (while loop on 16-22) until the
client’s agent accepts the offer or totally rejects the conditions (in which case both
parties – the vendor and the client are informed about negotiation failure). In case of a
total reject, the vendor agents performs its slimming procedure and migrates to
another client.
In the second case, where the agent wasn’t able to find any other agents to
negotiate with, it searches for ready-made data which he can analyze. This data can be
in various form, it can be for example an xml file available to the agent. If no data is
provided the agent informs both parties about analysis failure and migrates. If the
agent finds an appropriate piece of data, it requests subsequent agent parts responsible
for data analysis. He then checks the inventory levels of his client and matches them
to previously defined conditions. If a re-order is needed, the agent checks if it has
permissions to make a decision on his own, or if he needs human approval from either
party (the vendor or the client). If the agent possesses all necessary permissions it
decides how much and when to order, in other case he prepares a ready order and
sends it for approval to his supervisor (the party responsible for accepting or rejecting
the offer).
To clarify the described procedure, it has been presented below in the form of
pseudo-code representing the overall algorithm, according to which the vendor agent
takes his actions.
1 BEGIN
2 DEFINE FUNCTION place_order_procedure()
3 conditions = check_prerequisites()
4 #check weather, transport availability etc.
5 place_order(conditions)
6 delete_agent_body()
7 migrate()
486 P. Januszewski and W. Wieczerzycki
9 IF(environment IS handlable)
10 request_appropriate_agent_version()
11 IF(agent_to_negotiate_exists())
12 request negotiation module for customer
13 check customer inventory levels
14 IF(inventory below customer norms)
49 ELSE
50 migrate()
51 END
52 END
The procedure illustrated above concludes the presentation of the AMI model.
Implementation of Vendor Managed Inventory Concept 487
References
1. University of Puerto Rico, http://repositorio.upr.edu:8080/jspui/
2. 12Manage, http://www.12manage.com/methods_vendor_managed_inventory.html
3. Skjott-Larsen, T., Jespersen, B.D.: Supply Chain Management. In Theory and Practice,
Copenhagen (2005)
4. Tempelmeier, H.: Inventory management in supply networks: problems, models, solutions,
Nordestedt (2006)
5. Vendor Managed Inventory, http://vendormanagedinventory.com/
6. Diedrichs, M.: Collaborative Planning, Forecasting, and Replenishment (CPFR), Muenchen
(2009)
7. Li, L.: Supply Chain Management: Concepts, Techniques and Practices Enhancing Value
Through Collaboration, Singapore (2007)
8. Gourdin, K.: Global Logistics Management, Oxford (2006)
9. Datalliance consulting, http://www.datalliance.com/vmi.pdf
10. Waller, M., Johnson, M.E., Davis, T.: Vendor-Managed Inventory in the Retail Supply
Chain. Journal of Business Logistics 20, 183–203 (1999)
11. Enarsson, L.: Future Logistics Challenges, Copenhagen (2006)
12. NC State University, http://scm.ncsu.edu/public/lessons/less030305.html
13. FIPA, http://www.fipa.org/specs/fipaSC00001L/
14. FIPA, http://www.fipa.org/specs/fipa00061/
An Improved Data Warehouse Model
for RFID Data in Supply Chain
1 Introduction
In the past few years, automatic identification technique has been applied in supply
chain more than before. Factually, the technology provides the possibility of tracing,
gathering and managing information of items which are moved through supply chain.
Radio frequency identification (RFID) has mostly grown in automatic business area.
One of the typical ways of saving information in data warehouse is exploiting
dimensional method according to data cube and fact table but the method cannot be
used for RFID data in supply chain. In order to make it more clearly, it is assumed
that we have a fact table with the following dimensions.
(EPC, location, time_in, time_out: measure)
Data cube calculates all possible groups in fact table by gathering the records with
same value amongst all possible combinations of dimensions. For instance, we can
J.-S. Pan, S.-M. Chen, N.T. Nguyen (Eds.): ACIIDS 2012, Part I, LNAI 7196, pp. 488–497, 2012.
© Springer-Verlag Berlin Heidelberg 2012
An Improved Data Warehouse Model for RFID Data in Supply Chain 489
achieve the number of goods which has been in a particular location at the same time,
but ignoring the relationship among records is the problem of the model. So, it is
difficult to find the number of P goods which was sent from L distribution center to V
store. There have been some P goods everywhere but we do not know how many of
them have been sent to location V. This urges to the need for a stronger path-driven
model for data aggregation. The main challenge of this research is the optimal
management of a large volume of generated data through RFID application which
sometimes reaches to a couple of terabytes per day. Therefore, the method of saving
and categorizing data is not primarily important; in fact the most crucial and difficult
issue is supporting high-level queries in a similar as well as a productive way. Some
of the queries require a complete review of imported RFID database. In order to
achieve such goal, the researchers propose a framework for managing and saving data
based on compressing method which results in efficient processing of RFID data
queries. The model provides the possibility to keep a path for each item by utilizing
path coding methods in XML tree as well as prime numbers attributes. The model not
only uses bulky movement of items through a supply chain but also provides the
possibility of data aggregation according to other criteria apart from their movements
in supply chain. Furthermore, a new method for incremental aggregation of data
according to various combinations of RFID data queries is proposed which analyses
data based on different dimensions in addition to the path dimension. Finally, the
structure of tables in data warehouse and the last architecture of the proposed model
are provided.
Most topics in the field are focused on data model definitions and compressing
methods which guarantee the effective query processing of RFID data. These
researches are divided to two main categories: While the first category concentrates
on on-line processing of RFID data and data stream processing [4] [11], the second
group focuses on offline details and effective methods of saving data [3] [1] [5]. Most
of these researches have assumed that in supply chain, goods start moving in large
groups at first and then through the paths, they divide into smaller groups. Based on
this idea, data compressing depends only on nodes of supply chain and movement of
objects usually is not considered. Consequently, if any change occurs in supply chain
scenario or movement form, the compressing way will not be effective anymore and
the volume of tables will not be well-reduced. Besides, in this data aggregation
method, data increase leads to an extra cost regarding joining tables. In addition, in
those cases that performing various levels of granularity and complexity is required,
most of these approaches show less flexibility with regard to dimensional queries. A
new storing model has been suggested by Gonzalez [5] in order to provide effective
support for path oriented aggregate queries. In his model, a table called “stay” is
presented which is structured as the following:
Stay – table (GID, Loc, Start-time, End-time, Count)
490 S.K. Moghaddam, G. Nakhaeizadeh, and E.N. Kakhki
The table is intended to store RFID data in an effective manner. In most RFID
Applications, goods usually move in large groups at earlier moving levels of supply
chain. These groups divide into smaller ones later in the path. Consequently, each row
of stay table manifests goods which have moved from a similar location
simultaneously. According to this method, GID is an indication of smaller GIDs or a
list of EPCS. As a result of applying this method, the volume of data can be decreased
considerably. But if goods do not move in large groups, the volume of basic table
does not decrease. Besides, in the model, GID is designed to point at a list of fields
which have string format. In other words, each GID includes some GIDs and EPCs
which are strings. Hence in order to join tables, string assessment is needed which
requires so much time compared to numerical case. The other shortcoming of this
model is that it takes longer to process queries of the first group, i.e. tracking queries.
Because, in the method the location of each group of goods (GID) in stay table is
kept, in order to find movement path a table needs a sequent joining table of stay and
map tables. One of the other important problems which can be observed in most of
the models presented for RFID data is their low flexibility in comparison with the
queries which are presented in various data dimensions and sub-dimensions.
Moreover, these models are not able to change the levels of data granularity. In other
words, we need a model which not only categorize data according to common
movement of items but is also able to categorize data according to various
combinations of its attributes.
4 Proposed Model
In their article [11], Wu, Lee and Hsu has applied the uniqueness of prime numbers as
a means of showing the relationship between existing elements in a XML tree, which,
An Improved Data Warehouse Model for RFID Data in Supply Chain 491
in our model is utilized for coding nodes of the path tree of RFID tags. Reference [6]
provides more information regarding the application of prime numbers in the coding
nodes of the path tree.
There are various methods for coding the path [7] [8] [9] but compared to [11], our
applied method provides a more effective way of processing to RFID queries.
Moreover, concerning the large amount of RFID data, we have tried to apply a
method which firstly does not require storing a lot of information in order to retrieve
the path and secondly, retrieving the path is possible by running simple queries
without recurring joins of path table [6].
Path information can be kept according to the path coding, but saving the related
time for each location has not gained yet. In order to reach that, another tree is created
which is called time tree in which each node keeps location, time-in and time-out of
that location. In time tree two nodes are considered similar when they have a similar
path (similar location) sequence to the node) as well as same time-in and time-out.
The way of building this tree is to some extend similar to the way of building the path
tree; however, those nodes with the same location but different time-in and time-out
information are considered diverse. For efficient retrieving of time information,
values of each node should be accounted by using Region Number Scheme [12].
In order to keep path and time information of tags, two tables, path- table and time-
table, are created according to the above-mentioned methods; path-table keeps path
information and time-table saves information of the time tree. These tables are
structured as follows:
− Path- table: Path ID, Element-Enc, order-Enc
− Time-table: start, End, location, start-time, End-time.
One of the most crucial problems in most of the data models which have been
proposed for RFID data is the low flexibility of them in terms of with queries which
are provided in different combinations of data attributes and dimensions. In other
words, these models are not able to change the levels of granularity; as a result, the
response time to the aggregate queries escalates. In order to support this group of
queries, a model is required which has the ability of compressing data not only based
on path or time elements, but also on each and every attribute, as needed. In order to
obtain this flexibility in data aggregation, the concept of aggregation factors has been
used [4]. To understand the concept better, take location dimension for instance; Data
can be aggregated on city, region and country levels. Also goods can be categorized
according to its brand, type and price. Imagine a set like S which contains all the
attributes of the input data:
S= {A1, A2 … An}
In other words, all attributes are groups in S set according to various dimensions such
as location, time and items. In addition, for each dimension, a relation is defined as
following:
492 S.K. Moghaddam, G. Nakhaeizadeh, and E.N. Kakhki
By combining the result table of the first part of our proposed model (path coding)
and second part (data aggregation) the final structure can be gained as below (Fig. 2).
In this structure which includes seven tables, all of the fields and their relations are
determined.
It should be noted that stock table is the result of applying aggregation factor on
the middle table which is called stay and will be discussed in more details later.
An Improved Data Warehouse Model for RFID Data in Supply Chain 493
In the following step, three query categories are defined based on [6] in order to
examine and compare the response time of these databases to important RFID queries.
Among these three categories, the first one includes tracking Queries (chart. 1), the
1
Professor Roberto De Virgilio from Roma TreUniversity, kindly provided us with 1 million
RFID records which described the movement of goods through a supply chain.
An Improved Data Warehouse Model for RFID Data in Supply Chain 495
second one includes path-oriented retrieved queries (chart. 2, chart. 4) and the third
one includes path-oriented aggregation queries (chart. 3). The fourth chart is location-
oriented aggregation queries.
6 Conclusions
As shown in the charts 1, 2 and 3, the proposed model performs considerably better
than Gonzalez dimensional model regarding tracking queries, path-oriented retrieval
queries and path-oriented aggregation queries. But on the other hand, Gonzalez
dimensional model has a better performance concerning location-oriented queries
than our proposed model.
Generally, dimensional data warehouse -which RFID-Cuboids model is a kind of
them-returns a proper time in SQL aggregation queries. But in comparison with our
proposed model, RFID-Cuboids model performance is only better in those
aggregation queries which merely imply location conditions.
7 Further Suggestions
Utilizing more advanced lossless compression techniques in order to reduce the
required storage is suggested. RFID data requires an efficient framework in order to
answer high-level queries. Especially we are interested in dynamic scenarios where
supply chain can change in terms of object transitions and topology. It needs a
flexible and dynamic aggregation mechanism, and also a run-time query engine which
is able to update tracking information. In this paper we proposed a data warehouse
model for RFID data, which is applicable to other types of sensor data or spatial data,
therefore, this aspect needs to be investigated either.
References
[1] Agrawal, R., Cheung, A., Schonauer, S.: Toward traceability across sovereign distributed
RFID databases. In: 10th International Database Engineering and Application
Symposium, IDEAS 2006, Delhi, India (2006)
[2] Bai, Y., Wang, F., Liu, P., Zaniolo, C., Liu, S.: RFID data processing with a data stream
query language. In: The 23rd International Conference on Data Engineering, ICDE 2007,
Istanbul, Turkey (2007)
An Improved Data Warehouse Model for RFID Data in Supply Chain 497
[3] Ban, C., Hong, B.-H., Kim, D.: Time Parameterized Interval R-Tree for Tracing Tags in
RFID Systems. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005.
LNCS, vol. 3588, pp. 503–513. Springer, Heidelberg (2005)
[4] De Virgilio, R., Sugamiele, P., Torlone, R.: Incremental Aggregation of RFID Data.
ACM, IDEAS, Calabria, Italy (2009)
[5] Gonzalez, H., Han, J., Li, H., Klabjan, D.: Warehousing and Analyzing Massive RFID
Data Sets. In: The 22nd International Conference on Data Engineering, ICDE 2006,
Atlanta, USA (2006)
[6] Moghaddam, S.K.: An Improved Data Warehousing Model for RFID Data of Supply
Chain. Master thesis. Islamic Azad University Of Mashhad, Iran (2011)
[7] Lee, C., Chung, C.: Efficient storage scheme and query processing for supply chain
management using RFID. In: The 34th International Conference on Management of Data,
SIGMOD 2008, Vancouver, Canada (2008)
[8] Min, J., Park, M., Chung, J.: A queriable compression for xml data. In: SIGMOD (2003)
[9] Rao, P., Moon, M.: Indexing and querying xml using prufer sequences. In: ICD (2004)
[10] Wang, F., Liu, P.: Temporal management of RFID data. In: VLDB 2005, pp. 1128–1139
(2005)
[11] Wu, X., Lee, M., Hsu, W.: A prime number tagging scheme for dynamic ordered xml
trees. In: ICDE (2004)
[12] Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On supporting containment
queries in relational database management systems. In: SIGMOD (2001)
Author Index