Type-II Fuzzy Possibilistic C-Mean Clustering: M.H. Fazel Zarandi, M. Zarinbal, I.B. Turksen
Type-II Fuzzy Possibilistic C-Mean Clustering: M.H. Fazel Zarandi, M. Zarinbal, I.B. Turksen
1 Department of Industrial Engineering, Amirkabir University of Technology, P.O. Box 15875-4413, Tehran, Iran
2 Department of Industrial Engineering, TOBB Economy and Technology University, Sogutozo Cad. No:43, Sogutozo,
Ankara 06560, Turkey
3 Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Ont., Canada M5S 3G8
Email: [email protected], [email protected], [email protected]
Abstract—Fuzzy clustering is well known as a robust and efficient The rest of this paper is organized as follows: The clustering
way to reduce computation cost to obtain the better results. In the methods are reviewed in Section 2. Section 3 presents the
literature, many robust fuzzy clustering models have been presented historical review of Type-II Fuzzy Logic. Section 4 is
such as Fuzzy C-Mean (FCM) and Possibilistic C-Mean (PCM), dedicated to the proposed method and Section 5 presents the
where these methods are Type-I Fuzzy clustering. Type-II Fuzzy sets, experimental results. Finally, conclusions are presented in
on the other hand, can provide better performance than Type-I Fuzzy Section 6.
sets, especially when many uncertainties are presented in real data.
The focus of this paper is to design a new Type-II Fuzzy clustering
method based on Krishnapuram and Keller PCM. The proposed 2. Clustering Methods
method is capable to cluster Type-II fuzzy data and can obtain the The general philosophy of clustering is to divide the initial
better number of clusters (c) and degree of fuzziness (m) by using
set into homogenous groups [6] and to reduce the data [1].
Type-II Kwon validity index. In the proposed method, two kind of
Clustering is useful in several exploratory decision-making,
distance measurements, Euclidean and Mahalanobis are examined.
machine learning, data mining, image segmentation, and
The results show that the proposed model, which uses Mahalanobis
distance based on Gustafson and Kessel approach is more accurate
pattern classification [7]. In literature, most of the clustering
and can efficiently handle uncertainties. methods can be classified into two types: Crisp clustering
Keywords— Type-II Fuzzy Logic; Possibilistic C-Mean (PCM); and Fuzzy clustering. Crisp clustering assigns each data to a
Mahalanobis Distance; Cluster Validity Index; single cluster and ignores the possibility that these data may
also belongs to other clusters [8]. However, as the
boundaries between clusters could not be defined precisely,
1. Introduction some of the data could belong to more than one cluster with
Clustering is an important method in data mining, decision- different positive degrees of memberships [6]. This
making, image segmentation, pattern classification, and etc. clustering method considers each cluster as a fuzzy set and
Fuzzy clustering can obtain not only the belonging status of the membership function measures the degree of belonging
objects but also how much the objects belong to the clusters. of each feature in a cluster. So, each feature may be assigned
In the last 30 years, many fuzzy clustering models for crisp to multiple clusters with some degree of belonging [8]. Two
data have been presented such as Fuzzy K-Means and Fuzzy important applied models of fuzzy clustering, Fuzzy C-
C-Mean (FCM) [1]. FCM is a popular clustering method, but Means, and Possibilistic C-Means are described as follows:
its memberships do not always correspond well to the Fuzzy C-Means (FCM): Fuzzy C-Means clustering model
degrees of belonging and it may be inaccurate in a noisy can be defined as follows [9]:
environment [2]. To relieve these weaknesses, Krishnapuram ܿ ܰ
and Keller presented a Possibilistic C-Mean (PCM) min ቐݔ(ܬ, ߤ, ܿ) = ߤ݆݅݉ ݆݀݅2 ቑ (1)
approach [3]. In addition, in real data, there exist many ݅ =1 ݆ =1
uncertainties and vaguenesses, which Type-I Fuzzy sets are ܰ
not able to directly model as their membership functions are ۓ0 < ߤ < ܰ ( א ݅1,2, … , ܿ) (2)
݆݅
crisp. On the other hand, as Type-II membership functions ۖ
ST: ݆ =1
are fuzzy, they are able to model uncertainties more ܿ
۔
appropriately [2, 4]. Therefore, Type-II Fuzzy Logic ۖ ߤ݆݅ = 1 ( א ݆1,2, … , ܰ) (3)
Systems have the potential to provide better performance = ݅ ە1
than Type-I [5]. where, ߤ݆݅ is the degree of belonging of the jth data to the ith
In the case of combining Type-II Fuzzy logic with clustering cluster, dij is the distance between the jth data and the ith
methods, the data can be clustered more appropriately and cluster center, m is the degree of fuzziness, c is the number
more accurately. The focus of this paper is to design a new of clusters, and N is the number of the data.
Type-II Fuzzy clustering method based on Krishnapuram Although FCM is a very good clustering method, it has some
and Keller PCM. The proposed method is capable to cluster disadvantages: the obtained solution may not be a desirable
Type-II fuzzy data and can obtain the better number of solution and the FCM performance might be inadequate,
clusters (c) and degree of fuzziness (m) by using Type-II specially when the data set is contaminated by noise. In
Kwon validity index. addition, the membership values show degrees of
ISBN: 978-989-95079-6-8 30
IFSA-EUSFLAT 2009
probabilities of sharing [10] and not only depend on the where, xi and yi are the mean values of two different sets of
distance of that point to the cluster center, but also on the parameters, X and Y. ߪ݅2 are the respective variances and
th th
distance from the other cluster centers [11]. In addition, ߩ݆݅ is the coefficient of correlation between the i and j
when the norm used in the FCM method is different to the variates. Gustafson and Kessel proposed a new approach
Euclidean, introducing restrictions is necessary, e.g., based on Mahalanobis distance, which enables the detection
Gustafson and Kessel [12] and Windham limit the volume of of ellipsoidal clusters. Their approach focused on the case
the groups using fuzzy covariance and scatter matrices, where the matrix A is different for each cluster [12].
respectively [13]. Satisfying the underlying assumptions, such as shape and
Possibilistic C-Mean (PCM): To improve the FCM number, is another important issue in clustering methods,
weaknesses, Krishnapuram and Keller created a possibilistic which could be obtained by validation indices. Xie & Beni’s
approach, which uses a possibilistic type of membership (XB) and Kwon are two common validity indices [1]. Xie
function to describe the degree of belonging. It is desirable and Beni defined a cluster validity, (9), which aims to
that the memberships for representative feature points be as quantify the ratio of the total variation within clusters and
high as possible and unrepresentative points have low the separation of the clusters [1]:
membership. The objective function, which satisfies the 2
requirements, is formulated as follows [3]: σܿ݅=1 σ݆ܰ=1(ߤ݆݅ )݉ ฮ ݆ݔെ ݅ݒฮ
ܺ= )ܿ(ܤ 2 (9)
ܿ ܰ
ܰ minฮ ݅ݒെ ݆ݒฮ
݅ ്݆
min ቐݔ( ݉ܬ, ߤ, ܿ) = ߤ݆݅݉ ݆݀݅2
where, ߤ݆݅ is the degree of belonging of the jth data to the ith
݅ =1 ݆ =1
ܿ ܰ cluster, vj is the center of the jth cluster, m is the degree of
fuzziness, c is the number of clusters, and N is number of the
+ ߟ݅ (1 െ ߤ݆݅ )݉ ൡ (4)
data. The optimal number of clusters should minimize the
݅=1 ݆ =1
value of the index [1]. However, in practice, when ܿ ՜ ܰ ֜
where, dij is the distance between the jth data and the ith ܺ ܤ՜ 0 and it usually does not generate appropriate clusters.
cluster center, ߤ݆݅ is the degree of belonging of the jth data to The Vk(c), (10), was proposed by Kwon based on the
the ith cluster, m is the degree of fuzziness, ߟ݅ is a suitable improvement of the ܺ ܤindex [16];
positive number, c is the number of the clusters, and N is ܸ݇ (ܿ)
number of the data. ߤ݆݅ can be obtained by using (5) [3]: 2 1
σܿ݅=1 σ݆ܰ=1(ߤ݆݅ )݉ ฮ ݆ݔെ ݅ݒฮ + σܿ݅=1 ԡ ݅ݒെ ݒҧ ԡ2
1 = ܿ (10)
ߤ݆݅ = 1 (5) 2
minฮ ݅ݒെ ݆ݒฮ
݆݀݅2 ݉ െ1 ݅ ്݆
1+ቆ ቇ where, ߤ݆݅ is the degree of belonging of the jth data to the ith
ߟ݅
where, dij is the distance between the jth data and the ith cluster, vj is center of the jth cluster, ݒҧ is the mean of the
cluster center, ߤ݆݅ is the degree of belonging of the jth data to cluster centers, m is the degree of fuzziness, c is the number
the ith cluster, m is the degree of fuzziness, ߟ݅ is a suitable of clusters, and N is number of the data. To assess the
positive numbers. The value of ߟ݅ determines the distance at effectiveness of clustering method, the smaller the Vk, the
better the performance [16].
which the membership value of a point in a cluster becomes
All of the clustering methods and validation indices,
0.5. In practice, (6) is used to obtained ߟ݅ values. The value
mentioned above, are based on Type-I fuzzy set. However,
of ߟ݅ can be fixed or changed in each iteration by changing
in real world, there exists many uncertainties, which Type-I
the values of ߤ݆݅ and dij, but the care must be exercised,
fuzzy could not model them. Type-II fuzzy set, on the other
since it may lead to instabilities [3]:
hand, is able to successfully model these uncertainties [4].
σ݆ܰ=1 ߤ݆݅݉ ݆݀݅2
ߟ݅ = (6)
σ݆ܰ=1 ߤ݆݅݉
3. Type-II Fuzzy Clustering
The PCM is more robust in the presence of noise, in finding
valid clusters, and in giving a robust estimate of the centers The concept of a Type-II fuzzy set, was introduced by Zadeh
[14]. as an extension of Type-I fuzzy set [17]. A Type-II fuzzy set
Updating the membership values depends on the distance is characterized by fuzzy membership function, i.e., the
measurements [11].The Euclidean and Mahalanobis distance membership grade for each element of this set is a fuzzy set
are two common ones. The Euclidean distance works well in interval [0,1]. Such sets can be used in situations where
when a data set is compact or isolated [7] and Mahalanobis there are uncertainties about the membership values [18].
distance takes into account the correlation in the data by Type-II fuzzy logic is applied in many clustering methods
using the inverse of the variance-covariance matrix of data e.g., [19, 20, 21, 22, 23, 24, 25, 26, 27, 28]. There are
set which could be defines as follows [15]: essentially two types of Type-II fuzziness: Interval-Valued
i,j=p Type-II and generalized Type-II. Interval-Valued Type-II
D = ݅ݔ( ݆݅ܣെ ݅ݔ() ݅ݕെ ) ݆ݕ (7) fuzzy is a special Type-II fuzzy, where the upper and lower
i,j=1
bounds of membership are crisp and the spread of
݆ߪ ݅ߪ ݆݅ߩ = ݆݅ܣ (8) membership distribution is ignored with the assumption that
membership values between upper and lower values are
ISBN: 978-989-95079-6-8 31
IFSA-EUSFLAT 2009
uniformly distributed or scattered with a membership value jth cluster’s center, ߟ݅ is positive numbers, c is the number of
of 1 on the ߤ(ߤ( ))ݔaxis (Figure 1.a). Generalized Type-II the clusters, and N is the number of input data. The first
fuzzy identifies upper and lower membership values as well term make the distance to the cluster’s center be as low as
as the spread of membership values between these bounds possible and the second term make the membership values in
either probabilistically or fuzzily. That is there is a a cluster to be as large as possible. The membership values
probabilistic or possibilistic distribution of membership for data in each cluster must lie in the interval [0,1], and
values that are between upper and lower bound of their sum are restricted to be smaller than the number of
membership values in the ߤ(ߤ( ))ݔaxis (Figure 1.b) [29]. input data, as shown in (12), (13), and (14).
Minimizing ݔ( ݉ܬ, ߤ, ܿ) with respect to ߤ݆݅ is equivalent to
minimizing the individual objective function defined in (15)
with respect to ߤ݆݅ (provided that the resulting solution lies in
the interval [0,l]).
݉
ݔ( ݆݅݉ܬ, ߤ, c) = ߤ݆݅݉ ݆݅ܦ+ ߟ݅ ൫1 െ ߤ݆݅ ൯ (15)
Differentiating (15) with respect to ߤ݆݅ and setting it to 0,
leads to (16) which satisfies (12), (13), and (14).
1
ߤ݆݅ = 1 i = 1, … , c (16)
݉ ݆݅ܦെ1
1+൬ߟ ൰
݅
ߤ݆݅ is updated in each iteration and depends on the Dij and ߟ݅ .
As mentioned in [3], the value of ߟ݅ determines the distance
at which the membership value of a point in a cluster
becomes 0.5. In general, it is desirable that ߟ݅ relate to ith
cluster and be of the order of Dij [3].
Figure 1:(a) Interval Valued Type-II (b) Generalized Type-II σܰ ݆݅݉ ݆݅ܦ
݆ =1 ߤ
ߟ݅ = i = 1, … , c (17)
σܰ ݆݅݉
݆ =1 ߤ
4. Proposed Type-II PCM Method where, Dij is the distance measure and number of clusters (c)
and degree of fuzziness (m) are unknown. Since the
Considering the growing application areas of Type-II fuzzy
logic, designing a Type-II clustering method is essential. parameter ߟ݅ is independent of the relative location of the
Several researchers designed a Type-II fuzzy clustering clusters, the membership value ߤ݆݅ depends only on the
method based on FCM but FCM itself has some weaknesses, distance of a point to the cluster centre. Hence, the
which make some of the developed methods ineffective in membership of a point in a cluster is determined solely by
situations in which the data set is contaminated by noise, the how far a point is from the centre and is not coupled with its
norm used is different from the Euclidean, or the pixels on location with respect to other clusters [11].
an input data are highly correlated. PCM could improves The clustering method needs a validation index to define the
these weaknesses. number of clusters (c) and degree of fuzziness (m), which are
The proposed method is the extension of Krishnapuram and used in (15), (16), and (17). Therefore a Type-II Kwon Index
Keller Possibilistic C-Mean (PCM). Here, the membership based on Kwon index is designed, which is represented by
functions are Type-II Fuzzy, the distance is assumed to be (18):
Euclidean and Mahalanobis and Type-II Kwon validity ܸ෨݇ (ܿ)
2 1
index is used to find the optimal degree of fuzziness (m) and σܿ݅=1 σ݆ܰ=1 ߤ݆݅݉ ฮ ݆ݔെ ݒ݅ ฮ + σܿ݅=1ԡ ݅ݒെ ݒҧԡ2
= ܿ (18)
number of clusters (c). The proposed Type-II PCM model is 2
as follows: minฮݒ݅ െ ݒ݆ ฮ
݅ ്݆
ܿ ܰ
where, ߤ݆݅ is Type-II possibilistic membership values for the
ݔ( ݉ܬ, ߤ, ܿ) = ݉݅݊ ߤ݆݅݉ ݆݅ܦ ith data in the jth cluster, ݒ݅ is the ith center of cluster, ݒҧ is the
݅=1 ݆ =1 mean of centers, N is the number of input data, c is the
ܿ ܰ
݉ number of the classes and m is the degree of fuzziness. The
+ ߟ݅ ൫1 െ ߤ݆݅ ൯ ൩ (11) first term in the numerator denotes the compactness by the
݅=1 ݆ =1 sum of square distances within clusters and the second term
ܰ denotes the separation between clusters, while denominator
ۓ0 < ߤ < ܰ (12) denotes the minimum separation between clusters, so the
ۖ ݆݅
ܵ. ܶ: ݆ =1 smaller the ܸ෨݇ (ܿ), the better the performance.
ߤ۔݆݅ [ א0,1] ݅, ݆ (13) In sum, the steps of the proposed clustering method are
ۖ
ەmax ߤ݆݅ > 0 ݆ (14) described bellow and are shown in Figure 2.
where, ߤ݆݅ is Type-II membership for the ith data in the jth Step 1: Define the initial parameters including:
cluster, Dij is the Mahalanobis distance of the ith data to the x Maximum iteration of the method (R)
x Number of the clusters (c=2 is the initial value)
ISBN: 978-989-95079-6-8 32
IFSA-EUSFLAT 2009
ISBN: 978-989-95079-6-8 33
IFSA-EUSFLAT 2009
60
clear for Type-II PCM case. Therefore, in Type-I PCM
40 Type II
in all conditions the optimum (m,c) pairs are (1.5,2),
(1.5,3), (1.5,4), (1.5,5), which may not be good results. 20
However, in Type-II PCM the optimum (m,c) pairs are 0
(2.9,2), (3.7,3), (3.3,4), (2.5,5), which seems to be good 1.5 1.7 1.9 2.1 2.3 2.5 2.7 2.9 3.1 3.3
results. Figures 3 and 4 show the Kwon index results
m
for c=2 measured by Mahalanobis and Euclidean
distance. Figure 4- Kwon Index Result for c=2 and Euclidean
Distance (based on Table 3 and 4)
Mahalanobis Distance
80 x In the same type of fuzzy logic (Type-I or Type-II)
Type I
case, for m<3.1 using two different distance functions
Kwon Index
60
40 Type II did not show many differences for Kwon index results.
20 However, for m>3.1, the Kwon index could be
0 calculated by Mahalanobis but it is not defined for
Euclidean distance, as shown in Figures 5.
1.5 1.9 2.3 2.7 3.1 3.5 3.9 4.3 4.7 5.1
m
Figure 3: Kwon Index Result for c=2 and Mahalanobis
Distance (based on Table 3 and 4)
ISBN: 978-989-95079-6-8 34
5
IFSA-EUSFLAT 2009
[14] O. Nasraoui and R. Krishnapuram, Crisp Interpretations of Fuzzy and
Type-II Fuzzy; c=3 Possibilistic Clustering Algorithms. In Proceedings of the 3rd
European Congress on Intelligent Techniques and Soft Computing,
2500 1312-1318, 1995.
Euclidean
2000 [15] P.C. Mahalanobis, On the generalized distance in statistics.
Kwon Index
This paper has presented a Type-II Possibilistic C-Mean [20] B.I. Choi and F.C. Rhee, Interval Type-2 Fuzzy Membership
Function Generation Methods for Pattern Recognition. Information
(PCM) method for clustering purposes. The results of the Sciences, Article in Press, 2008.
proposed method are compared with Type-I PCM using an
[21] C. Hwang and F.C. Rhee, An interval type-2 Fuzzy C-Spherical
image as an input data and two kind of distance functions, Shells algorithm. IEEE International Conference on Fuzzy Systems,
Euclidean and Mahalanobis. The results show that Type-II 2:1117-1122, 2004.
PCM using Mahalanobis distance can provide better values [22] C. Hwang, F.C. Rhee, Uncertain Fuzzy Clustering: Interval Type-2
for degree of fuzziness and number of clusters, which both Fuzzy Approach To C-Means. IEEE Transactions on Fuzzy Systems,
are used in calculating the membership functions. Therefore 15(1):107-120, 2007.
the proposed clustering method is more accurate, can [23] D.C. Lin and M.S. Yang, A Similarity Measure between Type 2
provide better performance and can handle uncertainties that Fuzzy Sets with its Application to Clustering. 4th International
Conference on Fuzzy Systems and Knowledge Discovery, 1:726-731,
exist in the data efficiently. 2007.
[24] F.C. Rhee, Uncertain Fuzzy Clustering: Insights and
References Recommendations. IEEE Computational Intelligence Magazine,
2(1):44-56, 2007.
[1] J.V. Oliveira and W. Pedrycz, Advances in Fuzzy Clustering and Its
Applications. John Wiley & Sons Ltd., 2007. [25] F.C. Rhee and C. Hwang, A Type-2 Fuzzy C-Means Clustering
Algorithm. Annual Conference of the North American Fuzzy
[2] J.M. Mendel and R. John, Type 2 Fuzzy Sets Made Simple. IEEE Information Processing Society, 4:1926-1929, 2001.
Transactions On Fuzzy Systems, 10(2):117-127, 2002.
[26] O. Uncu and I.B. Turksen, Discrete Interval Type 2 Fuzzy System
[3] R. Krishnapuram and J.M. Keller, A Possibilistic Approach to Models using Uncertainty in Learning Parameters. IEEE
Clustering. IEEE Transactions on Fuzzy Systems, 1(2):98-110, 1993. Transactions on Fuzzy Systems, 15(1):90-106, 2006.
[4] R. Seising (Ed.), Views on Fuzzy Sets and Systems from Different [27] W.B. Zhang et al., Rules Extraction of Interval Type-2 Fuzzy Logic
Perspectives. Springer-Verlag, 2009. System based on Fuzzy C-Means Clustering. 4th International
Conference on Fuzzy Systems and Knowledge Discovery, 2:256-260,
[5] J.M. Mendel et al., Interval Type 2 Fuzzy Logic Systems Made
Simple. IEEE Transactions on Fuzzy Systems, 14(6):808-821, 2006. 2007.
[6] E. Nasibov and G. Ulutagay, A New Unsupervised Approach for [28] W.B. Zhang and W.J. Liu, IFCM: Fuzzy Clustering for Rule
Extraction of Interval Type-2 Fuzzy Logic System. Proceedings of
Fuzzy Clustering. Fuzzy Sets and Systems, Article in Press.
46th IEEE Conference on Decision and Control, 5318-5322, 2007.
[7] A.K Jain et al., Data clustering: A review. ACM Computing Surveys,
[29] I.B. Turksen, Type 2 Representation and Reasoning for CWW. Fuzzy
31(3):264-323, 1999.
Sets and Systems, 127(1):17-36, 2002.
[8] M. Menard and M. Eboueya Extreme Physical Information and
Objective Function in Fuzzy Clustering. Fuzzy Sets and Systems,
128(3):285-303, 2002.
[9] I.B. Turksen, An Ontological and Epistemological Perspective of
Fuzzy Set Theory. Elsevier Inc., 2006.
[10] H. Frigui, R. Krishnapuram, A Robust Competitive Clustering
Algorithm with Applications in Computer Vision. IEEE Transactions
on Pattern Analysis and Machine Intelligence, 21(5):450-465 1999.
[11] K.P. Detroja et al., A Possibilistic Clustering Approach to Novel
Fault Detection and Isolation. Journal of Process Control,
16(10):1055-1073, 2006.
[12] D.E. Gustafson and W.C. Kessel, Fuzzy Clustering with Fuzzy
Covariance Matrix. Proc. IEEE CDC, San Diego, CA, 761-766,
1979.
[13] A. Flores-Sintas et al., A Local Geometrical Properties Application to
Fuzzy Clustering. Fuzzy Sets and Systems, 100(3): 245-256, 1998.
ISBN: 978-989-95079-6-8 35