Fuzzy C-Means Clustering
Course Project Presentation
Mahdi Amiri
June 2003
Sharif University of Technology
Presentation Outline
Motivation and Goals
Fuzzy C-Means Clustering (FCM)
Possibilistic C-Means Clustering (PCM)
Fuzzy-Possibilistic C-Means (FPCM)
Comparison of FCM, PCM and FPCM
Conclusions and Future Works
Page 1 of 30 Fuzzy C-Means Clustering
Motivation and Goals
Sample Applications
Image segmentation
– Medical imaging
• X-ray Computer Tomography (CT)
• Magnetic Resonance Imaging (MRI)
• Position Emission Tomography (PET)
Image and speech enhancement
Edge detection
Video shot change detection
Page 2 of 30 Fuzzy C-Means Clustering
Motivation and Goals
Pattern Recognition
Definition: Search for structure in data
Elements of Numerical Pattern Recognition
– Process Description
• Feature Nomination, Test Data, Design Data
– Feature Analysis
• Preprocessing, Extraction, Selection, …
– Cluster Analysis
• Labeling, Validity, …
We are here
– Classifier Design
• Classification, Estimation, Prediction, Control, …
Page 3 of 30 Fuzzy C-Means Clustering
Motivation and Goals
Fuzzy Clustering
Useful in Fuzzy Modeling
– Identification of the fuzzy rules needed to
describe a “black box” system, on the basis
of observed vectors of inputs and outputs
History
– FCM: Bezdek, 1981
– PCM: Krishnapuram - Keller, 1993
– FPCM: N. Pal - K. Pal - Bezdek, 1997
Prof. Bezdek
Page 4 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Input, Output
Input: Unlabeled data set X x1 , x 2 ,K , x n
n is the number of data point in X
x k p p is the number of features in each vector
Main Output
A c-partition of X, which is c n matrix U
Common Additional Output
Set of vectors V v1 , v 2 ,K , vc p
v i is called “cluster center”
Page 5 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Sample Illustration
X
n 188
p2
Rows of U
(Membership Functions)
U and V
c4
Page 6 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
(FCM), Objective Function
Optimization of an “objective function” or
“performance index”
c n
2
min J m (U, V) uik Dik
m
( U ,V )
i 1 k 1
c
Constraint u
i 1
ik 1 ,k
A-norm
2
Distance D xk vi
2
ik A
x A
x, x A
xT Ax
Degree of
Fuzzification
m 1
Page 7 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Minimizing Objective Function
Zeroing the gradient of J mwith respect to V
1
c 2
Dik m 1
uik , i, k : Ut F (Vt 1 )
j 1 D jk
Zeroing the gradient of J mwith respect to U
n m n
v i uik x k u , i : Vt G ( U t 1 )
m
ik
k 1 k 1 Note: It is the Center of Gravity
Page 8 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Pick
Initial Choices
– Number of clusters 1 c n
– Maximum number of iterations (Typ.: 100) T
– Weighting exponent (Fuzziness degree) m
• m=1: crisp
• m=2: Typical
– Termination measure Et Vt Vt 1 1-norm
– Termination threshold (Typ. 0.01) 0
Page 9 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Guess, Iterate
Guess Initial Cluster Centers V0 ( v1,0 K , v c ,0 )
cp
Alternating Optimization (AO)
– t 0
– REPEAT
– t t 1
– Ut F (Vt 1 )
– Vt G ( U t 1 )
– UNTIL ( t T or Vt Vt 1 )
– (U, V) (Ut , Vt )
Page 10 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Sample Termination Measure Plot
Termination Measure Values m 2.0
Final
Membership Degrees
Page 11 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Implementation Notes
Process could be shifted one half cycle
– Initialization is done on U0
– Iterates become Ut 1 Vt Ut
– Termination criterion U t U t 1
The convergence theory is the same in either case
Initializing and terminating on V is advantageous
– Convenience
– Speed
– Storage
Page 12 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Pros and Cons
Advantages
– Unsupervised
– Always converges
Disadvantages
– Long computational time
– Sensitivity to the initial guess (speed, local minima)
– Sensitivity to noise
• One expects low (or even no) membership degree
for outliers (noisy points)
Page 13 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Optimal Number of Clusters
Performance Index
c n
min P(c) uikm ( x k v i
2 2
vi x )
(c)
i 1 k 1
1 n
Average of all feature vectors x
n k 1
xk
c n c n
2
ik k i
2 m
u m
( x v ) uik ( v i x )
i 1 k 1 i 1 k 1
Sum of the Sum of the
within fuzzy cluster fluctuations between fuzzy cluster fluctuations
(small value for optimal c) (big value for optimal c)
Page 14 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Optimal Cluster No. (Example)
Performance index for optimal clusters c=2 c=3
(is minimum for c = 4)
c=4 c=5
Page 15 of 30 Fuzzy C-Means Clustering
Possibililstic C-Means Clustering
Outliers, Disadvantage of FCM
FCM on FCM on u1,6 0.5 u2,6 0.5
X 11 u1,6 0.5 u2,6 0.5 X 12 u1,12 0.5 u2,12 0.5
x12
x6
x6
x12 is an outlier but has the same membership
degrees as x 6
Page 16 of 30 Fuzzy C-Means Clustering
Possibililstic C-Means Clustering
(PCM), Objective Function
Objective function
c n c n
m
min Pm (T, V; w ) tik Dik w i (1 tik )
m 2
( T,V )
i 1 k 1 i 1 k 1
Typicality or Possibility
c
tik
– No constraint like u
i 1
ik 1 ,k
Cluster weights
w ( w1 , w2 ,K , wc )T wi
Page 17 of 30 Fuzzy C-Means Clustering
Possibililstic C-Means Clustering
Terms of Objective Function
Unconstrained optimization of first term will
lead to the trivial solution tik 0 ,i, k
c n
First term ik ik J m
t m
D 2
i 1 k 1
The second term acts as a penalty which tries to
bring typicality values towards 1.
c n
Second term i ik
w
i 1
(1 t ) m
k 1
Page 18 of 30 Fuzzy C-Means Clustering
Possibililstic C-Means Clustering
Minimizing Objective Function (OF)
Rows and columns of OF are independent
ik-th term of OF Pmik (T, V ) tikm Dik2 w i (1 tik ) m
First order necessary conditions for
Typicality values Cluster centers (Same as FCM)
1
tik ,i, k n m n
1
vi tik xk t , i
m
ik
Dik2 m 1
k 1
1
k 1
wi
Page 19 of 30 Fuzzy C-Means Clustering
Possibililstic C-Means Clustering
Alternating Optimization, Again
Similar to FCM-AO algorithm (Replace
equations of necessary conditions)
Terminal outputs of FCM-AO recommended as
a good way to initialize PCM-AO
– Cluster centers: Final cluster centers of FCM-AO
– Weights: n
ik ik
u m
D 2
is proportional to the average
within cluster fluctuation
wi K k 1
n
,K 0
ik
u m
k 1
Typ. K = 1
Page 20 of 30 Fuzzy C-Means Clustering
Possibililstic C-Means Clustering
Identify Outliers
FCM on u1,6 0.5 u2,6 0.5 PCM on t1,6 0.63 t2,6 0.63
X 12 u1,12 0.5 u2,12 0.5 X 12 t1,12 0.07 t2,12 0.07
x12 x12
x6 x6
x12 is recognized as an outlier by PCM m 2.0
w1 w2 7.88
Page 21 of 30 Fuzzy C-Means Clustering
Possibililstic C-Means Clustering
Pros and Cons
Advantage
– Clustering noisy data samples
Disadvantage
– Very sensitive to good initialization
– Coincident clusters may result
• Because the columns and rows of the typicality
matrix are independent of each other
• Sometimes this could be advantageous (start with
a large value of c and get less distinct clusters)
Page 22 of 30 Fuzzy C-Means Clustering
Fuzzy-Possibililstic C-Means
Idea
uik is a function of x k and all c centroids
tik is a function of x k and v i alone
Both are important
– To classify a data point, cluster centroid has
to be closest to the data point Membership
– For Estimating the centroids Typicality for
alleviating the undesirable effect of outliers
Page 23 of 30 Fuzzy C-Means Clustering
Fuzzy-Possibililstic C-Means
(FPCM), OF and Constraints
Objective function
c n
2
min J m, (U, T, V ) (uik tik ) Dik
m
( U ,T, V )
i 1 k 1
Constraints c
– Membership u ik 1 ,k n
– Typicality
i 1
t
k 1
ik 1 ,i
• Because of this constraint, typicality of a data point to a
cluster, will be normalized with respect to the distance of all
n data points from that cluster next slide
Page 24 of 30 Fuzzy C-Means Clustering
Fuzzy-Possibililstic C-Means
Minimizing OF
1
Membership values c 2
Dik m 1
– Same as FCM, but uik , i, k
j 1 D jk
resulted values may
be different 1
n 2
Typicality values Dik 1
tik , i, k
– Depends on all data
j 1 Dij
Cluster centers
Typical
n n
vi (uikm tikm )xk (u t ) , i
m
ik
m
ik
in the interval
[3,5]
k 1 k 1
Page 25 of 30 Fuzzy C-Means Clustering
Fuzzy-Possibililstic C-Means
FPCM on X-12
u1,6 0.5 u2,6 0.5 t1,6 0.023 t2,6 0.023
U values u1,12 0.5 u2,12 0.5 T values t1,12 0.002 t2,12 0.002
x12
x12
x6 x6
Initial parameters m 2.0 2.0 0.00001
Page 26 of 30 Fuzzy C-Means Clustering
Comparison of FCM, PCM and FPCM
IRIS Data Samples
Iris plants database Iris
setosa
– 4-dimensional data set containing
50 samples each of three types Petal
of IRIS flowers
Iris
– n = 150, p = 4, c = 3 versicolor
– Features
• Sepal length, sepal width,
petal length, petal width
Iris
– Classes virginica
• Setosa, Versicolor, Virginica
Page 27 of 30 Fuzzy C-Means Clustering
Comparison of FCM, PCM and FPCM
IRIS Data Clustering
Initial parameters: [PalPB97]
Resubstitution errors based on the hardened Us and Ts
m Iter- Iter- Iter- Err- Err- Err-U- Err-T-
FCM PCM FPCM FCM PCM FPCM FPCM
1.5 1.5 26 (24) 57, 80 26 (13) 17 (17) 100, 10 17 (17) 19 (17)
2.0 2.0 28 (26) 39, 80 28 (12) 16 (16) 100, 10 16 (16) 15 (16)
1.5 3.0 26 (24) 57, 57 26 (13) 17 (17) 100, 10 17 (17) 16 (17)
3.0 3.0 29 (25) 31, 40 29 (13) 15 (15) 100, 11 15 (15) 15 (14)
5.0 2.0 37 (31) 49, 30 44 (16) 15 (15) 150, 20 15 (15) 12 (14)
2.0 5.0 28 (26) 39, 57 28 (12) 16 (16) 100, 9 16 (16) 16 (16)
5.0 5.0 37 (31) 49, 30 37 (15) 15 (15) 150, 20 15 (15) 12 (15)
My Implementation [PalPB97] Auto weights Tuned weights
Page 28 of 30 Fuzzy C-Means Clustering
Conclusions and Future Works
Err-T-FPCM <= Err-U-FPCM <= Err-FCM
– Could be considered true in general
Mismatch
– Number of iterations required for FPCM in general
is not half of that for FCM as mentioned at
[PalPB97]; Is there any mistake in my
implementation?
Comparison of algorithms using other “noisy”
data sets
Page 29 of 30 Fuzzy C-Means Clustering
Fuzzy C-Means Clustering
Course Project Presentation
Thank You
FIND OUT MORE AT...
1. http://ce.sharif.edu/~m_amiri/
2. http://yashil.20m.com/
References
[Bez81] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function
Algorithms, Plenum, NY, 1981.
[BezKKP99] James C. Bezdek, James Keller, Raghu Krishnapuram and Nikhil
R. Pal, Fuzzy Models and Algorithms for Pattern Recognition and
Image Processing, Kluwer Academic Publishers, TA 1650.F89, 1999.
[KriK93] R. Krishnapuram and J. M. Keller, “A possibilistic approach to
clustering,” IEEE Transactions on Fuzzy Systems, Vol. 1, No. 2, pp.
98-110, May 1993.
[PalPB97] N. R. Pal, K. Pal and J. C. Bezdek, “A mixed c-means clustering
model,” Proceedings of the Sixth IEEE International Conference on
Fuzzy Systems, Vol. 1, pp. 11-21, Jul. 1997.
[YanRP94] Jun Yan, Michael Ryan and James Power, Using fuzzy logic Towards
intelligent systems, Prentice Hall, 1994.
Page 31 of 30 Fuzzy C-Means Clustering
Part Title
…
Part Title
…
…
…
…
Page 32 of 30 Fuzzy C-Means Clustering