0% found this document useful (0 votes)

86 views

Community Detection

Community detaction

Uploaded by

jbsimha3629

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

86 views

Community Detection

Community detaction

Uploaded by

jbsimha3629

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

CS224W Project Report

Deep Learning with K-Means

Applied to Community Detection in Networks

1. Introduction and Background

1.1 Community Detection in Networks

Alexandre Vilcek

There is not a universal, unique definition of what a community

is in the context of network analysis. A very comprehensive
study on this topic was performed in [8]. The exact definition
will depend on the specific system and/or application
considered. Nevertheless, a general intuition is that a
community is constituted by a group of cohesive nodes, in the
sense that there is much more links inside that cohesive group
than links connecting that group to nodes outside it.

[email protected]

Abstract
Identifying communities, or clusters, in graphs is a task of great
importance
when
analyzing
network
structures.
A
telecommunications provider, for instance, would like to identify
communities of customers that place a large amount of calls to
each other, in order to create more effective, directed marketing
campaigns. Another example would be a financial institution,
trying to identify and understand communities of customers
that have a high amount of financial transactions between each
other. There is also a wide applicability of community detection
in life sciences research as well: for example, when studying
protein-protein interaction networks.

Consider for example the network depicted by the graph shown

in Fig. 1a: we define a graph
as a set of vertices (or
nodes) and a set of edges (or links) . We would like to find
one or more communities
, defined as a subset of .
We then consider the internal and external degrees of a node
as
and
, respectively. According to the previous
intuition, a good community would be a sub graph , such that
for each node
we maximize
and minimize . This can
be seen in Fig. 1b, where edges linking nodes inside
are
drawn in black and edges linking nodes in to nodes outside
are drawn in red.

There is a wide range of algorithms and methods that can be

applied, in order to identify communities in network structures,
such as Spectral Clustering, Modularity Maximization, and
Hierarchical Clustering. Spectral Clustering is one of the
traditional algorithms that is suitable for network clustering
because it first map the original data points into a lowerdimensional space, where the clustering properties of the graph
tend to be much more evident, allowing for a subsequent
application of standard clustering techniques such as K-Means.
Despite its good results, Spectral Clustering presents some
computational challenges when applied to very large networks.
Recent research shows that techniques and algorithms from
Deep Learning, such as layered neural networks-based autoencoders, are suitable for the task of mapping data points into
lower-dimensional spaces, which can be useful for a variety of
further tasks such as data clustering and classification.

Fig.1a (left): a graph G with 20 nodes and 52 edges

Fig. 1b (right): three communities (color-shadowed in red,
green, and blue) defined for the graph G

There is also recent research showing that a K-Means based

approach is also suitable for a lower-dimensional data
representation that can replace traditional neural networksbased auto-encoders in a Deep Learning pipeline.

1.2 Spectral Clustering

Spectral Clustering [8],[14],[18],[20] is one of the traditional and
popular methods of data clustering that is well suited for the
problem of finding communities in networks.

In this project we will create a new processing pipeline for nonoverlapping community detection in network structures based
entirely on K-Means. We will show that this approach is similar
to a traditional Deep Learning auto-encoder in its ability to learn
useful representations of the original data in a lowerdimensional space, making the data clustering task easier to
accomplish. We will then test its applicability for the specific
challenges of community detection in networks and compare its
performance with the traditional Spectral Clustering approach.

When applying Spectral Clustering to an arbitrary dataset, the

idea is to represent that dataset by its similarity matrix (or a
derivation of it).
Consider for example the graph
of Fig. 1. Then a
similarity matrix would be constructed by applying a pairwise
similarity function
to every pair of nodes
with
being symmetric and non-negative, i.e.
.

In practice, what Spectral Clustering does is to partition the data

according to a lower-dimensional representation that is obtained
from computing eigenvectors of the graph Laplacian matrix
obtained from that data, being the number of desired clusters.
When computing eigenvectors, we consider those corresponding
to the
smallest non-zero eigenvalues of . After embedding
the original data into a lower-dimensional representation, the
final clustering result is obtained by running K-Means on that
representation.

Fig. 3b(right): Spectral Clustering correctly identifies two

spirals, depicted in red and black
1.3 Deep Learning and Auto-Encoders
In the area of Artificial Intelligence and Machine Learning, In
recent years, lots of research has been developed around deep
architectures that allow for higher level abstractions for
representing and understanding data [2],[7],[11],[19]. Usually
these architectures are comprised of artificial neural networks or
similar constructs that aim to mimic the layered-processing of
information that occurs in the human brain (particularly in the
neocortex region). This approach has yielded very good results
particularly in tasks related to computer vision and natural
language processing.

In Fig. 2 we see a plotting of the top-2 eigenvectors whose

eigenvalues are non-zero, where the three communities found
by K-Means are highlighted in different colors and point shapes.
This corresponds to the graph depicted in Fig. 1. It is easy to
see that performing clustering on that new representation of the
graph becomes a trivial task.

There is a lot of variety in the possible architectures and

algorithms related to Deep Learning. In the context of this
study, we will focus on some of the characteristics of Deep
Learning that are relevant for the proposed method:

Use many layers of non-linear processing units for

extraction and embedding of features.
Is based on unsupervised learning.
Allows for a distributed representation of the data, in
the sense that the underlying data can be explained by
a set of hierarchical factors.

In the context of Deep Learning, an auto-encoder is the unit

responsible for the extraction and embedding of features in an
(usually) unsupervised fashion. Given an input
from the
original data, the auto-encoder tries to learn an encoding
function
. It does so, by iteratively minimizing
the error of reconstructing
through a decoding function
(see Fig. 4).
(
)

Fig. 2: three communities found by K-Means applied to the top-2

eigenvectors of ; here
, where is the diagonal matrix
formed by the node degrees of , and is the adjacency matrix
of *
It may seem non intuitive, at first glance, why not applying KMeans directly on the original data. It turns out that standard KMeans is capable of separate data only linearly in an ndimensional space. On the other hand, by embedding the data
from an n-dimensional space into a new k-dimensional space,
where
, Spectral Clustering makes it possible to linearly
separate the data. In Fig. 3a and Fig. 3b we see a simple
example showing that: what we actually want to separate are
two spirals developing in opposite directions, as correctly
identified by Spectral Clustering.
* We note here that for a trivial graph like in this example, the
node degree is sufficient as a similarity function. But in practice,
for larger and real-word networks, we need a proper similarity
function that is able to capture local similar structures in the
network, such as Jaccard, Gaussian, or Cosine Similarity.

Fig. 4: a high level representation of an auto-encoder based on

a single-layer neural network

The embedding given by an auto-encoder can then be used to

better representing the data for the final learning task, for
example a classification or clustering.
In [16] the authors show that Spectral Clustering can be viewed
as the minimization of the reconstruction error of the graph
similarity matrix , in a way much similar to an auto-encoder. In
the case of Spectral Clustering, the embedding is given by the
top
eigenvectors of the matrix , as exemplified in 1.2.
It turns out that, building an embedding hierarchy learned by
recursively applying
, one is capable of extracting a

Fig.3a (left): K-Means fails to identify the two desired spirals

distributed representation of the data that tends to better

represent the original data as the number of embedding layers
in the architecture increase. This behavior was consistently
proved in data classification tasks such as described in [6] and
similar researches.

In [17], the authors propose an implementation of Spectral

Clustering based on the MapReduce paradigm for parallel,
distributed computations. As in [16], they explore the sparsity
characteristic of the graph similarity matrix, and optimized
eigensolvers.

1.4 K-Means
In the fields of Machine Learning and Data Mining, perhaps KMeans is the most known and studied method for clustering
analysis [12].

3. Proposed Approach

Standard K-Means works as following: consider the data to be

{
} and
clustered
{
} the set of clusters
to group on. K-Means will try then to minimize a loss function
that measures the squared error between the computed mean
of each cluster
and its assigned cluster members { }, for all
clusters being considered:

In this project we propose a novel approach to perform Graph

Clustering for Network Community Detection that is entirely
based on K-Means. K-Means is well known for its computational
simplicity, ease of implementation, and ease of scalability in
parallel distributed computing frameworks.

3.1 Motivation

In this project, we will explore a characteristic of K-Means that is

very similar to what we discussed in 1.3: the ability to learn
higher level representations of the data in a lower-dimensional
space, as shown in [5],[6].

We have seen the potential computational complexity and

scalability issues of Spectral Clustering when applied to large
Graphs. We have also seen in [16] a novel approach that seems
to outperform Spectral Clustering, yet reducing some of those
complexity and scalability issues. On the other hand, we know
that the approach proposed in [16], which is based on
techniques
of
feature-learning
through
auto-encoders
implemented with neural networks, is non-trivial to implement
for getting good results [2],[5].[11].

It turns out that, as we will show empirically, for the problem of

community detection in networks we only need a very simple
encoding schema for the embedding of the original data. This is
sometimes called hard-assignment or Boolean encoding and can
be formulated by:

Here we propose to replace the eigenvector decomposition step

in Spectral Clustering, as defined in [14], by an multi-layer autoencoder pipeline implemented using only the K-Means clustering
algorithm in a recursive way. We call this algorithm Deep KMeans.

3.2 Proposed Implementation for Deep K-Means

Here we detail the proposed algorithm step by step:

a. We begin with a graph
and is the set of edges.

2. Previous Work
There is a lot of previous research that investigated Spectral
Clustering or derived approaches both applied to general
clustering problems and specifically to community detection in
networks [8],[14],[18],[20].

, where

b. Compute a pairwise similarity matrix

similarity function
(
)
c. Compute a diagonal matrix
are the node degrees for nodes

It is a known fact that the two most expensive computational

steps in the Spectral Clustering algorithm are the eigenvalue
decomposition of the graph Laplacian matrix, which has time
complexity of
), and the computation of the graph similarity
matrix, which has time complexity of
), both for a graph
with nodes. This makes the algorithm unsuitable for analyzing
large network structures. Several approaches that try to address
those limitations were investigated:

is the set of nodes

from

, using a

, whose elements

d. Define a K-Means auto-encoder module

comprised by the
K-Means computation and hard-assignment as defined in 1.4.
e. We define the initial input data as the matrix
, the
desired number of layers of
as , the desired number of
clusters for each layer of
as , and the final desired number
of communities as .

In [4], the authors investigate techniques for sparsification or

approximation calculation of the similarity matrix, so that
optimized eigensolvers that can be easily parallelized can be
used.

f. For each layer

, we compute

, such

that
g. As the last step, we apply standard K-Means to
desired

In [16], as previously described in 1.3, the authors successfully

replace the expensive eigensolver calculation in the Spectral
Clustering by a stacked, neural networks-based auto-encoder.

to find the

communities of .

The embedding of the data matrix into lower-dimensional

spaces has the analogous effect of mapping it to its top
eigenvectors. Fig. 5 below show a plotting of a 100-node

network with 5 communities, after embedding its initial matrix

(as defined in e. above) into a 4-layer Deep-KMeans. For the sake
of visualization, only 2 of the 6 final features are represented.

A vector
graph.

defines a community distribution for each node in the

Fig. 5: visualization of a 100x100 initial matrix after running a

4-layer Deep KMeans and embedding it into a 100x6 matrix
(showing only 2 of 6 features)

3.3 Experiments
We performed a series of experiments to evaluate the accuracy,
computational time complexity, and scalability of Deep K-Means,
comparing it with an implementation of Spectral Clustering as
defined in [14], on the task of finding communities in network
structures.
For those experiments, we analyzed network data with groundtruth communities, both synthetic and real-world data.

Tab. 1: main characteristics of the synthetic networks

To generate synthetic data, we used the approach proposed in

[13], where the generated data incorporates important features
of real networks for the problem of community detection, like
the heterogeneity of node degree distributions and community
sizes. Table Tab. 1 below summarizes some of the main
characteristics of the tested networks.

3.3.1 Experiment #1
In this experiment we ran both Deep K-Means and Spectral
Clustering on the synthetic networks described earlier. The goal
of this experiment is to investigate if the proposed algorithm
can provide better clustering performance than Spectral
Clustering in a consistent way.

We also ran experiments using the Football dataset, as

described in [22].

In order to maximize the accuracy of Deep K-Means, we tuned

the auto-encoder pipeline with the following parameters

In all experiments, the accuracy obtained was evaluated using

the Normalized Mutual Information
between the
communities detected by the algorithm
and
corresponding ground-truth communities , according to:

the

Where:
is the Mutual Information between
and , that
represents the amount of shared information between
and
and is given by:

Number of layers: 3
Maximum number of iterations for K-Means on each
layer: 25
Number of random initializations for K-Means on each
layer: 10
Similarity function: Jaccard

We ran both Deep K-Means and Spectral Clustering 10 times for

each network and computed the average of the clustering
accuracy as the normalized mutual information compared to the
ground-truth communities.

3.3.2 Experiment #2

And
is the Entropy of , that represents the information
contained in and is given by:

In this experiment we analyze the impact in clustering accuracy,

regarding the choice of the similarity function used when

computing the similarity matrix. As mentioned earlier, we need

a similarity function that is able to capture local edge structures
in the network. For that, we tested three functions:

Jaccard Similarity: the Jaccard Similarity coefficient of

two nodes is the number of common neighbors divided
by the number of nodes that are neighbors of at least
one of the two nodes being considered;

Dices Coefficient (or Srensen Index): the Dice

Similarity coefficient of two nodes is twice the number
of common neighbors divided by the sum of the
degrees of the two nodes being considered;

Inverse Log-weighted Similarity, as proposed in [1]: the

inverse log-weighted similarity of two nodes is the
number of their common neighbors, weighted by the
inverse logarithm of their degrees. The intuition is that
two vertices should be considered more similar if they
share a low-degree common neighbor, since highdegree common neighbors are more likely to appear
even by pure chance.

And the second part of the test with the following parameters:

3.3.5 Experiment #5
In this experiment we analyze the performance of Deep K-Means
and Spectral Clustering using the Football dataset [22].

For the auto-encoder pipeline of Deep K-Means we used a singlelayer without random restarts.

The Football dataset is a representation of the schedule of

Division I games for the 2000 season: vertices in the graph
represent teams and edges represent regular season games
between the two teams they connect. It incorporates a known
community structure, which are conferences containing
around 8 to 12 teams each. Games are more frequent between
members of the same conference than between members of
different conferences.

3.3.3 Experiment #3
In this experiment we empirically analyze the time complexity of
Deep K-Means compared to Spectral Clustering.
In Spectral Clustering we expect the time complexity being
dominated by the following steps, which occur sequentially.
Here we define as the number of nodes in the graph and , the
number of communities:

Construction of the similarity matrix:

Eigenvalue decomposition of the similarity matrix:

We ran Deep K-Means using a 1-layer auto-encoder and one KMeans initialization. We ran both Deep K-Means and Spectral
Clustering 25 times and measure the NMI and execution time
for each run.

K-Means Clustering:

3.4 Results

For Deep K-Means, we have analogously:

Number of layers: 1
Maximum number of iterations for K-Means on each
layer: 25
Number of random initializations for K-Means on each
layer: 1, 2, 4, 8, 32
Similarity function: Dices

For each part of the test described above, and for each
corresponding parameter configuration, we ran Deep K-Means
10 times for the graphs g1 through g14 and measured the
correspondent average accuracy and average execution time.

For each similarity function described above we ran Deep KMeans 10 times for the graphs g1 through g14 and measured
the correspondent average accuracy as the normalized mutual
information compared to the ground-truth communities.

Number of layers: 1, 2, 3, and 4

Maximum number of iterations for K-Means on each
layer: 25
Number of random initializations for K-Means on each
layer: 1
Similarity function: Dices

Here we present the results considerations for the experiments

described above:

Construction of the similarity matrix:

Deep K-Means pipeline:
, where is proportional
to the number of random initializations in the autoencoder
K-Means Clustering:

3.4.1 Result for Experiment #1

In Fig. 6 below we see that Deep K-Means outperforms Spectral
Clustering in terms of clustering accuracy in a consistent way.
We also notice that the accuracy of Spectral Clustering drops
faster than the accuracy of Deep K-Means as the network
increases in size, which makes Deep K-Means more robust when
the network connectivity and community distribution complexity
increase.

We ran both Deep K-Means and Spectral Clustering 10 times for

each network with the same setup as described in 3.3.1 and
computed the average of the elapsed running time.
3.3.4 Experiment #4
In this experiment we empirically analyze the tradeoff between
time complexity and accuracy, when running Deep K-Means with
different configurations for the auto-encoder pipeline. More
specifically, we vary both the number of layers and number of
random initializations of K-Means for the auto-encoder.
We ran the first part of the test with the following parameters:

Fig. 6: NMI accuracy for both algorithms; Graph represents the

graph id as described in Tab.1; Each point in the plot represents
the average NMI for 10 independent runs; dashed lines
represent the accuracy average across all graphs for the
corresponding algorithm

Fig. 8a: Execution time for Deep K-Means and Spectral

Clustering for graphs g1 trough g10

3.4.2 Result for Experiment #2

In Fig. 7 below we see that the Dices coefficient provides
slightly better accuracy in most cases. This may indicate that
giving more weight to the number of common neighbors
between the nodes being compared, as Dices similarity does
when compared to Jaccard and Inverse Log-weighted, seems to
be a better approach for Deep K-Means.
Fig. 8b: Execution time for Deep K-Means and Spectral
Clustering for graphs g1 trough g14 shown in logarithmic scale

3.4.4 Result for Experiment #4

In Fig. 9a through Fig. 9f we analyze the behavior of Deep KMeans, when varying the number of layers and the number of
random initializations in the auto-encoder pipeline, in terms of
both accuracy and execution time.
We notice that the number of layers have very little influence on
the accuracy, which is more noticeable when the number of
nodes in the graph increases. Surprisingly, the configuration
with 4 layers approaches the execution time of the single-layer
configuration and even gets better, as the number of nodes in
the network increases. This could be explained by the fact that
increasing the number of auto-encoder layers eventually reduces
the execution complexity of the last step of the algorithm (i.e, it
converges faster), as defined in 3.2.g, to the point that the
overall execution time can be reduced when compared to the
execution time in a configuration with a smaller number of
layers.

Fig. 7: NMI accuracy for Deep K-Means using 3 different

similarity scores: Dices, Inverse Log-weighted, and Jaccard;
dashed lines represent the accuracy average across all graphs
for the corresponding similarity function

3.4.3 Result for Experiment #3

In Fig. 8a and Fig. 8b below we notice the influence of the
number of K-Means random initializations in the auto-encoder
pipeline for Deep K-Means as the main difference in the
execution time between both algorithms. As we notice in 3.4.4,
there is no noticeable improvement in accuracy when increasing
the number of random initializations, so that we could
approximate the execution time of Deep K-Means much closer
to Spectral Clustering by reducing the number of random
initializations.

We also notice that the number of random initializations in the

auto-encoder has no influence on the accuracy, but has a
significant impact on execution time, which increases as a
function of both the number of random initializations and the
number of nodes.
Therefore we could improve the execution time of Deep KMeans in relation to Spectral Clustering without a significant
change in accuracy, as shown in Fig. 9g and Fig. 9h.

Fig. 9d: NMI accuracy for Deep K-Means for 1, 2, 4, 8, and 32

random initializations; dashed lines represent the accuracy
average across all graphs

Fig. 9a: NMI accuracy for Deep K-Means for 1, 2, 3, and 4 autoencoder layers; dashed lines represent the accuracy average
across all graphs for the corresponding configuration
Fig. 9e: Execution time for Deep K-Means for graphs g1 trough
g10

Fig. 9b: Execution time for Deep K-Means for graphs g1 trough
g10

Fig. 9f: Execution time for Deep K-Means for graphs g1 trough
g14 shown in logarithmic scale

Fig. 9c: Execution time for Deep K-Means for graphs g1 trough
g14 shown in logarithmic scale

Fig. 9g : NMI accuracy for Spectral Clustering and Deep KMeans with 4 layers and 1 random restart

Fig. 9h: Execution time for Deep K-Means and Spectral

Clustering for graphs g1 trough g14 shown in logarithmic scale

3.4.5 Result for Experiment #5

implement the algorithm in a distributed, parallel data

processing framework, such as Apache Hadoop or Apache
Spark, or to implement improved versions of K-Means such as
proposed in [9].

In Fig. 10a and Fig. 10b below we see the accuracy and
execution time, respectively, when running both Deep K-Means
and Spectral Clustering against the Football dataset [22]. As this
is a fairly simple and small network, running Deep K-Means in a
1-layer configuration yields good results. As with the synthetic
datasets in the previous experiments, we see Deep K-Means
outperforming Spectral Clustering.

The way the algorithm was implemented demands the prior

knowledge of the number of communities to be found. This is
not a problem for networks with ground truth communities,
such as those tested in this work, but it is not the case for the
majority of the practical problems. It would be interesting to
incorporate some technique for automatic choosing the number
of communities, such as optimizing the modularity function as
proposed in [20] or the approach proposed in [10], where a
statistical test is applied to each subset of the data assigned to
a given cluster to assess if the number of clusters is optimal.

References
[1] Adamic, Lada A., and Eytan Adar. "Friends and neighbors on the
web." Social networks 25.3 (2003): 211-230.
[2] Bengio, Yoshua. "Learning deep architectures for AI." Foundations
and trends in Machine Learning 2.1 (2009): 1-127.

Fig. 10a: community detection accuracy for Deep K-Means and

Spectral Clustering for the Football dataset; Each point in the
plot represents the average NMI for 25 independent runs of
Deep K-Means and Spectral Clustering; dashed lines represent
the accuracy average across all runs

[3] Boutsidis, Christos, et al. "Randomized Dimensionality Reduction

for k-means Clustering." arXiv preprint arXiv:1110.2897 (2011).
[4] Chen, Wen-Yen, et al. "Parallel spectral clustering in distributed
systems."Pattern Analysis and Machine Intelligence, IEEE
Transactions on 33.3 (2011): 568-586.
[5] Coates, Adam, Andrew Y. Ng, and Honglak Lee. "An analysis of
single-layer networks in unsupervised feature learning." International
Conference on Artificial Intelligence and Statistics. 2011.
[6] Coates, Adam, and Andrew Y. Ng. "Learning feature
representations with k-means." Neural Networks: Tricks of the Trade.
Springer Berlin Heidelberg, 2012. 561-580.
[7] Erhan, Dumitru, et al. "Why does unsupervised pre-training help
deep learning?." The Journal of Machine Learning Research 11
(2010): 625-660.

Fig. 10b: Execution time for Deep K-Means and Spectral

Clustering for the Football dataset

[8] Fortunato, Santo. "Community detection in graphs." Physics

Reports 486.3 (2010): 75-174.
[9] Frahling, Gereon, and Christian Sohler. "A fast k-means
implementation using coresets." International Journal of Computational
Geometry & Applications18.06 (2008): 605-625.

4. Conclusion
In this work we proposed a new algorithm for non-overlapping
network community detection that leverages ideas from Deep
Learning pipelines for data embedding in lower-dimensional
spaces, which eases the task of clustering the data into
communities.

[10] Hamerly, Greg, and Charles Elkan. "Learning the k in A>

means." Advances in neural information processing systems 16
(2004): 281.
[11] Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the
dimensionality of data with neural networks." Science 313.5786 (2006):
504-507.

We show through experiments using several datasets of varied

sizes that the proposed algorithm outperforms traditional
Spectral Clustering in accuracy measured by the Normalized
Mutual Information between the communities discovered by the
algorithm and the ground-truth communities associated with the
respective datasets.

[12] Jain, Anil K. "Data clustering: 50 years beyond K-means." Pattern

Recognition Letters 31.8 (2010): 651-666.
[13] Lancichinetti, Andrea, Santo Fortunato, and Filippo Radicchi.
"Benchmark
graphs
for
testing
community
detection
algorithms." Physical Review E 78.4 (2008): 046110.

The drawback of the proposed approach is its execution time,

which increases faster than Spectral Clustering as the datasets
increase. Nevertheless, as the proposed approach is entirely
based on K-Means clustering, it is much more easier to

[14] Ng, Andrew Y., Michael I. Jordan, and Yair Weiss. "On spectral
clustering: Analysis and an algorithm." Advances in neural information
processing systems 2 (2002): 849-856.

[15] Song, Yangqiu, et al. "Parallel spectral clustering." Machine

Learning and Knowledge Discovery in Databases. Springer Berlin
Heidelberg, 2008. 374-389.
[16] Tian, Fei, et al. "Learning Deep Representations for Graph
Clustering." Twenty-Eighth AAAI Conference on Artificial Intelligence.
2014.
[17] Tsironis, Serafeim, Mauro Sozio, and Michalis Vazirgiannis.
"Accurate Spectral Clustering for Community Detection in
MapReduce."
[18] Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics
and computing 17.4 (2007): 395-416.
[19] Weston, Jason, et al. "Deep learning via semi-supervised
embedding." Neural Networks: Tricks of the Trade. Springer Berlin
Heidelberg, 2012. 639-655.
[20] White, Scott, and Padhraic Smyth. "A Spectral Clustering
Approach To Finding Communities in Graph." SDM. Vol. 5. 2005.
[21] Yang, Jaewon, and Jure Leskovec. "Defining and evaluating
network communities based on ground-truth." Proceedings of the ACM
SIGKDD Workshop on Mining Data Semantics. ACM, 2012.
[22] Girvan, Michelle, and Mark EJ Newman. "Community structure in
social and biological networks." Proceedings of the National Academy
of Sciences 99.12 (2002): 7821-7826.

Paper-2 Clustering Algorithms in Data Mining A Review
No ratings yet
Paper-2 Clustering Algorithms in Data Mining A Review
7 pages
Geotechnical Laboratory Measurements For Engineers, John T. Germaine and Amy v. Germaine, 2009
100% (1)
Geotechnical Laboratory Measurements For Engineers, John T. Germaine and Amy v. Germaine, 2009
359 pages
Spectral Approach (BU)
No ratings yet
Spectral Approach (BU)
2 pages
I Jcs It 2015060141
No ratings yet
I Jcs It 2015060141
5 pages
Luxburg07 Tutorial 4488
No ratings yet
Luxburg07 Tutorial 4488
32 pages
The Latest Research Progress On Spectral Clustering
No ratings yet
The Latest Research Progress On Spectral Clustering
10 pages
Handbook of Cluster Analysis: C. Hennig, M. Meila, F. Murtagh, R. Rocci (Eds.)
No ratings yet
Handbook of Cluster Analysis: C. Hennig, M. Meila, F. Murtagh, R. Rocci (Eds.)
28 pages
Entropy: Kernel Spectral Clustering For Big Data Networks
No ratings yet
Entropy: Kernel Spectral Clustering For Big Data Networks
20 pages
Spectral Clustering Survey
No ratings yet
Spectral Clustering Survey
12 pages
Tutorial On Spectral Clustering
No ratings yet
Tutorial On Spectral Clustering
26 pages
Graph-Based Clustering and Data Visualization Algorithms (PDFDrive)
No ratings yet
Graph-Based Clustering and Data Visualization Algorithms (PDFDrive)
120 pages
Spectral Clustering
No ratings yet
Spectral Clustering
7 pages
Spectral Clustering Via Ensemble Deep Autoencoder
No ratings yet
Spectral Clustering Via Ensemble Deep Autoencoder
33 pages
Research On Spectral Clustering Algorithms and Prospects
No ratings yet
Research On Spectral Clustering Algorithms and Prospects
5 pages
Hypergraph Convolutional Neural Network-Based Clustering Technique
No ratings yet
Hypergraph Convolutional Neural Network-Based Clustering Technique
9 pages
Spectral_Clustering
No ratings yet
Spectral_Clustering
4 pages
Sem232 LA CC07 Group08
No ratings yet
Sem232 LA CC07 Group08
23 pages
Expert Systems With Applications: Tülin Inkaya
No ratings yet
Expert Systems With Applications: Tülin Inkaya
10 pages
Spec Clus Mod
No ratings yet
Spec Clus Mod
29 pages
GIU_2719_65_22376_2025-02-17T23_42_29
No ratings yet
GIU_2719_65_22376_2025-02-17T23_42_29
37 pages
LAB6
No ratings yet
LAB6
4 pages
Spectral Clustering: X Through The Parameter W 0. The Resulting
No ratings yet
Spectral Clustering: X Through The Parameter W 0. The Resulting
7 pages
16-EJS1206
No ratings yet
16-EJS1206
26 pages
Comparison of Graph Clustering Algorithms
No ratings yet
Comparison of Graph Clustering Algorithms
6 pages
Social Network Analysis Unit-3
No ratings yet
Social Network Analysis Unit-3
28 pages
Community Detection and Evaluation
No ratings yet
Community Detection and Evaluation
46 pages
Unsupervised Learning (A.k.a Clustering) : Marcello Pelillo
No ratings yet
Unsupervised Learning (A.k.a Clustering) : Marcello Pelillo
102 pages
Clustering With Shallow Trees
No ratings yet
Clustering With Shallow Trees
17 pages
A Classification of Well-Behaved Graph Clustering Schemes: Vilhelm Agdur April 5, 2024
No ratings yet
A Classification of Well-Behaved Graph Clustering Schemes: Vilhelm Agdur April 5, 2024
16 pages
A Deep Community Detection Approach in Real Time Networks
No ratings yet
A Deep Community Detection Approach in Real Time Networks
14 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
4 Clustering
No ratings yet
4 Clustering
21 pages
Pattern Vectors From Algebraic Graph Theory
No ratings yet
Pattern Vectors From Algebraic Graph Theory
14 pages
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
SpectralClustering
No ratings yet
SpectralClustering
52 pages
Unsuper
No ratings yet
Unsuper
15 pages
Semi-Supervised Spectral Clustering Using Shared Nearest Neighbor For Data With Different Shape and Density
No ratings yet
Semi-Supervised Spectral Clustering Using Shared Nearest Neighbor For Data With Different Shape and Density
8 pages
LecN10_R
No ratings yet
LecN10_R
9 pages
Sna Unit III
No ratings yet
Sna Unit III
10 pages
Community Detection Using Statistically Significant Subgraph Mining
No ratings yet
Community Detection Using Statistically Significant Subgraph Mining
10 pages
SADMJ12
No ratings yet
SADMJ12
19 pages
ASWT-SGNN：基于自适应谱小波变换的自监督图神经网络
No ratings yet
ASWT-SGNN：基于自适应谱小波变换的自监督图神经网络
15 pages
Segmentation 1
No ratings yet
Segmentation 1
52 pages
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet
2092 On Spectral Clustering Analysis and An Algorithm
No ratings yet
2092 On Spectral Clustering Analysis and An Algorithm
8 pages
2019 REU Dimension Reduction Poster
No ratings yet
2019 REU Dimension Reduction Poster
1 page
5 - Clustering
No ratings yet
5 - Clustering
13 pages
GraphSigProc Part I v18 NowFnT
No ratings yet
GraphSigProc Part I v18 NowFnT
49 pages
Representation Learning On Graphs: Methods and Applications
No ratings yet
Representation Learning On Graphs: Methods and Applications
23 pages
SPECTRAL INFERENCE NETWORKS: UNIFYING DEEP AND SPECTRAL LEARNING
No ratings yet
SPECTRAL INFERENCE NETWORKS: UNIFYING DEEP AND SPECTRAL LEARNING
26 pages
Cluster_analysis
No ratings yet
Cluster_analysis
22 pages
2020 - Supervised Community Detection With Line Graph Neural Networks - Chen Et Al
No ratings yet
2020 - Supervised Community Detection With Line Graph Neural Networks - Chen Et Al
24 pages
Geometric Deep Learning On Graphs and Manifolds Using Mixture Model Cnns
No ratings yet
Geometric Deep Learning On Graphs and Manifolds Using Mixture Model Cnns
13 pages
The Shape of Data: Geometry-Based Machine Learning and Data Analysis in R
From Everand
The Shape of Data: Geometry-Based Machine Learning and Data Analysis in R
Colleen M. Farrelly
No ratings yet
Math 118: Mathematical Methods of Data Theory: Lecture 9: Graphs and Spectral Clustering
No ratings yet
Math 118: Mathematical Methods of Data Theory: Lecture 9: Graphs and Spectral Clustering
11 pages
Spectral Approach For Tabular and Graph Data Clustering
No ratings yet
Spectral Approach For Tabular and Graph Data Clustering
15 pages
PR_module_4_QB - Copy
No ratings yet
PR_module_4_QB - Copy
37 pages
DS303 Clustering
No ratings yet
DS303 Clustering
20 pages
Machine Learning
No ratings yet
Machine Learning
15 pages
Variational Deep Embedding
No ratings yet
Variational Deep Embedding
8 pages
Week 8 - Graph Mining
No ratings yet
Week 8 - Graph Mining
39 pages
CAR Data Mart
No ratings yet
CAR Data Mart
16 pages
Churn Prediction Using Logistic Regression
No ratings yet
Churn Prediction Using Logistic Regression
5 pages
An Introduction To Probabilistic Programming: Jan-Willem Van de Meent
No ratings yet
An Introduction To Probabilistic Programming: Jan-Willem Van de Meent
218 pages
Advances in ML
No ratings yet
Advances in ML
17 pages
Stock Prediction Using Twitter Sentiment Analysis: Anshul Mittal Anmittal@stanford - Edu Arpit Goel Argoel@stanford - Edu
No ratings yet
Stock Prediction Using Twitter Sentiment Analysis: Anshul Mittal Anmittal@stanford - Edu Arpit Goel Argoel@stanford - Edu
5 pages
RM Good Advice
No ratings yet
RM Good Advice
16 pages
Six Sigma Methodology With Recency, Frequency and Monetary Analysis Using Data Mining
No ratings yet
Six Sigma Methodology With Recency, Frequency and Monetary Analysis Using Data Mining
4 pages
Univariate/Bi Variate Analysis
No ratings yet
Univariate/Bi Variate Analysis
32 pages
5 Estimation
No ratings yet
5 Estimation
15 pages
Bridge The Gap: Fuzzy Logic and Deep Learning IJCAI Tutorial, July 13 2018, Stockholm, Swe
No ratings yet
Bridge The Gap: Fuzzy Logic and Deep Learning IJCAI Tutorial, July 13 2018, Stockholm, Swe
63 pages
AI Brochure
No ratings yet
AI Brochure
3 pages
Data Types and Operations
No ratings yet
Data Types and Operations
21 pages
An Improved Fuzzy Time Series Forecasting Model: Studies in Computational Intelligence January 2018
No ratings yet
An Improved Fuzzy Time Series Forecasting Model: Studies in Computational Intelligence January 2018
18 pages
Big Data Syllabus
No ratings yet
Big Data Syllabus
17 pages
J Fuzzy Logic
No ratings yet
J Fuzzy Logic
8 pages
Review On Big Data & Analytics - Concepts, Philosophy, Process and Applications
No ratings yet
Review On Big Data & Analytics - Concepts, Philosophy, Process and Applications
25 pages
J Fuzzy Logic
No ratings yet
J Fuzzy Logic
8 pages
DR - Jbs Cover para
No ratings yet
DR - Jbs Cover para
1 page
Fine-Grained Photovoltaic Output Prediction Using A Bayesian Ensemble
No ratings yet
Fine-Grained Photovoltaic Output Prediction Using A Bayesian Ensemble
7 pages
Hong Pesgm2011
No ratings yet
Hong Pesgm2011
6 pages
Improving Short Term Load Forecast Accuracy Via Combining Sister Forecasts
No ratings yet
Improving Short Term Load Forecast Accuracy Via Combining Sister Forecasts
17 pages
Good Language Learner Studies - Wikipedia PDF
No ratings yet
Good Language Learner Studies - Wikipedia PDF
13 pages
การติดตั้ง Orcad 16.0
No ratings yet
การติดตั้ง Orcad 16.0
14 pages
Nozzlepro
No ratings yet
Nozzlepro
20 pages
End of Capitalism As We Knew It
No ratings yet
End of Capitalism As We Knew It
12 pages
Neck Pain Questionnaire Form
100% (1)
Neck Pain Questionnaire Form
1 page
Malaysian Shophouses: Creating Cities of Character: Scholarworks@Uark
No ratings yet
Malaysian Shophouses: Creating Cities of Character: Scholarworks@Uark
80 pages
The TOEIC Reading Test Preparation
No ratings yet
The TOEIC Reading Test Preparation
33 pages
Question Paper: Yearly Assessment - April 2023
No ratings yet
Question Paper: Yearly Assessment - April 2023
6 pages
Acceptance Sampling FQA 7-A2
No ratings yet
Acceptance Sampling FQA 7-A2
28 pages
Topic of Presentation Emotional Intelligence Presented To Sir Presented by Atib Ramzan
No ratings yet
Topic of Presentation Emotional Intelligence Presented To Sir Presented by Atib Ramzan
14 pages
Eler 2004 1 1 Renda
No ratings yet
Eler 2004 1 1 Renda
22 pages
Cons Strategy Ethiopia Final
No ratings yet
Cons Strategy Ethiopia Final
257 pages
CSI 6500 Machinery Health Monitor: Online Machinery Monitoring Product Data Sheet
No ratings yet
CSI 6500 Machinery Health Monitor: Online Machinery Monitoring Product Data Sheet
12 pages
Already Table Exists in Oracle How To Append Table Row Using Data Pump - Google Search
No ratings yet
Already Table Exists in Oracle How To Append Table Row Using Data Pump - Google Search
2 pages
SOLID MECHANICS LAB MANUAL Rub
No ratings yet
SOLID MECHANICS LAB MANUAL Rub
12 pages
ITT American Electric Horizontal Cutoff Luminaire Series 13-14-25-26 Spec Sheet 2-81
100% (1)
ITT American Electric Horizontal Cutoff Luminaire Series 13-14-25-26 Spec Sheet 2-81
6 pages
Genre Hybrids Essay
No ratings yet
Genre Hybrids Essay
4 pages
Avalon Brown - 2.3.4.A TwoComplementArithmetic
No ratings yet
Avalon Brown - 2.3.4.A TwoComplementArithmetic
4 pages
Mole Concept
No ratings yet
Mole Concept
13 pages
PROGRAMS
No ratings yet
PROGRAMS
37 pages
Coker Unit
No ratings yet
Coker Unit
15 pages
J.S.M. Fowze - 2001
No ratings yet
J.S.M. Fowze - 2001
1 page
Download Full The Econometrics of Multi-dimensional Panels 2nd Edition Laszlo Matyas PDF All Chapters
100% (14)
Download Full The Econometrics of Multi-dimensional Panels 2nd Edition Laszlo Matyas PDF All Chapters
40 pages
Zero Day Attacks - Jason Kephart
No ratings yet
Zero Day Attacks - Jason Kephart
19 pages
BDC For 585 Infotype Using HR - Infotype - Operations
No ratings yet
BDC For 585 Infotype Using HR - Infotype - Operations
13 pages
AEC 1 Paper 1
No ratings yet
AEC 1 Paper 1
2 pages
Ch05 Competition Cooperation PDF
No ratings yet
Ch05 Competition Cooperation PDF
5 pages
Mla Sample Paper
No ratings yet
Mla Sample Paper
2 pages
Zhou 2011
No ratings yet
Zhou 2011
9 pages