0% found this document useful (0 votes)
39 views10 pages

Connections Between Stochastic Block Model and Label Propagation

This document discusses the connections between the stochastic block model (SBM) and the label propagation algorithm (LP). SBM is a random graph model that generates graphs with community structure by planting communities. LP is a semi-supervised algorithm that assigns labels to unlabeled nodes based on their neighbors' labels. Yamaguchi et al proved that a partial supervised version of SBM called PLSBM shares properties with a discrete version of LP called DLP. The objective is to understand if concentration inequality results for SBM can be extended to PLSBM and DLP to provide better analysis of the LP algorithm.

Uploaded by

Manjish Pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views10 pages

Connections Between Stochastic Block Model and Label Propagation

This document discusses the connections between the stochastic block model (SBM) and the label propagation algorithm (LP). SBM is a random graph model that generates graphs with community structure by planting communities. LP is a semi-supervised algorithm that assigns labels to unlabeled nodes based on their neighbors' labels. Yamaguchi et al proved that a partial supervised version of SBM called PLSBM shares properties with a discrete version of LP called DLP. The objective is to understand if concentration inequality results for SBM can be extended to PLSBM and DLP to provide better analysis of the LP algorithm.

Uploaded by

Manjish Pal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

CONNECTIONS BETWEEN

STOCHASTIC BLOCK MODEL


AND LABEL PROPAGATION
ALGORITHM
 Stochastic Block Model
 This is a random graph model that tries to model the community
structure in a graph i.e. generate graphs which has communities
in it.
 Differs substantially from the well known random graph model of
Erdos and Renyi.
 Basically the intention is to plant a community structure in a given
graph. Simplest case the planted bisection model.

INTRODUCTION
 Basic Model : Given a value of k (the number of communities)
and a k x k probability matrix where the value p_ij represents the
probability that there is an edge between a node in community i
and a node in community j. The sizes of each community is given
by a probability vector p_i where I varies from 1 to k. This model is
useful
 To model real life networks.
 Study the average case complexity of NP-hard problems.
 To establish benchmarks for clustering algorithms

STOCHASTIC BLOCK MODEL (CONT.)


 The main objectives in the SBM model are
 Weak Recovery : Find a partition of nodes that is positively
correlated with the hidden partition
 Partial Recovery : What portion of the nodes can be recovered
exactly ?
 Exact Recovery : When can entire clusters be recovered.
 These questions have been studied extensively for the past three
decade.
 Several Results have been obtained using tool from high dimensional
geometry till now which are mathematically very challenging. Abbe
et al and Bandeira et al. present the current state of the art.

RESULTS REGARDING SBM


 Label Propagation is a semi-supervised machine learning
algorithm that assigns labels to the unlabeled nodes in a graph
based on the labels of its neighborhood and various cost
functions.
 There are different notions of cost functions and different labeling
strategies.
 Label Propagation also refers to a class of graph clustering
algorithms.

LABEL PROPAGATION ALGORITHM


Can LP be viewed as a network generative model as SBM ?

Yamaguchi et al prove that LP and SBM share the same goal. They prove that a
modified partial supervised version of SBM called PLSBM shares the same properties as
a discrete version of Label Propagation called DLP

Our objective : Since Yamaguchi et al has established this connection we want to


understand whether we can use this connection to apply the results obtained by
Bandeira et al which is based on concentration inequalities in the LP scenario.

OBJECTIVES
 Can the concentration based results be extended to the PLSBM
setting and hence the DLP algorithm ?
 This objective has been slightly met in Yamaguchi’s paper.

PROBLEM DEFINITION
 Till now we have done a literature survey of the results obtained in
the research of SBM and Label propagation algorithms.
 This involves understanding the basic mathematical machinery to
understand these algorithms.

CONTRIBUTIONS
 Extend the results of Bandeira et al and Abbe et al to the PLSBM
setting.
 More formally, can the concentration based results be extended
to the PLSBM setting and hence the DLP algorithm ?
 Yamaguchi et al establish a weak connection; can this
connection be strengthened.
 Does the application of these results lead to a better
understanding of the LP algorithm ?

REMAINING TASKS
 Abbe, Emmanuel, and Colin Sandon. "Community detection in
general stochastic block models: Fundamental limits and efficient
algorithms for recovery." 2015 IEEE 56th Annual Symposium on
Foundations of Computer Science. IEEE, 2015.
 Abbe, Emmanuel, Afonso S. Bandeira, and Georgina Hall. "Exact
recovery in the stochastic block model." IEEE Transactions on
Information Theory 62.1 (2016): 471-487.
 Yamaguchi, Yuto, and Kohei Hayashi. "When Does Label
Propagation Fail? A View from a Network Generative
Model." IJCAI. 2017.

REFERENCES

You might also like