0% found this document useful (0 votes)

1K views6 pages

DRP Proposal - 210103

This document proposes studying the belief propagation algorithm. It begins with an introduction to belief propagation and graphical models. It then provides more background on the basic belief propagation algorithm, showing that for tree graphs the algorithm computes exact marginal probabilities. It proposes studying belief propagation on simple path and loopy graphs, coding the algorithm, and examining applications like image restoration. It discusses how belief propagation may converge on loopy graphs by considering an "unwrapped" version of the graph.

Uploaded by

api-445507715

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views6 pages

DRP Proposal - 210103

Uploaded by

api-445507715

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Applications of Belief Propagation on Loopy Graphs

Proposal submitted to the Applied Mathematics

Directed Reading Program

Jennah Gosciak
January 3, 2021

1 Introduction
The goal of the belief propagation algorithm is to compute the marginal probability of a
random variable, conditional on observed characteristics[8]. This marginal probability corre-
sponds to a node in a graphical model, either a Bayesian network or a Markov Random Field.
A Bayesian network is a directed acyclic graph that represents the conditional dependencies
of each node through arrows. Often the Bayes network tells a story with a clear direction,
i.e. a person’s attitude to health a↵ects their smoking and drinking habits that in turn a↵ect
the amount of tar in their lungs. A Markov Random Field is an undirected cyclic or acyclic
graph that represents conditional dependencies satisfying the following Markov Property for
some node v 2 V = {set of all vertices}:

P (Xv = xv | all the other variables ) = P (Xv = xv | neighbors )

Figure 1: Examples of a directed and undirected graph. [8].

Belief propagation is an algorithm used for inference on graphical models, often with ap-
plications in artificial intelligence and information theory. Graphical models are a useful tool
when analyzing complex probability distributions. They model the conditional dependence
relations among the random variables in a multivariate distribution. Each node in a graph-
ical model represents a random variable. The edges between nodes represent dependence
relationships.
There are no arrows between the nodes in a Markov Random Field (MRF) and direct
dependence relationships can only be inferred by the following: 8j 2 V, (edge between xj
and xi ) ! (xj and xi are dependent). We can infer independence when there are no edges
between any two subsets of vertices. For A, B ✓ V , xj 2 A is independent of xi 2 B only
when A and B are completely disconnected, i.e. there are no edges between any xn 2 A and
xm 2 B
It is computationally difficult to compute marginal probabilities. The workload grows
exponentially with the number of random variables, i.e. unobserved nodes. For a state space
X with N nodes, the brute force calculation of the marginal probabilities is of computational
complexity O(|X|N ) [3]. But with belief propagation, the marginal probabilities, known as
beliefs, only grow linearly.
Belief Propagation (BP) represents an advancement in computational applications of
information theory, computer vision, and artificial intelligence. Problems that seem difficult
or complicated, with many nodes and loops, become simplified when handled with the BP
algorithm. It grows linearly and is more efficient than even Monte Carlo approximation [8].
The BP algorithm was first proposed in 1982 by mathematician Judea Pearl. The paper
outlined a more formal system of computing beliefs and messages based on evidence, which
was previously inefficiently calculated on an ad-hoc basis. The paper, which focuses on
trees, demonstrated that Bayesian inference holds over multi-valued variables. It also showed
that on trees the beliefs computed in BP are exactly the marginal probabilities [5]. More
recent research focuses on undirected, cyclic graphs. Even outside Pearl’s tree and polytree
examples (a polytree is a directed acyclic graph which has an underlying undirected graph
that is a tree), loopy BP does converge often and the approximated distributions are a
relatively good approximation for the marginal[4].
We propose to first study the basic structure of the belief propagation algorithm for a
three node path graph. Then we will extend the belief propagation algorithm to a simple
loopy graph, and analyze the challenges associated with convergence among loopy graphs.
We will focus on coding the belief propagation algorithm in both of these cases: a simple path
graph and a simple loopy graph. The coded product will allow for interactive manipulation
of parameters. Finally, we will consider useful applications of the BP algorithm. Primarily,
we will examine how BP is useful for image restoration and computer vision problems[8].
However, we also want to think of new applications related to the problems facing cities.
For instance, researchers have applied the BP algorithm to real-time prediction of traffic
conditions or for modeling cities from unstructured data points [1][2].

2
2 Background
Belief propagation is an algorithm that allows for efficient approximation of marginal distri-
butions. The output of this algorithm is a belief function that is an approximation of the
marginal distribution. Consider the following equation, in which and are cost functions.
1 Y Y
P (X = x|Y = y) = ( ) ij (xi , xj ) ij (xi , yi ) (1)
Z i,j i

While the relationships between Xi and Xj imply the structure and dependence relation-
ships of a pairwise Markov Random Field, the steps of the algorithm are generalizable to
other graphical models.

Step 1: Compute the messages passed from node j to i, denoted mji (xi ). N (i) = {the
set of all v that share an edge with j}.
X Y
mji (xi ) i (xi ) ij (xi , xj ) mki (xi ) (2)
xi k2N (i)\j

Step 2: For each node, we calculate the belief associated with it. The belief Xi is propor-
tional to the evidence Yi associated with it. We denote this relationship i (xi ). For a belief
at node i:
Y
bi (xi ) = k i (xi ) mji (xi ). (3)
j2N (i)

It is not difficult to show that for a tree graph, the beliefs exactly represent the marginal
distribution of some node xi . As an example, we will prove this result for a path graph with
3 nodes.
Proposition 1. Consider a path graph with three nodes and the corresponding probability
distribution, conditional on observed characteristics.
1 Y Y
P (X|Y ) = ( ) i (xi ) ij (xi , xj )
Z i2V
(i,j)2E

We want to show that 8xi , b(xi ) = P (Xi|Y i).

Proof. To prove this statement, we compute the messages between each node.

X X X
m12 (x2 ) = 1 (x1 ) (x1 , x2 ) m21 (x1 ) = 2 (x2 ) (x2 , x1 ) 3 (x3 ) (x3 , x2 )
x1 x2 x3
X X X
m32 (x2 ) = 3 (x3 ) (x3 , x2 ) m23 (x3 ) = 2 (x2 ) (x2 , x3 ) 1 (x1 ) (x1 , x2 )
x3 x2 x1

In following the BP algorithm, we arrive at:

X X
b1 (x1 ) = k 1 (x1 )m21 (x1 ) = k 1 (x1 ) 2 (x2 ) (x2 , x1 ) 3 (x3 ) (x3 , x2 ) = P (x1 |y1 )
x2 x3

3
Figure 2: Here is an example of an unwrapped network [7].
X X
b2 (x2 ) = k 2 (x2 )m32 (x2 )m12 (x2 ) =k 2 (x2 ) 3 (x3 ) (x3 , x2 ) 1 (x1 ) (x1 , x2 ) = P (x2 |y2 )
x3 x1
X X
b3 (x3 ) = k 3 (x3 )m23 (x3 ) = k 3 (x3 ) 2 (x2 ) (x2 , x3 ) 1 (x1 ) (x1 , x2 ) = P (x3 |y3 ).
x2 x1

By the definition of marginalization, the beliefs are equivalent to the marginal probabilities.
This is true for any tree graph, i.e. a graph where any two vertices are only connected by
one path [8]. For non-tree graphs, such as loopy or cyclic graphs where at least one pair
of vertices is connected by multiple paths, the BP algorithm does not compute the exact
marginal probabilities. It is possible to run the BP algorithm on a loopy graph, but it does
not always converge. One outcome is that the messages circulate indefinitely. However,
there are also cases of loopy graphs where the BP algorithm works well. It is not always
clear why BP converges. Empirical evidence shoes that when BP converges, the beliefs are
a good approximation for the marginals [4].
One way to understand why BP converges on loopy graphs is to consider an unwrapped
graph. Let us look at the example G = C3 . Without loss of generality, let the root = B (see
Fig. 2). At step t = 0, the graph is 3 nodes A B C. At step t = 1, the graph is 5 nodes
C 00 A B C A0 . At step t = 2, the graph is 7 nodes B 00 C 00 A B C A0 B 0 .
Definition 1. Let 1 ⌧ be the belief of the root at step ⌧ and b1 t is the belief of Xi 2 C3 .
After ⌧ iterations of BP, b1 t = 1 ⌧ .
Since the unwrapped graph is acyclic and singly-connected, then BP on a loopy graph is
equivalent to BP on the unwrapped graph. If the BP algorithm converges after N iterations,

4
then the beliefs are the marginal probabilities of the unwrapped graph of length N. In
comparing the probability distribution of the unwrapped graph to the original problem, we
find that the algorithm will give us the most likely sequence of information, but that the
marginal distributions for each node have no relationship to the marginal distributions of
G = C3 [7]. For example, consider the cyclic graph with joint distribution
1 1
P (X|Y ) = exp ( E(X)).
z T
P
E(X) is an energy function equal to gi (xi )+hij (xi , xj ). If we want to maximize the proba-
bility distribution to find the most likely sequence of states for random variables Xi , then we
want to minimize E(X). Computation over the unwrapped graph as opposed to the cyclic
graph changes the value of T , but if we’re interested in the sequences that maximizes the
joint distribution, the new temperature T 0 doesn’t change the result. This scaling, however,
does not translate to finding the marginal distribution. The beliefs for the unwrapped graph
give us the marginal distribution for the unwrapped graph, but not for the original cyclic
graph.
Convergence occurs when the center nodes in an unwrapped graph are independent of the
nodes on the boundary. After a finite number of iterations for some nodes n, the additional
iterations have no change on the probabilities of the original center nodes [7].

3 Proposed Methodology
We will begin by learning how to implement the BP algorithm for simple graphs, such as a
path graph. We will first program the BP algorithm for the 3 node path graph, and then we
will program the BP algorithm for a 3 node cyclic graph, to experiment with the optimal
parameters for convergence [8]. We will explore damping, which has been used to improve
BP when it doesn’t converge or converges slowly. Damping turns the messages sent between
nodes into a weighted average between old estimates and new estimates[6].
A significant amount of time will be spent on applications of BP–specifically in image
restoration. Once we’ve coded the BP algorithm for simple graphs, we will test it out on
an example, such as a blurry image. We can best model image restoration using a pairwise
Markov Random Field (MRF). In the structure of a grid, we assign each pixel or group of
pixels a position i corresponding to a random variable Xi . Each Xi has evidence Yi that is
observed. We can model unobserved and observed qualities with i (xi ), with the assumption
that xi should be similar to yi [8]. It is also reasonable to expect that neighboring pixels will
be similar. This gives us the function (xi , xj ), which represents the dependence between
xi and xj . If we return to (1), the joint probability of the image conditional on observed
qualities is
1 Y Y
P (X = x|Y = y) = ( ) ij (xi , xj ) i (xi , yi ).
Z ij
(i)

Conventionally, i and ij are energy functions where i (xi ) = e g(xi ,yi ) and ij =
h(xi ,xj
e ). Since we want to maximize the probability distribution, we want to minimize

5
g(xi ) and h(xi , xj ). To minimize g(xi , yi ) and h(xi , xj ), it is useful to represent them as
cost functions, most commonly with the l1 or the l2 norm. For our programming purposes,
g(xi , yi ) = ||xi yi ||1 and h(xi , xj ) = ||xi xj ||1 . We will experiment with di↵erent cost
functions and norms, to see if this changes the convergence rate of BP on the pairwise MRF.

References
[1] C. Furtlehner, J.-M. Lasgouttes, and A. de La Fortelle, A belief propagation
approach to traffic prediction using probe vehicles, in 2007 IEEE Intelligent Transporta-
tion Systems Conference, IEEE, 2007, pp. 1022–1027.

[2] F. Lafarge and C. Mallet, Building large urban environments from unstructured
point data, in 2011 International Conference on Computer Vision, IEEE, 2011, pp. 1068–
1075.

[3] M. Mezard and A. Montanari, Information, physics, and computation, Oxford Uni-
versity Press, 2009.

[4] K. P. Murphy, Y. Weiss, and M. I. Jordan, Loopy belief propagation for ap-
proximate inference: An empirical study, in Proceedings of the Fifteenth conference on
Uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc., 1999, pp. 467–
475.

[5] J. Pearl, Reverend Bayes on inference engines: A distributed hierarchical approach,

Cognitive Systems Laboratory, School of Engineering and Applied Science , 1982.

[6] P. Som and A. Chockalingam, Damped belief propagation based near-optimal equal-
ization of severely delay-spread uwb mimo-isi channels, in 2010 IEEE International Con-
ference on Communications, IEEE, 2010, pp. 1–5.

[7] Y. Weiss, Belief propagation and revision in networks with loops, (1997).

[8] J. S. Yedidia, W. T. Freeman, and Y. Weiss, Understanding belief propagation

and its generalizations, Exploring artificial intelligence in the new millennium, 8 (2003),
pp. 236–239.

AIML-Unit 5 Notes
No ratings yet
AIML-Unit 5 Notes
45 pages
Group Presentation
No ratings yet
Group Presentation
43 pages
AI Bayes Theorem
No ratings yet
AI Bayes Theorem
10 pages
Data Science
No ratings yet
Data Science
74 pages
Approximate Inference
No ratings yet
Approximate Inference
37 pages
Bayesian Lecture Notes
No ratings yet
Bayesian Lecture Notes
28 pages
CS263 - Bayesian Decision Theory
No ratings yet
CS263 - Bayesian Decision Theory
16 pages
Variable Elimination hmm ppt-1
No ratings yet
Variable Elimination hmm ppt-1
21 pages
lcaw_fvs_tsp12
No ratings yet
lcaw_fvs_tsp12
16 pages
ML Section16 Causality
No ratings yet
ML Section16 Causality
57 pages
Bayesian Theory - Bayesian Network - Dempster Shafer Theory-AI Seminar
No ratings yet
Bayesian Theory - Bayesian Network - Dempster Shafer Theory-AI Seminar
21 pages
Doing Bayesian Data Analysis With JASP: Darrell A. Worthy
No ratings yet
Doing Bayesian Data Analysis With JASP: Darrell A. Worthy
76 pages
2020 Testing by Betting A Strategy For Statistical and Scientific Communication
No ratings yet
2020 Testing by Betting A Strategy For Statistical and Scientific Communication
30 pages
All Tasks
No ratings yet
All Tasks
7 pages
6438 CombinedNotes
No ratings yet
6438 CombinedNotes
206 pages
Generalised Linear Models and Bayesian Statistics
No ratings yet
Generalised Linear Models and Bayesian Statistics
35 pages
Bian - Deep Learning On Smooth Manifolds
No ratings yet
Bian - Deep Learning On Smooth Manifolds
6 pages
Bayesian Network
No ratings yet
Bayesian Network
15 pages
Data Mining - Classification
No ratings yet
Data Mining - Classification
53 pages
Connected Components
No ratings yet
Connected Components
42 pages
Doug Bates Mixed Models
No ratings yet
Doug Bates Mixed Models
75 pages
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
100% (1)
Notes For 18.6501x, Fundamentals of Statistics: v0.2 (2019 April 24)
14 pages
CA494-08280 - HP SmartStream Designer 10 For InDesign CC 2014 - User Guide
No ratings yet
CA494-08280 - HP SmartStream Designer 10 For InDesign CC 2014 - User Guide
295 pages
Bayesian Learning Methods
No ratings yet
Bayesian Learning Methods
57 pages
Teaching Bayesian Method
No ratings yet
Teaching Bayesian Method
20 pages
Bayesian Network - Problem
100% (1)
Bayesian Network - Problem
4 pages
HW5 Solutions PDF
100% (1)
HW5 Solutions PDF
29 pages
Hw3sol 21015 PDF
No ratings yet
Hw3sol 21015 PDF
13 pages
Beyond Prediction Using Big Data For Policy Problems
No ratings yet
Beyond Prediction Using Big Data For Policy Problems
4 pages
Product KEY 2019
50% (2)
Product KEY 2019
8 pages
Bioinformatics F&amp M 20100722 Bujak
100% (1)
Bioinformatics F&amp M 20100722 Bujak
27 pages
Fanuc Focas Ethernet Manual
No ratings yet
Fanuc Focas Ethernet Manual
82 pages
Exponential Smoothing
No ratings yet
Exponential Smoothing
5 pages
Google Page Rank
No ratings yet
Google Page Rank
25 pages
Chapter 2 Part 2
No ratings yet
Chapter 2 Part 2
12 pages
15 Mvue
100% (1)
15 Mvue
28 pages
Information Theory: 1 Random Variables and Probabilities X
No ratings yet
Information Theory: 1 Random Variables and Probabilities X
8 pages
Causal Inference and Stable Learning: Peng Cui Tong Zhang
No ratings yet
Causal Inference and Stable Learning: Peng Cui Tong Zhang
95 pages
Conditional Probability: Class XII - Math Concepts and Formulae Chapter: P Robability
No ratings yet
Conditional Probability: Class XII - Math Concepts and Formulae Chapter: P Robability
3 pages
Artificial Intelligence Tutorial 5 - Answers: Difficult), and P
100% (1)
Artificial Intelligence Tutorial 5 - Answers: Difficult), and P
5 pages
PowerPoint 2010 - Basic - Exercises
No ratings yet
PowerPoint 2010 - Basic - Exercises
2 pages
Study Guide For STA3701
No ratings yet
Study Guide For STA3701
325 pages
Google The Anatomy of A Large-Scale Hypertextual Web Search Engine
No ratings yet
Google The Anatomy of A Large-Scale Hypertextual Web Search Engine
3 pages
A Step-By-Step Android Penetration Testing Guide For Beginners
No ratings yet
A Step-By-Step Android Penetration Testing Guide For Beginners
39 pages
Nonparametric Testing Using The Chi-Square Distribution: Reading Tips
No ratings yet
Nonparametric Testing Using The Chi-Square Distribution: Reading Tips
4 pages
AVEVA Hull & Outfitting 12.1.SP4.56 Partial Fix Release 80460 Windows 7
No ratings yet
AVEVA Hull & Outfitting 12.1.SP4.56 Partial Fix Release 80460 Windows 7
517 pages
HW6 Solutions PDF
No ratings yet
HW6 Solutions PDF
34 pages
L - 2.3, L-2.4 RSA - Diffie Hellman Algorithm
No ratings yet
L - 2.3, L-2.4 RSA - Diffie Hellman Algorithm
21 pages
07 - Natural Experiment (Part 2) PDF
No ratings yet
07 - Natural Experiment (Part 2) PDF
90 pages
Bayesforbeginners
No ratings yet
Bayesforbeginners
21 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
11 pages
B.E Semester: 6 - IT (GTU) : 2161603 - Data Compression and Data Retrieval
No ratings yet
B.E Semester: 6 - IT (GTU) : 2161603 - Data Compression and Data Retrieval
17 pages
Gamma Distribution
No ratings yet
Gamma Distribution
7 pages
Cuestionarios IA
No ratings yet
Cuestionarios IA
17 pages
05 Random Variables
No ratings yet
05 Random Variables
327 pages
2d Animation Thesis
100% (3)
2d Animation Thesis
8 pages
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
No ratings yet
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
33 pages
ADA 2 Mansol ch24 - 367-378
No ratings yet
ADA 2 Mansol ch24 - 367-378
7 pages
Multinomial Logistic Regression Basic Relationships
No ratings yet
Multinomial Logistic Regression Basic Relationships
73 pages
z390 Phantom Gaming 6
No ratings yet
z390 Phantom Gaming 6
106 pages
Lecture - 12 Von Neumann & Morgenstern Expected Utility
No ratings yet
Lecture - 12 Von Neumann & Morgenstern Expected Utility
20 pages
Hw1 Theory Solution PuHK4fmHvB
No ratings yet
Hw1 Theory Solution PuHK4fmHvB
4 pages
Unit 4 Uncertain Knowledge Complete
No ratings yet
Unit 4 Uncertain Knowledge Complete
14 pages
Relnotes
No ratings yet
Relnotes
65 pages
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
100% (1)
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
8 pages
ACC 421. Data Processing Technology HardWare
No ratings yet
ACC 421. Data Processing Technology HardWare
26 pages
STAT 480b Answer Key To Problem Set No. 4
No ratings yet
STAT 480b Answer Key To Problem Set No. 4
3 pages
hw3 Solutions PDF
No ratings yet
hw3 Solutions PDF
11 pages
TH 250
No ratings yet
TH 250
53 pages
R Slides
No ratings yet
R Slides
326 pages
Batocera Installation Guide
No ratings yet
Batocera Installation Guide
14 pages
Image Processing and Computer Vision Systems Using MATLAB
No ratings yet
Image Processing and Computer Vision Systems Using MATLAB
49 pages
Input - Output - PART 1
No ratings yet
Input - Output - PART 1
15 pages
003. Lesson 9 - Multithreading and Asynchronous Processing in Mobile Applications.docx
No ratings yet
003. Lesson 9 - Multithreading and Asynchronous Processing in Mobile Applications.docx
6 pages
Aiopen S 23 00476
No ratings yet
Aiopen S 23 00476
10 pages
UNIT 1 Notes Part1
No ratings yet
UNIT 1 Notes Part1
7 pages
How to Create a Calculator in Python Tkinter
No ratings yet
How to Create a Calculator in Python Tkinter
3 pages
How Does A QR Code Work
No ratings yet
How Does A QR Code Work
13 pages
CNC Engraving Machine
No ratings yet
CNC Engraving Machine
30 pages
Information Theory Is The New Central Discipline
No ratings yet
Information Theory Is The New Central Discipline
3 pages
INTCOMP
No ratings yet
INTCOMP
21 pages
Revision - Bayesian Inference
No ratings yet
Revision - Bayesian Inference
4 pages
Sampann 2600 Datasheet
No ratings yet
Sampann 2600 Datasheet
2 pages
WWW Google Com Amp S WWW Diskpart Com Windows 10 Fix Unmountable Boot Volume Error in Windows 10 4125 Amp HTML
No ratings yet
WWW Google Com Amp S WWW Diskpart Com Windows 10 Fix Unmountable Boot Volume Error in Windows 10 4125 Amp HTML
6 pages
5289 LOWIS Software
No ratings yet
5289 LOWIS Software
8 pages
Brochure Indicator Ci 605a
No ratings yet
Brochure Indicator Ci 605a
2 pages
Last Exception
No ratings yet
Last Exception
1 page
CV Benmak
No ratings yet
CV Benmak
2 pages
Embedded Linux Design & Development: With Beagle Bone Black Expansion Board
No ratings yet
Embedded Linux Design & Development: With Beagle Bone Black Expansion Board
3 pages
Nitesh Jaiswal Resume
No ratings yet
Nitesh Jaiswal Resume
1 page

DRP Proposal - 210103

Uploaded by

DRP Proposal - 210103

Uploaded by

Applications of Belief Propagation on Loopy Graphs

Proposal submitted to the Applied Mathematics

P (Xv = xv | all the other variables ) = P (Xv = xv | neighbors )

Figure 1: Examples of a directed and undirected graph. [8].

We want to show that 8xi , b(xi ) = P (Xi|Y i).

In following the BP algorithm, we arrive at:

[5] J. Pearl, Reverend Bayes on inference engines: A distributed hierarchical approach,

[8] J. S. Yedidia, W. T. Freeman, and Y. Weiss, Understanding belief propagation

You might also like