0% found this document useful (0 votes)
24 views31 pages

Bayesian Belief Network 2

bbn

Uploaded by

Parshva Maniar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views31 pages

Bayesian Belief Network 2

bbn

Uploaded by

Parshva Maniar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

BAYESIAN

NETWORKS
Bayesian Network Motivation
 We want a representation and reasoning
system that is based on conditional
independence
 Compact yet expressive representation
 Efficient reasoning procedures
 Bayesian Networks are such a Thomas Bayes
representation
 Named after Thomas Bayes (ca. 1702 –1761)
 Term coined in 1985 by Judea Pearl (1936 – )
 Their invention changed the focus on AI
from logic to probability!

2
Judea Pearl
Bayesian Networks
 A Bayesian network specifies a joint distribution in a structured
form

 Represent dependence/independence via a directed graph


 Nodes = random variables
 Edges = direct dependence

 Structure of the graph  Conditional independence relations

 Requires that graph is acyclic (no directed cycles)

 Two components to a Bayesian network


 The graph structure (conditional independence assumptions)
 The numerical probabilities (for each variable given its parents)
Bayesian Networks

 General form:

𝑃(𝑋1, 𝑋2, … . 𝑋𝑁) = ෑ 𝑃(𝑋𝑖 | 𝑝𝑎𝑟𝑒𝑛𝑡𝑠(𝑋𝑖 ) )


𝑖

The full joint distribution The graph-structured approximation


Example of a simple Bayesian
network
𝑃(𝑋1, 𝑋2, … . 𝑋𝑁 ) = ෑ 𝑃(𝑋𝑖 | 𝑝𝑎𝑟𝑒𝑛𝑡𝑠(𝑋𝑖 ) ) A B
𝑖

𝑃 𝐴, 𝐵, 𝐶 = 𝑃 𝐶 𝐴, 𝐵 𝑃 𝐴 𝑃(𝐵)
C

 Probability model has simple factored form


 Directed edges => direct dependence
 Absence of an edge => conditional independence

 Also known as belief networks, graphical models, causal


networks
 Other formulations, e.g., undirected graphical models
Examples of 3-way Bayesian
Networks

A B C Absolute Independence:
p(A,B,C) = p(A) p(B) p(C)
Examples of 3-way Bayesian
Networks
Conditionally independent
effects:
𝑝(𝐴, 𝐵, 𝐶) = 𝑝(𝐵|𝐴)𝑝(𝐶|𝐴)𝑝(𝐴)
A
 B and C are conditionally
independent given A
B C

 e.g., A is a disease, and we


model B and C as
conditionally independent
symptoms given A
Examples of 3-way Bayesian
Networks
 Independent Clauses:
𝑝(𝐴, 𝐵, 𝐶) = 𝑝(𝐶|𝐴, 𝐵)𝑝(𝐴)𝑝(𝐵)

A B

 “Explaining away” effect:


C
 A and B are independent but become
dependent once C is known!!
 (we’ll come back to this later)
Examples of 3-way Bayesian
Networks

A B C Markov dependence:
p(A,B,C) = p(C|B) p(B|A)p(A)
The Alarm Example
 You have a new burglar alarm installed
 It is reliable about detecting burglary, but responds to minor
earthquakes
 Two neighbors (John, Mary) promise to call you at work when
they hear the alarm
 John always calls when hears alarm, but confuses alarm
with phone ringing (and calls then also)
 Mary likes loud music and sometimes misses alarm!

 Given evidence about who has and hasn’t called, estimate the
probability of a burglary
The Alarm Example
 Represent problem using 5 binary variables:
 B = a burglary occurs at your house
 E = an earthquake occurs at your house
 A = the alarm goes off
 J = John calls to report the alarm
 M = Mary calls to report the alarm

 What is P(B | M, J) ?

 We can use the full joint distribution to answer this question


◼ Requires 25 = 32 probabilities

 Can we use prior domain knowledge to come up with a


Bayesian network that requires fewer probabilities?
Constructing a Bayesian
Network: Step 1
 Order the variables in terms of causality
(may be a partial order)
 e.g., {E, B} -> {A} -> {J, M}

 Use these assumptions to create the graph


structure of the Bayesian network
The Resulting Bayesian
Network

network topology reflects causal knowledge


Constructing a Bayesian
Network: Step 2
 Fill in conditional probability
tables (CPTs)
 One for each node
 2𝑝 entries, where 𝑝 is the number
of parents

 Where do these
probabilities come from?
 Expert knowledge
 From data (relative frequency
estimates)
 Or a combination of both
The Bayesian network

Shouldn’t these add up to 1?


No. Each row adds up to 1, and
we’re using this to let us show
only half of the table. For
example,
𝑃 ¬𝐴 𝐵, 𝐸 = 1 − 𝑃 𝐴 𝐵, 𝐸
= 1 − 0.95 = 0.05
The Bayesian network

What is P(j  m  a  b  e)?

P (j | a) P (m | a) P (a | b, e) P (b) P (e)


Number of Probabilities in Bayesian
Networks
(i.e. why Bayesian Networks are
effective)
 Consider n binary variables

 Unconstrained joint distribution requires


O(2n) probabilities

 If we have a Bayesian network, with a


maximum of k parents for any node, then we
need O(n 2k) probabilities
 16 binary variables
 Full joint distribution is 216
 How many probability values required for
Bayes Net?
Bayesian Networks from a
different Variable Ordering
Example for BN construction: Fire
Diagnosis
You want to diagnose whether there is a fire
in a building
 You receive a noisy report about whether

everyone is leaving the building


 If everyone is leaving, this may have been

caused by a fire alarm


 If there is a fire alarm, it may have been
caused by a fire or by tampering
 If there is a fire, there may be smoke
Example for BN construction: Fire
Diagnosis
First you choose the variables. In this case, all are Boolean:
 Tampering is true when the alarm has been tampered with

 Fire is true when there is a fire

 Alarm is true when there is an alarm

 Smoke is true when there is smoke

 Leaving is true if there are lots of people leaving the


building
 Report is true if the sensor reports that lots of people
are leaving the building

 Let’s construct the Bayesian network for this


 First, you choose a total ordering of the variables, let’s say:
Fire; Tampering; Alarm; Smoke; Leaving; Report.
Example for BN construction: Fire
Diagnosis
Example for BN construction: Fire
Diagnosis
Example for BN construction: Fire
Diagnosis
• Using the total ordering of variables:
 Let’s say Fire; Tampering; Alarm; Smoke; Leaving; Report.
• Now choose the parents for each variable by evaluating
conditional independencies
 Fire is the first variable in the ordering. It does not have
parents.
 Tampering independent of fire (learning that one is true would
not change your beliefs about the probability of the other)
 Alarm depends on both Fire and Tampering: it could be caused
by either or both
 Smoke is caused by Fire, and so is independent of Tampering and
Alarm given whether there is a Fire
 Leaving is caused by Alarm, and thus is independent of the other
variables given Alarm
 Report is caused by Leaving, and thus is independent of the
other variables given Leaving
Example for BN construction: Fire
Diagnosis

• How many probabilities do we need to specify


for this Bayesian network?
• 1+1+4+2+2+2 = 12
Independence
 Let define the symbol ⊥ to indicate
independence of two variables.

A B C 𝐴⊥𝐵
𝐴⊥𝐶
𝐵⊥𝐶
Independence

True False
𝑅⊥𝐴
𝑅 ⊥ 𝐴|𝐿
𝐿⊥𝑆
𝐿 ⊥ 𝑆|𝐹
𝑆 ⊥ 𝐴|𝐹
𝑅 ⊥ 𝑆|𝐿

 General rule of thumb:


 A known variable makes everything below that variable
independent from everything above that variable.
Another (tricky) Example

True False
B⊥M
B ⊥ M|E
B ⊥ M|A
B⊥E
B ⊥ E|A
Explaining Away
 Earth doesn’t care whether your
house is currently being burgled
 While you are on vacation, one of
your neighbors calls and tells you
your home’s burglar alarm is
ringing.
 But now suppose you learn that
there was a medium-sized
earthquake in your neighborhood.
Oh, whew! Probably not a burglar
after all.

 𝐸𝑎𝑟𝑡ℎ𝑞𝑢𝑎𝑘𝑒 “explains away” the


hypothetical burglar, so knowing
about the 𝐴𝑙𝑎𝑟𝑚 and
𝐸𝑎𝑟𝑡ℎ𝑞𝑢𝑎𝑘𝑒 effects you estimate
of 𝐵𝑢𝑟𝑔𝑙𝑎𝑟𝑦.
Independence
 Is there a principled way to determine all
these dependencies?
 Yes! It’s called D-Separation – 3 specific rules.
◼ Some say D-separation rules are easy
◼ Our book: “rather complicated… we omit it”
◼ The truth: a mix of both… easy to state rules, can
be tricky to apply. Talk to me if you want to know
more.
Next class…
 Inference using Bayes Nets

You might also like