Bayesian Networks: Independencies and Inference: Scott Davies and Andrew Moore
Bayesian Networks: Independencies and Inference: Scott Davies and Andrew Moore
Note to other teachers and users of these slides. Andrew and Scott would
be delighted if you found this source material useful in giving your own
lectures. Feel free to use these slides verbatim, or to modify them to fit
your own needs. PowerPoint originals are available. If you make use of a
significant portion of these slides in your own lecture, please include this
message, or the following link to the source repository of Andre w’s
tutorials: http://www.cs.cmu.edu/~awm/tutorials . Comments and
corrections gratefully received.
• This implies
n
P ( X 1 K X n ) = ∏ P ( X i | parents ( X i ))
i =1
1
What Independencies does a Bayes Net Model?
• Example:
Given Y, does learning the value of Z tell us
Z nothing new about X?
P(Y | Z ) P( X | Y ) P( Z )
= (By Assumption)
P( X | Y ) P(Y )
P(Y | Z ) P( Z )
= = P( Z | Y ) (Bayes’s Rule)
P(Y )
2
What Independencies does a Bayes Net Model?
X Z
U V
• I<X,{U},Z>? No.
• I<X,{U,V},Z>? Yes.
• Maybe I<X, S, Z> iff S acts a cutset between X and Z
in an undirected version of the graph…?
3
Things get a little more confusing
X Z
Burglar Earthquake
Alarm
Phone Call
• Your house has a twitchy burglar alarm that is also
sometimes triggered by earthquakes.
• Earth arguably doesn’t care whether your house is
currently being burgled
• While you are on vacation, one of your neighbors
calls and tells you your home’s burglar alarm is
ringing. Uh oh!
4
Things get a lot more confusing
Burglar Earthquake
Alarm
Phone Call
5
A path is “blocked” when...
• Or, ...
6
d-separation to the rescue, cont’d
d-separation example
A B
I J
7
Bayesian Network Inference
I1 I2 I3 I4 I5
Reduces to
O
P(O) must be
How many satisfying assignments? (#sat. assign.)*(.5^#inputs)
8
Decomposing the probabilities
Xi
P( X i | E ) = P ( X i | Ei− , Ei+ )
Xi
9
Decomposing the probabilities, cont’d
P( X i | E ) = P ( X i | Ei− , Ei+ )
P( Ei− | X , Ei+ ) P ( X | Ei+ )
= Xi
P( Ei− | Ei+ )
P( X i | E ) = P ( X i | Ei− , Ei+ )
P( Ei− | X , Ei+ ) P ( X | Ei+ )
= Xi
P( Ei− | Ei+ )
P( Ei− | X ) P( X | Ei+ )
=
P ( Ei− | Ei+ )
10
Decomposing the probabilities, cont’d
P( X i | E ) = P ( X i | Ei− , Ei+ )
P( Ei− | X , Ei+ ) P ( X | Ei+ )
= Xi
P( Ei− | Ei+ )
P( Ei− | X ) P( X | Ei+ )
=
P ( Ei− | Ei+ )
= ap ( X i )? ( X i ) Where:
• α is a constant independent of Xi
• π(Xi) = P(Xi |Ei+)
• λ(Xi) = P(Ei-| Xi)
11
Quick aside: “Virtual evidence”
Xc
• Then:
? ( X i ) = P ( E i− | X i ) =
12
Calculating λ(Xi) for non-leaves
Xc
• Then:
? ( X i ) = P ( E i− | X i ) = ∑ P(E
j
i
−
,XC = j | Xi)
Xc
• Then:
? ( X i ) = P ( E i− | X i ) = ∑ P(E
j
i
−
,XC = j | Xi)
= ∑ P(X
j
C = j | X i ) P ( E i− | X i , X C = j )
13
Calculating λ(Xi) for non-leaves
Xc
• Then:
? ( X i ) = P ( E i− | X i ) = ∑ P(E
j
i
−
,XC = j | Xi)
= ∑ P(X
j
C
= j | X i ) P ( E i− | X i , X C = j )
= ∑ P(X
j
C = j | X i ) P ( E i− | X C = j )
= ∑ P(X
j
C = j | X i )?( X C = j)
∏ ∑
=
X j ∈C
P ( X j | X i ) ?( X j
)
X j
where λj(Xi) is the contribution to P(Ei-| Xi) of the part of
the evidence lying in the subtree rooted at one of Xi’s
children Xj.
14
We are now λ-happy
λ λ λ λ
15
Computing π(Xi)
p( X i ) = P( X i | Ei+ ) =
Xp
Xi
Computing π(Xi)
p( X i ) = P( X i | Ei+ ) = ∑ P ( X i , X p = j | Ei+ )
j
Xp
Xi
16
Computing π(Xi)
p( X i ) = P( X i | Ei+ ) = ∑ P ( X i , X p = j | Ei+ )
j
= ∑ P ( X i | X p = j, E ) P ( X p = j | Ei+ )
i
+
Xp j
Xi
Computing π(Xi)
p( X i ) = P( X i | Ei+ ) = ∑ P ( X i , X p = j | Ei+ )
j
= ∑ P ( X i | X p = j, Ei+ ) P ( X p = j | Ei+ )
Xp j
= ∑ P ( X i | X p = j ) P ( X p = j | Ei+ )
j
Xi
17
Computing π(Xi)
p( X i ) = P( X i | Ei+ ) = ∑ P ( X i , X p = j | Ei+ )
j
= ∑ P ( X i | X p = j, E ) P ( X p = j | Ei+ )
i
+
Xp j
= ∑ P ( X i | X p = j ) P ( X p = j | Ei+ )
j
Xi
P( X p = j | E)
= ∑ P( X i | X p = j )
j ? i ( X p = j)
Computing π(Xi)
p( X i ) = P ( X i | Ei+ ) = ∑ P( X i , X p = j | Ei+ )
j
= ∑ P ( X i | X p = j, Ei+ ) P( X p = j | Ei+ )
j
Xp
= ∑ P ( X i | X p = j) P( X p = j | Ei+ )
j
Xi
P( X p = j | E )
= ∑ P ( X i | X p = j)
j ? i ( X p = j)
= ∑ P ( X i | X p = j)p i ( X p = j)
j
P( X p | E )
Where πi(Xp) is defined as
?i(X p)
18
We’re done. Yay!
λ λ
π π
λ λ λ λ
π π π π
Conjunctive queries
19
Polytrees
B C BC
D D
Set to 0 Set to 1
• Conditioning
20
Join trees
A ABC
B C
BCD BCD
E D G
DF
F
In the worst case the join tree nodes must take on exponentially
many combinations of values, but often works well in practice
21