Lecture4 DONT USE
Lecture4 DONT USE
Network
Farhana Shahid, Brac University
Summer 2020
Probabilistic Models
• Naïve Bayes
• Used for classification based on features
• Bayesian Belief Networks
• A compact graphical representation used for causal inference and to represent statistical
relationships
• Probability + Graph Theory
• Bayesian Networks are graphs. In fact, Directed Acyclic Graphs
2
Directed Acyclic Graph
3
Learning the Network
• Given the data, we would like to learn the network that fits
the data well
• Algorithms exist to do this efficiently, the optimal ones are
NP complete
• If many models fit the data well, then we need a
discriminant measure to select
• Scoring function: Bayesian Likelihood
4
Bayes Probability
Rules of Inference
Bayes Logic
• Given our knowledge, that an event may have been the result of two or more causes occurring, what is
the probability it occurred as a result of a particular cause?
• We would like to predict the unobserved, using our knowledge, i.e. assumptions, about things
6
Conditional Probabilities
• •
7
Joint Probabilities
8
Conditional Independencies
9
Markov Assumption
10
Example Cloudy
P(C)
T 0.5
F 0.5 Sprinkler Rain
P(S|C)
C P(S=F) P(S=T)
P(R|C) Wet grass
F 0.5 0.5
C P(R=F) P(R=T)
T 0.9 0.1
F 0.8 0.2
T 0.2 0.8
P(W| S, R)
P(W, R, S, C) = P(C) P(S|C) P(R|C, S) P(W| C, R, S)
S R P(W=F) P(W=T)
= P(C) P(S|C) P(R|C) P(W| S, R)
F F 1 0
F T 0.1 0.9
T F 0.1 0.9
T T 0.01 0.99
11
Inference
•
P(W, R, S, C)=
P(C) P(S|C) P(R|C) P(W| S, R)
C R S W
0 0 0.5 × 0.5 × 0.8 × 0.9 = 0.18
0 1 0.5 × 0.5 × 0.2 × 0.99 = 0.0495
1 1
1 0 0.5 × 0.1 × 0.2 × 0.9 = 0.009
1 1 0.5 × 0.1 × 0.8 × 0.99 = 0.0396
∑ = 0.2781 P(W| S, R)
S R P(W=F) P(W=T)
P(R|C) P(S|C) F F 1 0
P(C) C P(S=F) P(S=T) F T 0.1 0.9
C P(R=F) P(R=T)
T 0.5 F 0.5 0.5 T F 0.1 0.9
F 0.8 0.2
F 0.5 T 0.9 0.1 T T 0.01 0.99
T 0.2 0.8
12
DIY
• P(R=1 | W=1) = ?
• Compare P(S=1 | W=1) and P(R=1 | W=1)
• Infer whether the sprinkler was open or it rained in the morning if you find the grass wet in your lawn?
13