Lecture Note #7_PEC-CS701E
Lecture Note #7_PEC-CS701E
1. Statistical-Based Methods
• Regression
• Bayesian Classifier
•
2. Distance-Based Classification
• K-Nearest Neighbours
3
Bayesian Classifier
• A statistical classifier
• Performs probabilistic prediction, i.e., predicts class membership probabilities
• Foundation
• Based on Bayes’ Theorem.
• Assumptions
1. The classes are mutually exclusive and exhaustive.
2. The attributes are independent given the class.
4
Bayesian Classifier
• In many applications, the relationship between the attributes set and the class
variable is non-deterministic.
• In other words, a test cannot be classified to a class label with certainty.
• Before going to discuss the Bayesian classifier, we should have a quick look at
the Theory of Probability and then Bayes’ Theorem.
5
Bayes’ Theorem of Probability
6
Simple Probability
7
Simple Probability
• Suppose, A and B are any two events and P(A), P(B) denote the probabilities
that the events A and B will occur, respectively.
8
Simple Probability
• Independent events: Two events are independent if occurrences of one does
not alter the occurrence of other.
9
Joint Probability
𝑃 𝐴∪𝐵 =𝑃 𝐴 +𝑃 𝐵 −𝑃 𝐴∩𝐵
10
Conditional Probability
Suppose, A and B are two events associated with a random experiment. The
probability of A under the condition that B has already occurred and 𝑃(𝐵) ≠ 0 is
given by
𝑃(𝐴 ∩ 𝐵)
=
𝑃(𝐵)
11
Conditional Probability
Corollary : Conditional Probability
𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴 .𝑃 𝐵 𝐴 , 𝑖𝑓 𝑃 𝐴 ≠ 0
or 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐵 .𝑃 𝐴 𝐵 , 𝑖𝑓 𝑃(𝐵) ≠ 0
𝑃 𝐴 ∩ 𝐵 ∩ 𝐶 = 𝑃 𝐴 .𝑃 𝐵 .𝑃 𝐶 𝐴 ∩ 𝐵
For n events A1, A2, …, An and if all events are mutually independent to each other
𝑃 𝐴1 ∩ 𝐴2 ∩ … … … … ∩ 𝐴𝑛 = 𝑃 𝐴1 . 𝑃 𝐴2 … … … … 𝑃 𝐴𝑛
Note:
𝑃 𝐴𝐵 =0 if events are mutually exclusive
𝑃 𝐴𝐵 =𝑃 𝐴 if A and B are independent
𝑃 𝐴 𝐵 ⋅ 𝑃 𝐵 = 𝑃 𝐵 𝐴 ⋅ 𝑃(𝐴) otherwise,
P A ∩ B = P(B ∩ A) 12
Conditional Probability
• Generalization of Conditional Probability:
P(A ∩ B) P(B ∩ A)
P AB = =
P(B) P(B)
P(B|A)∙P(A)
= ∵P A ∩ B = P(B|A) ∙P(A) = P(A|B) ∙P(B)
P(B)
ഥ , where A
By the law of total probability : P(B) = P B ∩ A ∪ B ∩ A ഥ denotes the
compliment of event A. Thus,
P(B|A) ∙ P(A)
P AB =
P B∩A ∪ B∩A ഥ
P BA∙P(A)
=
P BA ∙P A +P(B│Aഥ )∙P(A
ഥ)
13
Conditional Probability
In general,
P(A) ∙ P D A
P AD =
P A ∙ P D A + P B ∙ P D B + P(C) ∙ P(D│C)
14
Total Probability
Definition : Total Probability
𝑃 𝐴 = 𝑃 𝐸1 . 𝑃 𝐴 𝐸1 + 𝑃 𝐸2 . 𝑃 𝐴 𝐸2 + ⋯ … … … . +𝑃 𝐸𝑛 . 𝑃(𝐴|𝐸𝑛 )
15
Bayes’ Theorem
𝑃 𝐸𝑖 . 𝑃(𝐴|𝐸𝑖 )
𝑃(𝐸𝑖 𝐴 =
σ𝑛𝑖=1 𝑃 𝐸𝑖 . 𝑃(𝐴|𝐸𝑖 )
16
Probability Basics
• Prior, conditional and joint probability
– Prior probability: P(X)
– Conditional probability: P(X1 |X2 ), P(X2 |X1 )
– Joint probability: X = (X1 , X2 ), P(X) = P(X1 ,X2 )
– Relationship: P(X1 ,X2 ) = P(X2 |X1 )P(X1 ) = P(X1 |X2 )P(X2 )
– Independence: P(X2 |X1 ) = P(X2 ), P(X1 |X2 ) = P(X1 ), P(X1 ,X2 ) = P(X1 )P(X2 )
• Bayesian Rule
17
Prior and Posterior Probabilities
• P(A) and P(B) are called prior probabilities X Y
• P(A|B), P(B|A) are called posterior probabilities
𝑥1 A
Example 8.6: Prior versus Posterior Probabilities 𝑥2 A
• This table shows that the event Y has two outcomes 𝑥3 B
namely A and B, which is dependent on another event X
with various outcomes like 𝑥1 , 𝑥2 and 𝑥3 . 𝑥3 A
• Case1: Suppose, we don’t have any information of the 𝑥2 B
event A. Then, from the
5
given sample space, we can
calculate P(Y = A) = 10 = 0.5 𝑥1 A
•
• Case2: Now, 2
suppose, we want to calculate P(X = 𝑥1 B
𝑥2 |Y =A) = 5 = 0.4 .
𝑥3 B
19
Naïve Bayesian Classifier
• Naïve Bayesian classifier calculate this posterior probability using Bayes’ theorem, which is
as follows.
• There are any two class conditional probabilities namely P(Y= 𝑦𝑖 |X=x) and
P(Y= 𝑦𝑗 | X=x).
• If P(Y= 𝑦𝑖 | X=x) > P(Y= 𝑦𝑗 | X=x), then we say that 𝑦𝑖 is more stronger than 𝑦𝑗
for the instance X = x.
21
Example
• Example: Play Tennis
22
Example
23
Naïve Bayesian Classifier
Algorithm: Naïve Bayesian Classification
Input: Given a set of k mutually exclusive and exhaustive classes C =
𝑐1 , 𝑐2 , … . . , 𝑐𝑘 , which have prior probabilities P(C1), P(C2),….. P(Ck).
Note: σ 𝒑𝒊 ≠ 𝟏, because they are not probabilities rather proportion values (to posterior probabilities)
24
Naïve Bayesian Classifier
Pros and Cons
• The Naïve Bayes’ approach is a very popular one, which often works well.
25