Naive Bayes

Uploaded by

Vipul Khandke

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views25 pages

Naive Bayes

Uploaded by

Vipul Khandke

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 25

Naïve Bayes Classifier

1
Generative vs. Discriminative Classifiers
Training classifiers involves estimating f: X  Y, or P(Y|X)

Discriminative classifiers (also called ‘informative’ by

Rubinstein&Hastie):
1. Assume some functional form for P(Y|X)
2. Estimate parameters of P(Y|X) directly from training data

Generative classifiers
1. Assume some functional form for P(X|Y), P(X)
2. Estimate parameters of P(X|Y), P(X) directly from training data
3. Use Bayes rule to calculate P(Y|X= xi)
Bayes Formula
Generative Model

• Color
• Size
• Texture
• Weight
• …
Discriminative Model

• Logistic Regression

• Color
• Size
• Texture
• Weight
• …
Comparison
• Generative models
– Assume some functional form for P(X|Y), P(Y)
– Estimate parameters of P(X|Y), P(Y) directly from
training data
– Use Bayes rule to calculate P(Y|X= x)
• Discriminative models
– Directly assume some functional form for P(Y|X)
– Estimate parameters of P(Y|X) directly from
training data
Probability Basics
• Prior, conditional and joint probability for random
variables
– Prior probability:P(X )
P(X1| X2), P(X2| X1)
– Conditional probability:
– Joint probability: X (X1,X2), P(X) P(X1 ,X2)
– Relationship: P(X1 ,X2) P(X2| X1)P(X1) P(X1| X2)P(X2)
P(X2| X1) P(X2), P(X1| X2) P(X1), P(X1 ,X2) P(X1)P(X2)
– Independence:
• Bayesian Rule
P(X| C)P(C) Likelihood
Prior
P(C| X)  Posterior

P(X) Evidence

7
Probability Basics
• Quiz: We have two six-sided dice. When they are tolled, it could
end up with the following occurance: (A) dice 1 lands on side “3”,
(B) dice 2 lands on side “1”, and (C) Two dice sum to eight.
Answer the following questions:
1) P( A) ?
2)P(B)?
3)P(C) ?
4)P( A| B) ?
5)P(C| A) ?
6)P( A , B) ?
7)P( A ,C) ?
8)Is P( A ,C) equalsP(A) P(C)?
8
Probabilistic Classification
• Establishing a probabilistic model for
classification
– Discriminative model
P(C| X) C c1,,cL , X (X1,,Xn )

P(c1| x) P(c2 | x) P(cL | x)



Discriminative
Probabilistic Classifier


x1 x2 xn
x (x1 , x2 ,, xn )
9
Probabilistic Classification
• Establishing a probabilistic model for
classification (cont.)
– Generative model
P(X| C) C c1,,cL , X (X1,,Xn )

P(x| c1) P(x| c2 ) P(x| cL )

Generative Generative Generative

Probabilistic Model Probabilistic Model  Probabilistic Model
for Class 1 for Class 2 for Class L
  
x1 x2 xn x1 x2 xn x1 x2 xn

x (x1 , x2 ,, xn )
10
Probabilistic Classification
• MAP classification rule
– MAP: Maximum A Posterior
– Assign x to c* if
P(C c* | X x)  P(C c| X x) c c* , c c1,,cL

• Generative classification with the MAP rule

x (x1 , x2 ,, xn )
– MAP classification rule: for
[P(x1| c* ) P(xn | c* )]P(c* )  [P(x1| c) P(xn | c)]P(c), c c* , c c1,,cL
12
Naïve Bayes
• Naïve Bayes Algorithm (for discrete input attributes)
– Learning Phase: Given a training set S,
Foreachtargetvalueof ci (ci c1 ,,cL )
Pˆ(C ci )  estimateP(C ci ) withexamplesin S;
Foreveryattributevaluexjk of eachattributeX j ( j 1,,n; k 1,, Nj )
Pˆ(X j xjk | C ci )  estimateP(X j xjk | C ci ) withexamplesin S;
Output: conditional probability tables; for elements
X j , Nj L
– Test Phase: Given an unknown instance ,
X (a1 ,,an )
Look up tables to assign the label c* to X’ if

[Pˆ(a1 | c* ) Pˆ(an | c* )]Pˆ(c* )  [Pˆ(a1 | c) Pˆ(an | c)]Pˆ(c), c c* , c c1,,cL

13
Example
• Example: Play Tennis

14
Example
• Learning Phase
Outlook Play=Y Play=N Temperatu Play=Yes Play=No
es o re
Sunny 2/9 3/5 Hot 2/9 2/5
Overcast 4/9 0/5 Mild 4/9 2/5
Rain 3/9 2/5 Cool 3/9 1/5
Humidity Play=Ye Play=N
Wind Play=Y Play=
s o
es No
High 3/9 4/5 Strong 3/9 3/5
Normal 6/9 1/5 Weak 6/9 2/5
P(Play=Yes) = 9/14
P(Play=No) = 5/14

15
Example
• Test Phase
– Given a new instance,
x’=(Outlook=Sunny, Temperature=Cool, Humidity=High,
Wind=Strong)
– Look up tables P(Outlook=Sunny|Play=No) = 3/5
P(Outlook=Sunny|Play=Yes) = 2/9
P(Temperature=Cool|Play==No)
P(Temperature=Cool|Play=Yes) = 3/9 = 1/5
P(Huminity=High|Play=No) = 4/5
P(Huminity=High|Play=Yes) = 3/9
P(Wind=Strong|Play=Yes) = 3/9 P(Wind=Strong|Play=No) = 3/5
P(Play=Yes) = 9/14 P(Play=No) = 5/14

Given the fact P(Yes|x’) < P(No|x’), we label x’ to be 16

Example
• Test Phase
– Given a new instance,
x’=(Outlook=Sunny, Temperature=Cool, Humidity=High,
Wind=Strong)
– Look up tables P(Outlook=Sunny|Play=No) = 3/5
P(Outlook=Sunny|Play=Yes) = 2/9
P(Temperature=Cool|Play==No)
P(Temperature=Cool|Play=Yes) = 3/9 = 1/5
P(Huminity=High|Play=No) = 4/5
P(Huminity=High|Play=Yes) = 3/9
P(Wind=Strong|Play=Yes) = 3/9 P(Wind=Strong|Play=No) = 3/5
P(Play=Yes) = 9/14 P(Play=No) = 5/14

Given the fact P(Yes|x’) < P(No|x’), we label x’ to be 17

Relevant Issues
• Violation of Independence Assumption
P(X1,,Xn | C)  P(X1| C) P(Xn | C)
– For many real world tasks,
– Nevertheless, naïve Bayes works surprisingly well
anyway!
• Zero conditional probability Problem
X j ajk, Pˆ(X j ajk| C ci ) 0
– If no example contains
Pˆ(x1|the
ci ) attribute
Pˆ(ajk| ci ) Pˆ(xvalue
n | ci ) 0
– In this circumstance, during
test ˆ nc  mp
P(X j ajk| C ci ) 
n m
– For a remedy, conditional probabilities estimated with
nc : numberof trainingexamplesfor whichX j ajk andC ci
n : numberof trainingexamplesfor whichC ci
p : priorestimate(usually,p 1/ t fort possiblevaluesof X j )
m: weightto prior(numberof "virtual"examples,m 1)
18
Relevant Issues
• Continuous-valued Input Attributes
– Numberless values for an attribute
– Conditional probability modeled with the normal
distribution  (X j   ji )2 
ˆ 1  
P(X j | C ci )  exp 
2  ji  2 2 
 ji 
 ji : mean(avearage)of attributevaluesX j of examplesfor whichC ci
 ji : standarddeviationof attributevaluesX j of examplesfor whichC ci

for X (X1,,Xn ), C c1,,cL

– Learningn
Phase:
L P(C ci ) i 1,, L
Output: normal distributions and
for X (X1 ,,Xn )
– Test Phase:
• Calculate conditional probabilities with all the normal distributions
• Apply the MAP rule to make a decision

19
Conclusions
• Naïve Bayes based on the independence assumption
– Training is very easy and fast; just requiring considering each
attribute in each class separately
– Test is straightforward; just looking up tables or calculating
conditional probabilities with normal distributions
• A popular generative model
– Performance competitive to most of state-of-the-art classifiers
even in presence of violating independence assumption
– Many successful applications, e.g., spam mail filtering
– A good candidate of a base learner in ensemble learning
– Apart from classification, naïve Bayes can do more…

20
Extra Slides

21
Naïve Bayes (1)
• Revisit

• Which is equal to

• Naïve Bayes assumes conditional independency

• Then the inference of posterior is

Naïve Bayes (2)
• Training: Observation is multinomial; Supervised, with label information
– Maximum Likelihood Estimation (MLE)

– Maximum a Posteriori (MAP): put Dirichlet prior

• Classification
Naïve Bayes (3)
• What if we have continuous Xi ？

• Generative training

• Prediction
Naïve Bayes (4)
• Problems
– Features may overlapped
– Features may not be independent
• Size and weight of tiger
– Use a joint distribution estimation (P(X|Y), P(Y)) to solve a
conditional problem (P(Y|X= x))
• Can we discriminatively train?
– Logistic regression
– Regularization
– Gradient ascent

Naïve Bayes Classifier: Adopted From Slides by Ke Chen From University of Manchester and Yangqiu Song From Msra
No ratings yet
Naïve Bayes Classifier: Adopted From Slides by Ke Chen From University of Manchester and Yangqiu Song From Msra
25 pages
Naïve Bayes Classifier: Ke Chen
No ratings yet
Naïve Bayes Classifier: Ke Chen
19 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
Probabilistic Class I Fiers
No ratings yet
Probabilistic Class I Fiers
5 pages
Lecture 5-Naïve Bayes
No ratings yet
Lecture 5-Naïve Bayes
26 pages
Lecture 7
No ratings yet
Lecture 7
15 pages
Naïve Bayes for Machine Learning
No ratings yet
Naïve Bayes for Machine Learning
20 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
10 pages
Naive Bayes
No ratings yet
Naive Bayes
31 pages
Classification - Naive Bayes
No ratings yet
Classification - Naive Bayes
17 pages
8 ML
No ratings yet
8 ML
22 pages
Naive Bayes
No ratings yet
Naive Bayes
18 pages
Naive Bayes Classifier PDF
No ratings yet
Naive Bayes Classifier PDF
17 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
17 pages
Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
No ratings yet
Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
54 pages
Bayes Algorithm
No ratings yet
Bayes Algorithm
26 pages
ml3 - Text Classification - Naive Bayes
No ratings yet
ml3 - Text Classification - Naive Bayes
50 pages
Naïve Bayes for Continuous Data
No ratings yet
Naïve Bayes for Continuous Data
16 pages
Naïve Bayes Classifier: Ke Chen
No ratings yet
Naïve Bayes Classifier: Ke Chen
18 pages
Naïve Bayes Classifier Guide
No ratings yet
Naïve Bayes Classifier Guide
20 pages
Ba Yes Naive
No ratings yet
Ba Yes Naive
15 pages
Naive Bayes
No ratings yet
Naive Bayes
9 pages
Naïve Bayes Classifier: Dr. Hussain Dawood
No ratings yet
Naïve Bayes Classifier: Dr. Hussain Dawood
20 pages
Naïve Bayes Classifier: Ke Chen
No ratings yet
Naïve Bayes Classifier: Ke Chen
18 pages
6 Naive-Bayes
No ratings yet
6 Naive-Bayes
18 pages
Bayesian Learning: Berrin Yanikoglu
No ratings yet
Bayesian Learning: Berrin Yanikoglu
64 pages
ML Lecture#5
No ratings yet
ML Lecture#5
65 pages
2 Naive Bayes
No ratings yet
2 Naive Bayes
49 pages
Chapter 4
No ratings yet
Chapter 4
22 pages
Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression
No ratings yet
Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression
17 pages
NBayes Log Reg
No ratings yet
NBayes Log Reg
18 pages
Naïve Bayes Classifier: April 25, 2006
No ratings yet
Naïve Bayes Classifier: April 25, 2006
19 pages
Bayesian Decision Theory in ML
No ratings yet
Bayesian Decision Theory in ML
56 pages
Slide07 Bayes
No ratings yet
Slide07 Bayes
51 pages
Class Adv Classification IV
No ratings yet
Class Adv Classification IV
49 pages
Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression
No ratings yet
Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression
17 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
Bayesian Learning
No ratings yet
Bayesian Learning
41 pages
L25 - Naïve Bayes
No ratings yet
L25 - Naïve Bayes
18 pages
Unit 3 Bayesian Learning
No ratings yet
Unit 3 Bayesian Learning
49 pages
Bayesian Learning Essentials
No ratings yet
Bayesian Learning Essentials
49 pages
Naïve Bayes for Data Scientists
No ratings yet
Naïve Bayes for Data Scientists
31 pages
3 - Classification - Naive Bayes
No ratings yet
3 - Classification - Naive Bayes
30 pages
ML Unit 2
No ratings yet
ML Unit 2
107 pages
Bayesian Classification, Nearest
No ratings yet
Bayesian Classification, Nearest
46 pages
Machine Learning - Unit 2
No ratings yet
Machine Learning - Unit 2
104 pages
WK 08
No ratings yet
WK 08
10 pages
CCS - Lec 5
No ratings yet
CCS - Lec 5
33 pages
ML Unit3
No ratings yet
ML Unit3
21 pages
Lecture10 - Bayesian Classifier
No ratings yet
Lecture10 - Bayesian Classifier
40 pages
DM NaiveBayes
No ratings yet
DM NaiveBayes
15 pages
ML BayesionBeliefNetwork Lect12 14
No ratings yet
ML BayesionBeliefNetwork Lect12 14
99 pages
I239-5 Naive Bayes
No ratings yet
I239-5 Naive Bayes
35 pages
L23 Bayesian Naive
No ratings yet
L23 Bayesian Naive
18 pages
Naive Bayes Classification Guide
No ratings yet
Naive Bayes Classification Guide
21 pages
IOE 421 Deep Learning-Assignment-2
No ratings yet
IOE 421 Deep Learning-Assignment-2
1 page
5 Decision Tree
No ratings yet
5 Decision Tree
26 pages
4 Hadoop Ecosystem
No ratings yet
4 Hadoop Ecosystem
16 pages
Hive Table Session
No ratings yet
Hive Table Session
23 pages
Hive Updated
No ratings yet
Hive Updated
18 pages
Map Reduce
No ratings yet
Map Reduce
37 pages
R Statements - 04
No ratings yet
R Statements - 04
21 pages
Linear Regression
No ratings yet
Linear Regression
12 pages
ID3 Decision Tree Guide
No ratings yet
ID3 Decision Tree Guide
30 pages
Hive Part 2
No ratings yet
Hive Part 2
47 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
R Operators - 03
No ratings yet
R Operators - 03
26 pages
MLR - R and R2
No ratings yet
MLR - R and R2
17 pages
R Basics - 02
No ratings yet
R Basics - 02
34 pages
R Functions - 06
No ratings yet
R Functions - 06
26 pages
R Data Structures - 07 - 3
No ratings yet
R Data Structures - 07 - 3
35 pages
R DataPreprocessing
No ratings yet
R DataPreprocessing
23 pages
R Data Structures - 07 - 2
No ratings yet
R Data Structures - 07 - 2
18 pages
R Loops - 05
No ratings yet
R Loops - 05
16 pages
R Data Structures - 07 - 1
No ratings yet
R Data Structures - 07 - 1
30 pages
LR Assumptions - 05
No ratings yet
LR Assumptions - 05
12 pages
Ch01 ICS422 04
No ratings yet
Ch01 ICS422 04
84 pages
R Data Structures - 07 - 4
No ratings yet
R Data Structures - 07 - 4
27 pages
MLR Multicollinearlty, Categorical Variable
No ratings yet
MLR Multicollinearlty, Categorical Variable
48 pages
Residual Analysis and Test - 02
No ratings yet
Residual Analysis and Test - 02
37 pages
Multiple Linear Regression - Excel
No ratings yet
Multiple Linear Regression - Excel
14 pages
Ch01 ICS422 02
No ratings yet
Ch01 ICS422 02
39 pages
Ch01 ICS422 01
No ratings yet
Ch01 ICS422 01
42 pages
Ch01 ICS422 03
No ratings yet
Ch01 ICS422 03
46 pages
Lecture 6 To 8 N-Gram
No ratings yet
Lecture 6 To 8 N-Gram
19 pages
Moore and Mealy Machine
No ratings yet
Moore and Mealy Machine
5 pages
Pushdown Automata Examples
No ratings yet
Pushdown Automata Examples
10 pages
Object Constraint Language PPT by MHM
No ratings yet
Object Constraint Language PPT by MHM
15 pages
UML Diagram Types Guide
No ratings yet
UML Diagram Types Guide
17 pages
Module 5
No ratings yet
Module 5
27 pages
UML Class Diagram Relationships
No ratings yet
UML Class Diagram Relationships
35 pages
Box Jerkin
No ratings yet
Box Jerkin
7 pages
Charotar University of Science and Technology Faculty of Technology and Engineering
No ratings yet
Charotar University of Science and Technology Faculty of Technology and Engineering
10 pages
AnFrek Hujan
No ratings yet
AnFrek Hujan
76 pages
Neural Networks for Optimization
No ratings yet
Neural Networks for Optimization
10 pages
An Ingression Into Deep Learning - Resp
No ratings yet
An Ingression Into Deep Learning - Resp
25 pages
Transformer Architecture
No ratings yet
Transformer Architecture
2 pages
Module 4 Recurrent Neural Network
100% (1)
Module 4 Recurrent Neural Network
78 pages
Deep Learning
No ratings yet
Deep Learning
2 pages
Autoencoder Architecture
No ratings yet
Autoencoder Architecture
2 pages
Unit 2 DL
No ratings yet
Unit 2 DL
3 pages
ARIMA Procedure Ebook
No ratings yet
ARIMA Procedure Ebook
110 pages
Mealy vs Moore Machines Guide
No ratings yet
Mealy vs Moore Machines Guide
21 pages
String Matching Introduction To NP-Completeness
No ratings yet
String Matching Introduction To NP-Completeness
37 pages
Business Data Mining Week 12
No ratings yet
Business Data Mining Week 12
24 pages
Unified CLDNN Architecture for Speech Recognition
No ratings yet
Unified CLDNN Architecture for Speech Recognition
5 pages
Generative Models: GANs & Diffusion
No ratings yet
Generative Models: GANs & Diffusion
47 pages
Assignment 01
No ratings yet
Assignment 01
3 pages
96320
No ratings yet
96320
58 pages
214 Fractalnet Ultra Deep Neural N
No ratings yet
214 Fractalnet Ultra Deep Neural N
11 pages
WAP To Implement Artificial Neural Network
No ratings yet
WAP To Implement Artificial Neural Network
13 pages
LAB 5 15102024 105908pm
No ratings yet
LAB 5 15102024 105908pm
4 pages
Theory of Automata Mid Term Spring 2020 11052020 045123pm PDF
No ratings yet
Theory of Automata Mid Term Spring 2020 11052020 045123pm PDF
2 pages
Ma 6453-Probability & Queueing Theory UNIT-1 Random Variables Part A
No ratings yet
Ma 6453-Probability & Queueing Theory UNIT-1 Random Variables Part A
32 pages