0% found this document useful (0 votes)

25 views

Naive Bayes Classifier in Machine Learning

The document discusses the Naive Bayes classifier algorithm. It is a supervised learning algorithm based on Bayes' theorem used for classification. The document explains how Naive Bayes works, provides examples of its applications, advantages and disadvantages, and includes code to implement Naive Bayes in Python.

Uploaded by

sambhvathan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

Naive Bayes Classifier in Machine Learning

Uploaded by

sambhvathan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Home Machine Learning Artificial Intelligence DBMS Java Blockchain Control System

Naïve Bayes Classifier Algorithm

Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes
theorem and used for solving classification problems.

It is mainly used in text classification that includes a high-dimensional training dataset.

Naïve Bayes Classifier is one of the simple and most effective Classification algorithms
which helps in building the fast machine learning models that can make quick
predictions.

It is a probabilistic classifier, which means it predicts on the basis of the probability

of an object.

Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental
analysis, and classifying articles.

Why is it called Naïve Bayes?

The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be
described as:

Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is
independent of the occurrence of other features. Such as if the fruit is identified on the
bases of color, shape, and taste, then red, spherical, and sweet fruit is recognized as an
apple. Hence each feature individually contributes to identify that it is an apple without
depending on each other.

Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.

Bayes' Theorem:
Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine
the probability of a hypothesis with prior knowledge. It depends on the conditional
probability.

The formula for Bayes' theorem is given as:

Where,

P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.

P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a
hypothesis is true.

P(A) is Prior Probability: Probability of hypothesis before observing the evidence.

P(B) is Marginal Probability: Probability of Evidence.

Working of Naïve Bayes' Classifier:

Working of Naïve Bayes' Classifier can be understood with the help of the below example:

Suppose we have a dataset of weather conditions and corresponding target variable "Play".
So using this dataset we need to decide that whether we should play or not on a particular day
according to the weather conditions. So to solve this problem, we need to follow the below
steps:

1. Convert the given dataset into frequency tables.

2. Generate Likelihood table by finding the probabilities of given features.

3. Now, use Bayes theorem to calculate the posterior probability.

Problem: If the weather is sunny, then the Player should play or not?

Solution: To solve this, first consider the below dataset:

Outlook Play

0 Rainy Yes

1 Sunny Yes

2 Overcast Yes

3 Overcast Yes

4 Sunny No

5 Rainy Yes

6 Sunny Yes

7 Overcast Yes

8 Rainy No
9 Sunny No

10 Sunny Yes

11 Rainy No

12 Overcast Yes

13 Overcast Yes

Frequency table for the Weather Conditions:

Weather Yes No

Overcast 5 0

Rainy 2 2

Sunny 3 2

Total 10 5

Likelihood table weather condition:

Weather No Yes

Overcast 0 5 5/14= 0.35

Rainy 2 2 4/14=0.29

Sunny 2 3 5/14=0.35

All 4/14=0.29 10/14=0.71

Applying Bayes'theorem:

P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)

P(Sunny|Yes)= 3/10= 0.3

P(Sunny)= 0.35

P(Yes)=0.71

So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60

P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)

P(Sunny|NO)= 2/4=0.5

P(No)= 0.29

P(Sunny)= 0.35

So P(No|Sunny)= 0.5*0.29/0.35 = 0.41

So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)

Hence on a Sunny day, Player can play the game.

Advantages of Naïve Bayes Classifier:

Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.

It can be used for Binary as well as Multi-class Classifications.

It performs well in Multi-class predictions as compared to the other Algorithms.

It is the most popular choice for text classification problems.

Disadvantages of Naïve Bayes Classifier:

Naive Bayes assumes that all features are independent or unrelated, so it cannot learn
the relationship between features.

Applications of Naïve Bayes Classifier:

It is used for Credit Scoring.

It is used in medical data classification.

It can be used in real-time predictions because Naïve Bayes Classifier is an eager

learner.

It is used in Text classification such as Spam filtering and Sentiment analysis.

Types of Naïve Bayes Model:

There are three types of Naive Bayes Model, which are given below:

Gaussian: The Gaussian model assumes that features follow a normal distribution. This
means if predictors take continuous values instead of discrete, then the model assumes
that these values are sampled from the Gaussian distribution.

Multinomial: The Multinomial Naïve Bayes classifier is used when the data is
multinomial distributed. It is primarily used for document classification problems, it
means a particular document belongs to which category such as Sports, Politics,
education, etc.
The classifier uses the frequency of words for the predictors.

Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier, but the
predictor variables are the independent Booleans variables. Such as if a particular word is
present or not in a document. This model is also famous for document classification
tasks.

Python Implementation of the Naïve Bayes algorithm:

Now we will implement a Naive Bayes Algorithm using Python. So for this, we will use the
"user_data" dataset, which we have used in our other classification model. Therefore we can
easily compare the Naive Bayes model with the other models.

Steps to implement:

Data Pre-processing step

Fitting Naive Bayes to the Training set

Predicting the test result

Test accuracy of the result(Creation of Confusion matrix)

Visualizing the test set result.

1) Data Pre-processing step:

In this step, we will pre-process/prepare the data so that we can use it efficiently in our code. It
is similar as we did in data-pre-processing. The code for this is given below:

Importing the libraries

import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd

# Importing the dataset

dataset = pd.read_csv('user_data.csv')
x = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25, random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

In the above code, we have loaded the dataset into our program using "dataset =
pd.read_csv('user_data.csv'). The loaded dataset is divided into training and test set, and then
we have scaled the feature variable.

The output for the dataset is given as:

2) Fitting Naive Bayes to the Training Set:

After the pre-processing step, now we will fit the Naive Bayes model to the Training set. Below
is the code for it:

# Fitting Naive Bayes to the Training set

from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()
classifier.fit(x_train, y_train)
In the above code, we have used the GaussianNB classifier to fit it to the training dataset. We
can also use other classifiers as per our requirement.

Output:

Out[6]: GaussianNB(priors=None, var_smoothing=1e-09)

3) Prediction of the test set result:

Now we will predict the test set result. For this, we will create a new predictor variable y_pred,
and will use the predict function to make the predictions.

# Predicting the Test set results

y_pred = classifier.predict(x_test)

Output:
The above output shows the result for prediction vector y_pred and real vector y_test. We can
see that some predications are different from the real values, which are the incorrect
predictions.

4) Creating Confusion Matrix:

Now we will check the accuracy of the Naive Bayes classifier using the Confusion matrix. Below
is the code for it:

# Making the Confusion Matrix

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)

Output:
As we can see in the above confusion matrix output, there are 7+3= 10 incorrect predictions,
and 65+25=90 correct predictions.

5) Visualizing the training set result:

Next we will visualize the training set result using Naïve Bayes Classifier. Below is the code for it:

# Visualising the Training set results

from matplotlib.colors import ListedColormap
x_set, y_set = x_train, y_train
X1, X2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step =
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))
mtp.contourf(X1, X2, classifier.predict(nm.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
alpha = 0.75, cmap = ListedColormap(('purple', 'green')))
mtp.xlim(X1.min(), X1.max())
mtp.ylim(X2.min(), X2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
c = ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Naive Bayes (Training set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

Output:

In the above output we can see that the Naïve Bayes classifier has segregated the data points
with the fine boundary. It is Gaussian curve as we have used GaussianNB classifier in our code.

6) Visualizing the Test set result:

# Visualising the Test set results

from matplotlib.colors import ListedColormap
x_set, y_set = x_test, y_test
X1, X2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step =
nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))
mtp.contourf(X1, X2, classifier.predict(nm.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
alpha = 0.75, cmap = ListedColormap(('purple', 'green')))
mtp.xlim(X1.min(), X1.max())
mtp.ylim(X2.min(), X2.max())
for i, j in enumerate(nm.unique(y_set)):
mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
c = ListedColormap(('purple', 'green'))(i), label = j)
mtp.title('Naive Bayes (test set)')
mtp.xlabel('Age')
mtp.ylabel('Estimated Salary')
mtp.legend()
mtp.show()

Output:

The above output is final output for test set data. As we can see the classifier has created a
Gaussian curve to divide the "purchased" and "not purchased" variables. There are some wrong
predictions which we have calculated in Confusion matrix. But still it is pretty good classifier.

← Prev Next →

Youtube For Videos Join Our Youtube Channel: Join Now

Feedback

Send your Feedback to [email protected]

Help Others, Please Share

Learn Latest Tutorials

Splunk tutorial SPSS tutorial Swagger T-SQL tutorial
tutorial
Splunk SPSS Transact-SQL
Swagger

Tumblr tutorial React tutorial Regex tutorial Reinforcement

learning tutorial
Tumblr ReactJS Regex
Reinforcement
Learning

R Programming RxJS tutorial React Native Python Design

tutorial tutorial Patterns
RxJS
R Programming React Native Python Design
Patterns

Python Pillow Python Turtle Keras tutorial

tutorial tutorial
Keras
Python Pillow Python Turtle

Preparation

Aptitude Logical Verbal Ability Interview

Reasoning Questions
Aptitude Verbal Ability
Reasoning Interview Questions

Company
Interview
Questions
Company Questions

Trending Technologies
Artificial AWS Tutorial Selenium Cloud
Intelligence tutorial Computing
AWS
Artificial Selenium Cloud Computing
Intelligence

Hadoop tutorial ReactJS Data Science Angular 7

Tutorial Tutorial Tutorial
Hadoop
ReactJS Data Science Angular 7

Blockchain Git Tutorial Machine DevOps

Tutorial Learning Tutorial Tutorial
Git
Blockchain Machine Learning DevOps

B.Tech / MCA

DBMS tutorial Data Structures DAA tutorial Operating

tutorial System
DBMS DAA
Data Structures Operating System

Computer Compiler Computer Discrete

Network tutorial Design tutorial Organization and Mathematics
Architecture Tutorial
Computer Network Compiler Design
Computer Discrete
Organization Mathematics

Ethical Hacking Computer Software html tutorial

Graphics Tutorial Engineering
Ethical Hacking Web Technology
Computer Graphics Software
Engineering

Cyber Security Automata C Language C++ tutorial

tutorial Tutorial tutorial
C++
Cyber Security Automata C Programming
Java tutorial .Net Python tutorial List of
Framework Programs
Java Python
tutorial
Programs
.Net

Control Data Mining Data

Systems tutorial Tutorial Warehouse
Tutorial
Control System Data Mining
Data Warehouse

394205900a Scion Getting Started PDF
No ratings yet
394205900a Scion Getting Started PDF
50 pages
Naïve Bayes
No ratings yet
Naïve Bayes
15 pages
Machine Ass
No ratings yet
Machine Ass
33 pages
Naive Bayes Classifier in Machine Learning - Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning - Javatpoint
19 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
11 pages
Naive Bates Classifier
No ratings yet
Naive Bates Classifier
18 pages
CSL0777 L24
No ratings yet
CSL0777 L24
38 pages
Unit 2 AAM
No ratings yet
Unit 2 AAM
32 pages
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
3 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
16 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
10 pages
Navies Bayes
No ratings yet
Navies Bayes
18 pages
Practical-3 Ritesh
No ratings yet
Practical-3 Ritesh
5 pages
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
No ratings yet
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
47 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
3 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
6 pages
LM3 - Naive Bayes Model
No ratings yet
LM3 - Naive Bayes Model
21 pages
Unit-4 Naïve Bayes & Support Vector Machine
No ratings yet
Unit-4 Naïve Bayes & Support Vector Machine
79 pages
AI NOTES unit 2
No ratings yet
AI NOTES unit 2
9 pages
NOTES
No ratings yet
NOTES
15 pages
(Machine Learning) BAYES’ THEOREM AND CONCEPT LEARNING
No ratings yet
(Machine Learning) BAYES’ THEOREM AND CONCEPT LEARNING
22 pages
6. Naive Bayes
No ratings yet
6. Naive Bayes
26 pages
Practical_3 (2)
No ratings yet
Practical_3 (2)
11 pages
UNIT 2 AAM notes (1)
No ratings yet
UNIT 2 AAM notes (1)
38 pages
16_Naïve Bayes Classifier
No ratings yet
16_Naïve Bayes Classifier
21 pages
Mechine Learning
No ratings yet
Mechine Learning
7 pages
Myppt
No ratings yet
Myppt
14 pages
Naive Bayes Classifiers - Parta
No ratings yet
Naive Bayes Classifiers - Parta
17 pages
Notes On Module 3 - Pattern Recognition
No ratings yet
Notes On Module 3 - Pattern Recognition
17 pages
6d7701 - Bayesean Classifer
No ratings yet
6d7701 - Bayesean Classifer
8 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
46 pages
An Introduction to Naive Bayes Algorithm for Beginners
No ratings yet
An Introduction to Naive Bayes Algorithm for Beginners
11 pages
NB Slides
No ratings yet
NB Slides
29 pages
Naive_Bayes_Classifier_Presentation
No ratings yet
Naive_Bayes_Classifier_Presentation
10 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
Naive Bayes
No ratings yet
Naive Bayes
11 pages
Lab2 - Bayes Classification
No ratings yet
Lab2 - Bayes Classification
4 pages
07_Naive_Bayes
No ratings yet
07_Naive_Bayes
6 pages
ML Naive Bayes 1
No ratings yet
ML Naive Bayes 1
19 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
Unit2_5_part 2
No ratings yet
Unit2_5_part 2
1 page
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
24 pages
Naive Bays
No ratings yet
Naive Bays
10 pages
Lecture 7
No ratings yet
Lecture 7
15 pages
Bayes' Theorem Explained
No ratings yet
Bayes' Theorem Explained
18 pages
NBayes Log Reg
No ratings yet
NBayes Log Reg
18 pages
Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression
No ratings yet
Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression
17 pages
L25 - Naïve Bayes
No ratings yet
L25 - Naïve Bayes
18 pages
Naive Bayes.pptx (1)
No ratings yet
Naive Bayes.pptx (1)
19 pages
EXP-10
No ratings yet
EXP-10
9 pages
07 - ML - Naive-Bayes-update
No ratings yet
07 - ML - Naive-Bayes-update
26 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
Bayes Rule PR-2
No ratings yet
Bayes Rule PR-2
5 pages
Unit-4
No ratings yet
Unit-4
36 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
3 pages
BSC ML CH2.pptx
No ratings yet
BSC ML CH2.pptx
79 pages
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)
Student Solutions Manual for Mathematics for Economics, fourth edition
From Everand
Student Solutions Manual for Mathematics for Economics, fourth edition
Michael Hoy
No ratings yet
The Magic Cafe Forums - Invisible Man Routine
No ratings yet
The Magic Cafe Forums - Invisible Man Routine
4 pages
SAP Security Checklist New Updated V9
No ratings yet
SAP Security Checklist New Updated V9
7 pages
E Commerce Architecture
No ratings yet
E Commerce Architecture
11 pages
Bootloader DSPic
No ratings yet
Bootloader DSPic
189 pages
Nwn2 Toolset Guide i
No ratings yet
Nwn2 Toolset Guide i
125 pages
Mastering LibGDX Game Development - Sample Chapter
100% (1)
Mastering LibGDX Game Development - Sample Chapter
53 pages
(Introducing Voice Over Ip Networks) : Voip Fundamentals
No ratings yet
(Introducing Voice Over Ip Networks) : Voip Fundamentals
8 pages
Module1 PartA Dr. Ilavarasi
No ratings yet
Module1 PartA Dr. Ilavarasi
36 pages
Lab Activity 1A - Research Computer
No ratings yet
Lab Activity 1A - Research Computer
3 pages
FID Utilities e
No ratings yet
FID Utilities e
172 pages
EiiZi Mart
No ratings yet
EiiZi Mart
4 pages
DLL Quarter1 Week3 Tle6
No ratings yet
DLL Quarter1 Week3 Tle6
7 pages
GSP Um 12
No ratings yet
GSP Um 12
288 pages
An A-Z Index of The Windows CMD Command Line
No ratings yet
An A-Z Index of The Windows CMD Command Line
4 pages
IFEM Solution Ch17
No ratings yet
IFEM Solution Ch17
3 pages
NafehaKhan CV Coverletter
No ratings yet
NafehaKhan CV Coverletter
2 pages
Sample Lab Report #1
No ratings yet
Sample Lab Report #1
13 pages
Web Marketi
No ratings yet
Web Marketi
250 pages
SAP P - BTPA - 2408 Demo Quesitons From
No ratings yet
SAP P - BTPA - 2408 Demo Quesitons From
10 pages
Esquema Electrico
No ratings yet
Esquema Electrico
1 page
MPPT Aco
No ratings yet
MPPT Aco
6 pages
DXL Reference Manual
No ratings yet
DXL Reference Manual
952 pages
Presentation5 3
No ratings yet
Presentation5 3
16 pages
FDS and Smokeview Seminar
No ratings yet
FDS and Smokeview Seminar
2 pages
Face Recognition System
No ratings yet
Face Recognition System
7 pages
GSM Physical Layer
No ratings yet
GSM Physical Layer
25 pages
Image Captioning Using CNN & RNN
No ratings yet
Image Captioning Using CNN & RNN
4 pages
Boo - Meet New People B Personalit
No ratings yet
Boo - Meet New People B Personalit
3 pages
Physio-Relief Project Documentation
No ratings yet
Physio-Relief Project Documentation
19 pages