0% found this document useful (0 votes)
4 views

ml last document group 2.pdf

The document is an assignment from Mizan Tepi University's School of Computing and Informatics, focusing on Bayes Theorem and the Naive Bayes algorithm in machine learning. It explains the concepts of Bayes Theorem, its applications, and the workings of the Naive Bayes classifier, emphasizing its efficiency in classification tasks despite its simplifying assumptions. The assignment includes a detailed breakdown of prerequisites for understanding Bayes Theorem and examples of its application in predicting outcomes.

Uploaded by

gemechisltujuba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

ml last document group 2.pdf

The document is an assignment from Mizan Tepi University's School of Computing and Informatics, focusing on Bayes Theorem and the Naive Bayes algorithm in machine learning. It explains the concepts of Bayes Theorem, its applications, and the workings of the Naive Bayes classifier, emphasizing its efficiency in classification tasks despite its simplifying assumptions. The assignment includes a detailed breakdown of prerequisites for understanding Bayes Theorem and examples of its application in predicting outcomes.

Uploaded by

gemechisltujuba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Mizan Tepi University

School Of Computing And Informatics

Department of Information Systems


Assignment of Machine Learning
Prepared by:

No Name ID
01 Henok Tadesse NSR/0932/13
02 Melkamu Yitayih NSR/1179/13
03 Adise Adane NSR/0124/12
04 Birhan Ayenew NSR/0393/13
05 Tizazu Mekuant NSR/1641/13

Submission to Mr.
Gebreyes G.

Submitted date 09/01/2024


Table of Contents Page
1. Introduction to Bayes Theorem in Machine Learning .............................................................2
What is Bayes Theorem? .................................................................................................................2
Prerequisites for Bayes Theorem .................................................................................................3
1. Experiment ...........................................................................................................................3
2. Sample Space ....................................................................................................................... 4
3. Event .................................................................................................................................... 4
4. Random Variable: ................................................................................................................ 6
5. Exhaustive Event: ................................................................................................................ 6
6. Independent Event: .............................................................................................................. 6
7. Conditional Probability: .......................................................................................................6
8. Marginal Probability: ........................................................................................................... 6
2. What Is the Naive Bayes Algorithm? ...................................................................................... 7
How Do Naive Bayes Algorithms Work? ................................................................................... 9
References ..............................................................................................................................12

1
1. Introduction to Bayes Theorem in Machine Learning

Bayes theorem is given by an English statistician, philosopher, and Presbyterian minister


named Mr. Thomas Bayes in 17th century. Bayes provides their thoughts in decision theory
which is extensively used in important mathematics concepts as Probability. Bayes theorem is
also widely used in Machine Learning where we need to predict classes precisely and accurately.
An important concept of Bayes theorem named Bayesian method is used to calculate
conditional probability in Machine Learning application that includes classification tasks.
Further, a simplified version of Bayes theorem (Naïve Bayes classification) is also used to
reduce computation time and average cost of the projects.

Bayes theorem is also known with some other name such as Bayes rule or Bayes Law. Bayes
theorem helps to determine the probability of an event with random knowledge. It is used to
calculate the probability of occurring one event while other one already occurred. It is a best
method to relate the condition probability and marginal probability.

In simple words, we can say that Bayes theorem helps to contribute more accurate results.

Bayes Theorem is used to estimate the precision of values and provides a method for calculating
the conditional probability. However, it is hypocritically a simple calculation but it is used to
easily calculate the conditional probability of events where intuition often fails. Some of the data
scientist assumes that Bayes theorem is most widely used in financial industries but it is not like
that. Other than financial, Bayes theorem is also extensively applied in health and medical,
research and survey industry, aeronautical sector, etc.

What is Bayes Theorem?

Bayes theorem is one of the most popular machine learning concepts that helps to calculate the
probability of occurring one event with uncertain knowledge while other one has already
occurred.

Bayes' theorem can be derived using product rule and conditional probability of event X with
known event Y:

According to the product rule we can express as the probability of event X with known event Y
as follows;

1. P(X ? Y)= P(X|Y) P(Y) {equation 1}

Further, the probability of event Y with known event X:

1. P(X ? Y)= P(Y|X) P(X) {equation 2}

2
Mathematically, Bayes theorem can be expressed by combining both equations on right hand
side. We will get:

Here, both events X and Y are independent events which means probability of outcome of both
events does not depends one another.

The above equation is called as Bayes Rule or Bayes Theorem.

P(X|Y) is called as posterior, which we need to calculate. It is defined as updated probability


after considering the evidence.

P(Y|X) is called the likelihood. It is the probability of evidence when hypothesis is true.

P(X) is called the prior probability, probability of hypothesis before considering the evidence

P(Y) is called marginal probability. It is defined as the probability of evidence under any
consideration.

Hence, Bayes Theorem can be written as:

posterior = likelihood * prior / evidence

Prerequisites for Bayes Theorem


While studying the Bayes theorem, we need to understand few important concepts. These are as
follows:

1. Experiment

An experiment is defined as the planned operation carried out under controlled condition such as
tossing a coin, drawing a card and rolling a dice, etc.

3
2. Sample Space

During an experiment what we get as a result is called as possible outcomes and the set of all
possible outcome of an event is known as sample space. For example, if we are rolling a dice,
sample space will be:

S1 = {1, 2, 3, 4, 5, 6}

Similarly, if our experiment is related to toss a coin and recording its outcomes, then sample
space will be:

S2 = {Head, Tail}

3. Event

Event is defined as subset of sample space in an experiment. Further, it is also called as set of
outcomes.

Assume in our experiment of rolling a dice, there are two event A and B such that;

A = Event when an even number is obtained = {2, 4, 6}

B = Event when a number is greater than 4 = {5, 6}

Probability of the event A ''P(A)''= Number of favourable outcomes / Total number of possible
outcomes
P(E) = 3/6 =1/2 =0.5

Similarly, Probability of the event B ''P(B)''= Number of favourable outcomes / Total number
of possible outcomes
=2/6
=1/3
=0.333

4
Union of event A and B:
A∪B = {2, 4, 5, 6}

Intersection of event A and B:


A∩B= {6}

Disjoint Event: If the intersection of the event A and B is an empty set or null then such events
are known as disjoint event or mutually exclusive events also.

5
4. Random Variable:

It is a real value function which helps mapping between sample space and a real line of an
experiment. A random variable is taken on some random values and each value having some
probability. However, it is neither random nor a variable but it behaves as a function which can
either be discrete, continuous or combination of both.

5. Exhaustive Event:

As per the name suggests, a set of events where at least one event occurs at a time, called
exhaustive event of an experiment.

Thus, two events A and B are said to be exhaustive if either A or B definitely occur at a time and
both are mutually exclusive for e.g., while tossing a coin, either it will be a Head or may be a
Tail.

6. Independent Event:

Two events are said to be independent when occurrence of one event does not affect the
occurrence of another event. In simple words we can say that the probability of outcome of both
events does not depends one another.

Mathematically, two events A and B are said to be independent if:

P(A ∩ B) = P(AB) = P(A)*P(B)

7. Conditional Probability:

Conditional probability is defined as the probability of an event A, given that another event B
has already occurred (i.e. A conditional B). This is represented by P(A|B) and we can define it as:

P(A|B) = P(A ∩ B) / P(B)

8. Marginal Probability:

Marginal probability is defined as the probability of an event A occurring independent of any


other event B. Further, it is considered as the probability of evidence under any consideration.

P(A) = P(A|B)*P(B) + P(A|~B)*P(~B)

6
Here ~B represents the event that B does not occur.

2. What Is the Naive Bayes Algorithm?


It is a classification technique based on Bayes’ Theorem with an independence assumption

among predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a

particular feature in a class is unrelated to the presence of any other feature.

The Naïve Bayes classifier is a popular supervised machine learning algorithm used for

classification tasks such as text classification. It belongs to the family of generative learning

algorithms, which means that it models the distribution of inputs for a given class or category.

This approach is based on the assumption that the features of the input data are conditionally

independent given the class, allowing the algorithm to make predictions quickly and accurately.

In statistics, naive Bayes classifiers are considered as simple probabilistic classifiers that apply

Bayes’ theorem. This theorem is based on the probability of a hypothesis, given the data and

some prior knowledge. The naive Bayes classifier assumes that all features in the input data are

independent of each other, which is often not true in real-world scenarios. However, despite this

simplifying assumption, the naive Bayes classifier is widely used because of its efficiency and

good performance in many real-world applications.

7
Moreover, it is worth noting that naive Bayes classifiers are among the simplest Bayesian

network models, yet they can achieve high accuracy levels when coupled with kernel density

estimation. This technique involves using a kernel function to estimate the probability density

function of the input data, allowing the classifier to improve its performance in complex

scenarios where the data distribution is not well-defined. As a result, the naive Bayes classifier is

a powerful tool in machine learning, particularly in text classification, spam filtering, and

sentiment analysis, among others.

For example, a fruit may be considered to be an apple if it is red, round, and about 3 inches in

diameter. Even if these features depend on each other or upon the existence of the other features,

all of these properties independently contribute to the probability that this fruit is an apple and

that is why it is known as ‘Naive’.

An NB model is easy to build and particularly useful for very large data sets. Along with

simplicity, Naive Bayes is known to outperform even highly sophisticated classification methods.

Bayes theorem provides a way of computing posterior probability P(c|x) from P(c), P(x) and

P(x|c). Look at the equation below:

8
here

P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).

P(c) is the prior probability of class.

P(x|c) is the likelihood which is the probability of the predictor given class.

P(x) is the prior probability of the predictor.

How Do Naive Bayes Algorithms Work?

Let’s understand it using an example. Below we have a training data set of weather
and corresponding target variable ‘Play’ (suggesting possibilities of playing). Now,
we need to classify whether players will play or not based on weather condition. Let’s
follow the below steps to perform it.

Convert the data set into a frequency table


In this first step data set is converted into a frequency table

9
Create Likelihood table by finding the probabilities
Create Likelihood table by finding the probabilities like Overcast probability
0.29 and probability of playing is
0.64.

Use Naive Bayesian equation to calculate the posterior probability


Now, use Naive Bayesian equation to calculate the posterior probability for
each class. The class with the highest posterior probability is the outcome of the
prediction.

Problem: Players will play if the weather is sunny. Is this statement correct?

We can solve it using the above-discussed method of posterior probability.

P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)

Here P( Sunny | Yes) * P(Yes) is in the numerator, and P (Sunny) is in the


denominator.

10
Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 =
0.64

Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.

So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60

P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)

P(Sunny|NO)= 2/4=0.5

So P(No|Sunny)= 0.5*0.29/0.35 = 0.41

At the end P(No)= 0.29, P(Sunny)= 0.35

So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)

Hence on a Sunny day, Player can play the game.

The Naive Bayes uses a similar method to predict the probability of different class based on
various attributes. This algorithm is mostly used in text classification and with problems having
multiple classes.

11
References
[1] javatpoint.com

[2] W3schools.com

12

You might also like