Logistic Regression and Naive Bayes
Logistic Regression and Naive Bayes
Learning technique. It is used for predicting the categorical dependent variable using a given set of independent
variables.
o Logistic regression predicts the output of a categorical dependent variable. Therefore the outcome must be a
categorical or discrete value. It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact
value as 0 and 1, it gives the probabilistic values which lie between 0 and 1.
o Logistic Regression is much similar to the Linear Regression except that how they are used. Linear Regression is
used for solving Regression problems, whereas Logistic regression is used for solving the classification
problems.
o In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic function, which predicts
two maximum values (0 or 1).
o The curve from the logistic function indicates the likelihood of something such as whether the cells are cancerous
or not, a mouse is obese or not based on its weight, etc.
o Logistic Regression is a significant machine learning algorithm because it has the ability to provide probabilities
and classify new data using continuous and discrete datasets.
o Logistic Regression can be used to classify the observations using different types of data and can easily determine
the most effective variables used for the classification. The below image is showing the logistic function:
Note: Logistic regression uses the concept of predictive modeling as regression; therefore, it is called logistic regression, but
is used to classify samples; Therefore, it falls under the classification algorithm.
o The sigmoid function is a mathematical function used to map the predicted values to probabilities.
o It maps any real value into another value within a range of 0 and 1.
o The value of the logistic regression must be between 0 and 1, which cannot go beyond this limit, so it forms a
curve like the "S" form. The S-form curve is called the Sigmoid function or the logistic function.
o In logistic regression, we use the concept of the threshold value, which defines the probability of either 0 or 1.
Such as values above the threshold value tends to 1, and a value below the threshold values tends to 0.
o In Logistic Regression y can be between 0 and 1 only, so for this let's divide the above equation by (1-y):
o But we need range between -[infinity] to +[infinity], then take logarithm of the equation it will become:
On the basis of the categories, Logistic Regression can be classified into three types:
o Binomial: In binomial Logistic regression, there can be only two possible types of the dependent variables, such
as 0 or 1, Pass or Fail, etc.
o Multinomial: In multinomial Logistic regression, there can be 3 or more possible unordered types of the
dependent variable, such as "cat", "dogs", or "sheep"
o Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types of dependent variables, such
as "low", "Medium", or "High".
o Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and used for
solving classification problems.
o It is mainly used in text classification that includes a high-dimensional training dataset.
o Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which helps in building
the fast machine learning models that can make quick predictions.
o It is a probabilistic classifier, which means it predicts on the basis of the probability of an object.
o Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental analysis, and classifying
articles.
The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be described as:
o Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is independent of the
occurrence of other features. Such as if the fruit is identified on the bases of color, shape, and taste, then red,
spherical, and sweet fruit is recognized as an apple. Hence each feature individually contributes to identify that it is
an apple without depending on each other.
o Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.
Bayes' Theorem:
o Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine the probability of a
hypothesis with prior knowledge. It depends on the conditional probability.
o The formula for Bayes' theorem is given as:
Where,
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a hypothesis is true.
Working of Naïve Bayes' Classifier can be understood with the help of the below example:
Suppose we have a dataset of weather conditions and corresponding target variable "Play". So using this dataset we need to
decide that whether we should play or not on a particular day according to the weather conditions. So to solve this problem,
we need to follow the below steps:
There are three types of Naive Bayes Model, which are given below:
o Gaussian: The Gaussian model assumes that features follow a normal distribution. This means if predictors take
continuous values instead of discrete, then the model assumes that these values are sampled from the Gaussian
distribution.
o Multinomial: The Multinomial Naïve Bayes classifier is used when the data is multinomial distributed. It is
primarily used for document classification problems, it means a particular document belongs to which category
such as Sports, Politics, education, etc.
The classifier uses the frequency of words for the predictors.
o Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier, but the predictor variables are the
independent Booleans variables. Such as if a particular word is present or not in a document. This model is also
famous for document classification tasks.