0% found this document useful (0 votes)

2 views

ML-Unit-2

The document provides an overview of supervised machine learning algorithms, defining machine learning and its types, including supervised, unsupervised, and reinforcement learning. It details various algorithms such as linear regression, logistic regression, decision trees, K-nearest neighbors, and support vector machines, explaining their functions and applications. The document emphasizes the importance of supervised learning, where algorithms learn from labeled data to predict outcomes based on input variables.

Uploaded by

sutarsarojani2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

ML-Unit-2

Uploaded by

sutarsarojani2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

UNIT-II Supervised Machine Learning Algorithms

Machine Learning?

Two definitions of Machine Learning are offered.

Arthur Samuel described it as: Tom Mitchell provides a more modern

definition

“The field of study that gives computers the : “A computer program is said to learn from
ability to learn without being explicitly experience E with respect to some class of tasks
programmed.” This is an older, informal T and performance measure P, if its
definition. performance at tasks in T, as measured by P,
improves with experience E.”

Types of Machine Learning (ML)

Machine Learning Algorithms helps computer system learn without being explicitly
programmed. These algorithms are categorized into supervised or unsupervised. Let us now see
a few algorithms −

Supervised machine learning algorithms

This is the most commonly used machine learning algorithm. It is called supervised because the
process of algorithm learning from the training dataset can be thought of as a teacher
supervising the learning process. In this kind of ML algorithm, the possible outcomes are
already known and training data is also labeled with correct answers. It can be understood as
follows −

Suppose we have input variables x and an output variable y and we applied an algorithm to
learn the mapping function from the input to output such as −

Y = f(x)

Now, the main goal is to approximate the mapping function so well that when we have new
input data (x), we can predict the output variable (Y) for that data.

Mainly supervised leaning problems can be divided into the following two kinds of problems −

 Classification − A problem is called classification problem when we have the

categorized output such as “black”, “teaching”, “non-teaching”, etc.

 Regression − A problem is called regression problem when we have the real value
output such as “distance”, “kilogram”, etc.

Decision tree, random forest, knn, logistic regression are the examples of supervised machine
learning algorithms.

Unsupervised machine learning algorithms

As the name suggests, these kinds of machine learning algorithms do not have any supervisor to
provide any sort of guidance. That is why unsupervised machine learning algorithms are closely
aligned with what some call true artificial intelligence. It can be understood as follows −

Suppose we have input variable x, then there will be no corresponding output variables as there
is in supervised learning algorithms.

In simple words, we can say that in unsupervised learning there will be no correct answer and
no teacher for the guidance. Algorithms help to discover interesting patterns in data.

Unsupervised learning problems can be divided into the following two kinds of problem −

 Clustering − In clustering problems, we need to discover the inherent groupings in the

data. For example, grouping customers by their purchasing behavior.
 Association − A problem is called association problem because such kinds of problem
require discovering the rules that describe large portions of our data. For example,
finding the customers who buy both x and y.

K-means for clustering, Apriori algorithm for association are the examples of unsupervised
machine learning algorithms.

Reinforcement machine learning algorithms

These kinds of machine learning algorithms are used very less. These algorithms train the
systems to make specific decisions. Basically, the machine is exposed to an environment where
it trains itself continually using the trial and error method. These algorithms learn from past
experience and try to capture the best possible knowledge to make accurate decisions. Markov
Decision Process is an example of reinforcement machine learning algorithms.

Example of Machine Learning Algorithms:

1. Linear Regression
2. Logistic Regression
3. Decision Tree
4. Support Vector Machine (SVM)
5. Naïve Bayes
6. K-nearest neighbor(KNN)
7. K-mean clustering
8. Random Forest
1. Linear Regression

It is one of the most well-known algorithms in statistics and machine learning.

Basic concept − mainly linear regression is a linear model that assumes a linear relationship
between the input variables say x and the single output variable say y. In other words, we can
say that y can be calculated from a linear combination of the input variables x. The relationship
between variables can be established by fitting a best line.

Types of Linear Regression

Linear regression is of the following two types −

 Simple linear regression − A linear regression algorithm is called simple linear

regression if it is having only one independent variable.

 Multiple linear regression − A linear regression algorithm is called multiple linear

regression if it is having more than one independent variable.

Linear regression is mainly used to estimate the real values based on continuous variable(s). For
example, the total sale of a shop in a day, based on real values, can be estimated by linear
regression.

2. Logistic Regression

It is a classification algorithm and also known as logit regression.

Mainly logistic regression is a classification algorithm that is used to estimate the discrete
values like 0 or 1, true or false, yes or no based on a given set of independent variable.
Basically, it predicts the probability hence its output lies in between 0 and 1.
3. NAIVE BAYES CLASSIFIER

1. Introduction

Naive Bayes is a probabilistic machine learning algorithm that can be used in a wide variety of
classification tasks. Typical applications include filtering spam, classifying documents, sentiment
prediction etc

The name naive is used because it assumes the features that go into the model are independent of
each other. That is changing the value of one feature, does not directly influence or change the
value of any of the other features used in the algorithm.

2. The Bayes Rule

The Bayes Rule is a way of going from P(X|Y), known from the training dataset, to find P(Y|X).

To do this, we replace A and B in the above formula, with the feature X and response Y.

For observations in test or scoring data, the X would be known while Y is unknown. And for
each row of the test dataset, compute the probability of Y given the X has already happened.
3. The Naive Bayes

The Bayes Rule provides the formula for the probability of Y given X. But, in real-world
problems, we typically have multiple X variables.

When the features are independent, we can extend the Bayes Rule to what is called Naive
Bayes.

It is called „Naive‟ because of the naive assumption that the X‟s are independent of each other.
Regardless of its name, it‟s a powerful formula.
In technical jargon, the left-hand-side (LHS) of the equation is understood as the posterior
probability or simply the posterior

The RHS has 2 terms in the numerator.

The first term is called the „Likelihood of Evidence‟. It is nothing but the conditional probability
of each X‟s given Y is of particular class „c‟.

Since the X‟s entire are assumed to be independent of each other, we can just multiply the
„likelihoods‟ of all the X‟s and called it the „Probability of likelihood of evidence‟. This is
known from the training dataset by filtering records where Y=c.

The second term is called the prior which is the overall probability of Y=c, where c is a class of
Y. In simpler terms, Prior = count(Y=c) / n_Records.
Example

Let‟s say we have data on 1000 pieces of fruit. The fruit being a Banana, Orange or some other
fruit and imagine we know 3 features of each fruit, whether it‟s long or not, sweet or not and
yellow or not, as displayed in the table below.

So from the table we already know?

 50% of the fruits are bananas

 30% are oranges

 20% are other fruits

Based on our training set we can also say the following:

 From 500 bananas 400 (0.8) are Long, 350 (0.7) are Sweet and 450 (0.9) are Yellow

 Out of 300 oranges, 0 are Long, 150 (0.5) are Sweet and 300 (1) are Yellow

 From the remaining 200 fruits, 100 (0.5) are Long, 150 (0.75) are Sweet and 50 (0.25) are
Yellow

Which should provide enough evidence to predict the class of another fruit as it‟s introduced.
So let‟s say we‟re given the features of a piece of fruit and we need to predict the class. If we‟re
told that the additional fruit is Long, Sweet and yellow, we can classify it using the following
formula

The one with the highest probability (score) being the winner.

Banana:

Orange:
Other Fruit:

In this case, based on the higher score (0.252 for banana) we can assume this Long, Sweet and
Yellow fruit is in fact, a Banana.

4. Decision Tree

Decision Trees are a class of very powerful Machine Learning model cable of achieving high
accuracy in many tasks while being highly interpretable. What makes decision trees special in
the realm of ML models is really their clarity of information representation. The “knowledge”
learned by a decision tree through training is directly formulated into a hierarchical structure.
This structure holds and displays the knowledge in such a way that it can easily be understood,
even by non-experts.

Decision tree algorithm falls under the category of supervised learning. They can be used to
solve both regression and classification problems.

Decision tree uses the tree representation to solve the problem in which each leaf node
corresponds to a class label and attributes are represented on the internal node of the tree.
We can represent any boolean function on discrete attributes using the decision tree.

The most notable types of decision tree algorithms are:-

1. Iterative Dichotomiser 3 (ID3): This algorithm uses Information Gain to decide which
attribute is to be used classify the current subset of the data. For each level of the tree,
information gain is calculated for the remaining data recursively.

2. C4.5: This algorithm is the successor of the ID3 algorithm. This algorithm uses either
Information gain or Gain ratio to decide upon the classifying attribute. It is a direct improvement
from the ID3 algorithm as it can handle both continuous and missing attribute values.

3. Classification and Regression Tree (CART): It is a dynamic learning algorithm which can
produce a regression tree as well as a classification tree depending upon the dependent variable.

ID3 Algorithm:

ID3 is a greedy algorithm that grows the tree top-down, at each node selecting the attribute that
best classifies the local training examples. This process continues until the tree perfectly
classifies the training examples or until all attributes have been used.

1. Entropy

It is a fundamental theorem which commonly used in information theory to

measure important of information relative to its size. Let x is our training set
contains positive and negative examples, then the entropy of x relative to this
classification is:

2. Information Gain

For training set x and its attribute y, the formula of Information Gain is:
a. Entropy is 0 if all the members of S belong to the same class.
b. Entropy is 1 when the sample contains an equal number of positive and negative
examples.
c. If the sample contains unequal number of positive and negative examples, entropy is
between 0 and 1.

Example 1:

Data Set
5. K NEAREST NEIGHBOR

Introduction to K-nearest neighbor classifier

K-nearest neighbor classifier is one of the introductory supervised classifier, which every data
science learner should be aware of.

For simplicity, this classifier is called as Knn Classifier. To be surprised k-nearest neighbor
classifier mostly represented as Knn, even in many research papers too. Knn address the pattern
recognition problems and also the best choices for addressing some of the classification
related tasks.

The simple version of the K-nearest neighbor classifier algorithms is to predict the target label by
finding the nearest neighbor class. The closest class will be identified using the distance
measures like Euclidean distance.

K-nearest neighbor classification step by step procedure

Before diving into the k-nearest neighbor, classification process lets‟ understand the application-
oriented example where we can use the knn algorithm.

Knn Algorithm Pseudocode:

1. Calculate “d(x, xi)” i =1, 2, ….., n; where d denotes the Euclidean distance between the
points.
2. Arrange the calculated n Euclidean distances in non-decreasing order.
3. Let k be a +ve integer, take the first k distances from this sorted list.
4. Find those k-points corresponding to these k-distances.
5. Let ki denotes the number of points belonging to the ith class among k points i.e. k ≥ 0
6. If ki >kj ∀ i ≠ j then put x in class i.
K- Nearest neighbor algorithm example

Let‟s consider that we have two different target classes‟ white and orange circles. We have total
26 training samples. Now we would like to predict the target class for the blue
circle. Considering k value as three, we need to calculate the similarity distance using similarity
measures like Euclidean distance.

If the similarity score is less which means the classes are close. We have calculated distance and
placed the less distance circles to blue circle inside the Big circle.

Let‟s consider a setup with “n” training samples, where x i is the training data point. The training
data points are categorized into “c” classes. Using KNN, we want to predict class for the new
data point. So, the first step is to calculate the distance (Euclidean) between the new data point
and all the training data points.

Next step is to arrange all the distances in non-decreasing order. Assuming a positive value of
“K” and filtering “K” least values from the sorted list. Now, we have K top distances. Let
ki denotes no. of points belonging to the ith class among k points. If ki >kj ∀i ≠ j then put x in
class i.

Nearest Neighbor Algorithm:

Nearest neighbor is a special case of k-nearest neighbor class. Where k value is 1 (k = 1). In this
case, new data point target class will be assigned to the 1st closest neighbor.

How to choose the value of K?

Selecting the value of K in K-nearest neighbor is the most critical problem. A small value of K
means that noise will have a higher influence on the result i.e., the probability of over fitting is
very high. A large value of K makes it computationally expensive and defeats the basic idea
behind KNN (that points that are near might have similar classes). A simple approach to select k
is k = n^(1/2).
To optimize the results, we can use Cross Validation. Using the cross-validation technique, we
can test KNN algorithm with different values of K. The model which gives good accuracy can be
considered to be an optimal choice.

It depends on individual cases, at times best process is to run through each possible value of k
and test our result.

Most of the classification techniques can be classified into the following three groups:

1. Parametric
2. Semi parametric
3. Non-Parametric

Parametric & Semi parametric classifiers need specific information about the structure of data in
training set. It is difficult to fulfil this requirement in many cases. So, non-parametric classifier
like KNN was considered.

Advantages of K-nearest neighbors algorithm

 Knn is simple to implement.

 Knn executes quickly for small training data sets.
 Performance asymptotically approaches the performance of the Bayes Classifier.
 Don‟t need any prior knowledge about the structure of data in the training set.
 No retraining is required if the new training pattern is added to the existing training set.

Limitation to K-nearest neighbors algorithm

 When the training set is large, it may take a lot of space.

 For every test data, the distance should be computed between test data and all the training
data. Thus a lot of time may be needed for the testing.
6. SUPPORT VECTOR MACHINE

Introduction to Support Vector Machines

A support vector machine (SVM) is a supervised learning technique from the field of machine
learning applicable to both classification and regression.
Rooted in the Statistical Learning Theory developed by Vladimir Vapnik and co-workers at
AT&T Bell Laboratories in 1995, SVMs are based on the principle of Structural Risk
Minimization

Support Vector Machines was worked out for linear two-class classification with margin,
where margin means the minimal distance from the separating hyperplane to the closest data
points. SVM learning machine seeks for an optimal separating hyperplane, where the margin is
maximal. An important and unique feature of this approach is that the solution is based only on
those data points, which are at the margin. These points are called support vectors.

Figure 1

Support Vector machines (SVM) are a new statistical learning technique that can be seen as a
new method for training classifiers based on polynomial functions, radial basis functions,
neural networks, spines or other functions.
Support Vector machines use a hyper-linear separating plane to create a classifier. For
problems that cannot be linearly separated in the input space, this machine offers a possibility to
find a solution by making a non-linear transformation of the original input space into a high
dimensional feature space, where an optimal separating hyperplane can be found.

SVM differs from the other classification algorithms in the way that it chooses the decision
boundary that maximizes the distance from the nearest data points of all the classes. An SVM
doesn't merely find a decision boundary; it finds the most optimal decision boundary.

The most optimal decision boundary is the one which has maximum margin from the nearest
points of all the classes. The nearest points from the decision boundary that maximize the
distance between the decision boundary and the points are called support vectors as seen in
Figure 1. The decision boundary in case of support vector machines is called the maximum
margin classifier, or the maximum margin hyper plane.

The main objective in SVM is to find the optimal hyperplane to correctly classify between data
points of different classes (Figure 2). The hyperplane dimensionality is equal to the number of
input features minus one (eg. when working with three feature the hyperplane will be a two-

dimensional plane).

Figure 2
Data points on one side of the hyperplane will be classified to a certain class while data points on
the other side of the hyperplane will be classified to a different class (eg. green and red as in
Figure 2). The distance between the hyperplane and the first point (for all the different classes)
on either side of the hyperplane is a measure of sure the algorithm is about its classification
decision. The bigger the distance and the more confident we can be SVM is making the right
decision.

The data points closest to the hyperplane are called Support Vectors. Support Vectors
determines the orientation and position of the hyperplane, in order to maximise the classifier
margin (and therefore the classification score). The number of Support Vectors the SVM
algorithm should use can be arbitrarily chosen depending on the applications.

There are two main types of classification SVM algorithms Hard Margin and Soft Margin:

 Hard Margin: aims to find the best hyperplane without tolerating any form of
misclassification.
 Soft Margin: we add a degree of tolerance in SVM. In this way we allow the model to
voluntary misclassify a few data points if that can lead to identifying a hyperplane able to
generalise better to unseen data.

Soft Margin SVM can be implemented in Scikit-Learn by adding a C penalty term in svm.SVC.
The bigger C and the more penalty the algorithm gets when making a misclassification.

Kernel Trick
If the data we are working with is not linearly separable (therefore leading to poor linear SVM
classification results), it is possible to apply a technique known as the Kernel Trick. This method
is able to map our non-linear separable data into a higher dimensional space, making our data
linearly separable. Using this new dimensional space SVM can then be
Figure 3

Easily implemented (as shown in Figure 3).

There are many different types of Kernels which can be used to create this higher dimensional
space; some examples are linear, polynomial, Sigmoid and Radial Basis Function (RBF). In
Scikit-Learn a Kernel function can be specified by adding a kernel parameter in svm.SVC. An
additional parameter called gamma can be included to specify the influence of the kernel on the
model.

Feature Selection
Reducing the number of features in Machine Learning plays a really important role especially
when working with large datasets. This can in fact: speed up training, avoid overfitting and
ultimately lead to better classification results thanks to the reduced noise in the data.

How does it work?

1. Identify the right hyper-plane (Scenario-1): Here, we have three hyper-planes

(A, B and C). Now, identify the right hyper-plane to classify star and circle.
 We need to remember a thumb rule to identify the right hyper-plane: “Select the
hyper-plane which segregates the two classes better”. In this scenario, hyper-plane
“B” has excellently performed this job.

2. Identify the right hyper-plane (Scenario-2): Here, we have three hyper-planes

(A, B and C) and all are segregating the classes well. Now, how can we identify
the right hyper-plane?

Here, maximizing the distances between nearest data points (either class) and hyper-plane
will help us to decide the right hyper-plane. This distance is called as Margin.
Above, we can see that the margin for hyper-plane C is high as compared to both A and
B. Hence, we name the right hyper-plane as C. Another lightning reason for selecting
the hyper-plane with higher margin is robustness. If we select a hyper-plane having low
margin then there is high chance of miss-classification.

3. Identify the right hyper-plane (Scenario-3):

Some of you may have selected the hyper-plane B as it has higher margin compared to A. But,
here is the catch; SVM selects the hyper-plane which classifies the classes accurately prior
to maximizing margin. Here, hyper-plane B has a classification error and A has classified all
correctly. Therefore, the right hyper-plane is A.
4. Can we classify two classes (Scenario-4)?: In the scenario below

It is mentioned that, one star at other end is like an outlier for star class. SVM has a
feature to ignore outliers and find the hyper-plane that has maximum margin. Hence, we
can say, SVM is robust to outliers.

5. Find the hyper-plane to segregate to classes (Scenario-5): In the scenario

below, we can‟t have linear hyper-plane between the two classes, so how does SVM
classify these two classes? Till now, we have only looked at the linear hyper-plane.

SVM can solve this problem. Easily! It solves this problem by introducing additional
feature. Here, we will add a new feature z=x^2+y^2. Now, let‟s plot the data points on

Axis x and z.

In above plot, points to consider are:

 All values for z would be positive always because z is the squared sum of
both x and y
 In the original plot, red circles appear close to the origin of x and y axes,
leading to lower value of z and star relatively away from the origin result
to higher value of z.

In SVM, it is easy to have a linear hyper-plane between these two classes. But, another
burning question which arises is, should we need to add this feature manually to have a
hyper-plane. No, SVM has a technique called the kernel trick. These are functions which
takes low dimensional input space and transform it to a higher dimensional space i.e. it
converts not separable problem to separable problem, these functions are called kernels.
It is mostly useful in non-linear separation problem. Simply put, it does some extremely
complex data transformations, then find out the process to separate the data based on the
labels or outputs we have defined.

When we look at the hyper-plane in original input space it looks like a circle:

Pros and Cons associated with SVM

 Pros:
o It works really well with clear margin of separation
o It is effective in high dimensional spaces.
o It is effective in cases where number of dimensions is greater than the number of
samples.
o It uses a subset of training points in the decision function (called support vectors),
so it is also memory efficient.
 Cons:
o It doesn‟t perform well, when we have large data set because the required training
time is higher
o It also doesn‟t perform very well, when the data set has more noise i.e. target
classes are overlapping
o SVM doesn‟t directly provide probability estimates, these are calculated using an
expensive five-fold cross-validation. It is related SVC method of Python scikit-
learn library.

7. RANDOM FOREST ALGORITHM

Introduction

Random forest is a supervised learning algorithm which is used for both classification as well
as regression. But however, it is mainly used for classification problems. As we know that a
forest is made up of trees and more trees means more robust forest. Similarly, random forest
algorithm creates decision trees on data samples and then gets the prediction from each of them
and finally selects the best solution by means of voting. It is an ensemble method which is
better than a single decision tree because it reduces the over-fitting by averaging the result.

Working of Random Forest Algorithm

We can understand the working of Random Forest algorithm with the help of following steps −

 Step 1 − First, start with the selection of random samples from a given dataset.
 Step 2 − Next, this algorithm will construct a decision tree for every sample. Then it will
get the prediction result from every decision tree.

 Step 3 − In this step, voting will be performed for every predicted result.

 Step 4 − At last, select the most voted prediction result as the final prediction result.

The following diagram will illustrate its working −

Overfitting

Overfitting is a practical problem while building a decision tree model. The model is having an
issue of overfitting is considered when the algorithm continues to go deeper and deeper in the to
reduce the training set error but results with an increased test set error i.e, Accuracy of prediction
for our model goes down. It generally happens when it builds many branches due to outliers and
irregularities in data.

Two approaches which we can use to avoid overfitting are:

 Pre-Pruning
 Post-Pruning
Pre-Pruning

In pre-pruning, it stops the tree construction bit early. It is preferred not to split a node if its
goodness measure is below a threshold value. But it‟s difficult to choose an appropriate stopping
point.

Post-Pruning

In post-pruning first, it goes deeper and deeper in the tree to build a complete tree. If the tree
shows the overfitting problem then pruning is done as a post-pruning step. We use a cross-
validation data to check the effect of our pruning. Using cross-validation data, it tests whether
expanding a node will make an improvement or not.

If it shows an improvement, then we can continue by expanding that node. But if it shows a
reduction in accuracy then it should not be expanded i.e, the node should be converted to a leaf
node.

Decision Tree Algorithm Advantages and Disadvantages

Advantages:

1. Decision Trees are easy to explain. It results in a set of rules.

2. It follows the same approach as humans generally follow while making decisions.
3. Interpretation of a complex Decision Tree model can be simplified by its visualizations.
Even a naive person can understand logic.
4. The Number of hyper-parameters to be tuned is almost null.

Disadvantages:

1. There is a high probability of overfitting in Decision Tree.

2. Generally, it gives low prediction accuracy for a dataset as compared to other machine
learning algorithms.
3. Information gain in a decision tree with categorical variables gives a biased response for
attributes with greater no. of categories.
4. Calculations can become complex when there are many class labels.

How Random Forest algorithm works?

There are two stages in Random Forest algorithm, one is random forest creation, the other is to
make a prediction from the random forest classifier created in the first stage. The whole process is
shown below, and it‟s easy to understand using the figure.

Here the author firstly shows the Random Forest creation pseudocode:

1. Randomly select “K” features from total “m” features where k << m

2. Among the “K” features, calculate the node “d” using the best split point

3. Split the node into daughter nodes using the best split

4. Repeat the a to c steps until “l” number of nodes has been reached

5. Build forest by repeating steps a to d for “n” number times to create “n”

Random Forest algorithm Application.

 For the application in banking, Random Forest algorithm is used to find loyal customers, who
mean customers who can take out plenty of loans and pay interest to the bank properly, and
fraud customers, which means customers who have bad records like failure to pay back a
loan on time or have dangerous actions.

 For the application in medicine, Random Forest algorithm can be used to both identify the
correct combination of components in medicine, and to identify diseases by analyzing the
patient‟s medical records.

 For the application in the stock market, Random Forest algorithm can be used to identify a
stock‟s behaviour and the expected loss or profit.
 For the application in e-commerce, Random Forest algorithm can be used for predicting
whether the customer will like the recommend products, based on the experience of similar
customers.

Advantages of Random Forest algorithm.

Compared with other classification techniques, there are three advantages as the author
mentioned.

1. For applications in classification problems, Random Forest algorithm will avoid the
overfitting problem.

2. For classification and regression task, the same random forest algorithm can be used.

ExtendSim For DESS Textbook
No ratings yet
ExtendSim For DESS Textbook
124 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
UNIT1
No ratings yet
UNIT1
38 pages
ML Algorithms
No ratings yet
ML Algorithms
12 pages
ML - Unit - 1
No ratings yet
ML - Unit - 1
47 pages
Unit 4 - Machine Learning PDF
No ratings yet
Unit 4 - Machine Learning PDF
49 pages
unit 1
100% (1)
unit 1
13 pages
Supervised Learning
No ratings yet
Supervised Learning
46 pages
ML Doc1
No ratings yet
ML Doc1
14 pages
Unit 3 ML
No ratings yet
Unit 3 ML
28 pages
Colloquium Evaluation: Faculty of Computer Science and Engineering To:Kanika Gupta Ma'Am Bhavya Sethi 16csu082
No ratings yet
Colloquium Evaluation: Faculty of Computer Science and Engineering To:Kanika Gupta Ma'Am Bhavya Sethi 16csu082
12 pages
Evolutional Study On KNN and K-Means Algorithms (SP)
No ratings yet
Evolutional Study On KNN and K-Means Algorithms (SP)
9 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
21 pages
unit 1
No ratings yet
unit 1
8 pages
AI Unit-4
No ratings yet
AI Unit-4
58 pages
ML Notes UT-1
No ratings yet
ML Notes UT-1
21 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
ML Notes
No ratings yet
ML Notes
10 pages
Unit Iii
No ratings yet
Unit Iii
18 pages
Unit Iii Supervised Learning
No ratings yet
Unit Iii Supervised Learning
67 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Session 3 Types of Machine Learning (1)
No ratings yet
Session 3 Types of Machine Learning (1)
22 pages
Machine Learning Ppts
No ratings yet
Machine Learning Ppts
38 pages
Intro Machine Learning
No ratings yet
Intro Machine Learning
4 pages
Unit II
No ratings yet
Unit II
25 pages
Machine Learning
100% (6)
Machine Learning
115 pages
Data Science Vijay1
No ratings yet
Data Science Vijay1
88 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
17 pages
Machine Learning For Beginners PDF
No ratings yet
Machine Learning For Beginners PDF
29 pages
Chap2 SupervisedLearning
No ratings yet
Chap2 SupervisedLearning
24 pages
Machine Learning Types
No ratings yet
Machine Learning Types
30 pages
Machine Learning Notes
100% (1)
Machine Learning Notes
8 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
ML Chapter 1
No ratings yet
ML Chapter 1
41 pages
Machine Learning Reg
No ratings yet
Machine Learning Reg
45 pages
Unit - 3
No ratings yet
Unit - 3
83 pages
INTRODUCTION
No ratings yet
INTRODUCTION
51 pages
Machine Learning For Beginners
100% (1)
Machine Learning For Beginners
30 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
Supervised Machine Learning Algorithm
100% (1)
Supervised Machine Learning Algorithm
111 pages
ML Type
No ratings yet
ML Type
13 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Unit 3 Machine Learning
No ratings yet
Unit 3 Machine Learning
12 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
UNIT 1 PART 3
No ratings yet
UNIT 1 PART 3
11 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
27 pages
Machine Learning Models
No ratings yet
Machine Learning Models
11 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
ML & Cloud Computing For Iot: Topics in Module-3
No ratings yet
ML & Cloud Computing For Iot: Topics in Module-3
38 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
15 pages
machine learning notes
No ratings yet
machine learning notes
20 pages
Machine Learning (1)
No ratings yet
Machine Learning (1)
133 pages
Capture d’écran, le 2025-03-18 à 04.47.36
No ratings yet
Capture d’écran, le 2025-03-18 à 04.47.36
63 pages
FALL SEMESTER 2019-20 AI With Python: ECE4031 Digital Assignment - 1
No ratings yet
FALL SEMESTER 2019-20 AI With Python: ECE4031 Digital Assignment - 1
14 pages
Unit-5 MECH 3-2
No ratings yet
Unit-5 MECH 3-2
14 pages
Intorduction of ML
No ratings yet
Intorduction of ML
14 pages
Machine learning algorithms laiki
No ratings yet
Machine learning algorithms laiki
123 pages
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Sensor Network Databases - An Introduction: Context
No ratings yet
Sensor Network Databases - An Introduction: Context
29 pages
Audit Interne Et Audit Qualité Interne
No ratings yet
Audit Interne Et Audit Qualité Interne
10 pages
Binary Tree
No ratings yet
Binary Tree
5 pages
Library Management System
No ratings yet
Library Management System
5 pages
SmartFlash Help
No ratings yet
SmartFlash Help
9 pages
Introduction To Computer Systems: The Course That Gives CMU Its "Zip"!
No ratings yet
Introduction To Computer Systems: The Course That Gives CMU Its "Zip"!
29 pages
Continuity of Functions of One Variable
No ratings yet
Continuity of Functions of One Variable
20 pages
Yealink Basic IP Phone SIP-T9CM User Manual
No ratings yet
Yealink Basic IP Phone SIP-T9CM User Manual
9 pages
Spring Mass Damper
No ratings yet
Spring Mass Damper
5 pages
Katalog PLC
No ratings yet
Katalog PLC
51 pages
PM06 - Serial Number Process
No ratings yet
PM06 - Serial Number Process
11 pages
Arrowhead PDF
No ratings yet
Arrowhead PDF
31 pages
Unit 3 Probabilitity Notes (Answers)
No ratings yet
Unit 3 Probabilitity Notes (Answers)
48 pages
Magazine: Japanese Home Style Bathroom: 4 Types of Light
No ratings yet
Magazine: Japanese Home Style Bathroom: 4 Types of Light
6 pages
Example Test Plan
No ratings yet
Example Test Plan
15 pages
22CS205 COA M1T1-QP-part A
No ratings yet
22CS205 COA M1T1-QP-part A
3 pages
Stage of SISP
No ratings yet
Stage of SISP
8 pages
Algebra 2 Honors Syllabusname
No ratings yet
Algebra 2 Honors Syllabusname
6 pages
Digital Drawing Pad Using Graphical LCD and Touch Screen: Construction
No ratings yet
Digital Drawing Pad Using Graphical LCD and Touch Screen: Construction
3 pages
RAT-UE - Drive Test Device
No ratings yet
RAT-UE - Drive Test Device
1 page
Machine Learning Answers
No ratings yet
Machine Learning Answers
3 pages
Minibar Procedures
No ratings yet
Minibar Procedures
2 pages
Word 2013 - Mail Merge and Macros - studentFINAL
No ratings yet
Word 2013 - Mail Merge and Macros - studentFINAL
52 pages
Cloudera Industry Brief Digital Insurance
No ratings yet
Cloudera Industry Brief Digital Insurance
2 pages
13000320111_Rahul_Chakraborty
No ratings yet
13000320111_Rahul_Chakraborty
11 pages
Lecture#13, Chap#3 (Lexical Analyzer Generator (Part-III), NFA To DFA Conversion)
No ratings yet
Lecture#13, Chap#3 (Lexical Analyzer Generator (Part-III), NFA To DFA Conversion)
23 pages
r05321901 Embedded and Real Time Systems
No ratings yet
r05321901 Embedded and Real Time Systems
4 pages
Demantra - Vikram
100% (1)
Demantra - Vikram
3 pages

ML-Unit-2

Uploaded by

ML-Unit-2

Uploaded by

UNIT-II Supervised Machine Learning Algorithms

Two definitions of Machine Learning are offered.

Arthur Samuel described it as: Tom Mitchell provides a more modern

Types of Machine Learning (ML)

Supervised machine learning algorithms

 Classification − A problem is called classification problem when we have the

Unsupervised machine learning algorithms

 Clustering − In clustering problems, we need to discover the inherent groupings in the

Reinforcement machine learning algorithms

Example of Machine Learning Algorithms:

It is one of the most well-known algorithms in statistics and machine learning.

Types of Linear Regression

Linear regression is of the following two types −

 Simple linear regression − A linear regression algorithm is called simple linear

 Multiple linear regression − A linear regression algorithm is called multiple linear

It is a classification algorithm and also known as logit regression.

2. The Bayes Rule

The RHS has 2 terms in the numerator.

So from the table we already know?

 50% of the fruits are bananas

 30% are oranges

 20% are other fruits

Based on our training set we can also say the following:

The most notable types of decision tree algorithms are:-

It is a fundamental theorem which commonly used in information theory to

Introduction to K-nearest neighbor classifier

K-nearest neighbor classification step by step procedure

Knn Algorithm Pseudocode:

Nearest Neighbor Algorithm:

How to choose the value of K?

Advantages of K-nearest neighbors algorithm

 Knn is simple to implement.

Limitation to K-nearest neighbors algorithm

 When the training set is large, it may take a lot of space.

Introduction to Support Vector Machines

Easily implemented (as shown in Figure 3).

How does it work?

1. Identify the right hyper-plane (Scenario-1): Here, we have three hyper-planes

2. Identify the right hyper-plane (Scenario-2): Here, we have three hyper-planes

3. Identify the right hyper-plane (Scenario-3):

5. Find the hyper-plane to segregate to classes (Scenario-5): In the scenario

In above plot, points to consider are:

Pros and Cons associated with SVM

7. RANDOM FOREST ALGORITHM

Working of Random Forest Algorithm

The following diagram will illustrate its working −

Two approaches which we can use to avoid overfitting are:

Decision Tree Algorithm Advantages and Disadvantages

1. Decision Trees are easy to explain. It results in a set of rules.

1. There is a high probability of overfitting in Decision Tree.

How Random Forest algorithm works?

Random Forest algorithm Application.

Advantages of Random Forest algorithm.

You might also like