0% found this document useful (0 votes)

9 views5 pages

Unit III

notes

Uploaded by

Blaze 08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views5 pages

Unit III

notes

Uploaded by

Blaze 08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Unit III: Supervised Learning Techniques

3.1 Decision Trees

What is a Decision Tree?

A Decision Tree is a supervised learning model that looks like a tree. It helps make decisions or predictions by
breaking down data into smaller and smaller parts. It can be used for both "yes/no" type answers (classification) and
guessing numbers (regression).

How it Looks and Works:

• Nodes: Each "circle" or "box" in the tree is a "node."

o Internal Node: This is where a question is asked about a feature (like "Is age > 30?").

o Branch: The lines coming out of a node are "branches," representing the answers to the question
(like "Yes" or "No").

o Leaf Node: These are the very end points of the tree. They give the final answer or prediction (like
"Buy computer" or "Don't buy").

o Root Node: This is the starting node at the very top of the tree.

• The tree keeps asking questions and splitting the data until it reaches a final answer. The goal is to make the
purest groups possible at the end.

Key Ideas:

• A tree-like model for making decisions.

• Used for classification (categories) and regression (numbers).

• Made of nodes, branches, and leaves.

• Works by splitting data based on questions about features.

3.2 Naive Bayes Classification

What is Naive Bayes?

Naive Bayes is a classification algorithm that works based on probability, using a rule called Bayes' Theorem. It's
"naive" because it assumes that all features (like a person's age, income, and job) are independent of each other
when predicting a class (like whether they will buy a product). This means it thinks one feature doesn't affect
another.

How it Works:

Even with this simple assumption, Naive Bayes often performs surprisingly well, especially with a lot of data. It's very
fast and efficient. It calculates the chance that a certain input belongs to each possible category, and then picks the
category with the highest chance.

Key Ideas:

• It's a classification algorithm (predicts categories).

• Based on probability (Bayes' Theorem).

• Assumes features are independent (which is why it's "naive").

• Good for tasks like spam detection or figuring out text feelings.

• Simple, fast, and efficient.

3.3 Classification (General Concepts)

What is Classification?

Classification is a type of supervised learning where the computer learns to put things into predefined groups or
categories. The answer it gives is always one of these specific labels.

How it Differs:

• Output: The main thing about classification is that its output is always a category (like "yes" or "no", "cat" or
"dog").

• Learning: The model learns from data that already has these categories marked (labeled data).

Examples:

• Is this email spam or not spam?

• Is this picture a cat, a dog, or a bird?

• Does this person have a disease or not?

Key Points:

• Predicts discrete categories (labels).

• Learns from labeled data.

• Different from regression (which predicts numbers).

3.4 Support Vector Machines (SVMs)

What is a Support Vector Machine (SVM)?

An SVM is a powerful supervised learning algorithm used for both classification and regression, but most commonly
for classification. Its main goal is to find the best way to separate different groups of data.

How it Works (The "Hyperplane"):

Imagine you have data points scattered on a graph, and you want to draw a line to separate two different types of
points (like circles and squares).

• Hyperplane: The SVM tries to find the "best" line (or plane, if you have more features) that separates these
groups. This line is called a hyperplane.

• Maximum Margin: The "best" line is the one that has the largest possible gap (or "margin") between it and
the closest data points from each group.

• Support Vectors: The data points that are closest to this separating line are called "support vectors." These
are the critical points that "support" or define the position of the hyperplane. If you move these points, the
hyperplane might change.

Key Points:

• A supervised learning algorithm for classification (and regression).

• Goal: To divide datasets into classes.

• Finds a "hyperplane" (a line or plane) that maximizes the margin between classes.

• Support vectors are the data points closest to the hyperplane that influence its position.
3.5 Random Forest

What is a Random Forest?

A Random Forest is a very popular and powerful ML algorithm for both classification and regression. It's based on an
idea called "Ensemble Learning," which means combining many simpler models to get a better, more robust result.
As its name suggests, it builds a "forest" of decision trees.

How it Works:

• Instead of relying on just one decision tree, a Random Forest builds many decision trees using different
random subsets of the training data.

• Each tree makes its own prediction.

• For classification problems, the Random Forest then takes a "vote" from all the trees and chooses the
prediction that the majority of trees agreed on.

• For regression problems, it averages all the trees' predictions.

• Having many trees helps improve accuracy and prevents overfitting (where a single tree might be too specific
to the training data).

Key Ideas:

• Combines multiple decision trees.

• Uses Ensemble Learning to improve performance.

• Takes majority vote for classification, averages for regression.

• Helps achieve higher accuracy and prevents overfitting.

3.6 Linear Regression for Regression Problems

What is Linear Regression?

Linear Regression is a fundamental statistical method used for regression problems. Its goal is to find the best straight
line that describes the relationship between an input feature (or features) and a continuous numerical output.

How it Works (Finding the "Best Fit Line"):

Imagine you have data points on a graph (like house size vs. house price). Linear regression tries to draw a straight
line that comes closest to all these points.

• Minimizing Errors: It does this by minimizing the "residuals" or "errors," which are the distances between
each actual data point and the line. Specifically, it tries to minimize the sum of the squared differences
between the actual values and the values predicted by the line. This is called the "Ordinary Least Squares
(OLS)" method.

• Equation of the Line: The final line has an equation like Y=mX+c (for one input), or more generally, Y=β0+β1
X1+β2X2+...+e for multiple inputs. Here, Y is the output, X values are inputs, β values are the coefficients
(how much each input affects the output), and 'e' is the error.

Assumptions (Things that should be true for it to work well):

Before using Linear Regression, ideally, certain things should be true about your data:

• Linear Relationship: There should be a straight-line relationship between inputs and outputs.
• Independence of Errors: The errors (differences between predicted and actual values) should not be related
to each other.

• Constant Variance of Errors: The spread of errors should be roughly the same across all input values.

• Normal Distribution of Errors: The errors should follow a bell-shaped curve (normal distribution).

• No Multicollinearity: Input features should not be too highly correlated with each other.

Key Points:

• Used for regression problems (predicting numbers).

• Finds the best-fitting straight line through data points.

• Minimizes the sum of squared errors (Ordinary Least Squares).

• Has assumptions about the data for best results.

3.7 Ordinary Least Squares (OLS) Regression

What is OLS?

Ordinary Least Squares (OLS) is the most common method used in Linear Regression. It's how the "best-fitting line" is
actually found.

How it Works:

The main idea of OLS is to make the differences between the actual data points and the line as small as possible.

• Errors/Residuals: These are the vertical distances from each data point to the line.

• Sum of Squared Residuals: OLS doesn't just add up the errors (because positive and negative errors would
cancel out). Instead, it squares each error and then adds them up. This way, larger errors get penalized more.

• Minimizing This Sum: The OLS method finds the line (by picking the right slope and intercept) that makes
this "sum of squared residuals" as small as possible. This line is called the "Regression Line."

Key Points:

• A linear regression technique.

• Estimates the unknown parameters (coefficients) of the model.

• Relies on minimizing the sum of squared differences between actual and predicted values.

• The resulting line is the Regression Line.

3.8 Logistic Regression

What is Logistic Regression?

Despite "regression" in its name, Logistic Regression is primarily a classification algorithm. It's used when the output
you want to predict is a binary category (like "yes/no," "true/false," "spam/not spam"). It predicts the probability that
an input belongs to a certain class.

How it Works:

• Instead of fitting a straight line to the data (like linear regression), Logistic Regression uses a special S-shaped
curve called the "sigmoid function."

• This curve squashes any input value into a probability between 0 and 1.
• If the probability is above a certain threshold (e.g., 0.5), it assigns it to one class; otherwise, it assigns it to
the other.

Key Points:

• A classification algorithm, not for predicting numbers directly.

• Used for binary outcomes (e.g., yes/no).

• Predicts the probability of belonging to a class.

• Uses the sigmoid (S-shaped) function.

Unit 3
No ratings yet
Unit 3
12 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
AIML
No ratings yet
AIML
30 pages
Machine Learning: Supervised vs Unsupervised
No ratings yet
Machine Learning: Supervised vs Unsupervised
21 pages
UNIT3 Machine Learning
No ratings yet
UNIT3 Machine Learning
53 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
23 pages
Machine Learning Techniques Explained
100% (1)
Machine Learning Techniques Explained
12 pages
M2 - Supervised Machine Learning
No ratings yet
M2 - Supervised Machine Learning
79 pages
Intro to Machine Learning Algorithms
No ratings yet
Intro to Machine Learning Algorithms
72 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
Machine Learning QNA
No ratings yet
Machine Learning QNA
1 page
Machine Learing Algorithms
No ratings yet
Machine Learing Algorithms
13 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Unit3 ML
No ratings yet
Unit3 ML
7 pages
AI and DS QB1
No ratings yet
AI and DS QB1
31 pages
ML & DL Notes
No ratings yet
ML & DL Notes
30 pages
Financial Machine Learning-Unit-1: Dr. J.Dhanalakshmi
No ratings yet
Financial Machine Learning-Unit-1: Dr. J.Dhanalakshmi
70 pages
Supervised ML
No ratings yet
Supervised ML
69 pages
Overview of Supervised Machine Learning
No ratings yet
Overview of Supervised Machine Learning
24 pages
Classification Algorithms 3rd
No ratings yet
Classification Algorithms 3rd
15 pages
Module 5
No ratings yet
Module 5
6 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Class 3 - Classification
No ratings yet
Class 3 - Classification
80 pages
Assessing A Single Classification Algorithm and Two Classification Algorithms
No ratings yet
Assessing A Single Classification Algorithm and Two Classification Algorithms
12 pages
Week 8. Supervised Learning. Classification
No ratings yet
Week 8. Supervised Learning. Classification
45 pages
Ai Notes For Isa
No ratings yet
Ai Notes For Isa
9 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
28 pages
Lecture2 MCQ Guide
No ratings yet
Lecture2 MCQ Guide
8 pages
Understanding Machine Learning Algorithms - in Depth
No ratings yet
Understanding Machine Learning Algorithms - in Depth
167 pages
ML Unit-4
No ratings yet
ML Unit-4
20 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
17 pages
Overview of Machine Learning Algorithms
No ratings yet
Overview of Machine Learning Algorithms
123 pages
Types of Machine Learning Algorithms
No ratings yet
Types of Machine Learning Algorithms
27 pages
Supervised Learning
No ratings yet
Supervised Learning
187 pages
Module 3
No ratings yet
Module 3
63 pages
Machine Learning Issues & Algorithms
No ratings yet
Machine Learning Issues & Algorithms
133 pages
UNIT II Machine Learning
No ratings yet
UNIT II Machine Learning
118 pages
BECE352E Module 3
No ratings yet
BECE352E Module 3
64 pages
Classification Algorithm Guide
100% (2)
Classification Algorithm Guide
23 pages
Mbas901 - L4
No ratings yet
Mbas901 - L4
83 pages
Unit 2
No ratings yet
Unit 2
133 pages
2-Machine Learning Algorithms
No ratings yet
2-Machine Learning Algorithms
16 pages
AI For Eng Supervised-Learning
No ratings yet
AI For Eng Supervised-Learning
25 pages
ML-classification Models
No ratings yet
ML-classification Models
27 pages
Classification
No ratings yet
Classification
10 pages
Beginner's Guide to Machine Learning
No ratings yet
Beginner's Guide to Machine Learning
37 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
PerceptiLabs-ML Handbook
No ratings yet
PerceptiLabs-ML Handbook
31 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Unit 1
No ratings yet
Unit 1
15 pages
Machine Learning
No ratings yet
Machine Learning
62 pages
Unit IV
No ratings yet
Unit IV
6 pages
Unit V
No ratings yet
Unit V
5 pages
Cybersecurity Red Team Audit Part1
No ratings yet
Cybersecurity Red Team Audit Part1
2 pages
Skimming Hindpool Area Overview
No ratings yet
Skimming Hindpool Area Overview
99 pages
Xii A S No. Roll No. Name of Student
No ratings yet
Xii A S No. Roll No. Name of Student
1 page
Xii A S No. Roll No. Name of Student
No ratings yet
Xii A S No. Roll No. Name of Student
1 page
Business Analytics Essentials
No ratings yet
Business Analytics Essentials
37 pages
Imp
No ratings yet
Imp
2 pages
Glass Slipper Restaurant Sales Forecast
No ratings yet
Glass Slipper Restaurant Sales Forecast
26 pages
STAT 207 Statistics Solutions 6317
No ratings yet
STAT 207 Statistics Solutions 6317
9 pages
JOHNSON 2014 Progress in Regression - Why Natural Language Data Calls For Mixed-Effects Models
No ratings yet
JOHNSON 2014 Progress in Regression - Why Natural Language Data Calls For Mixed-Effects Models
17 pages
Lesson 3.2 Variation
No ratings yet
Lesson 3.2 Variation
8 pages
Business Statistics II Exam
No ratings yet
Business Statistics II Exam
5 pages
CORRELATION
No ratings yet
CORRELATION
4 pages
One-Sample Kolmogorov-Smirnov Test
No ratings yet
One-Sample Kolmogorov-Smirnov Test
4 pages
Ial Maths s2 PEP
No ratings yet
Ial Maths s2 PEP
5 pages
EPB 8120-Introduction To Biostatistics Exam-2020-2021
No ratings yet
EPB 8120-Introduction To Biostatistics Exam-2020-2021
6 pages
Bayesian Regression for Pollutants
No ratings yet
Bayesian Regression for Pollutants
16 pages
Ad3491 Fdsa Unit 3 Notes
No ratings yet
Ad3491 Fdsa Unit 3 Notes
37 pages
Point Estimation
No ratings yet
Point Estimation
22 pages
U.S. Whey Protein Survey Results
100% (1)
U.S. Whey Protein Survey Results
13 pages
Summative Assignment I 2022 Solutions PDF
No ratings yet
Summative Assignment I 2022 Solutions PDF
7 pages
Variance Components 2nd Edition Shayle R. Searle 2025 Download Now
No ratings yet
Variance Components 2nd Edition Shayle R. Searle 2025 Download Now
159 pages
Survival 2
No ratings yet
Survival 2
31 pages
Simple Regression True/False Questions
75% (4)
Simple Regression True/False Questions
91 pages
Osmosis Report Template
No ratings yet
Osmosis Report Template
3 pages
OPM 501 Assignment 1
No ratings yet
OPM 501 Assignment 1
16 pages
Paired Samples T Test Step-by-Step JASP Guide
No ratings yet
Paired Samples T Test Step-by-Step JASP Guide
15 pages
2 ML
No ratings yet
2 ML
15 pages
Acceptance of Evidence Based On The Results of Probability Sampling
100% (1)
Acceptance of Evidence Based On The Results of Probability Sampling
6 pages
CFA Level II Item-Set - Questions Study Session 3 June 2019: Reading 7 Correlation and Regression
No ratings yet
CFA Level II Item-Set - Questions Study Session 3 June 2019: Reading 7 Correlation and Regression
30 pages
SAMPLE EXAM QUESTIONS #2 1. in A Time-Series Forecasting Problem
88% (8)
SAMPLE EXAM QUESTIONS #2 1. in A Time-Series Forecasting Problem
3 pages
Data Analysis - Statistics
No ratings yet
Data Analysis - Statistics
68 pages
sn13 A
No ratings yet
sn13 A
6 pages
Variable 1 Variable 2: T-Test: Paired Two Sample For Means
No ratings yet
Variable 1 Variable 2: T-Test: Paired Two Sample For Means
4 pages
Seemingly Unrelated Regressions Guide
No ratings yet
Seemingly Unrelated Regressions Guide
21 pages

Unit III

Uploaded by

Unit III

Uploaded by

Unit III: Supervised Learning Techniques

3.1 Decision Trees

What is a Decision Tree?

How it Looks and Works:

• Nodes: Each "circle" or "box" in the tree is a "node."

• A tree-like model for making decisions.

• Used for classification (categories) and regression (numbers).

• Made of nodes, branches, and leaves.

• Works by splitting data based on questions about features.

3.2 Naive Bayes Classification

What is Naive Bayes?

• It's a classification algorithm (predicts categories).

• Based on probability (Bayes' Theorem).

• Assumes features are independent (which is why it's "naive").

• Simple, fast, and efficient.

• Is this email spam or not spam?

• Is this picture a cat, a dog, or a bird?

• Does this person have a disease or not?

• Predicts discrete categories (labels).

• Learns from labeled data.

• Different from regression (which predicts numbers).

3.4 Support Vector Machines (SVMs)

What is a Support Vector Machine (SVM)?

How it Works (The "Hyperplane"):

• A supervised learning algorithm for classification (and regression).

• Goal: To divide datasets into classes.

What is a Random Forest?

• Each tree makes its own prediction.

• For regression problems, it averages all the trees' predictions.

• Combines multiple decision trees.

• Uses Ensemble Learning to improve performance.

• Takes majority vote for classification, averages for regression.

• Helps achieve higher accuracy and prevents overfitting.

3.6 Linear Regression for Regression Problems

What is Linear Regression?

How it Works (Finding the "Best Fit Line"):

Assumptions (Things that should be true for it to work well):

• Used for regression problems (predicting numbers).

• Finds the best-fitting straight line through data points.

• Minimizes the sum of squared errors (Ordinary Least Squares).

• Has assumptions about the data for best results.

3.7 Ordinary Least Squares (OLS) Regression

• A linear regression technique.

• Estimates the unknown parameters (coefficients) of the model.

• The resulting line is the Regression Line.

3.8 Logistic Regression

What is Logistic Regression?

• A classification algorithm, not for predicting numbers directly.

• Used for binary outcomes (e.g., yes/no).

• Predicts the probability of belonging to a class.

• Uses the sigmoid (S-shaped) function.

You might also like