0% found this document useful (0 votes)

7 views

UNIT III Part-2

This document discusses semi-parametric methods in machine learning, focusing on linear models, discriminant-based classification, and gradient descent. It covers the basics of linear regression and logistic regression, the geometry of linear discriminants, and the importance of pairwise separation for non-linearly separable classes. Additionally, it explains gradient descent and its types, including batch, stochastic, and mini-batch gradient descent, highlighting their advantages and applications in optimizing machine learning models.

Uploaded by

janarthana9789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

UNIT III Part-2

Uploaded by

janarthana9789

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 39

UNIT III

SEMI PARAMETRIC METHODS

Code:U18CST7002
Presented by: Nivetha R
Department: CSE
Contents

• Introduction to Linear Model

• Generalizing linear model
• Geometry of linear Discriminant
• Pairwise Separations
• Gradient Descent
Linear Model

•The Linear Model is one of the most straightforward models in

machine learning. It is the building block for many complex
machine learning algorithms, including deep neural networks.
•Linear models predict the target variable using a linear function
of the input features. Two crucial linear models in machine
learning: linear regression and logistic regression.
•Linear regression is used for regression tasks, whereas logistic
regression is a classification algorithm.
Likelihood vs. Discriminant-based Classification

• Likelihood-based: Assume a model for p(x|Ci), use

Bayes’ rule to calculate P(Ci|x)

• Discriminant-based: Assume a model for gi(x|Φi); no

density estimation

• Estimating the boundaries is enough; no need to accurately

estimate the densities inside the boundaries
This approach is useful because it focuses on the boundaries
between classes rather than trying to estimate the probability
distribution within each class.
Discriminant-based Classification
A discriminant function is a mathematical function used
in classification tasks to determine which class a given data
point belongs to. It assigns a score to each class based on
the input features of the data point. The class with the
highest score is chosen as the predicted class for that data
point.
Discriminant-based Classification
Discriminant-based Classification
Discriminant-based Classification
Discriminant-based Classification
Linear Discriminant

• In linear discriminant ,the final output is a weighted

sum of the input attributes

• The magnitude of the weight shows the importance

of attributes
• Its sign indicates if the effect is positive or negative.
Generalizing the Linear Model
Generalized Linear Model in the context of classification, particularly
focusing on how we can extend linear models to handle more complex
relationships between features
Generalizing the Linear Model
Generalizing the Linear Model
Generalizing the Linear Model
Nonlinear basis functions help transform original features into
new, more complex forms, making it easier to apply linear
models to data that has a non-linear relationship between
input and output
Summary -Generalizing the Linear Model
Geometry of Linear Discriminant
Linear Discriminant analysis is one of the most popular
dimensionality reduction techniques used for supervised
classification problems in machine learning. It is also considered a
pre-processing step for modeling differences in ML and applications
of pattern classification.

Linear Discriminant Analysis model is considered the most common

technique to solve such classification problems. For e.g., if we have
two classes with multiple features and need to separate them
efficiently. When we classify them using a single feature, then it
may show overlapping.
Geometry of Linear Discriminant
Derivations for Geometry of Linear
Discriminant
Derivations for Geometry of Linear
Discriminant
Derivations for Geometry of Linear
Discriminant
Derivations for Geometry of Linear
Discriminant
Derivations for Geometry of Linear
Discriminant
Derivations for Geometry of Linear
Discriminant
Pairwise Separation

• Pairwise separation is a method used when the classes in a

dataset are not linearly separable. It breaks down the problem
into smaller, more manageable parts by focusing on separating
each pair of classes individually.
Pairwise Separation
Pairwise Separation
Pairwise Separation
Gradient Descent
• Gradient Descent is known as one of the most commonly used
optimization algorithms to train machine learning models by
means of minimizing errors between actual and expected results.
Further, gradient descent is also used to train Neural Networks.
• In mathematical terminology, Optimization algorithm refers to the
task of minimizing/maximizing an objective function f(x)
parameterized by x. Similarly, in machine learning, optimization
is the task of minimizing the cost function parameterized by the
model's parameters.
• The main objective of gradient descent is to minimize the convex
function using iteration of parameter updates. Once these
machine learning models are optimized, these models can be
used as powerful tools for Artificial Intelligence and various
computer science applications.
What is Gradient Descent or Steepest
Descent
• Gradient Descent is defined as one of the most commonly used
iterative optimization algorithms of machine learning to train the
machine learning and deep learning models. It helps in finding
the local minimum of a function.

• The best way to define the local minimum or local maximum of a

function using gradient descent is as follows:

• If moved towards a negative gradient or away from the gradient

of the function at the current point, it will give the local minimum
of that function.

• Whenever move towards a positive gradient or towards the

gradient of the function at the current point, we will get the local
maximum of that function.
What is Gradient Descent or Steepest
Descent

The main objective of using a gradient descent algorithm

is to minimize the cost function using iteration. To achieve
this goal, it performs two steps iteratively:
Gradient Descent
• The first-order derivative of a function is used to compute its
gradient or slope.
• In gradient descent, we update the parameters by moving in the
opposite direction of the gradient.
• This means adjusting the values by a factor of alpha, where
alpha (α) is the learning rate—a crucial tuning parameter that
determines the step size taken during the optimization process.
• It helps control how quickly or slowly the algorithm converges to
the optimal solution.
What is Cost-function?

• The cost function is defined as the measurement of difference or

error between actual values and expected values at the current
position and present in the form of a single real number.

• It helps to increase and improve machine learning efficiency by

providing feedback to this model so that it can minimize error and
find the local or global minimum.

• Further, it continuously iterates along the direction of the

negative gradient until the cost function approaches zero.

• At this steepest descent point, the model will stop learning

further.
How does Gradient Descent work?
How does Gradient Descent work?
How does Gradient Descent work?
The main objective of gradient descent is to minimize the cost
function or the error between expected and actual.

Learning Rate:

• It is defined as the step size taken to reach the minimum or

lowest point. This is typically a small value that is evaluated and
updated based on the behavior of the cost function. If the
learning rate is high, it results in larger steps but also leads to
risks of overshooting the minimum. At the same time, a low
learning rate shows the small step sizes, which compromises
overall efficiency but gives the advantage of more precision.
How does Gradient Descent work?
Types of Gradient Descent
1. Batch Gradient Descent:

Batch gradient descent (BGD) is used to find the error for each
point in the training set and update the model after evaluating all
training examples. This procedure is known as the training epoch.
In simple words, it is a greedy approach where need to sum over all
examples for each update.

Advantages of Batch gradient descent:

• It produces less noise in comparison to other gradient descent.

• It produces stable gradient descent convergence.

• It is Computationally efficient as all resources are used for all

training samples.
Types of Gradient Descent
2. Stochastic gradient descent

Stochastic gradient descent (SGD) is a type of gradient descent that runs one
training example per iteration. Or in other words, it processes a training epoch
for each example within a dataset and updates each training example's
parameters one at a time. As it requires only one training example at a time,
hence it is easier to store in allocated memory. However, it shows some
computational efficiency losses in comparison to batch gradient systems as it
shows frequent updates that require more detail and speed. Further, due to
frequent updates, it is also treated as a noisy gradient

Advantages of Stochastic gradient descent

• It is easier to allocate in desired memory.

• It is relatively fast to compute than batch gradient descent.

• It is more efficient for large datasets.

Types of Gradient Descent
3. Mini Batch Gradient Descent:

Mini Batch gradient descent is the combination of both batch gradient descent
and stochastic gradient descent. It divides the training datasets into small
batch sizes then performs the updates on those batches separately. Splitting
training datasets into smaller batches make a balance to maintain the
computational efficiency of batch gradient descent and speed of stochastic
gradient descent. Hence, achieving a special type of gradient descent with
higher computational efficiency and less noisy gradient descent.

Advantages of Mini Batch gradient descent:

• It is easier to fit in allocated memory.

• It is computationally efficient.

• It produces stable gradient descent convergence

UPI_(Report) (1)
100% (1)
UPI_(Report) (1)
30 pages
Gradient Descent
No ratings yet
Gradient Descent
17 pages
Gradient Descent
No ratings yet
Gradient Descent
4 pages
UNIT2
No ratings yet
UNIT2
25 pages
Gradient_Descent_(1)
No ratings yet
Gradient_Descent_(1)
8 pages
AI33
No ratings yet
AI33
6 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
Gradient Descent
No ratings yet
Gradient Descent
13 pages
DL Unit -2
No ratings yet
DL Unit -2
20 pages
Gradient Descent DS Rohit Sharma Fench Knjs
No ratings yet
Gradient Descent DS Rohit Sharma Fench Knjs
15 pages
DL_Unit2
No ratings yet
DL_Unit2
113 pages
CCS355 Neural Networks and Deep Learning
No ratings yet
CCS355 Neural Networks and Deep Learning
142 pages
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
No ratings yet
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
37 pages
NN WK 3 Lec 5 6 Gradient Descent
No ratings yet
NN WK 3 Lec 5 6 Gradient Descent
7 pages
5.1Loss Function, Optimization,Gd
No ratings yet
5.1Loss Function, Optimization,Gd
39 pages
Lec05-1-Gradient Descent-Detailed
No ratings yet
Lec05-1-Gradient Descent-Detailed
62 pages
Gradient Descent
No ratings yet
Gradient Descent
7 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
Tom Mitchell Provides A More Modern Definition
No ratings yet
Tom Mitchell Provides A More Modern Definition
10 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
Deep Learning - DL-2
100% (1)
Deep Learning - DL-2
44 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
10 pages
LInear
No ratings yet
LInear
14 pages
Linear Models (Unit II) Chapter III 1
No ratings yet
Linear Models (Unit II) Chapter III 1
24 pages
ML-UNIT-3
No ratings yet
ML-UNIT-3
46 pages
Gradient Descent Final
No ratings yet
Gradient Descent Final
27 pages
GD Types
No ratings yet
GD Types
98 pages
ML MODULE 5 FULL NOTES
No ratings yet
ML MODULE 5 FULL NOTES
23 pages
ML-UNIT-3-1
No ratings yet
ML-UNIT-3-1
57 pages
UNIT3
No ratings yet
UNIT3
37 pages
2. Gradient Descent (GD)- GD With Momentum- Nesterov Accelerated GD- Stochastic GD - OrIGINAL
No ratings yet
2. Gradient Descent (GD)- GD With Momentum- Nesterov Accelerated GD- Stochastic GD - OrIGINAL
25 pages
Gradient Descent_PR
No ratings yet
Gradient Descent_PR
31 pages
Gradient Descent
No ratings yet
Gradient Descent
6 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
5 pages
Module2-Optimizations
No ratings yet
Module2-Optimizations
65 pages
chp2 Gradient Descent algorithm
No ratings yet
chp2 Gradient Descent algorithm
5 pages
4. Gradient Descent
No ratings yet
4. Gradient Descent
15 pages
Module 3
No ratings yet
Module 3
27 pages
Gradient Descent and Cost Function
No ratings yet
Gradient Descent and Cost Function
14 pages
Gradient Decent
No ratings yet
Gradient Decent
40 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
35 pages
Assignment B 4 GradientDescent
No ratings yet
Assignment B 4 GradientDescent
5 pages
Gradient Descent Based Learners
No ratings yet
Gradient Descent Based Learners
11 pages
CS601 Machine Learning Unit 2 Notes 1672759753
No ratings yet
CS601 Machine Learning Unit 2 Notes 1672759753
14 pages
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
No ratings yet
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
14 pages
Gradient Descent (3) (2)
No ratings yet
Gradient Descent (3) (2)
27 pages
Deep Learning (Part 8) - Coursesteach
No ratings yet
Deep Learning (Part 8) - Coursesteach
16 pages
Gradient Descent Unit3
No ratings yet
Gradient Descent Unit3
9 pages
A Layman's Guide to the Project
No ratings yet
A Layman's Guide to the Project
34 pages
Gradient-Based Optimizers
No ratings yet
Gradient-Based Optimizers
54 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
8 pages
Mlfa Autumn 23 Optimization
No ratings yet
Mlfa Autumn 23 Optimization
37 pages
Gradient Descent Algorithm is a first
No ratings yet
Gradient Descent Algorithm is a first
5 pages
Models PDF
No ratings yet
Models PDF
86 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
32 pages
ML: Introduction 1. What Is Machine Learning?
No ratings yet
ML: Introduction 1. What Is Machine Learning?
38 pages
Basic Machine Learning: Case Study
No ratings yet
Basic Machine Learning: Case Study
11 pages
Deep Learning Tutorial 9
No ratings yet
Deep Learning Tutorial 9
70 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Visualization Charts
No ratings yet
Visualization Charts
108 pages
DV-Unit5
No ratings yet
DV-Unit5
113 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
UNIT II Part-2
No ratings yet
UNIT II Part-2
32 pages
Unit V -Multiple Learners
No ratings yet
Unit V -Multiple Learners
54 pages
Unit V -Graphical Models
No ratings yet
Unit V -Graphical Models
43 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
UNIT I-Part 2
No ratings yet
UNIT I-Part 2
35 pages
Qb 12678
No ratings yet
Qb 12678
3 pages
UNIT III Part-1
No ratings yet
UNIT III Part-1
69 pages
DV-unit1-part2
No ratings yet
DV-unit1-part2
98 pages
UNIT II Part-1
No ratings yet
UNIT II Part-1
59 pages
A Comprehensive Survey On Support Vector Machine Classification - Applications, Challenges and Trends
No ratings yet
A Comprehensive Survey On Support Vector Machine Classification - Applications, Challenges and Trends
27 pages
Introduction - To - ML Lec
No ratings yet
Introduction - To - ML Lec
15 pages
Intership Report - Dhanya - 2020
No ratings yet
Intership Report - Dhanya - 2020
22 pages
NI Vision Concepts Manual
No ratings yet
NI Vision Concepts Manual
414 pages
Technovate Poster - Template (AutoRecovered)
No ratings yet
Technovate Poster - Template (AutoRecovered)
1 page
Lab 04 Extra Weka Experimenter
No ratings yet
Lab 04 Extra Weka Experimenter
5 pages
Python SciKit Learn Tutorial _ DigitalOcean
No ratings yet
Python SciKit Learn Tutorial _ DigitalOcean
11 pages
Q NeuroEvolution Arxiv
No ratings yet
Q NeuroEvolution Arxiv
12 pages
Random Forests: Paper Presentation For CSI5388 Pengcheng Xi Mar. 23, 2005
No ratings yet
Random Forests: Paper Presentation For CSI5388 Pengcheng Xi Mar. 23, 2005
23 pages
A Hybrid Deep Learning Model For Consumer Credit Scoring: Bing Zhu, Wenchuan Yang, Huaxuan Wang, Yuan Yuan
No ratings yet
A Hybrid Deep Learning Model For Consumer Credit Scoring: Bing Zhu, Wenchuan Yang, Huaxuan Wang, Yuan Yuan
4 pages
FPGA Implementation of A Face Recognition System
No ratings yet
FPGA Implementation of A Face Recognition System
5 pages
LSUN: Construction of A Large-Scale Image Dataset Using Deep Learning With Humans in The Loop
No ratings yet
LSUN: Construction of A Large-Scale Image Dataset Using Deep Learning With Humans in The Loop
9 pages
Bcse332l Deep-Learning TH 1.0 0 Bcse332l
No ratings yet
Bcse332l Deep-Learning TH 1.0 0 Bcse332l
3 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
54 pages
Mẫu Trình Bày (Tham Khảo)
No ratings yet
Mẫu Trình Bày (Tham Khảo)
82 pages
Download Complete Computational Advancement in Communication Circuits and Systems Proceedings of 3rd ICCACCS 2020 1st Edition M. Mitra PDF for All Chapters
No ratings yet
Download Complete Computational Advancement in Communication Circuits and Systems Proceedings of 3rd ICCACCS 2020 1st Edition M. Mitra PDF for All Chapters
22 pages
2 AI-B ML TLP
No ratings yet
2 AI-B ML TLP
4 pages
Assignment On Advance Statistics For Business
No ratings yet
Assignment On Advance Statistics For Business
7 pages
1 Homework 2: 1.1 Large Scale Data Analysis / Aalto University, Spring 2023
No ratings yet
1 Homework 2: 1.1 Large Scale Data Analysis / Aalto University, Spring 2023
12 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Detection of Fake Online Reviews by Using Machine Learning
No ratings yet
Detection of Fake Online Reviews by Using Machine Learning
7 pages
Syllabi MTech Data Science
No ratings yet
Syllabi MTech Data Science
50 pages
Crop Yield Prediction Using Machine Learning
No ratings yet
Crop Yield Prediction Using Machine Learning
4 pages
Quantum Neural Networks Versus Conventional Feedforward Neural N
No ratings yet
Quantum Neural Networks Versus Conventional Feedforward Neural N
10 pages
MCQs DL Mid I R20 2023 With Answers
No ratings yet
MCQs DL Mid I R20 2023 With Answers
3 pages
Acknowledgement
No ratings yet
Acknowledgement
14 pages
Unit Ii ML MCQ
No ratings yet
Unit Ii ML MCQ
9 pages
Sensitivity Analysis
No ratings yet
Sensitivity Analysis
64 pages
Abstract: Instance-Aware Semantic Segmentation For Autonomous Vehicles
No ratings yet
Abstract: Instance-Aware Semantic Segmentation For Autonomous Vehicles
7 pages

UNIT III Part-2

Uploaded by

UNIT III Part-2

Uploaded by

UNIT III

SEMI PARAMETRIC METHODS

• Introduction to Linear Model

•The Linear Model is one of the most straightforward models in

• Likelihood-based: Assume a model for p(x|Ci), use

• Discriminant-based: Assume a model for gi(x|Φi); no

• Estimating the boundaries is enough; no need to accurately

• In linear discriminant ,the final output is a weighted

• The magnitude of the weight shows the importance

Linear Discriminant Analysis model is considered the most common

• Pairwise separation is a method used when the classes in a

• The best way to define the local minimum or local maximum of a

• If moved towards a negative gradient or away from the gradient

• Whenever move towards a positive gradient or towards the

The main objective of using a gradient descent algorithm

• The cost function is defined as the measurement of difference or

• It helps to increase and improve machine learning efficiency by

• Further, it continuously iterates along the direction of the

• At this steepest descent point, the model will stop learning

• It is defined as the step size taken to reach the minimum or

Advantages of Batch gradient descent:

• It produces less noise in comparison to other gradient descent.

• It produces stable gradient descent convergence.

• It is Computationally efficient as all resources are used for all

Advantages of Stochastic gradient descent

• It is easier to allocate in desired memory.

• It is relatively fast to compute than batch gradient descent.

• It is more efficient for large datasets.

Advantages of Mini Batch gradient descent:

• It is easier to fit in allocated memory.

• It produces stable gradient descent convergence

You might also like