0% found this document useful (0 votes)
9 views16 pages

Regression

regression lecture

Uploaded by

Mohamed Ragab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views16 pages

Regression

regression lecture

Uploaded by

Mohamed Ragab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Ridge ,lasso & Regression Evaluation

Machine learning chapter


Overfitting and Underfitting
• Overfitting occurs when a machine learning model learns the training data
too well, capturing noise and fluctuations in the data rather than the
underlying patterns. As a result, the model performs very well on the
training set but fails to generalize effectively to new, unseen data.
𝑌

𝑋
Overfitting and Underfitting
• Underfitting occurs when a machine learning model is too simple to
capture the underlying patterns in the training data. The model fails to
learn the relevant relationships and performs poorly on both the training
set and new data.
𝑌

𝑋
Overfitting and Underfitting

𝑋
Bias and Variance
• If we have model which is very
accurate, therefore the error of our
model will be low, meaning a low bias
and low variance. All the data points
fit within the bulls-eye. Similarly we
can say that if the variance increases,
the spread of our data point increases
which results in less accurate
prediction. And as the bias increases
the error between our predicted value
and the observed values increases.
Bias and Variance
• As we add more and more
parameters to our model, its
complexity increases, which results
in increasing variance and
decreasing bias, i.e., overfitting. So
we need to find out one optimum
point in our model where the
decrease in bias is equal to increase
in variance. In practice, there is no
analytical way to find this point. So
how to deal with high variance or
high bias?
Multicollinearity
• Multicollinearity is the occurrence of
high intercorrelations among two or 𝑥2
more independent variables in a
multiple regression model.
Multicollinearity can lead to skewed
or misleading results when a
researcher or analyst attempts to
determine how well each
independent variable can be used
most effectively to predict or
understand the dependent variable 𝑥1
in a statistical model.
Regularization ?
• Regularization is a technique used in machine learning to prevent
overfitting by adding a penalty term to the objective function that the
model is trying to optimize. The goal of regularization is to discourage the
model from fitting the training data too closely and, instead, encourage it to
learn the underlying patterns that generalize well to new, unseen data
• We have 2 images of regularization :
✓ L1 Regularization (Lasso)
✓ L2 Regularization (Ridge)
Ridge Regression
• Ridge regression is a model tuning method that is used to analyse any data
that suffers from multicollinearity. This method performs L2 regularization.
When the issue of multicollinearity occurs, least-squares are unbiased, and
variances are large, this results in predicted values being far away from the
actual values.

𝑛 2 𝑝 2
Ridge Objective Function = σ𝑖 𝑦𝑖 − 𝑦ො𝑖 + 𝜆 σ𝑖 𝛽𝑖
• λ (lambda) is the regularization parameter, a non-negative hyperparameter
that controls the strength of the regularization. As λ increases, the impact
of the regularization term on the objective function increases.
Lasso Regression
• LASSO regression, also known as L1 regularization, is a popular technique
used in statistical modeling and machine learning to estimate the
relationships between variables and make predictions. LASSO stands for
Least Absolute Shrinkage and Selection Operator.

𝑛 2 𝑝 .
Lasso Objective Function = σ𝑖 𝑦𝑖 − 𝑦ො𝑖 + 𝜆 σ𝑖 𝛽𝑖
• λ (lambda) is the regularization parameter, a non-negative hyperparameter
that controls the strength of the regularization..
Regression Evaluation
• Model evaluation in machine learning is a
crucial step that serves various purposes. It
enables the assessment of a model's
performance on a specific task, using metrics
like accuracy and precision.
• Evaluating on a separate test set ensures the
model's ability to make accurate predictions
on new, unseen data, indicating its
generalization capabilities. Additionally,
model evaluation facilitates the comparison
of different models, aiding in the selection of
the most effective one for a given task. It
plays a key role in hyperparameter tuning,
helping to find optimal settings and improve
overall performance.
Mean Absolute Error(MAE)
• MAE is a metric used to measure the average magnitude of errors between
predicted and actual values in a regression task. It is calculated by taking the
average of the absolute differences between the predicted and true values.
MAE provides a straightforward way to assess the accuracy of a predictive
model, with lower MAE values indicating better performance.

𝑁
• MAE formula :
1
𝑀𝐴𝐸 = ෍ 𝑦𝑖 − 𝑦ො𝑖
𝑁
𝑖=1

• When : 𝑁 is the number of data points, 𝑦𝑖 is the true value for the 𝑖 -th data
point and 𝑦ො𝑖 is the predicted value for the 𝑖 -th data point.
Mean Squared Error(MSE)
• MSE is a most used and very simple metric with a little bit of change in mean
absolute error. Mean squared error states that finding the squared difference
between actual and predicted value.
• It represents the squared distance between actual and predicted values. we
perform squared to avoid the cancellation of negative terms and it is the
benefit of MSE. 𝑁
1 2
• MSE formula : 𝑀𝑆𝐸 = ෍ 𝑦𝑖 − 𝑦ො𝑖
𝑁
𝑖=1

• When : 𝑁 is the number of data points, 𝑦𝑖 is the true value for the 𝑖 -th data
point and 𝑦ො𝑖 is the predicted value for the 𝑖 -th data point.
Root Mean Squared Error(RMSE)
• As RMSE is clear by the name itself, that it is a simple square root of mean
squared error.

𝑁
• RMSE formula : 1
𝑅𝑀𝑆𝐸 = ෍ 𝑦𝑖 − 𝑦ො𝑖 2
𝑁
𝑖=1

• When : 𝑁 is the number of data points, 𝑦𝑖 is the true value for the 𝑖 -th data
point and 𝑦ො𝑖 is the predicted value for the 𝑖 -th data point.
R Squared 2
(𝑅 )
• R-squared (R²) is a statistical measure that represents the proportion of the
variance in the dependent variable that is explained by the independent
variables in a regression model. It is also known as the coefficient of
determination. R-squared values range from 0 to 1, where:
• 𝑅2 = 0 ∶ The model does not explain any of the variability in the dependent
variable.
• 𝑅2 = 1 ∶ The model explains all the variability in the dependent variable.

𝑠𝑢𝑚 𝑜𝑓𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠 (𝑆𝑆𝑅)


• 𝑅2 formula : 𝑅 2 =1 −
𝑡𝑜𝑡𝑎𝑙 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠 (𝑆𝑆𝑇)

• When : 𝑆𝑆𝑅 = σ𝑁
𝑖=1 𝑦𝑖 − 𝑦
ො𝑖 2 , 𝑆𝑆𝑇 = σ𝑁
𝑖=1 𝑦𝑖 − 𝑦
ഥ𝑖 2

here 𝑦ഥ𝑖 s the mean of the dependent variable.


Session Finished

Thank You!
MACHINFY EDUCATION TEAM

You might also like