0% found this document useful (0 votes)
38 views6 pages

Alshammari 2024 Ijca 923446

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views6 pages

Alshammari 2024 Ijca 923446

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

International Journal of Computer Applications (0975 – 8887)

Volume 186 – No.9, February 2024

Implementation of Linear Regression using Least


Squares and Gradient Descent in Python

Ahmad Farhan AlShammari


Department of Computer and Information Systems
College of Business Studies, PAAET
Kuwait

ABSTRACT business, education, medicine, public health, agriculture,


The goal of this research is to develop a linear regression environment, climate change, etc.
program using least squares and gradient descent in Python.
Linear regression helps to find the line that best fits to the data 2. LITERATURE REVIEW
points. The linear regression model is based on a linear The review of literature revealed the major contributions in the
polynomial of slope (m) and intercept (c). Least squares is used field of linear regression using least squares and gradient
to minimize the error between the observed and predicted descent [18-30].
points. Gradient descent is used to find the optimal solution that
provides the minimum value of error function. Linear regression is a well-known method in mathematics. It
was first performed by the great mathematicians Legendre
The basic steps of linear regression using least squares and (1805) and Gauss (1809) to predict the motion of planets [31].
gradient descent are explained: preparing observed points, It was used to model the relationship between two variables.
initializing slope and intercept, computing predicted points,
computing partial derivatives, updating slope and intercept, The linear regression model is based on a linear polynomial as
computing error function, making equation of line, and plotting shown in the following form:
predicted line.
y = mx + c
The developed program was tested on an experimental dataset
from Kaggle. The program successfully performed the basic where (m) is the slope of the line, and (c) is the intercept with
steps of linear regression and provided the required results. the y-axis. The linear function is explained in the following
diagram:
Keywords
Artificial Intelligence, Machine Learning, Linear Regression, y y=mx+c
Least Squares, Mean Squared Error, Gradient Descent, Python,
Programming.
∆y

1. INTRODUCTION
In recent years, machine learning has played a major role in the ∆x Δy
m= (Slope)
development of computer systems. Machine learning (ML) is a Δx
branch of Artificial Intelligence (AI) which is focused on the c (Intercept)
study of computer algorithms to improve the performance of
computer programs [1-10].
x
Linear regression is one of the important applications of Fig 2: Equation of Line
machine learning. It is a common area that is sharing
knowledge between various fields such as machine learning, A line can have different shapes depending on the values of
programming, data science, mathematics, statistics, and slope and intercept as shown in the following diagram:
numerical methods [11-13, 14, 15, 16, 17].

Machine Mathematics
Learning
Linear Statistics
Programming Regression
Numerical
Data Science Methods
Different Slopes Different Intercepts
Fig 1: Field of Linear Regression
Fig 3: Different Shapes of Line
In this paper, linear regression is applied using least squares
and gradient descent to find the optimal line that best fits to the The fundamental concepts of linear regression using least
data points using a linear polynomial. Linear regression is squares and gradient descent are explained in the following
extensively used in data analysis. It has a wide range of section.
applications in many fields like: engineering, industry,

52
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.9, February 2024

Linear Regression: continues until the optimal solution, that provides the minimum
Linear regression is an algorithm used to find the line that best value to the error function, is reached.
fits to the data points. It is used to model the relationship
between the independent variable (x) and the dependent The concept of gradient descent is explained in the following
variable (y) using a linear polynomial of slope (m) and intercept diagram:
(c). Linear regression is widely used in data analysis because of
its simplicity and efficiency. E(p)

Observed Minimum
Value
Predicted

Initial Optimal p
Value Value

Fig 4: Explanation of Linear Regression Fig 6: Explanation of Gradient Descent

To find the optimal line that best fits to the data points, the The parameter (p) is updated by the following formula:
concept of least squares is used to minimize the variance
∂E
between the observed and predicted points. pnew = pold - α ( ) (2)
∂p
The concept of least squares is explained in the following
∂E
section. where (p) is the required parameter, (α) is the learning rate, ( )
∂p
is the partial derivative of error function (E) with respect to
Least Squares: parameter (p).
Least Squares is a mathematical method used to minimize the
error between the observed and predicted points. The error Now, applying that to the error function (MSE). The partial
function is defined by the Mean Squared Error (MSE). It is derivative of (MSE) with respect to slope (m) is given by the
computed by the following formula: following formula:

1 2 ∂MSE −1
MSE = ( ) ∑ (y − yp ) (1) = ( ) ∑ (y − yp ) (x) (3)
n ∂m n
1 2 2 2 And, the partial derivative of (MSE) with respect to intercept
= ( ) [(y0 − yp0 ) + (y1 − yp1 ) + … + (yn − yp𝑛 ) ]
n (c) is given by the following formula:

The differences between the observed and predicted points are ∂MSE −1
shown in the following diagram: = ( ) ∑ (y − yp ) (4)
∂c n

By using formula (2), the slope (m) is updated by the following


formula:

∆2 ∂MSE
mnew = mold − α ( ) (5)
∆3 ∂m

Observed And, the intercept (c) is updated by the following formula:


∆1 Predicted
∂MSE
cnew = cold − α ( ) (6)
∂c

Fig 5: Explanation of Least Squares The steps of the gradient descent method are explained in the
following algorithm:
Gradient Descent:
Gradient descent is an optimization method used to find the Algorithm 1: Gradient Descent Method
optimal solution that provides the minimum value to the error # initialize slope
function. m=0
# initialize intercept
In general, the gradient descent method starts by giving an c=0
initial value to the required parameter (p). Then, the partial # learning rate
derivative of error function (E) with respect to parameter is α = 0.0001
used to update the value of parameter. The iterative process # number of iterations

53
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.9, February 2024

Nt = 1000 Python provides many useful additional libraries such as:


for t = 0 to Nt do Numpy [33], Pandas [34], Matplotlib [35], NLTK [36], SciPy
# compute predicted points [37], and SK Learn [38].
yp = m * x + c
# compute partial derivative w.r.t. slope In this research, the standard functions of Python are applied
dm = (–1/n) * Σ (y – yp) * x without using any additional library.
# compute partial derivative w.r.t. intercept
dc = (–1/n) * Σ (y – yp) 3. RESEARCH METHODOLOGY
# update slope The basic steps of linear regression are: (1) preparing
m = m – α * dm observed points, (2) initializing slope and intercept, (3)
# update intercept computing predicted points, (4) computing partial
c = c – α * dc derivatives, (5) updating slope and intercept, (6) computing
# compute error function error function, (7) making equation of line, and (8) plotting
MSE = (1/n) * Σ (y – yp)^2 predicted line.
end for

R-Squared: 1. Preparing Observed Points (X, Y)


R-Squared (R2) is a statistical measure used to evaluate the 2. Initializing Slope and Intercept
linear regression model. It is computed by the following
formula: 3. Computing Predicted Points (X, Yp)
4. Computing Partial Derivatives
2
R2 = 1 − ( ∑ (y − yp ) ⁄ ∑(y − y̅)2 ) (7) 5. Updating Slope and Intercept
6. Computing Error Function
where (y) is the observed y-value, (yp) is the predicted y-value, 7. Making Equation of Line
and y̅ is the average of the observed y-values. 8. Plotting Predicted Line

The R2 can take values in the range [0, 1], where (0) indicates
that the line does not fit to the data points and (1) indicates that
the line fits to the data points. Fig 8: Steps of Linear Regression

Linear Regression System: Observed


In the linear regression system: Data
Input: Observed points (X, Y). (X, Y)
Output: Predicted line.
Processing: The observed points are obtained and prepared for
processing. First, the slope and intercept are initialized to zero.
Then, the predicted points are computed for the current values Initialize
of slope and intercept. After that, the partial derivatives of error Gradient Descent: Slope and
function are used to update the values of slope and intercept. Intercept
During that, the error function is computed to make sure that •Compute Predicted Points (X, Yp):
the model is converging well to the optimal solution. Finally, yp = m x + c
the optimal values of slope and intercept are obtained to form •Compute Partial Derivatives:
the equation of line and plot the predicted line. ∂MSE ∂MSE
,
∂m ∂c
•Update Slope and Intercept:
Observed Points ∂MSE
(X, Y) m=m−α( )
∂m
∂MSE
c=c−α( )
∂c Make
Linear Regression •Compute Error Function Equation
System of Line

Predicted Line Plot


y = mx + c Predicted
Fig 7: Diagram of Linear Regression System Line

Python: Fig 9: Flowchart of Linear Regression


Python [32] is a general high-level programming language. It
is simple, easy to learn, and powerful. It is the most preferred The steps of linear regression using least squares and gradient
programming language by the developers of machine learning descent are explained in the following section.
applications.

54
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.9, February 2024

1. Preparing Observed Points:


The observed points (X, Y) are obtained from the original import matplotlib.pyplot as plt
source and converted into lists in the following form: plt.scatter(X, Y)
plt.plot(X, Yp, color="red")
X = [x0, x1, x2, ..., xm] plt.show()
Y = [y0, y1, y2, ..., ym]

4. RESULTS AND DISCUSSION


2. Initializing Slope and Intercept: The developed program was tested on an experimental dataset
The slope (m) and intercept (c) are initialized to zero as shown from Kaggle [39]. The program performed the basic steps of
in the following code: linear regression using least squares and gradient descent and
provided the required results. The program output is explained
m = 0 in the following section.
c = 0

Observed Points:
3. Computing Predicted Points: The observed points (X, Y) are printed as shown in the
The predicted points (X, Yp) are computed for the current following view:
values of slope (m) and intercept (c). It is done by the following
code: X Y
---------------------------------
for i in range(n): 0 25.128 53.454
Yp[i] = m*X[i] + c 1 31.588 50.393
2 32.502 31.707
3 32.669 45.571
4. Computing Partial Derivatives: 4 32.94 67.171
The partial derivative of error function with respect to slope 5 33.094 50.72
(dm) is computed by the following code: 6 33.645 69.9
7 33.864 52.725
def compute_dm(X, Y, Yp): 8 34.333 55.723
sum = 0 9 35.568 41.413
for i in range(n): 10 35.678 52.722
sum += (Y[i] - Yp[i])*X[i] ...
return (-1/n)*sum
Processing Gradient Descent:
The partial derivative of error function with respect to intercept The gradient descent method is processed (1,000) iterations.
(dc) is computed by the following code: For each iteration; the slope (m), intercept (c), and error
function (MSE) are printed as shown in the following view:
def compute_dc(Y, Yp):
sum = 0 t m c MSE
for i in range(n): ----------------------------------------
sum += (Y[i] - Yp[i]) 0 0.368535 0.007274 5565.107834
return (-1/n)*sum 100 1.478861 0.032102 112.648861
200 1.478802 0.035105 112.647057
300 1.478743 0.038107 112.645254
5. Updating Slope and Intercept: 400 1.478684 0.041108 112.643452
The values of slope (m) and intercept (c) are updated by the 500 1.478625 0.044107 112.641652
following code: 600 1.478566 0.047106 112.639853
700 1.478507 0.050103 112.638055
m = m – alpha * dm 800 1.478448 0.053099 112.636259
c = c – alpha * dc 900 1.47839 0.056095 112.634464

6. Computing Error Function: A summary of the results is shown in the following view:
The error function (MSE) is computed by the following code:
Nt = 1000
m = 1.478331327454368
def compute_MSE(Y, Yp): c = 0.0590585566568512
sum = 0 MSE = 112.63268870038829
for i in range(n): R2 = 0.59001
sum += (Y[i] - Yp[i])**2
return sum/n
Predicted Points:
7. Making Equation of Line: The final values of the predicted points (X, Yp) are printed as
shown in the following view:
The equation of line is obtained as shown in the following form:
X Yp
y = m * x + c ---------------------------------
0 25.128 37.207
8. Plotting Predicted Line: 1 31.588 46.757
2 32.502 48.108
The predicted line is plotted using the "matplotlib" library. It is 3 32.669 48.355
done by the following code: 4 32.94 48.756

55
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.9, February 2024

5 33.094 48.983
6 33.645 49.797 Nt = 10000
7 33.864 50.122 m = 1.4731251028822563
8 34.333 50.815 c = 0.3239430786286303
9 35.568 52.64 MSE = 112.47669301199636
10 35.678 52.803 R2 = 0.590577
...

The equation of line is shown in the following view:


Error Function Plot:
The error function (MSE) is plotted as shown in the following y = 1.473125 x + 0.323943
chart:
The R2 value is (0.590577) which indicates that the resulting
line fits to the data points better than the previous run.

To improve the results more, the number of iterations is


increased 100 times to (100,000) iterations. The results of run
(3) are shown in the following view:

Nt = 100000
m = 1.4297282427708158
c = 2.531907100632597
MSE = 111.38251276556622
R2 = 0.59456

The equation of line is shown in the following view:

y = 1.429728 x + 2.531907
Fig 10: Error Function Plot
The R2 value is (0.59456) which indicates that the resulting line
The error function plot shows that MSE is decreasing with fits to the data points better than the previous runs.
iterations which indicates that the linear regression model is
converging well to the optimal solution. A comparison of R2 between the three runs of the linear
regression model is shown in the following table:
Equation of Line: Table 1: Comparison of R2 between The Three Runs
The equation of line is formed and printed as shown in the
following view: Linear Regression Model
Run 1 Run 2 Run 3
y = 1.478331 x + 0.059059
2
R 0.59001 0.590577 0.59456

Predicted Line: In summary, the program output clearly shows that the program
The predicted line is plotted as shown in the following chart:
has successfully performed the basic steps of linear regression
using least squares and gradient descent and provided the
required results.

5. CONCLUSION
Machine learning is playing a major role in the development of
computer systems. Linear regression is an important
application of machine learning. It helps to find the line that
best fits the data points. Linear regression is used to model the
relationship between the independent variable (x) and the
dependent variable (y) using a linear polynomial with slope (m)
and intercept (c). Least squares is used to minimize the error
between the observed and predicted points. Gradient descent is
used to find the optimal solution that provides the minimum
value of error function.
In this research, the author developed a program to perform
Fig 11: Linear Regression Model linear regression using least squares and gradient descent in
Python. The developed program performed the basic steps of
The R2 value is (0.59001) which indicates that the predicted linear regression: preparing observed points, initializing slope
line fits to the observed points. and intercept, computing predicted points, computing partial
derivatives, updating slope and intercept, computing error
Improving the Results: function, making equation of line, and plotting predicted line.
To improve the results of the program, the number of iterations
is increased 10 times to (10,000) iterations. The results of run The program was tested on an experimental dataset from
(2) are shown in the following view: Kaggle and provided the required results: computed slope and

56
International Journal of Computer Applications (0975 – 8887)
Volume 186 – No.9, February 2024

intercept, computed error function, predicted points, equation Analysis". John Wiley & Sons.
of line, and predicted line.
[17] Chapra, S. C. (2010). "Numerical Methods for Engineers".
In future work, more research is needed to improve and develop McGraw-Hill.
the current methods of linear regression using least squares and
gradient descent. In addition, they should be more investigated [18] Gray, J.B. (2002). "Introduction to Linear Regression
on different fields, domains, and datasets. Analysis". Technometrics, 44, 191-192.
6. REFERENCES [19] Groß, J. (2003). "Linear Regression". Springer Science &
[1] Sammut, C., & Webb, G. I. (2011). "Encyclopedia of
Business Media.
Machine Learning". Springer Science & Business Media.
[20] Olive, D. J. (2017). "Linear Regression". Berlin: Springer
[2] Jung, A. (2022). "Machine Learning: The Basics".
International Publishing.
Singapore: Springer.
[21] Yan, X., & Su, X. (2009). "Linear Regression Analysis:
[3] Kubat, M. (2021). "An Introduction to Machine
Theory and Computing". World Scientific.
Learning". Cham, Switzerland: Springer International
Publishing.
[22] Su, X., Yan, X., & Tsai, C. L. (2012). "Linear
Regression". Wiley Interdisciplinary Reviews:
[4] Dey, A. (2016). "Machine Learning Algorithms: A
Computational Statistics, 4(3), 275-294.
Review". International Journal of Computer Science and
Information Technologies, 7 (3), 1174-1179.
[23] Montgomery, D.C., Peck, E.A., Vining G. G. (20012).
"Introduction to Linear Regression Analysis". Wiley
[5] Jordan, M. I., & Mitchell, T. M. (2015). "Machine
Series in Probability and Statistics: John Wiley & Sons.
Learning: Trends, Perspectives, and Prospects". Science,
349 (6245), 255-260.
[24] Kutner, N., Nachtsheim, C., & Neter, J. (2004). "Applied
Linear Regression Models". McGraw-Hill/Irwin Series:
[6] Forsyth, D. (2019). "Applied Machine Learning". Cham:
Operations and Decision Sciences.
Springer International Publishing.
[25] Leemis, L.M. (1991). "Applied Linear Regression
[7] Chopra, D., & Khurana, R. (2023). "Introduction to
Models". Journal of Quality Technology, 23, 76-77.
Machine Learning with Python". Bentham Science
Publishers.
[26] Seber, G. A., & Lee, A. J. (2003). "Linear Regression
Analysis". John Wiley & Sons.
[8] Sarker, I. H. (2021). "Machine Learning: Algorithms,
Real-world Applications and Research Directions". SN
[27] Neter, J., Wasserman, W., & Kutner, M. H. (1983).
Computer Science, 2(3), 160.
"Applied Linear Regression Models". Irwin.
[9] Das, S., Dey, A., Pal, A., & Roy, N. (2015). "Applications
[28] Weisberg, S. (2005). "Applied Linear Regression". John
of Artificial Intelligence in Machine Learning: Review
Wiley & Sons.
and Prospect". International Journal of Computer
Applications, 115(9), 31-41.
[29] Malik, M.B. (2005). "Applied Linear Regression".
Technometrics, 47, 371-372.
[10] Dhall, D., Kaur, R., & Juneja, M. (2020). "Machine
Learning: A Review of the Algorithms and its
[30] Maulud, D., & Abdulazeez, A. M. (2020). "A Review on
Applications". Proceedings of ICRIC 2019: Recent
Linear Regression Comprehensive in Machine Learning".
Innovations in Computing, 47-63.
Journal of Applied Science and Technology Trends, 1(4),
140-147.
[11] Raschka, S. (2015). "Python Machine Learning". Packt
Publishing.
[31] Stigler, Stephen M. (1981). "Gauss and the Invention of
Least Squares". The Annals of Statistics. 9 (3): 465–474.
[12] Müller, A. C., & Guido, S. (2016). "Introduction to
Machine Learning with Python: A Guide for Data
[32] Python: https://www.python.org
Scientists". O'Reilly Media.
[33] Numpy: https://www.numpy.org
[13] Swamynathan, M. (2019). "Mastering Machine Learning
with Python in Six Steps: A Practical Implementation
[34] Pandas: https:// pandas.pydata.org
Guide to Predictive Data Analytics using Python". Apress.
[35] Matplotlib: https://www. matplotlib.org
[14] Brandt, S. (2014). "Data Analysis: Statistical and
Computational Methods for Scientists and Engineers".
[36] NLTK: https://www.nltk.org
Springer.
[37] SciPy: https://scipy.org
[15] VanderPlas, J. (2017). "Python Data Science Handbook:
Essential Tools for Working with Data". O'Reilly Media.
[38] SK Learn: https://scikit-learn.org
[16] Atkinson, K. (1989). "An Introduction to Numerical
[39] Kaggle: https://www.kaggle.com

IJCATM : www.ijcaonline.org 57

You might also like