0% found this document useful (0 votes)
3 views

LinearRegression_FoundationalMathofAI_S24

The document provides an overview of linear regression, a statistical technique for modeling relationships between dependent and independent variables, emphasizing its simplicity and applicability across various fields. It introduces simple linear regression, which models the relationship between a single independent variable and a dependent variable, detailing the linear equation and the objective of minimizing errors. The document also presents formulas for calculating the line of best fit, including the y-intercept and slope.

Uploaded by

Tej Grover
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

LinearRegression_FoundationalMathofAI_S24

The document provides an overview of linear regression, a statistical technique for modeling relationships between dependent and independent variables, emphasizing its simplicity and applicability across various fields. It introduces simple linear regression, which models the relationship between a single independent variable and a dependent variable, detailing the linear equation and the objective of minimizing errors. The document also presents formulas for calculating the line of best fit, including the y-intercept and slope.

Uploaded by

Tej Grover
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Linear Regression

Yashil Sukurdeep
June 28, 2024

Contents
1 Linear Regression: Fundamentals 2

2 Simple Linear Regression 2

1
1 Linear Regression: Fundamentals
Linear regression is a statistical technique used to model and analyze the rela-
tionships between a dependent variable and one or more independent variables.
By fitting a linear equation to the observed data, linear regression helps in pre-
dicting the value of the dependent variable based on the values of the indepen-
dent variables. This method is widely used in various fields such as economics,
biology, engineering, and social sciences due to its simplicity and interpretabil-
ity. The primary objective of linear regression is to determine the best-fitting
line, known as the line of best fit, which minimizes the differences between the
observed values and the predicted values. This allows researchers and analysts
to understand the strength and direction of the relationship between variables,
make predictions, and uncover trends. Linear regression is fundamental to both
statistical analysis and machine learning, serving as a foundational tool for more
complex modeling techniques.

2 Simple Linear Regression


Simple linear regression is a method used to model the relationship between a
single independent variable x and a dependent variable y, where both x and
y take on a continuous range of values. Let’s assume that we have a dataset
(x1 , y1 ), . . . , (xn , yn ), where xi ∈ R and yi ∈ R for all i = 1, . . . , n, such as the
one in Figure 1. For instance, the xi ’s could represent the number of hours you
spend on your final project, and the yi′ s could be your score on the final project.

Figure 1: Scatter plot showing a positive linear trend between two variables.

2
It seems reasonable to suspect that there is a positive linear relationship between
the dependent variable y and the independent variable x, which we can model
through the following linear model :

yi = β0 + β1 xi + ϵi for i = 1, . . . , n
where:
• yi is the dependent variable for the i-th observation,

• xi is the independent variable for the i-th observation,


• β0 is the y-intercept of the regression line,
• β1 is the slope of the regression line,
• ϵi is the error term for the i-th observation, which we will assume to be a
random variable which follows a standard normal distribution.
The objective in simple linear regression is to find the line of best fit

y = βb0 + βb1 x (1)

whose y-intercept βb0 and slope βb1 are calculated such that they minimize the
sum of the squared errors between the observed values yi and the predicted
.
values ybi = βb0 + βb1 xi :
n
X 2
βb0 , βb1 = argmin (yi − β0 − β1 xi ) (2)
β0 ,β1 i=1

Figure 2: Line of best fit (in red) for the data from Figure 1.

3
Using some calculus and algebra, we can find explicit formulas for βb0 and βb1 :

βb0 = ȳ − βb1 x̄ (3)


Pn
(x − x̄)(yi − ȳ)
βb1 = i=1 Pn i 2
(4)
i=1 (xi − x̄)
Pn Pn
In the above, x̄ = n1 i=1 xi and ȳ = n1 i=1 yi denote the sample averages of
the independent and dependent variables respectively.

You might also like