0% found this document useful (0 votes)
11 views7 pages

WINSEM2024-25 CSE3506 ELA CH2024250502181 Reference Material III 21-12-2024 21NEW3

Uploaded by

Harish.A.S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views7 pages

WINSEM2024-25 CSE3506 ELA CH2024250502181 Reference Material III 21-12-2024 21NEW3

Uploaded by

Harish.A.S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Input Data

Below is the sample data representing the observations −

# Values of height
151, 174, 138, 186, 128, 136, 179, 163, 152, 131

# Values of weight.
63, 81, 56, 91, 47, 57, 76, 72, 62, 48

lm() Function
This function creates the relationship model between the predictor and the
response variable.

Syntax
The basic syntax for lm() function in linear regression is −

lm(formula,data)

Following is the description of the parameters used −

 formula is a symbol presenting the relation between x and y.


 data is the vector on which the formula will be applied.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.


relation <- lm(y~x)

print(relation)

predict() Function

Syntax
The basic syntax for predict() in linear regression is −

predict(object, newdata)

Following is the description of the parameters used −

 object is the formula which is already created using the lm() function.
 newdata is the vector containing the new value for predictor variable.
---------------------------------------------------------------------------
-------------------------------------------- # The predictor
vector.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)

# The resposne vector.


y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

# Apply the lm() function.


relation <- lm(y~x)

# Find weight of a person with height 170.


a <- data.frame(x = 170)
result <- predict(relation,a)
print(result)

When we execute the above code, it produces the following result −

1
76.22869

Visualize the Regression GraphicallY


# Create the predictor and response variable.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
relation <- lm(y~x)

# Give the chart file a name.


png(file = "linearregression.png")

# Plot the chart.


plot(y,x,col = "blue",main = "Height & Weight Regression",
abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",ylab =
"Height in cm")

# Save the file.


dev.off()
Note :
model=lm(y~x1+x2)
summary(model)
>

 Fit the Model: The lm() function is used to fit a linear regression model.
 Summary: The summary() function is used to view detailed results of the regression
model, including coefficients, standard errors, t-values, p-values, R-squared, etc.

Linear regression equation


where:

 y is the response (also called outcome, or dependent


variable)
 x and z are the predictors (also called features, or
independent variables)
2.

boxplot(model$residuals)
Note :
Normality

To check whether the dependent variable follows a normal distribution, use


the hist() function.

3. hist(income.data$happiness)
--------------------------------------------------------
----- Linearity
The relationship between the independent and dependent variable must be linear.
We can test this visually with a scatter plot to see if the distribution of data points
could be described with a straight line.

plot(happiness ~ income, data = income.data)


----------------------------------------------------------------------
--- Multiple regression

1. Independence of observations

Use the cor() function to test the relationship between your independent variables
and make sure they aren’t too highly correlated.

cor(heart.data$biking, heart.data$smoking)
---------------------------------------------------------------
-- You can use the following methods to extract
regression coefficients from the lm() function in R:

Method 1: Extract Regression Coefficients Only


model$coefficients
---------------------------------------------------------------------------
--------------------------------- summary(model)
$coefficients

The following example shows how to use these methods in


practice

Example: Extract Regression Coefficients


from lm() in R
Suppose we fit the following multiple linear regression model
in R:
#create data frame
df <- data.frame(rating=c(67, 75, 79, 85, 90, 96, 97),
points=c(8, 12, 16, 15, 22, 28, 24),
assists=c(4, 6, 6, 5, 3, 8, 7),
rebounds=c(1, 4, 3, 3, 2, 6, 7))

#fit multiple linear regression model


model <- lm(rating ~ points + assists + rebounds, data=df)

We can use the summary() function to view the entire


summary of the regression model:
#view model summary
summary(model)

Call:
lm(formula = rating ~ points + assists + rebounds, data = df)

Residuals:
1 2 3 4 5 6 7
-1.5902 -1.7181 0.2413 4.8597 -1.0201 -0.6082 -0.1644

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 66.4355 6.6932 9.926 0.00218 **
points 1.2152 0.2788 4.359 0.02232 *
assists -2.5968 1.6263 -1.597 0.20860
rebounds 2.8202 1.6118 1.750 0.17847
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.193 on 3 degrees of freedom


Multiple R-squared: 0.9589, Adjusted R-squared: 0.9179
F-statistic: 23.35 on 3 and 3 DF, p-value: 0.01396

To view the regression coefficients only, we can


use model$coefficients as follows:
#view only regression coefficients of model
model$coefficients

(Intercept) points assists rebounds


66.435519 1.215203 -2.596789 2.820224

We can use these coefficients to write the following fitted


regresion equation:

Rating = 66.43551 + 1.21520(points) – 2.59678(assists) +


2.82022(rebounds)

To view the regression coefficients along with their standard


errors, t-statistics, and p-values, we can
use summary(model)$coefficients as follows:
#view regression coefficients with standard errors, t-statistics, and
p-values
summary(model)$coefficients

Estimate Std. Error t value Pr(>|t|)


(Intercept) 66.435519 6.6931808 9.925852 0.002175313
points 1.215203 0.2787838 4.358942 0.022315418
assists -2.596789 1.6262899 -1.596757 0.208600183
rebounds 2.820224 1.6117911 1.749745 0.178471275

We can also access specific values in this output.

For example, we can use the following code to access the p-


value for the points variable:
#view p-value for points variable
summary(model)$coefficients["points", "Pr(>|t|)"]

[1] 0.02231542

Or we could use the following code to access the p-value for


each of the regression coefficients:
#view p-value for all variables
summary(model)$coefficients[, "Pr(>|t|)"]

(Intercept) points assists rebounds


0.002175313 0.022315418 0.208600183 0.178471275

The p-values are shown for each regression coefficient in the


model.

You can use similar syntax to access any of the values in the
regression output.

You might also like