0% found this document useful (0 votes)

8 views

An Introduction to Stadistical Learning-129-140-1-8

This document discusses a lab on linear regression using Python, specifically focusing on importing necessary libraries and constructing model matrices with the ISLP package. It details the process of fitting a simple linear regression model using the Boston housing dataset, including methods for making predictions and visualizing results. Additionally, it introduces the concept of defining functions for plotting and examines diagnostic plots for assessing model performance.

Uploaded by

fidelcolque07lcrtx

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

An Introduction to Stadistical Learning-129-140-1-8

Uploaded by

fidelcolque07lcrtx

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

116 3.

Linear Regression

3.6 Lab: Linear Regression

3.6.1 Importing packages
We import our standard libraries at this top level.
In [1]: import numpy as np
import pandas as pd
from matplotlib.pyplot import subplots

New imports
Throughout this lab we will introduce new functions and libraries. However,
we will import them here to emphasize these are the new code objects in
this lab. Keeping imports near the top of a notebook makes the code more
readable, since scanning the first few lines tells us what libraries are used.
In [2]: import statsmodels.api as sm

We will provide relevant details about the functions below as they are
needed.
Besides importing whole modules, it is also possible to import only a
few items from a given module. This will help keep the namespace clean. namespace
We will use a few specific objects from the statsmodels package which we statsmodels
import here.
In [3]: from statsmodels.stats.outliers_influence \
import variance_inflation_factor as VIF
from statsmodels.stats.anova import anova_lm

As one of the import statements above is quite a long line, we inserted a

line break \ to ease readability.
We will also use some functions written for the labs in this book in the
ISLP package.

In [4]: from ISLP import load_data

from ISLP.models import (ModelSpec as MS ,
summarize ,
poly)

Inspecting Objects and Namespaces

The function dir() provides a list of objects in a namespace.
dir()
In [5]: dir()

Out[5]: ['In',
'MS',
'_',
'__',
'___',
'__builtin__ ',
'__builtins__ ',
...
3.6 Lab: Linear Regression 117

'poly ',
'quit ',
'sm',
'summarize ']

This shows you everything that Python can find at the top level. There
are certain objects like __builtins__ that contain references to built-in
functions like print().
Every python object has its own notion of namespace, also accessible
with dir(). This will include both the attributes of the object as well as
any methods associated with it. For instance, we see 'sum' in the listing
for an array.
In [6]: A = np.array ([3 ,5 ,11])
dir(A)

Out[6]: ...
'strides ',
'sum',
'swapaxes ',
...

This indicates that the object A.sum exists. In this case it is a method that
can be used to compute the sum of the array A as can be seen by typing
A.sum?.

In [7]: A.sum()

Out[7]: 19

3.6.2 Simple Linear Regression

In this section we will construct model matrices (also called design matri-
ces) using the ModelSpec() transform from ISLP.models.
We will use the Boston housing data set, which is contained in the ISLP
package. The Boston dataset records medv (median house value) for 506
neighborhoods around Boston. We will build a regression model to pre-
dict medv using 13 predictors such as rmvar (average number of rooms per
house), age (proportion of owner-occupied units built prior to 1940), and
lstat (percent of households with low socioeconomic status). We will use
statsmodels for this task, a Python package that implements several com-
monly used regression methods.
We have included a simple loading function load_data() in the ISLP pack-
load_data()
age:
In [8]: Boston = load_data("Boston")
Boston.columns

Out[8]: Index (['crim ', 'zn', 'indus ', 'chas ', 'nox', 'rm', 'age', 'dis',
'rad', 'tax', 'ptratio ', 'black ', 'lstat ', 'medv '],
dtype='object ')
118 3. Linear Regression

Type Boston? to find out more about these data.

We start by using the sm.OLS() function to fit a simple linear regression
sm.OLS()
model. Our response will be medv and lstat will be the single predictor.
For this model, we can create the model matrix by hand.
In [9]: X = pd.DataFrame ({'intercept ': np.ones(Boston.shape [0]) ,
'lstat ': Boston['lstat ']})
X[:4]

Out[9]: intercept lstat

0 1.0 4.98
1 1.0 9.14
2 1.0 4.03
3 1.0 2.94

We extract the response, and fit the model.

In [10]: y = Boston['medv ']
model = sm.OLS(y, X)
results = model.fit()

Note that sm.OLS() does not fit the model; it specifies the model, and then
model.fit() does the actual fitting.
Our ISLP function summarize() produces a simple table of the parame-
summarize()
ter estimates, their standard errors, t-statistics and p-values. The function
takes a single argument, such as the object results returned here by the
fit method, and returns such a summary.
In [11]: summarize(results)

Out[11]: coef std err t P>|t|

intercept 34.5538 0.563 61.415 0.0
lstat -0.9500 0.039 -24.528 0.0

Before we describe other methods for working with fitted models, we

outline a more useful and general framework for constructing a model ma-
trix X.

Using Transformations: Fit and Transform

Our model above has a single predictor, and constructing X was straight-
forward. In practice we often fit models with more than one predictor,
typically selected from an array or data frame. We may wish to introduce
transformations to the variables before fitting the model, specify interac-
tions between variables, and expand some particular variables into sets of
variables (e.g. polynomials). The sklearn package has a particular notion sklearn
for this type of task: a transform. A transform is an object that is created
with some parameters as arguments. The object has two main methods:
fit() and transform().
.fit()
We provide a general approach for specifying models and constructing .transform()
the model matrix through the transform ModelSpec() in the ISLP library.
ModelSpec()
ModelSpec() (renamed MS() in the preamble) creates a transform object,
and then a pair of methods transform() and fit() are used to construct a
corresponding model matrix.
3.6 Lab: Linear Regression 119

We first describe this process for our simple regression model using a
single predictor lstat in the Boston data frame, but will use it repeatedly
in more complex tasks in this and other labs in this book. In our case the
transform is created by the expression design = MS(['lstat']).
The fit() method takes the original array and may do some initial com-
putations on it, as specified in the transform object. For example, it may
compute means and standard deviations for centering and scaling. The
transform() method applies the fitted transformation to the array of data,
and produces the model matrix.
In [12]: design = MS(['lstat '])
design = design.fit(Boston)
X = design.transform(Boston)
X[:4]

Out[12]: intercept lstat

0 1.0 4.98
1 1.0 9.14
2 1.0 4.03
3 1.0 2.94

In this simple case, the fit() method does very little; it simply checks that
the variable 'lstat' specified in design exists in Boston. Then transform()
constructs the model matrix with two columns: an intercept and the vari-
able lstat.
These two operations can be combined with the fit_transform() method. .fit_
In [13]: design = MS(['lstat ']) transform()
X = design.fit_transform(Boston)
X[:4]

Out[13]: intercept lstat

0 1.0 4.98
1 1.0 9.14
2 1.0 4.03
3 1.0 2.94

Note that, as in the previous code chunk when the two steps were done
separately, the design object is changed as a result of the fit() operation.
The power of this pipeline will become clearer when we fit more complex
models that involve interactions and transformations.
Let’s return to our fitted regression model. The object results has several
methods that can be used for inference. We already presented a function
summarize() for showing the essentials of the fit. For a full and somewhat
exhaustive summary of the fit, we can use the summary() method (output
not shown).
In [14]: results.summary ()

The fitted coefficients can also be retrieved as the params attribute of

results.

In [15]: results.params
120 3. Linear Regression

Out[15]: intercept 34.553841

lstat -0.950049
dtype: float64

The get_prediction() method can be used to obtain predictions, and .get_

produce confidence intervals and prediction intervals for the prediction of prediction()
medv for given values of lstat.
We first create a new data frame, in this case containing only the vari-
able lstat, with the values for this variable at which we wish to make
predictions. We then use the transform() method of design to create the
corresponding model matrix.

In [16]: new_df = pd.DataFrame ({'lstat ':[5, 10, 15]})

newX = design.transform(new_df)
newX

Out[16]: intercept lstat

0 1.0 5
1 1.0 10
2 1.0 15

Next we compute the predictions at newX, and view them by extracting

the predicted_mean attribute.

In [17]: new_predictions = results.get_prediction(newX);

new_predictions.predicted_mean

Out[17]: array ([29.80359411 , 25.05334734 , 20.30310057])

We can produce confidence intervals for the predicted values.

In [18]: new_predictions.conf_int(alpha =0.05)

Out[18]: array ([[29.00741194 , 30.59977628] ,

[24.47413202 , 25.63256267] ,
[19.73158815 , 20.87461299]])

Prediction intervals are computing by setting obs=True:

In [19]: new_predictions.conf_int(obs=True , alpha =0.05)

Out[19]: array ([[17.56567478 , 42.04151344] ,

[12.82762635 , 37.27906833] ,
[ 8.0777421 , 32.52845905]])

For instance, the 95% confidence interval associated with an lstat value of
10 is (24.47, 25.63), and the 95% prediction interval is (12.82, 37.28). As
expected, the confidence and prediction intervals are centered around the
same point (a predicted value of 25.05 for medv when lstat equals 10), but
the latter are substantially wider.
Next we will plot medv and lstat using DataFrame.plot.scatter(), and .plot.
wish to add the regression line to the resulting plot. scatter()
3.6 Lab: Linear Regression 121

Defining Functions
While there is a function within the ISLP package that adds a line to an
existing plot, we take this opportunity to define our first function to do so. def
In [20]: def abline(ax , b, m):
"Add a line with slope m and intercept b to ax"
xlim = ax.get_xlim ()
ylim = [m * xlim [0] + b, m * xlim [1] + b]
ax.plot(xlim , ylim)

A few things are illustrated above. First we see the syntax for defining a
function: def funcname(...). The function has arguments ax, b, m where
ax is an axis object for an exisiting plot, b is the intercept and m is the slope
of the desired line. Other plotting options can be passed on to ax.plot by
including additional optional arguments as follows:
In [21]: def abline(ax , b, m, *args , ** kwargs):
"Add a line with slope m and intercept b to ax"
xlim = ax.get_xlim ()
ylim = [m * xlim [0] + b, m * xlim [1] + b]
ax.plot(xlim , ylim , *args , ** kwargs)

The addition of *args allows any number of non-named arguments to

abline, while *kwargs allows any number of named arguments (such as
linewidth=3) to abline. In our function, we pass these arguments verbatim
to ax.plot above. Readers interested in learning more about functions are
referred to the section on defining functions in docs.python.org/tutorial.
Let’s use our new function to add this regression line to a plot of medv
vs. lstat.
In [22]: ax = Boston.plot.scatter('lstat ', 'medv ')
abline(ax ,
results.params [0],
results.params [1],
'r--',
linewidth =3)

Thus, the final call to ax.plot() is ax.plot(xlim, ylim, 'r--', linewidth=3).

We have used the argument 'r--' to produce a red dashed line, and added
an argument to make it of width 3. There is some evidence for non-linearity
in the relationship between lstat and medv. We will explore this issue later
in this lab.
As mentioned above, there is an existing function to add a line to a plot
— ax.axline() — but knowing how to write such functions empowers us
to create more expressive displays.

Next we examine some diagnostic plots, several of which were discussed

in Section 3.3.3. We can find the fitted values and residuals of the fit as
attributes of the results object. Various influence measures describing the
regression model are computed with the get_influence() method. As we .get_
will not use the fig component returned as the first value from subplots(), influence()
we simply capture the second returned value in ax below.
In [23]: ax = subplots(figsize =(8 ,8))[1]
122 3. Linear Regression

ax.scatter(results.fittedvalues , results.resid)
ax.set_xlabel('Fitted value ')
ax.set_ylabel('Residual ')
ax.axhline (0, c='k', ls='--');

We add a horizontal line at 0 for reference using the ax.axhline() method,

.axhline()
indicating it should be black (c='k') and have a dashed linestyle (ls='--').
On the basis of the residual plot (not shown), there is some evidence
of non-linearity. Leverage statistics can be computed for any number of
predictors using the hat_matrix_diag attribute of the value returned by the
get_influence() method.

In [24]: infl = results.get_influence ()

ax = subplots(figsize =(8 ,8))[1]
ax.scatter(np.arange(X.shape [0]) , infl.hat_matrix_diag)
ax.set_xlabel('Index ')
ax.set_ylabel('Leverage ')
np.argmax(infl.hat_matrix_diag)

Out[24]: 374

The np.argmax() function identifies the index of the largest element of an

np.argmax()
array, optionally computed over an axis of the array. In this case, we maxi-
mized over the entire array to determine which observation has the largest
leverage statistic.

3.6.3 Multiple Linear Regression

In order to fit a multiple linear regression model using least squares, we
again use the ModelSpec() transform to construct the required model matrix
and response. The arguments to ModelSpec() can be quite general, but in
this case a list of column names suffice. We consider a fit here with the two
variables lstat and age.
In [25]: X = MS(['lstat ', 'age']).fit_transform(Boston)
model1 = sm.OLS(y, X)
results1 = model1.fit()
summarize(results1)

Out[25]: coef std err t P>|t|

intercept 33.2228 0.731 45.458 0.000
lstat -1.0321 0.048 -21.416 0.000
age 0.0345 0.012 2.826 0.005

Notice how we have compacted the first line into a succinct expression
describing the construction of X.
The Boston data set contains 12 variables, and so it would be cumbersome
to have to type all of these in order to perform a regression using all of the
predictors. Instead, we can use the following short-hand: .columns.
drop()
In [26]: terms = Boston.columns.drop('medv ')
terms
3.6 Lab: Linear Regression 123

Out[26]: Index (['crim ', 'zn', 'indus ', 'chas ', 'nox', 'rm', 'age', 'dis',
'rad', 'tax', 'ptratio ', 'lstat '],
dtype='object ')

We can now fit the model with all the variables in terms using the same
model matrix builder.
In [27]: X = MS(terms).fit_transform(Boston)
model = sm.OLS(y, X)
results = model.fit()
summarize(results)

Out[27]: coef std err t P>|t|

intercept 41.6173 4.936 8.431 0.000
crim -0.1214 0.033 -3.678 0.000
zn 0.0470 0.014 3.384 0.001
indus 0.0135 0.062 0.217 0.829
chas 2.8400 0.870 3.264 0.001
nox -18.7580 3.851 -4.870 0.000
rm 3.6581 0.420 8.705 0.000
age 0.0036 0.013 0.271 0.787
dis -1.4908 0.202 -7.394 0.000
rad 0.2894 0.067 4.325 0.000
tax -0.0127 0.004 -3.337 0.001
ptratio -0.9375 0.132 -7.091 0.000
lstat -0.5520 0.051 -10.897 0.000

What if we would like to perform a regression using all of the variables but
one? For example, in the above regression output, age has a high p-value.
So we may wish to run a regression excluding this predictor. The following
syntax results in a regression using all predictors except age (output not
shown).
In [28]: minus_age = Boston.columns.drop (['medv ', 'age'])
Xma = MS(minus_age).fit_transform(Boston)
model1 = sm.OLS(y, Xma)
summarize(model1.fit())

3.6.4 Multivariate Goodness of Fit

We can access the individual components of results by name (dir(results)
shows us what is available). Hence results.rsquared gives us the R2 , and
np.sqrt(results.scale) gives us the RSE.
Variance inflation factors (section 3.3.3) are sometimes useful to assess
the effect of collinearity in the model matrix of a regression model. We will
compute the VIFs in our multiple regression fit, and use the opportunity
to introduce the idea of list comprehension.
list compre-
hension
List Comprehension
Often we encounter a sequence of objects which we would like to transform
for some other task. Below, we compute the VIF for each feature in our X
matrix and produce a data frame whose index agrees with the columns of
X. The notion of list comprehension can often make such a task easier.

Pseudo Code
88% (8)
Pseudo Code
3 pages
ESP32 Development Using The Arduino IDE
100% (9)
ESP32 Development Using The Arduino IDE
162 pages
TMT2 Users Manual
No ratings yet
TMT2 Users Manual
11 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Boxer Codex - Wikipedia
No ratings yet
Boxer Codex - Wikipedia
9 pages
House Price Prediction: Project Description
No ratings yet
House Price Prediction: Project Description
11 pages
Computer Science Resume Sample - Stylish Original
No ratings yet
Computer Science Resume Sample - Stylish Original
1 page
ml2020 Pythonlab02
No ratings yet
ml2020 Pythonlab02
3 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
14 pages
Lab 3. Linear Regression 230223
100% (1)
Lab 3. Linear Regression 230223
7 pages
SMEC ML LAB MANUAL R22
No ratings yet
SMEC ML LAB MANUAL R22
21 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Ml Cyber Lab
No ratings yet
Ml Cyber Lab
16 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
ML LN 3
No ratings yet
ML LN 3
44 pages
Linear Regression
No ratings yet
Linear Regression
17 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
ML Combined
No ratings yet
ML Combined
254 pages
vertopal.com_Lab_Linear_Regression
No ratings yet
vertopal.com_Lab_Linear_Regression
21 pages
DSBDAL_Assignment no 4
No ratings yet
DSBDAL_Assignment no 4
15 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Lecture Notes - Linear Regression
No ratings yet
Lecture Notes - Linear Regression
26 pages
Linear Models - Numeric Prediction
No ratings yet
Linear Models - Numeric Prediction
7 pages
Intro To Forecasting
No ratings yet
Intro To Forecasting
15 pages
MLCyberLab
No ratings yet
MLCyberLab
9 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
98 pages
Unit 5
No ratings yet
Unit 5
171 pages
ML LAB(R22) MANUAL (4)
No ratings yet
ML LAB(R22) MANUAL (4)
25 pages
ML_LAB_MANUAL
No ratings yet
ML_LAB_MANUAL
12 pages
Lab 3 - Linear Regression
No ratings yet
Lab 3 - Linear Regression
15 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
Simple Regression Model Fitting
No ratings yet
Simple Regression Model Fitting
5 pages
Project Idea
No ratings yet
Project Idea
8 pages
CC02 Group6 Report
No ratings yet
CC02 Group6 Report
36 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
4 - Học Máy Cơ Bản - Hồi Quy Tuyến Tính
No ratings yet
4 - Học Máy Cơ Bản - Hồi Quy Tuyến Tính
113 pages
Stat Modelling Notes
No ratings yet
Stat Modelling Notes
49 pages
Fdsa UNIT V
No ratings yet
Fdsa UNIT V
18 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
16 pages
MIT 302 - Statistical Computing II - Tutorial 03
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 03
16 pages
19BCS2059 DL1
No ratings yet
19BCS2059 DL1
4 pages
Linear Regression
No ratings yet
Linear Regression
6 pages
04 - Notebook4 - Additional Information
No ratings yet
04 - Notebook4 - Additional Information
5 pages
Lab Experiments Vi Sem-1
No ratings yet
Lab Experiments Vi Sem-1
10 pages
Sales and Advertising
No ratings yet
Sales and Advertising
14 pages
ML Regression Documentation
No ratings yet
ML Regression Documentation
7 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
23 pages
C1 W1 Lab03 Model Representation Soln-Copy1
No ratings yet
C1 W1 Lab03 Model Representation Soln-Copy1
7 pages
Week 4 Linear Regression
No ratings yet
Week 4 Linear Regression
38 pages
Exp 4_LM
No ratings yet
Exp 4_LM
5 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
Cap8 Predicting Continuous Target Variables with Regression Analysis - Thakur Ankita 2016 - Python Real World Data Science
No ratings yet
Cap8 Predicting Continuous Target Variables with Regression Analysis - Thakur Ankita 2016 - Python Real World Data Science
36 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Lab 11,12 - Copy
No ratings yet
Lab 11,12 - Copy
7 pages
vertopal.com_22644501_lab02 (4)
No ratings yet
vertopal.com_22644501_lab02 (4)
14 pages
ml record
No ratings yet
ml record
21 pages
Exp_6-Model Development_sdk_ok
No ratings yet
Exp_6-Model Development_sdk_ok
11 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Assignment AI-ML
No ratings yet
Assignment AI-ML
13 pages
ML Unit
No ratings yet
ML Unit
23 pages
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
No ratings yet
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
10 pages
Lecture 3
No ratings yet
Lecture 3
42 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
ML LAB_MANUAL
No ratings yet
ML LAB_MANUAL
15 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Part 4 Key Word Transformation Practice Test
No ratings yet
Part 4 Key Word Transformation Practice Test
3 pages
Dumpsmaterials: The Best Professional Dumps Materials and Certifications Study Material
No ratings yet
Dumpsmaterials: The Best Professional Dumps Materials and Certifications Study Material
8 pages
Grade 11 Com Prog Quarter 1 Week 5 Module 5
No ratings yet
Grade 11 Com Prog Quarter 1 Week 5 Module 5
12 pages
Full Download Spring 5 Recipes: A Problem-Solution Approach 4th Edition Marten Deinum PDF
100% (9)
Full Download Spring 5 Recipes: A Problem-Solution Approach 4th Edition Marten Deinum PDF
53 pages
Expansion Activities Chapter 1: Overview of Verb Tenses: Understanding and Using English Grammar, 3rd Edition
No ratings yet
Expansion Activities Chapter 1: Overview of Verb Tenses: Understanding and Using English Grammar, 3rd Edition
1 page
Pt200Mb Advanced Weighing Indicator
No ratings yet
Pt200Mb Advanced Weighing Indicator
2 pages
Trending Poetry QUESTIONS
No ratings yet
Trending Poetry QUESTIONS
48 pages
Computer Applications in Industrial Engg.-I: Lecture #01
No ratings yet
Computer Applications in Industrial Engg.-I: Lecture #01
41 pages
EAM Process Flow
100% (1)
EAM Process Flow
3 pages
The Passive with Reporting Verbs
No ratings yet
The Passive with Reporting Verbs
6 pages
Offenses Dont Take Them
No ratings yet
Offenses Dont Take Them
38 pages
Learning Strand 6 (Digital Citizenship)
No ratings yet
Learning Strand 6 (Digital Citizenship)
2 pages
Early Childhood Research Quarterly: Kiran Vanbinst, Elsje Van Bergen, Pol Ghesquière, Bert de Smedt
No ratings yet
Early Childhood Research Quarterly: Kiran Vanbinst, Elsje Van Bergen, Pol Ghesquière, Bert de Smedt
9 pages
Academic Writing Class B - Syllabus
No ratings yet
Academic Writing Class B - Syllabus
8 pages
Pseudocode Array
No ratings yet
Pseudocode Array
15 pages
Excel Formula Bar
No ratings yet
Excel Formula Bar
5 pages
Final Assessment Fathiya Putri Handini
No ratings yet
Final Assessment Fathiya Putri Handini
14 pages
Estructura 1.3 Gustar 2018
No ratings yet
Estructura 1.3 Gustar 2018
13 pages
Labview DSP Module: Digital Signal Processing System-Level Design Using Labview
No ratings yet
Labview DSP Module: Digital Signal Processing System-Level Design Using Labview
6 pages
Fruits of The Holy Spirit
0% (1)
Fruits of The Holy Spirit
21 pages
BESKOVA Web
No ratings yet
BESKOVA Web
27 pages
Sanskrit Dictionary
100% (9)
Sanskrit Dictionary
818 pages
My Gita
No ratings yet
My Gita
201 pages
Vocabulary
No ratings yet
Vocabulary
3 pages
College of Teacher Education: Mariano Marcos State University
No ratings yet
College of Teacher Education: Mariano Marcos State University
7 pages

An Introduction to Stadistical Learning-129-140-1-8

Uploaded by

An Introduction to Stadistical Learning-129-140-1-8

Uploaded by

116 3.

3.6 Lab: Linear Regression

As one of the import statements above is quite a long line, we inserted a

In [4]: from ISLP import load_data

Inspecting Objects and Namespaces

3.6.2 Simple Linear Regression

Type Boston? to find out more about these data.

Out[9]: intercept lstat

We extract the response, and fit the model.

Out[11]: coef std err t P>|t|

Before we describe other methods for working with fitted models, we

Using Transformations: Fit and Transform

Out[12]: intercept lstat

Out[13]: intercept lstat

The fitted coefficients can also be retrieved as the params attribute of

Out[15]: intercept 34.553841

The get_prediction() method can be used to obtain predictions, and .get_

In [16]: new_df = pd.DataFrame ({'lstat ':[5, 10, 15]})

Out[16]: intercept lstat

Next we compute the predictions at newX, and view them by extracting

In [17]: new_predictions = results.get_prediction(newX);

Out[17]: array ([29.80359411 , 25.05334734 , 20.30310057])

We can produce confidence intervals for the predicted values.

In [18]: new_predictions.conf_int(alpha =0.05)

Out[18]: array ([[29.00741194 , 30.59977628] ,

Prediction intervals are computing by setting obs=True:

In [19]: new_predictions.conf_int(obs=True , alpha =0.05)

Out[19]: array ([[17.56567478 , 42.04151344] ,

The addition of *args allows any number of non-named arguments to

Thus, the final call to ax.plot() is ax.plot(xlim, ylim, 'r--', linewidth=3).

Next we examine some diagnostic plots, several of which were discussed

We add a horizontal line at 0 for reference using the ax.axhline() method,

In [24]: infl = results.get_influence ()

The np.argmax() function identifies the index of the largest element of an

3.6.3 Multiple Linear Regression

Out[25]: coef std err t P>|t|

Out[27]: coef std err t P>|t|

3.6.4 Multivariate Goodness of Fit

You might also like