1.1 Goals¶
- Extend our regression model routines to support multiple features
- Extend data structures to support multiple features
- Rewrite prediction, cost and gradient routines to support multiple features
- Utilize NumPy
np.dot
to vectorize their implementations for speed and simplicity
1.2 Tools¶
In this lab, we will make use of:
- NumPy, a popular library for scientific computing
- Matplotlib, a popular library for plotting data
import copy, math
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('./deeplearning.mplstyle')
np.set_printoptions(precision=2) # reduced display precision on numpy arrays
1.3 Notation¶
Here is a summary of some of the notation you will encounter, updated for multiple features.
General
Notation |
Description
|
Python (if applicable) | |
---|---|---|---|
a | scalar, non bold | ||
a | vector, bold | ||
A | matrix, bold capital | ||
Regression | |||
X | training example maxtrix | X_train |
|
y | training example targets | y_train |
|
x(i), y(i) | ithTraining Example | X[i] , y[i] |
|
m | number of training examples | m |
|
n | number of features in each example | n |
|
w | parameter: weight, | w |
|
b | parameter: bias | b |
|
fw,b(x(i)) | The result of the model evaluation at x(i) parameterized by w,b: fw,b(x(i))=w⋅x(i)+b | f_wb |
2 Problem Statement¶
You will use the motivating example of housing price prediction. The training dataset contains three examples with four features (size, bedrooms, floors and, age) shown in the table below. Note that, unlike the earlier labs, size is in sqft rather than 1000 sqft. This causes an issue, which you will solve in the next lab!
Size (sqft) | Number of Bedrooms | Number of floors | Age of Home | Price (1000s dollars) |
---|---|---|---|---|
2104 | 5 | 1 | 45 | 460 |
1416 | 3 | 2 | 40 | 232 |
852 | 2 | 1 | 35 | 178 |
You will build a linear regression model using these values so you can then predict the price for other houses. For example, a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old.
Please run the following code cell to create your X_train
and y_train
variables.
X_train = np.array([[2104, 5, 1,