4.Multivariate Linear Regression-Shared
4.Multivariate Linear Regression-Shared
multiple variables
Multiple features
• Different names
• Number of dependent & independent variables
• Regression and nature of dependent variable
• Nature of the model assuming y=f(x1, x2, ..., xn)
• House Price Prediction Example (Next Slide)
Multiple features (variables).
2104 460
1416 232
1534 315
852 178
… …
Andrew Ng
Multiple features (variables).
2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …
Andrew Ng
Multiple features (variables).
Size (feet2) Number of Number of Age of home Price ($1000)
bedrooms floors (years)
2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …
Notation:
= number of features
= input (features) of training example.
= value of feature in training example.
Andrew Ng
Hypothesis:
Previously:
Andrew Ng
For convenience of notation, define .
Gradient descent:
Repeat
Gradient descent:
Repeat
(simultaneously update )
Andrew Ng
Linear Regression with
multiple variables
Gradient descent in
practice I: Feature Scaling
Feature Scaling
Get every feature into approximately a range.
Andrew Ng
Linear Regression with
multiple variables
Gradient descent in
practice II: Learning rate
Machine Learning
Gradient descent
Andrew Ng
Making sure gradient descent is working correctly.
Example automatic
convergence test:
Declare convergence if
decreases by less than
in one iteration.
0 100 200 300 400
No. of iterations
Andrew Ng
Making sure gradient descent is working correctly.
Gradient descent not working.
Use smaller .
No. of iterations
To choose , try
Andrew Ng
Linear Regression with
multiple variables
Features and
polynomial regression
Machine Learning
Polynomial regression
Price
(y)
Size (x)
Andrew Ng
Choice of features
Price
(y)
Size (x)
Andrew Ng
Linear Regression with
multiple variables
Normal equation
Machine Learning
Gradient Descent
Andrew Ng
Intuition: If 1D
(for every )
Solve for
Andrew Ng
Examples:
Size (feet2) Number of Number of Age of home Price ($1000)
bedrooms floors (years)
1 2104 5 1 45 460
1 1416 3 2 40 232
1 1534 3 2 30 315
1 852 2 1 36 178
Andrew Ng
Show the derivation for
this equation on Board.
is inverse of matrix .
Octave: pinv(X’*X)*X’*y
Andrew Ng
training examples, features.
Gradient Descent Normal Equation
• Need to choose . • No need to choose .
• Needs many iterations. • Don’t need to iterate.
• Works well even • Need to compute
when is large.
• Slow if is very large.
Andrew Ng
Linear Regression with
multiple variables
Normal equation
and non-invertibility
(optional)
Machine Learning
Normal equation
Andrew Ng
What if is non-invertible?
• Redundant features (linearly dependent).
E.g. size in feet2
size in m2
Andrew Ng
Andrew Ng
⚫
Andrew Ng
Error Measures
𝑑
1 𝑦𝑖 − 𝑦𝑖′
𝑀𝑒𝑎𝑛 𝐴𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 𝐸𝑟𝑟𝑜𝑟:
𝑑 𝑦𝑖
𝑖=1
𝑑
2
(𝑦𝑖 − 𝑦𝑖′ )2 𝑅𝑀𝑆𝐸
𝑅𝑜𝑜𝑡 𝑀𝑒𝑎𝑛 𝑆𝑞𝑢𝑎𝑟𝑒 𝐸𝑟𝑟𝑜𝑟: 𝑁𝑅𝑀𝑆𝐸 =
𝑑 𝑦𝑚𝑎𝑥 − 𝑦𝑚𝑖𝑛
𝑖=1
Andrew Ng
Evaluating Predictor
Training + Test Set (Holdout Set)
Cross Validation
Training + Validation + Test Set
Training + Test Set and Cross Validation of Training Set for Model Selection
Andrew Ng
References
➢ Andrew Ng’s slides on Multiple Linear Regression
from his Machine Learning Course on Coursera.
➢ Data Mining Book from Han and Kamber
Andrew Ng
Disclaimer
➢ Content of this presentation is not original and it
has been prepared from various sources for
teaching purpose.
Andrew Ng