0% found this document useful (0 votes)

173 views17 pages

09 Handout PDF

Uploaded by

righthearted

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

173 views17 pages

09 Handout PDF

Uploaded by

righthearted

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Lecture 9: Linear Regression

0/23

Linear Regression

Roadmap
1

When Can Machines Learn?

Why Can Machines Learn?

Lecture 8: Noise and Error

learning can happen
with target distribution P(y |x) and low Ein w.r.t. err
3

How Can Machines Learn?

Lecture 9: Linear Regression

Linear Regression Problem
Linear Regression Algorithm
Generalization Issue
Linear Regression for Binary Classification
4

How Can Machines Learn Better?

1/23

Linear Regression

Linear Regression Problem

Credit Limit Problem

unknown target function

f: X Y

age
gender
annual salary
year in residence
year in job
current debt

credit limit? 5,000

(ideal credit limit formula)

training examples
D : (x1 , y1 ), , (xN , yN )

23 years
female
33,000 USD
1 year
0.5 year
20,000

learning
algorithm
A

(historical records in bank)

final hypothesis
gf
(learned formula to be used)

hypothesis set
H
(set of candidate formula)

Y = R: regression
2/23

Linear Regression

Linear Regression Problem

Linear Regression Hypothesis

age
annual salary
year in job
current debt

23 years
$ 33.000
0.5 year
20,000

For x = (x0 , x1 , x2 , , xd ) features of customer,

approximate the desired credit limit with a weighted sum:

d
X

wi xi

i=0

linear regression hypothesis: h(x) = wT x

h(x): like perceptron, but without the sign

3/23

Linear Regression

Linear Regression Problem

Illustration of Linear Regression

x = (x) R

x = (x1 , x2 ) R2

linear regression:
find lines/hyperplanes with small residuals
4/23

Linear Regression

Linear Regression Problem

The Error Measure

popular/historical error measure:
squared error err(y , y ) = (y y )2

in-sample

out-of-sample

N
1X
(h(xn ) yn )2
Ein (hw) =
| {z }
N
n=1

Eout (w) =

E
(x,y )P

(wT x y )2

wT xn

next: how to minimize Ein (w)?

5/23

Linear Regression

Linear Regression Problem

Fun Time
Consider using linear regression hypothesis h(x) = wT x to
predict the credit limit of customers x. Which feature below shall
have a positive weight in a good hypothesis for the task?
1

birth month

monthly income

current debt

number of credit cards owned

Reference Answer: 2
Customers with higher monthly income should
naturally be given a higher credit limit, which is
captured by the positive weight on the monthly
income feature.
6/23

Linear Regression

Linear Regression Algorithm

Matrix Form of Ein (w)

Ein (w) =

N
N
1X T
1X T
2
(w xn yn ) =
(xn w yn )2
N
N
n=1

n=1

2
T
x1 w y1

T

1
x
w

y
2

2

...
N

xT w y
N
N

xT1

T

1
x2 w

...
N

xT
N
1
k X
w y k2
N |{z} |{z} |{z}
Nd+1 d+11

2
y1

y2

...

yN

7/23

Linear Regression

Linear Regression Algorithm

min Ein (w) =

1
kXw yk2
N

Ein (w): continuous, differentiable, convex

necessary condition of best w

Ein

Ein (w)

Ein
w 0 (w)
Ein
w 1 (w)

...
Ein
w d (w)

0
0
...
0

task: find wLIN such that Ein (wLIN ) = 0

8/23

Linear Regression

Linear Regression Algorithm

The Gradient Ein (w)

Ein (w) =

1
1
kXw yk2 = wT XT X w 2wT XT y + yT y
|{z}
| {z }
|{z}
N
N
b

vector w

one w only
Ein (w)= N1

aw 2bw + c

Ein (w)= N1 wT Aw 2wT b + c

Ein (w)= N1 (2aw 2b)

Ein (w)= N1 (2Aw 2b)

simple! :-)

similar (derived by definition)

Ein (w) =

2
N

XT Xw XT y

9/23

Linear Regression

Linear Regression Algorithm

Optimal Linear Regression Weights

task: find wLIN such that

2
N

invertible XT X
easy! unique solution

wLIN =

1
XT y
XT X
|
{z
}
pseudo-inverse X

often the case because

XT Xw XT y = Ein (w) = 0

singular XT X
many optimal solutions
one of the solutions

wLIN = X y
by defining X in other ways

N d +1
practical suggestion:
use well-implemented routine
1 T
instead of XT X
X
for numerical stability when almost-singular
10/23

Linear Regression

Linear Regression Algorithm

from D, construct input matrix X and output vector y by

xT1
y1

2
y = y2
X=

T
yN
xN
| {z }
|
{z
}
N(d+1)

calculate pseudo-inverse

X
|{z}

(d+1)N
3

return wLIN = X y
|{z}
(d+1)1

simple and efficient

with good routine
11/23

Linear Regression

Linear Regression Algorithm

Fun Time
After getting wLIN , we can calculate the predictions yn = wTLIN xn . If all yn
similar to how we form y, what is the matrix
are collected in a vector y
?
formula of y
1

XXT y

XX y

XX XXT y

Reference Answer: 3
= XwLIN . Then, a simple
Note that y
substitution of wLIN reveals the answer.

12/23

Linear Regression

Linear Regression for Binary Classification

Linear Classification vs. Linear Regression

Linear Regression

Linear Classification
Y = {1, +1}
T

Y = R

h(x) = sign(w x)
err(y , y ) = Jy 6= y K

h(x) = wT x
err(y , y ) = (y y )2

NP-hard to solve in general

efficient analytic solution

{1, +1} R: linear regression for classification?

run LinReg on binary classification data D (efficient)

return g(x) = sign(wTLIN x)

but explanation of this heuristic?
19/23

Linear Regression

Linear Regression for Binary Classification

Relation of Two Errors

r
z
err0/1 = sign(wT x) 6= y

2
errsqr = wT x y

desired y = 1

desired y = 1
squared
0/1

err

wTx

err0/1 errsqr

20/23

Linear Regression

Linear Regression for Binary Classification

err0/1 errsqr

classification Eout (w)

classification Ein (w) + . . . . . .

regression Ein (w) + . . . . . .

(loose) upper bound errsqr as ec

rr to approximate err0/1
trade bound tightness for efficiency

wLIN : useful baseline classifier,

or as initial PLA/pocket vector

21/23

Linear Regression

Linear Regression for Binary Classification

Summary

1
2

When Can Machines Learn?

Why Can Machines Learn?

Lecture 8: Noise and Error

How Can Machines Learn?

Lecture 9: Linear Regression

Linear Regression Problem
use hyperplanes to approximate real values
Linear Regression Algorithm
analytic solution with pseudo-inverse
Generalization Issue
Eout Ein 2(d+1)
on average
N
Linear Regression for Binary Classification
0/1 error squared error
next: binary classification, regression, and then?
4

How Can Machines Learn Better?

23/23

Assignment 3 PDF
No ratings yet
Assignment 3 PDF
1 page
AN00131 - USB CDC ECM Class For Ethernet Over USB - 2.0.2rc1 PDF
No ratings yet
AN00131 - USB CDC ECM Class For Ethernet Over USB - 2.0.2rc1 PDF
31 pages
Root Locus: ROBT303 Linear Control Theory With Lab
No ratings yet
Root Locus: ROBT303 Linear Control Theory With Lab
38 pages
EjemplosDenavit Hartenberg
No ratings yet
EjemplosDenavit Hartenberg
171 pages
An758 Usb CDC
No ratings yet
An758 Usb CDC
30 pages
AISC Beam Diagrams and Formulas
No ratings yet
AISC Beam Diagrams and Formulas
7 pages
CoDeSys Manual V2p3 PDF
No ratings yet
CoDeSys Manual V2p3 PDF
388 pages
HW3 PDF
No ratings yet
HW3 PDF
2 pages
2DOF-RR Manipulator Velocity Analysis
No ratings yet
2DOF-RR Manipulator Velocity Analysis
1 page
Machine Learning Homework Problems
No ratings yet
Machine Learning Homework Problems
6 pages
Root Locus: ROBT303 Linear Control Theory With Lab
No ratings yet
Root Locus: ROBT303 Linear Control Theory With Lab
28 pages
Root Locus Analysis in Control Theory
No ratings yet
Root Locus Analysis in Control Theory
1 page
Controller Synthesis Via Root Locus: ROBT303 Linear Control Theory With Lab
No ratings yet
Controller Synthesis Via Root Locus: ROBT303 Linear Control Theory With Lab
24 pages
Robot Manipulator Kinematics Overview
No ratings yet
Robot Manipulator Kinematics Overview
50 pages
VEX Encoder in Single Joint Robot Design
No ratings yet
VEX Encoder in Single Joint Robot Design
13 pages
FBD Programming in TIA Portal Guide
No ratings yet
FBD Programming in TIA Portal Guide
28 pages
CODESYS SFC Programming Overview
No ratings yet
CODESYS SFC Programming Overview
52 pages
Understanding the Learning Problem in ML
No ratings yet
Understanding the Learning Problem in ML
27 pages
Learning from Data Exercise Solutions
No ratings yet
Learning from Data Exercise Solutions
4 pages
Coupled Electro-Thermal Model for HVDC Cables
No ratings yet
Coupled Electro-Thermal Model for HVDC Cables
65 pages
Astm d6730 App
No ratings yet
Astm d6730 App
5 pages
Farakka Barrage Junior Engineer Recruitment
No ratings yet
Farakka Barrage Junior Engineer Recruitment
57 pages
Health Factors and Safety for Ash Dykes
No ratings yet
Health Factors and Safety for Ash Dykes
23 pages
BNBC 2020: Special Stair Types Guide
No ratings yet
BNBC 2020: Special Stair Types Guide
12 pages
Seismic Control of Integral Abutment Bridges
No ratings yet
Seismic Control of Integral Abutment Bridges
282 pages
Local Stereo Radio Cassette Production Analysis
No ratings yet
Local Stereo Radio Cassette Production Analysis
8 pages
Kalman Decomposition in Control Systems
No ratings yet
Kalman Decomposition in Control Systems
10 pages
Engineering Design Showcase
No ratings yet
Engineering Design Showcase
36 pages
Pressure Vessel Design Calculations
No ratings yet
Pressure Vessel Design Calculations
1 page
8th Grade Science: Forces and Motion Quiz
No ratings yet
8th Grade Science: Forces and Motion Quiz
4 pages
Steel Connection Design Analysis
No ratings yet
Steel Connection Design Analysis
12 pages
Optimizing Power Plant Condenser Efficiency
No ratings yet
Optimizing Power Plant Condenser Efficiency
4 pages
Angles of Elevation and Depression
No ratings yet
Angles of Elevation and Depression
13 pages
Electric Resistance Lab Report
No ratings yet
Electric Resistance Lab Report
7 pages
Mole Calculations and Chemistry Formulas
No ratings yet
Mole Calculations and Chemistry Formulas
7 pages
Implant Başarı
No ratings yet
Implant Başarı
17 pages
Haar Wavelets in Mathematical Engineering
No ratings yet
Haar Wavelets in Mathematical Engineering
209 pages
Geometric Unsharpness Calculation in X-ray
No ratings yet
Geometric Unsharpness Calculation in X-ray
2 pages
Wave Optics: Interference & Polarization
No ratings yet
Wave Optics: Interference & Polarization
25 pages
Kelly Growth Strategy with Stop-Loss Rule
No ratings yet
Kelly Growth Strategy with Stop-Loss Rule
14 pages
Electromagnetic Wave Propagation Analysis
No ratings yet
Electromagnetic Wave Propagation Analysis
14 pages
Optimal Paper Airplane Design Lab
No ratings yet
Optimal Paper Airplane Design Lab
12 pages
Engineering Formula Sheet: Formulas Variables Formulas Variables
No ratings yet
Engineering Formula Sheet: Formulas Variables Formulas Variables
4 pages
Astm C271
No ratings yet
Astm C271
4 pages
Progress in Physical Geography
No ratings yet
Progress in Physical Geography
34 pages
Physics Unit Conversion Worksheet
No ratings yet
Physics Unit Conversion Worksheet
2 pages
Ebook Nonlinear Finite Element Analysis OptiStruct 2021
No ratings yet
Ebook Nonlinear Finite Element Analysis OptiStruct 2021
170 pages
Chilled Water Piping Calculations Guide
No ratings yet
Chilled Water Piping Calculations Guide
4 pages
Understanding Scientific Graphing Basics
No ratings yet
Understanding Scientific Graphing Basics
84 pages

09 Handout PDF

Uploaded by

09 Handout PDF

Uploaded by

Lecture 9: Linear Regression

When Can Machines Learn?

Why Can Machines Learn?

Lecture 8: Noise and Error

How Can Machines Learn?

Lecture 9: Linear Regression

How Can Machines Learn Better?

Linear Regression Problem

Credit Limit Problem

unknown target function

credit limit? 5,000

(ideal credit limit formula)

(historical records in bank)

Linear Regression Problem

Linear Regression Hypothesis

For x = (x0 , x1 , x2 , , xd ) features of customer,

approximate the desired credit limit with a weighted sum:

linear regression hypothesis: h(x) = wT x

h(x): like perceptron, but without the sign

Linear Regression Problem

Illustration of Linear Regression

Linear Regression Problem

The Error Measure

next: how to minimize Ein (w)?

Linear Regression Problem

number of credit cards owned

Linear Regression Algorithm

Matrix Form of Ein (w)

Linear Regression Algorithm

min Ein (w) =

Ein (w): continuous, differentiable, convex

task: find wLIN such that Ein (wLIN ) = 0

Linear Regression Algorithm

The Gradient Ein (w)

Ein (w)= N1 (2aw 2b)

Ein (w)= N1 (2Aw 2b)

similar (derived by definition)

Linear Regression Algorithm

Optimal Linear Regression Weights

often the case because

Linear Regression Algorithm

Linear Regression Algorithm

from D, construct input matrix X and output vector y by

simple and efficient

Linear Regression Algorithm

Linear Regression for Binary Classification

Linear Classification vs. Linear Regression

NP-hard to solve in general

efficient analytic solution

{1, +1} R: linear regression for classification?

run LinReg on binary classification data D (efficient)

return g(x) = sign(wTLIN x)

Linear Regression for Binary Classification

Relation of Two Errors

Linear Regression for Binary Classification

Linear Regression for Binary Classification

classification Eout (w)

classification Ein (w) + . . . . . .

regression Ein (w) + . . . . . .

(loose) upper bound errsqr as ec

wLIN : useful baseline classifier,

Linear Regression for Binary Classification

When Can Machines Learn?

Lecture 8: Noise and Error

How Can Machines Learn?

Lecture 9: Linear Regression

How Can Machines Learn Better?

You might also like