0% found this document useful (0 votes)

2 views

07. Linear Regression

The document is a lecture on Linear Regression from the Machine Learning course at Carnegie Mellon University, focusing on the Perceptron algorithm. It covers the need for an intercept term, the online and batch versions of the Perceptron algorithm, and extensions like Voted and Averaged Perceptron. Additionally, it discusses the geometric margin, linear separability, and the Perceptron mistake bound theorem.

Uploaded by

K SD

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

07. Linear Regression

Uploaded by

K SD

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

10-601 Introduction to Machine Learning

Machine Learning Department

School of Computer Science
Carnegie Mellon University

Linear Regression

Matt Gormley
Lecture 7
Feb. 5, 2020

1
Reminders
• Homework 2: Decision Trees
– Out: Wed, Jan. 22
– Due: Wed, Feb. 05 at 11:59pm
• Homework 3: KNN, Perceptron, Lin.Reg.
– Out: Wed, Feb. 05 (+ 1 day)
– Due: Wed, Feb. 12 at 11:59pm
• Today’s In-Class Poll
– http://p7.mlcourse.org

5
THE PERCEPTRON ALGORITHM

6
Intercept Term
Q: Why do we need an
intercept term?

A: It shifts the decision

boundary off the origin

b<0
Q: Why do we add / subtract 1.0
to the intercept term during
Perceptron training?
A: Two cases
1. Increasing b shifts the
decision boundary
b=0 towards the negative side
2. Decreasing b shifts the
decision boundary
towards the positive side
7
b>0
Perceptron Inductive Bias
1. Decision boundary should be linear
2. Most recent mistakes are most important
(and should be corrected)

8
Background: Hyperplanes
Hyperplane (Definition 1):
Notation Trick: fold the T
H = {x : w x = b}
bias b and the weights w
into a single vector θ by Hyperplane (Definition 2):
prepending a constant to ’ ’
x and increasing
w ’
dimensionality by one to 1
get x’!
Half-spaces:
(Online) Perceptron Algorithm
Data: Inputs are continuous vectors of length M. Outputs
are discrete.

Prediction: Output determined by hyperplane.

T 1, if a 0
ŷ = h (x) = sign( x) sign(a) =
1, otherwise

Learning: Iterative procedure:

• initialize parameters to vector of all zeroes
• while not converged
• receive next example (x(i), y(i))
• predict y’ = h(x(i))
• if positive mistake: add x(i) to parameters
• if negative mistake: subtract x(i) from parameters 10
(Online) Perceptron Algorithm
Data: Inputs are continuous vectors of length M. Outputs
are discrete.

Prediction: Output determinedImplementation

by hyperplane. Trick: same
T 1, if a 0
ŷ = h (x) = sign( x) behavior as our “add on
sign(a) =
1, otherwise
positive mistake and
subtract on negative
Learning: mistake” version, because
y(i) takes care of the sign

11
(Batch) Perceptron Algorithm
Learning for Perceptron also works if we have a fixed training
dataset, D. We call this the “batch” setting in contrast to the “online”
setting that we’ve discussed so far.

Algorithm 1 Perceptron Learning Algorithm (Batch)

1: procedure P (D = {(t(1) , y (1) ), . . . , (t(N ) , y (N ) )})
2: 0 Initialize parameters
3: while not converged do
4: for i {1, 2, . . . , N } do For each example
5: ŷ sign( T t(i) ) Predict
6: if ŷ = y (i) then If mistake
7: + y (i) t(i) Update parameters
8: return
12
(Batch) Perceptron Algorithm
Learning for Perceptron also works if we have a fixed training
dataset, D. We call this the “batch” setting in contrast to the “online”
setting that we’ve discussed so far.

Discussion:
The Batch Perceptron Algorithm can be derived in two ways.
1. By extending the online Perceptron algorithm to the batch
setting (as mentioned above)
2. By applying Stochastic Gradient Descent (SGD) to minimize a
so-called Hinge Loss on a linear separator

13
Extensions of Perceptron
• Voted Perceptron
– generalizes better than (standard) perceptron
– memory intensive (keeps around every weight vector seen during
training, so each one can vote)
• Averaged Perceptron
– empirically similar performance to voted perceptron
– can be implemented in a memory efficient way
(running averages are efficient)
• Kernel Perceptron
– Choose a kernel K(x’, x)
– Apply the kernel trick to Perceptron
– Resulting algorithm is still very simple
• Structured Perceptron
– Basic idea can also be applied when y ranges over an exponentially
large set
– Mistake bound does not depend on the size of that set

14
Perceptron Exercises
Question:
The parameter vector w learned by the
Perceptron algorithm can be written as
a linear combination of the feature
vectors x(1), x(2),…, x(N).

A. True, if you replace “linear” with

“polynomial” above
B. True, for all datasets
C. False, for all datasets
D. True, but only for certain datasets
E. False, but only for certain datasets
15
ANALYSIS OF PERCEPTRON

16
Geometric Margin
Definition: The margin of example ! w.r.t. a linear sep. " is the
distance from ! to the plane " ⋅ ! = 0 (or the negative if on wrong side)

Margin of positive example !&

!&
w

Margin of negative example !'

Slide from Nina Balcan

Geometric Margin
Definition: The margin of example % w.r.t. a linear sep. $ is the
distance from % to the plane $ ⋅ % = 0 (or the negative if on wrong side)
Definition: The margin !" of a set of examples # wrt a linear
separator $ is the smallest margin over points % ∈ #.

+ +
+
w
!" +
- !" +
+
- - +
- - +
-
-
Slide from Nina Balcan - -
Geometric Margin
Definition: The margin of example % w.r.t. a linear sep. $ is the
distance from % to the plane $ ⋅ % = 0 (or the negative if on wrong side)
Definition: The margin !# of a set of examples " wrt a linear
separator $ is the smallest margin over points % ∈ ".
Definition: The margin ! of a set of examples " is the maximum !#
over all linear separators $.

! + w
- ! +
+
- - +
- - +
-
-
Slide from Nina Balcan - -
Linear Separability
Def: For a binary classification problem, a set of examples !
is linearly separable if there exists a linear decision boundary
that can separate the points

Case 1: Case 2: Case 3: Case 4:

+ - + + -
- + + + + + - +

20
Analysis: Perceptron
Perceptron Mistake Bound
Guarantee: If data has margin and all points inside a ball of
radius R, then Perceptron makes (R/ )2 mistakes.
(Normalized margin: multiplying all points by 100, or dividing all points by 100,
doesn’t change the number of mistakes; algo is invariant to scaling.)

+ +
+
g +
- g +
+
- +
- - R
-
- - -
Slide adapted from Nina Balcan
- 21
Analysis: Perceptron
Perceptron Mistake Bound
Guarantee: If data has margin and all points inside a ball of
radius R, then Perceptron makes (R/ )2 mistakes.
(Normalized margin: multiplying all points by 100, or dividing all points by 100,
doesn’t change the number of mistakes; algo is invariant to scaling.)

+ +
+
Def: We say that the (batch) perceptron algorithm has
g +
converged if it stops making
g mistakes
+ on the training data
- the training data).+
(perfectly classifies
- +
Main Takeaway: For - linearly
- separable
R data, if the
perceptron algorithm cycles- repeatedly through the data,
- # of- steps.
it will converge in a finite
-
Slide adapted from Nina Balcan
- 22
Analysis: Perceptron
Perceptron Mistake Bound
Theorem 0.1 (Block (1962), Novikoﬀ (1962)).
Given dataset: D = {(t(i) , y (i) )}N
i=1 .
Suppose:
1. Finite size inputs: ||x(i) || R
2. Linearly separable data: s.t. || || = 1 and
y (i) ( · t(i) ) , i
Then: The number of mistakes made by the Perceptron
algorithm on this dataset is +
+
+

g +
2 +
k (R/ ) - g
+
- +
- -
- R
- -
- -
23
Figure from Nina Balcan
Common
Analysis: Perceptron Misunderstanding:
The radius is
Perceptron Mistake Bound centered at the
Theorem 0.1 (Block (1962), Novikoﬀ (1962)). origin, not at the
Given dataset: D = {(t(i) , y (i) )}N
i=1 .
center of the
Suppose: points.
1. Finite size inputs: ||x(i) || R
2. Linearly separable data: s.t. || || = 1 and
y (i) ( · t(i) ) , i
Then: The number of mistakes made by the Perceptron
algorithm on this dataset is +
+
+

g +
2 +
k (R/ ) - g
+
- +
- -
- R
- -
- -
24
Figure from Nina Balcan
Analysis: Perceptron
Proof of Perceptron Mistake Bound:

We will show that there exist constants A and B s.t.

(k+1)
Ak || || B k

(k+1)
Ak || || B k
(k+1)
Ak
(k+1) || || B k
Ak || || B k

(k+1)
Ak || || B k
25
Analysis: Perceptron
Theorem 0.1 (Block (1962), Novikoﬀ (1962)).
Given dataset: D = {(t(i) , y (i) )}N
i=1 . +
+
+
Suppose:
+
1. Finite size inputs: ||x(i) || R - g
g
+
+
2. Linearly separable data: s.t. || || = 1 and - +

y (i) ( · t(i) ) , i -
-
-
R

Then: The number of mistakes made by the Perceptron -

-
-
-
algorithm on this dataset is

k (R/ )2

Algorithm 1 Perceptron Learning Algorithm (Online)

1: procedure P (D = {(t(1) , y (1) ), (t(2) , y (2) ), . . .})
2: 0, k = 1 Initialize parameters
3: for i {1, 2, . . .} do For each example
4: if y (i) ( (k) · t(i) ) 0 then If mistake
5:
(k+1) (k)
+ y (i) t(i) Update parameters
6: k k+1
7: return 26
Analysis: Perceptron
Proof of Perceptron Mistake Bound:
Part 1: for some A, Ak || (k+1) || B k
(k+1)
· =( (k)
+ y (i) t(i) )
by Perceptron algorithm update
= (k)
· + y (i) ( · t(i) )
(k)
· +
by assumption
(k+1)
· k
by induction on k since (1)
=0
|| (k+1) || k
since ||r|| ||m|| r · m and || || = 1

Cauchy-Schwartz inequality
28
Analysis: Perceptron
Proof of Perceptron Mistake Bound:
AkB, || (k+1) || B k
Part 2: for some
|| (k+1) 2
|| = || (k)
+ y (i) t(i) ||2
by Perceptron algorithm update
= || (k) 2
|| + (y (i) )2 ||t(i) ||2 + 2y (i) ( (k)
· t(i) )
(k) 2
|| || + (y (i) )2 ||t(i) ||2
since kth mistake y (i) ( (k)
· t(i) ) 0
(k) 2
= || || + R2
since (y (i) )2 ||t(i) ||2 = ||t(i) ||2 = R2 by assumption and (y (i) )2 = 1
(k+1) 2
|| || kR2
by induction on k since ( (1) 2
) =0
(k+1)
|| || kR

29
Analysis: Perceptron
Proof of Perceptron Mistake Bound:
Part 3: Combining the bounds finishes the proof.
(k+1)
k || || kR
2
k (R/ )

The total number of mistakes

must be less than this

30
Combining, gives
√
k R ≥ ∥vk+1 ∥ ≥ vk+1 · u ≥ kγ
Analysis:2
Perceptron
which implies k ≤ (R/γ ) proving the theorem. ✷

What if the data is not linearly separable?

3.2. Analysis for the inseparable case

1.If thePerceptron will separable

data are not linearly not converge in this 1case
then the Theorem cannot(itbecan’t!)
used directly. However,
2.we now give a generalized
However, Freundversion of the theorem
& Schapire (1999) which allows
show for some
that mistakes in the
by projecting the
training set. As(hypothetically)
points far as we know, this theorem is new, although
into a higher the proof space,
dimensional techniqueweis very
can
similar to that ofa Klasner
achieve similarand Simon on
bound (1995,
theTheorem
number 2.2).
ofSee also the recent
mistakes madework on of
Shawe-Taylor
one pass andthrough
Cristianini (1998) who used thisoftechnique
the sequence examples to derive generalization error
bounds for any large margin classifier.

Theorem 2. Let ⟨(x1 , y1 ), . . . , (xm , ym )⟩ be a sequence of labeled examples with ∥xi ∥ ≤ R.

Let u be any vector with ∥u∥ = 1 and let γ > 0. Define the deviation of each example as

di = max{0, γ − yi (u · xi )},
!"
m 2
and define D = i=1 di . Then the number of mistakes of the online perceptron algorithm
on this sequence is bounded by
# $2
R+D
.
γ
31
Proof: The case D = 0 follows from Theorem 1, so we can assume that D > 0.
Perceptron Exercises
Question:
Unlike Decision Trees and K-
Nearest Neighbors, the Perceptron
algorithm does not suffer from
overfitting because it does not
have any hyperparameters that
could be over-tuned on the
training data.

A. True
B. False
C. True and False
32
Summary: Perceptron
• Perceptron is a linear classifier
• Simple learning algorithm: when a mistake is
made, add / subtract the features
• Perceptron will converge if the data are linearly
separable, it will not converge if the data are
linearly inseparable
• For linearly separable and inseparable data, we
can bound the number of mistakes (geometric
argument)
• Extensions support nonlinear separators and
structured prediction
33
Perceptron Learning Objectives
You should be able to…
• Explain the difference between online learning and
batch learning
• Implement the perceptron algorithm for binary
classification [CIML]
• Determine whether the perceptron algorithm will
converge based on properties of the dataset, and
the limitations of the convergence guarantees
• Describe the inductive bias of perceptron and the
limitations of linear models
• Draw the decision boundary of a linear model
• Identify whether a dataset is linearly separable or not
• Defend the use of a bias term in perceptron
34
REGRESSION

39
Flexible Modeling of Epi

Regression
Goal:
– Given a training dataset of pairs
(x,y) where
• x is a vector
• y is a scalar
– Learn a function (aka. curve or line)
y’ = h(x) that best fits the training
data
Example Applications:
– Stock price prediction
– Forecasting epidemics
– Speech synthesis
– Generation of images (e.g. Deep
Dream)
– Predicting the number of tourists
on Machu Picchu on a given day

Fig 2. 2013–2014 national forecast, retrospectively, using the

revised wILI data through epidemiological weeks (A)
4047, (B) 5
doi:10.1371/journal.pcbi.1004382.g002
Regression
Example Application:
Forecasting Epidemics
• Input features, x:
attributes of the
epidemic
• Output, y:
Weighted %ILI,
prevalence of the
disease
• Setting: observe
past prevalence to
predict future
prevalence

Fig 2. 2013–2014 national forecast, retrospectively, using the final revisions of wILI values, using
41
revised wILI data through epidemiological weeks (A) 47, (B) 51, (C) 1, and (D) 7.
Figure from Brooks et al. (2015)
doi:10.1371/journal.pcbi.1004382.g002
Regression
y Example: Dataset with only
one feature x and one scalar Q: What is the function that
output y best fits these points?

43
k-NN Regression
y Example: Dataset with only
one feature x and one scalar k=1 Nearest Neighbor
output y Regression
• Train: store all (x, y) pairs
• Predict: pick the nearest x
in training data and return
its y

k=2 Nearest Neighbor Distance

Weighted Regression
• Train: store all (x, y) pairs
• Predict: pick the nearest
two instances x(n1) and x(n2)
x
in training data and return
the weighted average of
their y values

44
LINEAR REGRESSION

45
Regression Problems
Chalkboard
– Definition of Regression
– Linear functions
– Residuals
– Notation trick: fold in the intercept

Our Country and Its Regions Volume 1 Grade 4 Social Studies
No ratings yet
Our Country and Its Regions Volume 1 Grade 4 Social Studies
154 pages
6.86x Machine Learning With Python: Linear Classifiers
No ratings yet
6.86x Machine Learning With Python: Linear Classifiers
7 pages
Midterm Review Spring18 Sols
No ratings yet
Midterm Review Spring18 Sols
22 pages
Perceptron Bound Proof
No ratings yet
Perceptron Bound Proof
27 pages
Perceptron
No ratings yet
Perceptron
26 pages
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
No ratings yet
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
43 pages
Perceptron Mistake Bound
No ratings yet
Perceptron Mistake Bound
10 pages
Lecture Notes 3 Perceptron
No ratings yet
Lecture Notes 3 Perceptron
7 pages
Linear Classifier-Perceptron
No ratings yet
Linear Classifier-Perceptron
42 pages
Perceptron Notes
No ratings yet
Perceptron Notes
5 pages
Lecturenotes Perceptron
No ratings yet
Lecturenotes Perceptron
7 pages
Perceptron Linear Classifiers
No ratings yet
Perceptron Linear Classifiers
42 pages
Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed
No ratings yet
Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed
21 pages
20.NeuralNets Short
No ratings yet
20.NeuralNets Short
60 pages
NN Theory
No ratings yet
NN Theory
138 pages
Pr5_PerceptronWriteUp.docx
No ratings yet
Pr5_PerceptronWriteUp.docx
6 pages
Deep Learning Practical Assignment #1:: Instructions
No ratings yet
Deep Learning Practical Assignment #1:: Instructions
5 pages
Perceptron: Tirtharaj Dash
No ratings yet
Perceptron: Tirtharaj Dash
22 pages
ML_Lec 6- Linear Classifiers
No ratings yet
ML_Lec 6- Linear Classifiers
55 pages
Perceptron
No ratings yet
Perceptron
6 pages
CIS 4526: Foundations of Machine Learning Linear Classification: Perceptron
No ratings yet
CIS 4526: Foundations of Machine Learning Linear Classification: Perceptron
33 pages
Perceptron Lecture 3
No ratings yet
Perceptron Lecture 3
25 pages
MAT6007 Session5 Perceptron Algorithm
No ratings yet
MAT6007 Session5 Perceptron Algorithm
19 pages
Percept Rons
No ratings yet
Percept Rons
68 pages
Perceptrons Algorithm PDF
No ratings yet
Perceptrons Algorithm PDF
68 pages
kernel_perceptron
No ratings yet
kernel_perceptron
28 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
46 pages
1 Algorithm: For I 1 To N Ify
No ratings yet
1 Algorithm: For I 1 To N Ify
6 pages
SML_Lecture5
No ratings yet
SML_Lecture5
45 pages
ANN (Perceptron) 02
No ratings yet
ANN (Perceptron) 02
14 pages
Lec1 PerceptronPocket Recap
No ratings yet
Lec1 PerceptronPocket Recap
61 pages
Percept Ron
No ratings yet
Percept Ron
2 pages
2007 02 01b Janecek Perceptron
No ratings yet
2007 02 01b Janecek Perceptron
37 pages
Perceptron PDF
No ratings yet
Perceptron PDF
37 pages
Kakade S. Tewari a. - Topics in Artificial Intelligence (Learning Theory)
No ratings yet
Kakade S. Tewari a. - Topics in Artificial Intelligence (Learning Theory)
68 pages
01 Halfspaces Perceptron
No ratings yet
01 Halfspaces Perceptron
56 pages
Session 6 Machine Learning Algorithms
No ratings yet
Session 6 Machine Learning Algorithms
46 pages
The Percept Ronal Go
No ratings yet
The Percept Ronal Go
72 pages
lecture 4
No ratings yet
lecture 4
65 pages
06 Optimization Basics PDF
No ratings yet
06 Optimization Basics PDF
82 pages
Lecture 3 - The Perceptron
No ratings yet
Lecture 3 - The Perceptron
4 pages
Machine Learning: Support Vector Machines Kernel Methods
No ratings yet
Machine Learning: Support Vector Machines Kernel Methods
87 pages
Perceptron - Algorithm
No ratings yet
Perceptron - Algorithm
9 pages
05 Linear Classifiers
No ratings yet
05 Linear Classifiers
59 pages
L17-Perceptron
No ratings yet
L17-Perceptron
21 pages
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
No ratings yet
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
11 pages
Lecture 9
No ratings yet
Lecture 9
97 pages
05 Neural Network
No ratings yet
05 Neural Network
38 pages
Perceptron PDF
0% (1)
Perceptron PDF
8 pages
hw1 Sols PDF
No ratings yet
hw1 Sols PDF
5 pages
Preceptron
No ratings yet
Preceptron
17 pages
Lecture 2 Math
No ratings yet
Lecture 2 Math
34 pages
PLA Explanation
No ratings yet
PLA Explanation
19 pages
nn1
No ratings yet
nn1
6 pages
Lecture 16 - Hyperplane Classifiers - Perceptron - Plain
No ratings yet
Lecture 16 - Hyperplane Classifiers - Perceptron - Plain
9 pages
3 Percept Ron
No ratings yet
3 Percept Ron
34 pages
PNAL4 SingleLayerNets
No ratings yet
PNAL4 SingleLayerNets
42 pages
05_optimization_basics
No ratings yet
05_optimization_basics
94 pages
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
No ratings yet
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
5 pages
Lecture 2
No ratings yet
Lecture 2
19 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
MUB_DCMLI
No ratings yet
MUB_DCMLI
13 pages
09. Stochastic Gradient Descent 1
No ratings yet
09. Stochastic Gradient Descent 1
42 pages
10. Binary Logistic Regression 2
No ratings yet
10. Binary Logistic Regression 2
43 pages
Chapter 4 Fuzzy Rules and Inferences
No ratings yet
Chapter 4 Fuzzy Rules and Inferences
7 pages
Chapter 6 Sugeno Fuzzy and Mamdani models
No ratings yet
Chapter 6 Sugeno Fuzzy and Mamdani models
5 pages
Chapter 1 Fuzzy set
No ratings yet
Chapter 1 Fuzzy set
13 pages
Ijmet 08 06 011
No ratings yet
Ijmet 08 06 011
10 pages
NIRF 2024 Engineering Submitted
No ratings yet
NIRF 2024 Engineering Submitted
16 pages
2024101120445120241011150344Fakulti_Pertanian_2024-2025_11.10.2024 (1)
No ratings yet
2024101120445120241011150344Fakulti_Pertanian_2024-2025_11.10.2024 (1)
206 pages
Oxsilan Presentation PDF
No ratings yet
Oxsilan Presentation PDF
41 pages
Chapter 2: Language, Symbols and Conventions of Mathematics
No ratings yet
Chapter 2: Language, Symbols and Conventions of Mathematics
12 pages
Listening Full Mock Test - Test 1
No ratings yet
Listening Full Mock Test - Test 1
5 pages
Sheet 01 20-21 Properties of Fluid Rev
No ratings yet
Sheet 01 20-21 Properties of Fluid Rev
2 pages
Jung Part 2
No ratings yet
Jung Part 2
5 pages
A Hydraulic Steering Gear Simulator For Analysis and Control
No ratings yet
A Hydraulic Steering Gear Simulator For Analysis and Control
9 pages
SANSRS Implementation Guidelines - V.1.0.
100% (1)
SANSRS Implementation Guidelines - V.1.0.
24 pages
On Shiksha Saptah (16 July 2024) - 1
No ratings yet
On Shiksha Saptah (16 July 2024) - 1
51 pages
Immediate download Enterprise Risk Management: Achieving and Sustaining Success – Ebook PDF Version ebooks 2024
100% (3)
Immediate download Enterprise Risk Management: Achieving and Sustaining Success – Ebook PDF Version ebooks 2024
61 pages
Measurement of Phase Angle and Frequency
No ratings yet
Measurement of Phase Angle and Frequency
10 pages
Issue 2 (Jan 2016)
No ratings yet
Issue 2 (Jan 2016)
80 pages
Mary Midgley.
No ratings yet
Mary Midgley.
26 pages
Allen, Geoffrey - Comprehensive Polymer Science and Supplements - (Elsevier) (1996)
100% (1)
Allen, Geoffrey - Comprehensive Polymer Science and Supplements - (Elsevier) (1996)
1,410 pages
Lee J S 2012 Author Copy
No ratings yet
Lee J S 2012 Author Copy
35 pages
MEP 4th Ed 2019 Worked Sols Chap 03
No ratings yet
MEP 4th Ed 2019 Worked Sols Chap 03
16 pages
PerDev 2nd P. PART 2
No ratings yet
PerDev 2nd P. PART 2
7 pages
Sample Paper 11 TH NM
No ratings yet
Sample Paper 11 TH NM
8 pages
Log Book-Criminology and Security Studies
No ratings yet
Log Book-Criminology and Security Studies
6 pages
Centralization Coding Schemes
No ratings yet
Centralization Coding Schemes
10 pages
Pni Mab300 1
No ratings yet
Pni Mab300 1
160 pages
GSB 622 Decision Support Systems Linear Programming & Excel Solver
No ratings yet
GSB 622 Decision Support Systems Linear Programming & Excel Solver
4 pages
(123doc) - Practices-Test-11
No ratings yet
(123doc) - Practices-Test-11
3 pages
TOS 2nd Quarter
No ratings yet
TOS 2nd Quarter
4 pages
The Force of Friction - Student Booklet
No ratings yet
The Force of Friction - Student Booklet
8 pages
Molecular Biology I: Nucleic Acid Metabolism SC/BIOL 3110, 2020 S1
No ratings yet
Molecular Biology I: Nucleic Acid Metabolism SC/BIOL 3110, 2020 S1
14 pages
Novel Analysis - A Child of Sorrow - AgbayYen
No ratings yet
Novel Analysis - A Child of Sorrow - AgbayYen
10 pages
Đề Thi IGCSE Math 2024 October Paper 4 Variant 2
No ratings yet
Đề Thi IGCSE Math 2024 October Paper 4 Variant 2
16 pages
Search Based Software Engineering: Techniques, Taxonomy, Tutorial
No ratings yet
Search Based Software Engineering: Techniques, Taxonomy, Tutorial
55 pages

07. Linear Regression

Uploaded by

07. Linear Regression

Uploaded by

10-601 Introduction to Machine Learning

Machine Learning Department

A: It shifts the decision

Prediction: Output determined by hyperplane.

Learning: Iterative procedure:

Prediction: Output determinedImplementation

Algorithm 1 Perceptron Learning Algorithm (Batch)

A. True, if you replace “linear” with

Margin of positive example !&

Margin of negative example !'

Slide from Nina Balcan

Case 1: Case 2: Case 3: Case 4:

We will show that there exist constants A and B s.t.

Then: The number of mistakes made by the Perceptron -

Algorithm 1 Perceptron Learning Algorithm (Online)

The total number of mistakes

What if the data is not linearly separable?

1.If thePerceptron will separable

Theorem 2. Let ⟨(x1 , y1 ), . . . , (xm , ym )⟩ be a sequence of labeled examples with ∥xi ∥ ≤ R.

Fig 2. 2013–2014 national forecast, retrospectively, using the

k=2 Nearest Neighbor Distance

You might also like