0% found this document useful (0 votes)
7 views

Lec3ML - Preceptron - Updated v4

Uploaded by

omar.okasha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Lec3ML - Preceptron - Updated v4

Uploaded by

omar.okasha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Lecture 3

Evaluating a Learning algorithm

Perceptron Algorithm

Ghada Khoriba
[email protected]

1
Linear classifiers

Inputs

parameters

2
Evaluation criteria
• The quality of predictions from a learned model is often expressed in terms
of a loss function. A loss function 𝑳(𝒈, 𝒂) tells you how much you will be
penalized for making a guess 𝑔 when the answer is actually 𝑎.
• There are many possible loss functions. Here are some frequently used
examples:
• 0 − 1 Loss applies to predictions drawn from finite domains.

3
Evaluation criteria
Why does Gradient Descent work with Sum Of Squared
• Squared loss Errors? Remember that gradient descent algorithm uses
the derivative of the function to be minimized. Squaring
the differences makes this error function differentiable
i.e. we can find the derivative of this function easily

• Linear loss

• Asymmetric loss Consider a situation in which you are trying to predict whether someone is having a
heart attack. It might be much worse to predict “no” when the answer is really “yes”, than the other
way around.

4
Evaluation Matrix, How good our model is?

blue points are considered positives and the red


points are considered negatives.

6 1

2 5

How many True Positives, True Negatives, False Sick, healthy


Positives, and False Negatives are in the model Spam, not Spam
above?

5
Ref: udacity/machine-learning
Accuracy: out of all data how many did you classify correctly?
𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑝𝑜𝑖𝑛𝑡𝑠
Accuracy=
𝑎𝑙𝑙 𝑝𝑜𝑖𝑛𝑡𝑠
Predicted Predicted
Sick Healthy
1000+8000
Accuracy= = 0.9= 90%
Sick 1000 200 10000

Healthy 800 8000

What is the accuracy of the model at right? answer as a


percentage
When will Accuracy not work?

6
Ref: udacity/machine-learning
When will Accuracy not work?

199,567 433
All
classified
as cats
199,567
Accuracy= = 0.9978= 99.8%
200,000
All
classified
as dog
433
Accuracy= = 0.00216 = 0.2%
200,000

7
Sick, healthy
False Positive, or False Negative Spam, not Spam

In the medical example, what is worse, a False Positive, In the spam detector example, what is worse, a
or a False Negative? False Positive, or a False Negative?

Predicted Predicted Sent to Spam Sent to inbox


Sick Healthy

Sick 1000 200 Spam 1000 200

False Negative False Negative


Healthy 800 8000 Not Spam 800 8000
False Positive False Positive

8
Evaluating a learning algorithm
• How should we evaluate the performance of a classifier ℎ?
• The best method is to measure test error on data that was not used to train
it.
• How should we evaluate the performance of a learning algorithm?
This is trickier. There are many potential sources of variability in the
possible result of computing test error on a learned hypothesis ℎ:
• Which particular training examples occurred in 𝐷𝑛
• Which particular testing examples occurred in 𝐷𝑛′
• Randomization inside the learning algorithm itself
• Generally, we would like to execute the following process multiple
times:
• Train on a new training set
• Evaluate resulting ℎ on a testing set that does not overlap the training set
9
Evaluating a learning algorithm
• Doing this multiple times controls for possible poor choices of training
set or unfortunate randomization inside the algorithm itself.
• in many applications, data is expensive or difficult to acquire.
• We can re-use data with cross-validation

cross-validation neither delivers nor evaluates a


single particular hypothesis ℎ. It evaluates the algorithm
that produces hypotheses.
10
Perceptron

perception, the process of experiencing the world through the senses

11
Recall: Classifiers 0
x2
• A linear classifier: > 0
✓0
h(x; ✓⇢
, ✓0 ) = sign( ✓> x + ✓0 ) x+ ✓0 e
=

>
+ 1 if ✓> x + ✓0 > 0 : > x
+ n
la
=
− 1 if ✓> x + ✓0 0
x ✓ r p
✓ x : yp e
• Hypothesis class H of all
h x 1

linear classifiers 0
⇢ <
• 0-1 Loss
0 if g = a ✓0
L (g, a) = x+
:✓
>
1 else
• Training error x
n
1X
En (h) = L (h(x ( i ) ), y( i ) )
n i= 1
• Example learning algorithm (given hypotheses h( j ))
12
Ex_learning_alg( D n ; k )
changed = False A. point is not on the lin
for i = 1 to n & prediction is wrong
Perceptron B. point isideaon the line
if y ( ✓ x + ✓0 ) 0 observation passed through the network, the fitted value is
(i ) > (i ) Learning Rule. The is that, for each

compared toC. initial


value step]
Set ✓= ✓+ y x
the actual and, if they do not match,
(i ) (i ) weights are updated until the error, computed as
(actual-fitted), is 0.
Set ✓0 = ✓0 + y( i )
changed = True
if not changed ⇣What does an update do?
break y (i )
( ✓+ y x ) x + ( ✓0 + y )
(i ) (i ) > (i ) (i )

Return ✓, ✓0 = y ( ✓ x + ✓0 ) + (y ) (x
(i ) > (i ) (i ) 2 (i )> (i )
x + 1
= y( i ) ( ✓> x ( i ) + ✓0 ) + (kx ( i ) k2 + 1)
13
Example/ Preceptron
• Current weight vector: 𝜃1 =(1,2,-2) , 𝜃2 =(3,-2,-1), 𝜃3 = (-1,2,4)
• Next training data point 𝑓 𝑥 =(1,-0.5, 3), 𝑦 = 2

• 𝜃1. 𝑓(𝑥)= 1(1)+2(-0.5) + (-2)(3)= -6


• 𝜃2. 𝑓(𝑥)= 3(1)+(-2)(-0.5) + (-1)(3)= 1 𝜃2 ← 𝜃2 + 𝑓(𝑥)
• 𝜃3. 𝑓(𝑥)= (-1)(1)+2(-0.5) + (4)(3)= 10 𝑦=3 𝜃3 ← 𝜃3 − 𝑓(𝑥)
𝜃2 ← 𝜃2 + 𝑓(𝑥) 𝜃3 ← 𝜃3 − 𝑓(𝑥)
3, −2, −1 + 1, −0.5,3 = (𝟒, −𝟐. 𝟓, 𝟐) −1,2,4 − 1, −0.5, 3 = (−𝟐, 𝟐. 𝟓, 𝟏)
11.25 −0.25

14
Classifier Quality

15
Classifier Quality

16
Classifier Quality

17
Classifier Quality

18
Classifier Quality

19
Theorem: Perceptron Performance

20

You might also like