0% found this document useful (0 votes)
14 views

ML Lecture#3

This document provides an overview of single and multilayer perceptrons. It begins by introducing artificial neural networks and the perceptron, which is a single artificial neuron that can learn linear decision boundaries. The document explains how a perceptron's inputs are weighted and summed, and if the activation level exceeds a threshold, the perceptron fires an output signal. It provides an example of calculating a perceptron's activation. The document then discusses how perceptrons can learn using the perceptron learning algorithm and its limitations in only being able to represent linearly separable problems. It compares perceptrons to support vector machines, noting how SVMs aim to maximize the margin between classes and can handle non-linear separability using kernels.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

ML Lecture#3

This document provides an overview of single and multilayer perceptrons. It begins by introducing artificial neural networks and the perceptron, which is a single artificial neuron that can learn linear decision boundaries. The document explains how a perceptron's inputs are weighted and summed, and if the activation level exceeds a threshold, the perceptron fires an output signal. It provides an example of calculating a perceptron's activation. The document then discusses how perceptrons can learn using the perceptron learning algorithm and its limitations in only being able to represent linearly separable problems. It compares perceptrons to support vector machines, noting how SVMs aim to maximize the margin between classes and can handle non-linear separability using kernels.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Machine Learning

Lecture # 3
Single & Multilayer Percceptron

1
Artificial Neural Network -
Perceptron
A (Linear) Decision Boundary

Represented by:
One artificial neuron
called a “Perceptron”

Low space complexity

Low time complexity


Input signals sent
from other neurons

If enough sufficient
signals accumulate, the
neuron fires a signal.

Connection strengths determine


how the signals are
accumulated
What is ANN?
• The nucleus sums all these new input values which gives us
the activation
• For n inputs and n weights – weights multiplied by input
and summed

a = x1w1+x2w2+x3w3... +xnwn

5
Perceptron
Perceptron
• input signals ‘x’ and weights ‘w’ are multiplied
• weights correspond to connection strengths
• signals are added up – if they are enough, FIRE!

x1 w1 add

if (a  t)
M
x2 output  output
output
w2 a   xi w i 1 signa
else lsignal
i1
output 
x3 w3 0

incoming connection
activation
signal strength
level
Calculation…

M Sum notation
a   xi (just like a loop from 1 to
M)
wi
i1

Multiple corresponding
double[] x = elements and add them
up
a (activation)
double[] w =
if (activation > threshold) FIRE !
Perceptron Decision Rule


 M
xw
 i i   t then output  1, else output 
if
 i 1  0

 M
xw
if  i i   t then ou tp u t  1, els e ou tp u t  0
 i 1 

output = 0
output = 1
Is this a good decision boundary?


 M
xw
if 
 i 1
i i 

 t then output  1, else output 
0
w1 = 1.0

w2 = 0.2

t = 0.05


 M
xw
if 
 i 1
i i 

 t then output  1, else output 
0
w1 = 2.1

w2 = 0.2

t = 0.05


 M
xw
if 
 i 1
i i 

 t then output  1, else output 
0
w1 = 1.9

w2 = 0.02

t = 0.05


 M
xw
if 
 i 1
i i 

 t then output  1, else output 
0
w1 = -0.8

w2 = 0.03

t = 0.05

Changing the weights/threshold makes the decision boundary


move.
M

x  [ 1.0, 0.5, 2.0 ] a   xi wi


i1
w  [ 0.2, 0.5, 0.5 ] x1 w1

t  1.0 x2
w2
x3 w3

Q1. What is the activation, a, of the


neuron? Q2. Does the neuron fire?
Q3. What if we set threshold at 0.5 and
weight #3 to zero?
M

x  [ 1.0, 0.5, 2.0 ] a   xi wi


i1
w  [ 0.2, 0.5, 0.5 ] x1 w1
w2
t  1.0 x2

x3 w3

Q1. What is the activation, a, of the neuron?


M

a   xi wi  (1.0 0.2)  (0.5 0.5)  (2.0 0.5) 


1.45
i1

Q2. Does the neuron fire?


if (activation > threshold) output=1 else output=0
M

x  [ 1.0, 0.5, 2.0 ] a   xi wi


i1
w  [ 0.2, 0.5, 0.5 ] x1 w1
w2
t  1.0 x2

x3 w3

Q3. What if we set threshold at 0.5 and weight #3 to zero?


M

a   xi wi  (1.0 0.2)  (0.5 0.5)  (2.0 0.0)  0.45


i1
if (activation > threshold) output=1 else
output=0
…. So no, it does not fire..
We can rearrange the decision rule….

M 
if  x w  t then output  1, else output  0
 i 1 i i 
M 
if  x w  t  0 then output  1, else output 
 i 1 i i  0

 x w  (1
M
if   t)i  0 then output  1, else output 
 i1 i
 0

if M 
 0 then output  1, else output 
 
i 1 x w 
i i   ( x  w )
0 0
0

  then output  1, else output 


if
M
 0
 
i 0 xi wi  0

We now treat the threshold like any other weight with a permanent input of -1
The Bias
False Input
Perceptron Learning
Algorithm
initialise weights (w)
Repeat until all points are correctly classified
Repeat for each point
Calculate margin (yiwXi) for point i)
If margin > 0, point is correclty
classified
Else change the weights to increase margin such that
Δw = ηyiXi and wnew = wold + Δw
end
end

Perceptron convergence theorem:


If the data is linearly separable, then application of the
Perceptron learning rule will find a separating decision boundary,
within a finite number of iterations
Decision Boundary Using
Perceptron
Multiple Outputs
Criticize on Perceptron
• Minsky and Papert criticise the
perceptron (1969)
• Minsky and Papert's criticism was partly right
• We have no a priori reason to think
that problems should be linear separable

it turns out world is full
However,
the that of close linear
problems
are a to
least
separable
Can a Perceptron solve this problem?
Can a Perceptron solve this problem? ….. NO.

Perceptrons only solve

LINEARLY SEPARABLE

problems
With a perceptron…
the decision boundary is
LINEAR

A
0

0 1

B
Overview of SVM w.r.t.
Perceptron
Perceptron
Perceptron VS SVM
Perceptron VS SVM
• The Perceptron does not try to optimize the separation
"distance". As long as it finds a hyperplane that separates the
two sets, it is good. SVM on the other hand tries to maximize
the "support vector", i.e., the distance between two closest
opposite sample points.
• The SVM typically tries to use a "kernel function" to project
the sample points to high dimension space to make them
linearly separable, while the perceptron assumes the sample
points are linearly separable.
• SVM Requires more parameters as compared to
– choice of kernel
– selection of kernel parameters
– selection of the value of the margin parameter
SVM and Margins
SVM for Nonlinear Data
Acknowledgements
 Introduction to Machine Learning, Alphaydin
 Statistical Pattern Recognition: A Review – A.K Jain et al., PAMI (22) 2000
 Pattern Recognition and Analysis Course – A.K. Jain, MSU
 Pattern Classification” by Duda et al., John Wiley & Sons.
Material in these slides has been taken from, the following
resources

37

You might also like