0% found this document useful (0 votes)
13 views27 pages

CSE465 T3 Perceptron

Uploaded by

adib136718
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views27 pages

CSE465 T3 Perceptron

Uploaded by

adib136718
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CSE465

Lecture 2

Perceptron

CSE465: Pattern Recognition and Neural Network


Sec: 3
Faculty: Silvia Ahmed (SvA)
Spring 2025
Today’s Topic
1. Perceptron:
• What is a Perceptron?
• Perceptron vs Neuron
• Geometric Intuition
• How to train a Perceptron?

Silvia Ahmed (SvA) CSE465 ECE@NSU 2


What is a Perceptron?
• Fundamental building block of ANN
• It is an algorithm, used for supervised ML.
• A Perceptron is a simple type of artificial
neural network algorithm developed by Frank
Rosenblatt in 1957. 1 b
• It's the basic unit of a neural network, taking w2
multiple binary inputs and producing a single x2 Σ A
binary output.
w1
• It computes a weighted sum of its input, x1
applies an activation function, and produces
an output.

Silvia Ahmed (SvA) CSE465 ECE@NSU 3


Different parts of Perceptron

Activation Function:
• Signum Function
Bias • Sigmoid
• ReLU
1 • tanh
b

w2
x2 Σ A
Input w1
features x1
Summation Function that
Weights works as dot product
𝑧 = 𝑤1 ∙ 𝑥1 + 𝑤2 ∙ 𝑥2 + 𝑏

Silvia Ahmed (SvA) CSE465 ECE@NSU 4


Example use of a Perceptron

1 b
IQ, x1 CGPA, x2 Job Placement
78 7.8 1 w2
69 5.1 0
x2 Σ A
… … … w1
x1

1) Training: 2) Prediction:

Main job is to learn the values of the For a new sample where IQ = 100 and CGPA = 5.1:
weights and the bias from the training 𝑧 = 100 × 1 + 5.1 × 2 + 3 = 113.2 ≥ 0
samples So Job placement = 1
Eg. w1 =1, w2 = 2, b = 3

Silvia Ahmed (SvA) CSE465 ECE@NSU 5


• Question: If there are more than 2 features?
1 b
IQ, x1 CGPA, x2 State Job x3 w3
Placement
w2
78 7.8 Dhaka 1 x2 Σ A
69 5.1 Khulna 0
w1
… … … x1

𝑧 = 𝑤1 ∙ 𝑥1 + 𝑤2 ∙ 𝑥2 + 𝑤3 ∙ 𝑥3 + 𝑏

Silvia Ahmed (SvA) CSE465 ECE@NSU 6


Perceptron vs Neuron
• Deep learning is inspired by nervous system.

Figure: Perceptron vs Neuron [2]

Silvia Ahmed (SvA) CSE465 ECE@NSU 7


Interpretation

1 b=1
IQ, x1 CGPA, x2 Job Placement
78 7.8 1
Σ
w2 = 4
x2 A
69 5.1 0
… … … w1 = 2
x1

• Weights actually depicts the strength of each (input) connections.


• Weights are mostly the feature importance.

Silvia Ahmed (SvA) CSE465 ECE@NSU 8


Geometric Intuition
1 b IQ, x1 CGPA, x2 Job Placement

w2
x2 Σ A y = 0,1 CGPA 𝐴𝑥 + 𝐵𝑦 + 𝑐 ≥ 0
w1
x1
𝑤1 => 𝐴, 𝑤2 => 𝐵, 𝑏 => 𝑐
𝑧 = 𝑤1 ∙ 𝑥1 + 𝑤2 ∙ 𝑥2 + 𝑏
𝑥1 => 𝑥, 𝑥2 => 𝑦
1 𝑧≥0
𝑦=𝑓 𝑧 =ቊ
0 𝑧<0 𝐴𝑥 + 𝐵𝑦 + 𝑐 IQ
𝐴𝑥 + 𝐵𝑦 + 𝑐 < 0
Equation of a line

• Perceptron is a “line” and its main functionality is to create “regions” 2D -> line
3D -> plane
• Perceptron is a binary classifier.
≥4D -> hyperplane

Silvia Ahmed (SvA) CSE465 ECE@NSU 9


Logic AND

input 1 input 2 output


1 1 1
1 0 0
0 1 0
0 0 0

Silvia Ahmed (SvA) CSE465 ECE@NSU 10


Logic OR

input 1 input 2 output


1 1 1
1 0 1
0 1 1
0 0 0

Silvia Ahmed (SvA) CSE465 ECE@NSU 11


Logic XOR

input 1 input 2 output


1 1 0
1 0 1
0 1 1
0 0 0

Silvia Ahmed (SvA) CSE465 ECE@NSU 12


Limitation
• Works only with linear or “sort-of” linear data

• Tensorflow playground: [Link]


Dataset type Noise Learning rate Activation
Gaussian 15-20 0.01 Sigmoid
Exclusive OR 15-20 0.01 Sigmoid

Silvia Ahmed (SvA) CSE465 ECE@NSU 13


Perception Trick

• Main target is to
get the decision
boundary in the
form:
𝑛

෍ 𝑤𝑖 𝑥𝑖 = 0
𝑖=0

Silvia Ahmed (SvA) CSE465 ECE@NSU 14


Steps - 1
• Initialize:
• A = 1, B = 1, C = 0

• Randomly select
one sample

Silvia Ahmed (SvA) CSE465 ECE@NSU 15


Steps - 2
• Initialize:
• A = 2, B = 1.5, C =
0.4

• Randomly select
one sample

Silvia Ahmed (SvA) CSE465 ECE@NSU 16


Steps - 3
• Initialize:
• A = 4, B = 1.5, C =
0.4

• Randomly select
one sample

Silvia Ahmed (SvA) CSE465 ECE@NSU 17


Line Transformation
• Shown in [Link]/calculator
• Ax+By+C=0

Main equation: 2x+3y+5=0 Effect


Change in c 2x+3y+10=0 2x+3y+0=0
Change in A 4x+3y+5=0 x+3y+5=0
Change in B 2x+6y+5=0 2x+y+5=0

Silvia Ahmed (SvA) CSE465 ECE@NSU 18


How much to transform?

Minus operation to
(1,3,1) 2 3 5 bring the wrongly
(1,3) (4,5) (4,5,1) (-) 4 5 1 “positive” point to
2 3 5 -2 -2 4 the correct
(+) 1 3 1 “negative” zone.

3 6 6
Plus operation to 2x+3y+5=0
bring the wrongly
“negative” point to
the correct
“positive” zone.

Silvia Ahmed (SvA) CSE465 ECE@NSU 19


Live Desmos demonstration

2x+3y+5=0 (5,2) (-3,-2)

Silvia Ahmed (SvA) CSE465 ECE@NSU 20


Learning rate
• The learning rate is a small number that controls how fast or slow a
machine learning or deep learning model updates its internal parameters
(like weights) during training.
• "It’s like the step size your model takes while learning. Too big, and it may
trip and fall. Too small, and it may take forever to learn."
• New coef = coef – learning rate * coef
• Why it's important:
• If the learning rate is too high → the model may skip over the best solution and
never settle.
• If the learning rate is too low → the model will learn very slowly, taking a long time
to improve (or getting stuck).

Silvia Ahmed (SvA) CSE465 ECE@NSU 21


Algorithm
• epoch = 1000, η = 0.01
for i in range(epoch):
randomly select a point for i in range(epoch):
if xi ∈ N and σ2𝑖=0 𝑤𝑖 𝑥𝑖 ≥ 0 randomly select a point
𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 + η 𝑦𝑖 − 𝑦ො𝑖 𝑥𝑖
𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 − η 𝑥𝑖
if xi ∈ P and σ2𝑖=0 𝑤𝑖 𝑥𝑖 < 0
𝑤𝑛𝑒𝑤 = 𝑤𝑜𝑙𝑑 + η 𝑥𝑖
𝑦𝑖 𝑦ො𝑖 𝑦𝑖 − 𝑦ො𝑖
1 1 0
0 0 0
1 0 1
0 1 -1

Silvia Ahmed (SvA) CSE465 ECE@NSU 22


Problem with Perceptron Trick
• Which decision boundary is better?
• Quantify the result
• Convergence

Silvia Ahmed (SvA) CSE465 ECE@NSU 23


Loss Function
• An error function (also called a loss function) measures how far off a
machine learning or deep learning model's predictions are from the actual
target values.
• It gives the model a numeric value that reflects its performance—lower
values mean better predictions.
• The error function guides the learning process by telling the optimizer
how to adjust the model’s parameters (like weights in a neural network)
during training.
• f(w1, w2, b)

Silvia Ahmed (SvA) CSE465 ECE@NSU 24


Perceptron Loss Function
• Number of misclassified points
• (Perpendicular) Distance of the misclassified points
• (In practice)
• Take the point and put it on the line
• This is proportional to the perpendicular
distance but the mathematics is much
simpler than calculating the actual distance. (4,5)

2(4)+3(5)+5=28
2(-2)+3(-2)+5= |-5| = 5

(-2,-2)
2x+3y+5=0

Silvia Ahmed (SvA) CSE465 ECE@NSU 25


More Loss Functions
• If activation function is Sigmoid:
• Loss is Binary cross entropy (used in logistic regression)
• So when activation function is sigmoid then perceptron is basically
logistic regression
• Multi-class classification:
• Activation: Softmax
• Loss: Categorical Cross Entropy
• Regression:
• Activation: Linear (no activation)
• Loss: MSE

Silvia Ahmed (SvA) CSE465 ECE@NSU 26


Reference and further reading
1. “Deep Learning”, Ian Goodfellow, et al.
2. Pramoditha, Rukshan. “The Concept of Artificial Neurons
(Perceptrons) in Neural Networks.” Medium, Towards Data
Science, 29 Dec. 2021, [Link]/the-concept-
of-artificial-neurons-perceptrons-in-neural-networks-
fab22249cbfc. Accessed 21 Jan. 2025.

Silvia Ahmed (SvA) CSE465 ECE@NSU 27

You might also like