0% found this document useful (0 votes)
46 views

Unit 2

The document discusses McCulloch-Pitts neurons, which were the first mathematical model of biological neurons. It describes the McCulloch-Pitts neuron model as having inputs that can be excitatory or inhibitory, and firing if the sum of active inputs exceeds a threshold. It then provides examples of how different logic functions like AND, OR, and NOT can be represented using McCulloch-Pitts neurons. Finally, it introduces perceptrons as a more advanced neural network model built upon McCulloch-Pitts neurons.

Uploaded by

Manish Sontakke
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Unit 2

The document discusses McCulloch-Pitts neurons, which were the first mathematical model of biological neurons. It describes the McCulloch-Pitts neuron model as having inputs that can be excitatory or inhibitory, and firing if the sum of active inputs exceeds a threshold. It then provides examples of how different logic functions like AND, OR, and NOT can be represented using McCulloch-Pitts neurons. Finally, it introduces perceptrons as a more advanced neural network model built upon McCulloch-Pitts neurons.

Uploaded by

Manish Sontakke
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

UNIT-2

McCulloch Pitts NeuronMcCulloch-Pitts Neuron — Mankind’s First Mathematical Model


of a Biological Neuron, the very first step towards the perceptron we use today was taken in
1943 by McCulloch and Pitts, by mimicking the functionality of a biological neuron.

MP neuron is a simplified computational model of the neuron

g: aggregation input

f: function takes a decision based on this aggregation

the inputs can be excitatory or inhibitory

Inhibitory: if this input is on, your output is always going to be 0

That means the neuron never going to fall for fire

Excitatory: excitatory inputs are not something that will

cause the neuron to fire on its own but it will combine with all other inputs that you have seen could
cause the neuron to fire and how.

The number of inputs which are on

(i.e.1)

Sum > than 1 then it will fire

ɵis called the thresholding parameter


this is called thresholding logic.

M-P Neuron: A Concise Representation:


This representation just denotes that, for the Boolean
inputs x_1, x_2, and x_3 if the g(x) i.e., sum ≥ theta, the
neuron will fire otherwise, it won’t.
Sr.No Name Figure Facts
2 AND An AND function neuron would
Function only fire when ALL the inputs are
ON i.e., g(x) ≥ 3 here.

3 OR I believe this is self explanatory as


Function we know that an OR function
neuron would fire if ANY of the
inputs is ON i.e., g(x) ≥ 1 here.

4 A Now this might look like a tricky


Function one but it’s really not. Here, we
With An have an inhibitory input i.e., x_2
Inhibitory so whenever x_2 is 1, the output
Input will be 0. Keeping that in mind,
we know that x_1 AND !x_2
would output 1 only when x_1 is 1
and x_2 is 0 so it is obvious that
the threshold parameter should
be 1.

And we also know that x_1 Lets verify that, the g(x) i.e., x_1 +
AND !x_2 would output 1 for x_2 would be ≥ 1 in only 3 cases:
Case 1 (above) so our
thresholding parameter holds Case 1: when x_1 is 1 and x_2 is 0
good for the given function. Case 2: when x_1 is 1 and x_2 is 1
Case 3: when x_1 is 0 and x_2 is 1

But in both Case 2 and Case 3, we


know that the output will be 0
because x_2 is 1 in both of them,
thanks to the inhibition.
5 NOR For a NOR neuron to fire, we want
Function ALL the inputs to be 0 so the
thresholding parameter should
also be 0 and we take them all as
inhibitory input.

6 NOT For a NOT neuron, 1 outputs 0


Function and 0 outputs 1. So we take the
input as an inhibitory input and
set the thresholding parameter to
0. It works!
Can any boolean function be
represented using the M-P
neuron? Before you answer that,
lets understand what M-P neuron
is doing geometrically.

Limitations Of M-P Neuron


What about non-boolean (say, real) inputs?
Do we always need to hand code the threshold?
Are all inputs equal? What if we want to assign more importance to some inputs?
What about functions which are not linearly separable? Say XOR function.

Perceptron
Perceptron is a building block of an Artificial Neural Network,Mr. Frank Rosenblatt invented
the Perceptron for performing certain calculations to detect input data capabilities or
business intelligence

How does Perceptron work?


In Machine Learning, Perceptron is considered as a single-layer neural network that consists
of four main parameters named input values (Input nodes), weights and Bias, net sum, and
an activation function. The perceptron model begins with the multiplication of all input
values and their weights, then adds these values together to create the weighted sum. Then
this weighted sum is applied to the activation function 'f' to obtain the desired output. This
activation function is also known as the step function and is represented by 'f'.
This step function or Activation function plays a vital role in ensuring that output is mapped
between required values (0,1) or (-1,1). It is important to note that the weight of input is
indicative of the strength of a node. Similarly, an input's bias value gives the ability to shift
the activation function curve up or down.

Perceptron model works in two important steps as follows:

Step-1: In the first step first, multiply all input values with corresponding weight values and
then add them to determine the weighted sum. Mathematically, we can calculate the
weighted sum as follows:

∑wi*xi = x1*w1 + x2*w2 +…wn*xn

Add a special term called bias 'b' to this weighted sum to improve the model's performance.

∑wi*xi + b

Step-2: In the second step, an activation function is applied with the above-mentioned
weighted sum, which gives us output either in binary form or a continuous value as follows:

Y = f(∑wi*xi + b)

Types of Perceptron Models

Based on the layers, Perceptron models are divided into two types. These are as follows:

Single-layer Perceptron Model

Multi-layer Perceptron model

Single Layer Perceptron Model:


This is one of the easiest Artificial neural networks (ANN) types. A single-layered perceptron
model consists feed-forward network and also includes a threshold transfer function inside
the model. The main objective of the single-layer perceptron model is to analyze the linearly
separable objects with binary outcomes.

X1
W1 Bias Activation Function

W2o/p
X2 ∑ Ψ (.)
.

Wn Xn Summation

Steps that are to be followed

Summation= sum(weight_i * Xi) + Bias

Prediction = 1.0 if Summation>= 0 else 0.0

W=W+ Learning_rate *(Expected- Predicted) *X

Where

Learning_rate= 0.01 you must configure

Expected- Predicted is an error for the training data

X= input

Multi-Layered Perceptron Model:


Like a single-layer perceptron model, a multi-layer perceptron model also has the same
model structure but has a greater number of hidden layers.

The multi-layer perceptron model is also known as the Backpropagation algorithm, which
executes in two stages as follows:

Forward Stage: Activation functions start from the input layer in the forward stage and
terminate on the output layer.

Backward Stage: In the backward stage, weight and bias values are modified as per the
model's requirement. In this stage, the error between actual output and demanded
originated backward on the output layer and ended on the input layer.
Hence, a multi-layered perceptron model has considered as multiple artificial neural
networks having various layers in which activation function does not remain linear, similar
to a single layer perceptron model. Instead of linear, activation function can be executed as
sigmoid, TanH, ReLU, etc., for deployment.

A multi-layer perceptron model has greater processing power and can process linear and
non-linear patterns. Further, it can also implement logic gates such as AND, OR, XOR, NAND,
NOT, XNOR, NOR.

Advantages of Multi-Layer Perceptron:


A multi-layered perceptron model can be used to solve complex non-linear problems.

It works well with both small and large input data.

It helps us to obtain quick predictions after the training.

It helps to obtain the same accuracy ratio with large as well as small data.

Disadvantages of Multi-Layer Perceptron:


In Multi-layer perceptron, computations are difficult and time-consuming.

In multi-layer Perceptron, it is difficult to predict how much the dependent variable affects
each independent variable.

The model functioning depends on the quality of the training.

The perceptron model has the following characteristics.


Perceptron is a machine learning algorithm for supervised learning of binary classifiers.

In Perceptron, the weight coefficient is automatically learned.

Initially, weights are multiplied with input features, and the decision is made whether the
neuron is fired or not.

The activation function applies a step rule to check whether the weight function is greater
than zero.

The linear decision boundary is drawn, enabling the distinction between the two linearly
separable classes +1 and -1.

If the added sum of all input values is more than the threshold value, it must have an output
signal; otherwise, no output will be shown.

Limitations of Perceptron Model


A perceptron model has limitations as follows:
The output of a perceptron can only be a binary number (0 or 1) due to the hard limit
transfer function.

Perceptron can only be used to classify the linearly separable sets of input vectors. If input
vectors are non-linear, it is not easy to classify them properly.

Multi-layer Perceptron in TensorFlow


Multi-Layer perceptron defines the most complex architecture of artificial neural networks.
It is substantially formed from multiple layers of the perceptron. TensorFlow is a very
popular deep learning framework released by, and this notebook will guide to build a neural
network with this library. If we want to understand what is a Multi-layer perceptron, we
have to develop a multi-layer perceptron from scratch using Numpy.

The pictorial representation of multi-layer perceptron learning is as shown below-

MLP networks are used for supervised learning format. A typical learning algorithm for MLP
networks is also called back propagation's algorithm.

A multilayer perceptron (MLP) is a feed forward artificial neural network that generates a
set of outputs from a set of inputs. An MLP is characterized by several layers of input nodes
connected as a directed graph between the input nodes connected as a directed graph
between the input and output layers. MLP uses backpropagation for training the network.
MLP is a deep learning method.

Representation Power of a Network of a Perceptron


For Decision, we will assume True(+1) and false (-1)

We consider two inputs and 4 perceptron


W1 w2 w3 w4

11

Bias=-2

The red lineshowsa weight value of -1 whereas the blue line represent1

=-1

=1

 Bias of each perceptron is -2.


 Perceptron will file only of weight ,sum of its input>=2

 Input Layer-
 Hidden Layer-
 Output Layer-

 Output of 4 Perceptron in the hidden layer , , ,


 Layer1 weight- ,
 Layer 2 Weight- , , ,
Claim that this network can be used to implement any Boolean function
(Linearly separable or not)

XOR Function: -
Let, be the bias output of the neuron,

XOR

0 0 0 1 0 0 0
0 1 1 0 1 0 0
1 0 1 0 0 1 0
1 1 0 0 0 0 1

<=0
>=
>=
<=

If we consider =0 then we adjust the weight of , , , and can be implement


XOR function

Sigmoid Neuron:

The building block of the deep neural networks is called the sigmoid neuron. Sigmoid
neurons are similar to perceptron, but they are slightly modified such that the output from
the sigmoid neuron is much smoother than the step functional output from perceptron.

Perceptron model takes several real-valued inputs and gives a single binary output. In the
perceptron model,From the mathematical representation, we might say that the
thresholding logic used by the perceptron is very harsh.

e.g. Deciding whether we will like or dislike the moviebased decision only on one I/P

If we set threshold is 0.5 then then what would be the decision for a movie with critics
rating=0.51(like) , 0.49(dislike)

Consider the logic is harsh

it seems harsh that we would like a movie withrating 0.51 but not one with 0.49

Introducing sigmoid neurons where the output function is much smoother than the step
function. In the sigmoid neuron, a small change in the input only causes a small change in
the output as opposed to the stepped output. There are many functions with the
characteristic of an “S” shaped curve known as sigmoid functions. The most commonly used
function is the logistic function.

Y =

Where wTh = Y=

We no longer see a sharp transition at the threshold b. The output from the sigmoid neuron
is not 0 or 1. Instead, it is a real value between 0–1 which can be interpreted as probability.

Gradient descent.
Gradient Descent is defined as one of the most commonly used iterative optimization
algorithms of deep learning to train the models. It helps in finding the local minimum of a
function.

The main objective of using a gradient descent algorithm is to minimize the cost function
using iteration.To achieve this goal, it performs two steps iteratively:

 Calculates the first-order derivative of the function to compute the gradient or slope
of that function
 Move away from the direction of the gradient, which means slope increased from
the current point by alpha times, where Alpha is defined as Learning Rate. It is a
tuning parameter in the optimization process which helps to decide the length of the
steps.
 Vector of parameters randomly initialized = [w, b]
 Changes in w & b ∆ = [∆w, ∆b]
 Start with random guess, once we start with random guess then the change that
make to w, b so that we landed up in better situation i.e. error is less
 Add small changes which is also a vector
 new = [Wnew, bnew] moved in the direction of ∆
 Let be bit conservative more only by a small amount ⴄ
 new = ⴄ .∆
 ∆ = ? ∆ is more in the direction opposite to gradient
 Let ∆ = u from taylor series
What is Cost-function?

The cost function is defined as the measurement of difference or error between actual
values and expected values at the current position and present in the form of a single real
number.

How does Gradient Descent work?

Before starting the working principle of gradient descent, we should know some basic
concepts to find out the slope of a line from linear regression. The equation for simple linear
regression is given as:Y=MX+C

The starting point(shown in above fig.) is used to evaluate the performance as it is


considered just as an arbitrary point. At this starting point, we will derive the first derivative
or slope and then use a tangent line to calculate the steepness of this slope. Further, this
slope will inform the updates to the parameters (weights and bias).

The slope becomes steeper at the starting point or arbitrary point, but whenever new
parameters are generated, then steepness gradually reduces, and at the lowest point, it
approaches the lowest point, which is called a point of convergence.

The main objective of gradient descent is to minimize the cost function or the error
between expected and actual. To minimize the cost function, two data points are required:

 Direction & Learning Rate

Learning Rate:
It is defined as the step size taken to reach the minimum or lowest point. This is typically a
small value that is evaluated and updated based on the behavior of the cost function. If the
learning rate is high, it results in larger steps but also leads to risks of overshooting the
minimum. At the same time, a low learning rate shows the small step sizes, which
compromises overall efficiency but gives the advantage of more precision.
W x1 y1
Bx2 y2
We assume there is only one point to fit (x , y)
Prof.U.A.S.Gani
(Subject Teacher)

You might also like