0% found this document useful (0 votes)
33 views62 pages

Introduction to Neural Networks Basics

The document provides an introduction to neural networks, detailing the structure and function of biological and artificial neurons, including perceptrons and their learning algorithms. It discusses key concepts such as activation functions, weight and bias significance, and the gradient descent optimization method. Additionally, it addresses the limitations of single-layer perceptrons and the necessity of multi-layer networks for solving non-linearly separable problems.

Uploaded by

Senthilselvi A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
0% found this document useful (0 votes)
33 views62 pages

Introduction to Neural Networks Basics

The document provides an introduction to neural networks, detailing the structure and function of biological and artificial neurons, including perceptrons and their learning algorithms. It discusses key concepts such as activation functions, weight and bias significance, and the gradient descent optimization method. Additionally, it addresses the limitations of single-layer perceptrons and the necessity of multi-layer networks for solving non-linearly separable problems.

Uploaded by

Senthilselvi A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.

Introduction to Neural Network

[Link]
PROFESSOR
CSE - AIML
SRM IST, Ramapuram

1
[Link] Prof/CSE - AIML 1
Unit-1 Introduction to Neural Network
Biological neuron, Motivation from biological neuron, McCulloch Pitts Neuron,
Perceptron, Perceptron learning Algorithm, Representation power of a network of
perceptrons, Activation functions-Sigmoid, tanh, ReLU, leaky ReLU, Sigmoid neuron,
Gradient descent leaming Algorithm, Representation power of multilayer Network of
Sigmoid Neurons, Representation power of function: Complex functions in real world
examples, Feedforward Neural Networks, Learning parameters, output and loss
functions of FFN Networks, Backpropagation learning Algorithm, Applying chain rule
across in a neural network, Computing partial derivatives [Link] a weight

[Link] Prof/CSE - AIML 2 2


Biological Neuron
⦿Neurons are the basic functional units of the nervous system, and they
generate electrical signals called action potentials, which allows them to
quickly transmit information over long distances. Almost all the neurons
have three basic functions essential for the normal functioning of all the
cells in the body.
⦿These are to:
1. Receive signals (or information) from outside.
2. Process the incoming signals and determine whether or not the
information should be passed along.
3. Communicate signals to target cells which might be other neurons or
muscles or glands.

[Link] Prof/CSE-Aiml 3 3
Biological Neuron

4
[Link] Prof/CSE - AIML 4
Main parts of biological neuron
⦿ Dendrite
Dendrites are responsible for getting incoming signals from outside
Incoming signals can be either excitatory — which means they tend to make the
neuron fire (generate an electrical impulse) — or inhibitory — which means that they tend to
keep the neuron from firing.
⦿ Soma

Soma is the cell body responsible for the processing of input signals and deciding whether a
neuron should fire an output signal
⦿ Axon

Axon is responsible for getting processed signals from neuron to relevant cells
⦿ Synapse
Synapse is the connection between an axon and other neuron dendrites

[Link] Prof/CSE- AIML 5 5


Artificial neuron

• Artificial neuron also known as perceptron is the basic unit of the


neural network. In simple terms, it is a mathematical function based on
a model of biological neurons.
• It can also be seen as a simple logic gate with binary outputs .

6
[Link] Prof/CSE - AIML 6
Main Functions of Artificial neuron

• Takes inputs from the input layer


• Weighs them separately and sums them up
• Pass this sum through a nonlinear function to produce output.

7
[Link] Prof/CSE -AIML 7
Biological Neuron Vs
Artificial Neuron

8
[Link] Prof/CSE - AIML 8
McCulloch-Pitts
Neuron Model
Binary neuron model (1943):
o Takes binary inputs (0 or 1).
o Applies weighted sum and threshold.
o Output is 1 if sum ≥ threshold, else 0.

9
[Link] Prof/CSE -AIML 9
Perceptron

10
[Link] Prof/CSE -AIML 10
Parts of Perceptron
⦿ Input layer

⦿Weights and Bias

⦿Activation Function

⦿Output Layer

[Link] Prof/CSE - AIML 11 11


Comparison between MP Neuron
Model and Perceptron Model
• Both, MP Neuron Model as well as the Perceptron model work on linearly
separable data.
• MP Neuron Model only accepts boolean input whereas Perceptron Model can
process any real input.
• Inputs aren’t weighted in MP Neuron Model, which makes this model less
flexible. On the other hand, Perceptron model can take weights with respective
to inputs provided.
• While using both the models we can adjust threshold input to make the model
fit the dataset.

12
[Link] Prof/CSE - AIML 12
Perceptron Learning Algorithm
1. First, multiply all input values with corresponding weight values and then add them to
determine the weighted sum. Mathematically, we can calculate the weighted sum as
follows: ∑wi∗xi=x1∗w1+x2∗w2+…+wn∗xn Add another essential term called bias 'b' to the
weighted sum to improve the model performance. ∑wi∗xi+b
2. Next, an activation function is applied to this weighed sum, producing a binary or a
continuous-valued output. Y=f(∑wi∗xi+b)
3. Next, the difference between this output and the actual target value is computed to get the
error term, E, generally in terms of mean squared error. The steps up to this form the forward
propagation part of the algorithm. E=(Y−Yactual)2

13
[Link] Prof/CSE - AIML 13
Perceptron Learning Algorithm
[Link] optimize this error (loss function) using an optimization algorithm. Generally, some form of
gradient descent algorithm is used to find the optimal values of the hyperparameters like learning
rate, weight, Bias, etc. This step forms the backward propagation part of the algorithm.

14
[Link] Prof/CSE - AIML 14
Importance of Weight and Bias
• Weight increases the steepness of activation function. This means weight decide how fast the
activation function will trigger whereas bias is used to delay the triggering of the activation
function.
• The weight shows the effectiveness of a particular input. More the weight of input, more it will
have impact on network.
• On the other hand Bias is like the intercept added in a linear equation. It is an additional
parameter in the Neural Network which is used to adjust the output along with the weighted sum
of the inputs to the neuron.
• Therefore Bias is a constant which helps the model in a way that it can fit best for the given
data.

[Link] Prof/CSE - AIML 15 15


Importance of Weight and Bias

• y = mx+c
Where m = weight and c = bias
• Now, Suppose if c was absent, then the graph will be
formed like in figure
• Due to absence of bias, model will train over point
passing through origin only, which is not in accordance
with real-world scenario.
• Also with the introduction of bias, the model will
become more flexible.

[Link] Prof/CSE - AIML 16 16


Importance of Weight and Bias

[Link] Prof/CSE - AIML 17 17


Importance of Weight and Bias -
Example
Change in weight
• weight W1 changed from
1.0 to 4.0
• weight W2 changed from -
0.5 to 1.5
• On increasing the weight the
steepness is increasing.
• Therefore it can be inferred
that More the weight earlier
activation function will
trigger.

[Link] Prof/CSE - AIML 18 18


Importance of Weight and Bias -
Example
Bias changed from -1.0 to -
5.0
The change in bias is
increasing the value of
triggering activation function.
Therefore it can be inferred
that from above graph that,
bias helps in controlling the
value at which activation
function will trigger.

[Link] Prof/CSE - AIML 19 19


Example

output = sum (weights * inputs) + bias


y = f(x) = Σxiwi

[Link] Prof/CSE - AIML 20 20


Example

[Link] Prof/CSE - AIML 21 21


Activation Function
• An activation function is a function that is added into an artificial neural network in order to
help the network learn complex patterns in the data.
• When comparing with a neuron-based model that is in our brains, the activation function is at
the end deciding what is to be fired to the next neuron.
• That is exactly what an activation function does in an ANN as well. It takes in the output
signal from the previous cell and converts it into some form that can be taken as input to
the next cell.
1. Sigmoid Function
2. Softmax

22
[Link] Prof/CSE - AIML 22
Activation Function

[Link] Prof/CSE - AIML 23 23


Sigmoid Neuron

• Similar to perceptron but with sigmoid activation.


• Continuous output between 0 and 1.
• Useful for probabilistic interpretation.

[Link] Prof/CSE - AIML 24 24


Softmax Vs Sigmoid

[Link] Prof/CSE - AIML 25 25


Softmax

[Link] Prof/CSE - AIML 26 26


Single Layer Feed Forward
Output
Neurons
y1_out
Input y1_in Y1
Neurons w11
x1 w21
X1 w12 y2_in
w13 y2_out
w14 Y2
w22 w
x2 32
X2 y3_in
w23 Y3 y3_out
w31
x3 w33
X3 w24
w34 y4_in
y4_out
Y4

[Link] Prof/CSE - AIML 27 27


Multi Layer Feed Forward

[Link] Prof/CSE - AIML 28 28


Multi Layer Feed Forward

[Link] Prof/CSE - AIML 29 29


Simple Classification Problem

[Link] Prof/CSE - AIML 30 30


Simple Classification Problem

[Link] Prof/CSE - AIML 31 31


Simple Classification Problem

[Link] Prof/CSE - AIML 32 32


XOR Problem
• Most of the real life classification
problems are not linearly
separable
• A perceptron cannot learn to
compute even a 2-bit XOR as it is
non linearly separable
• There is no single straight line to
separate the patterns producing 1s
{(0,1), (1,0)} from the patterns
producing 0s {(0,0), (1,1)}
• How to overcome this limitation?
1. Draw a curved decision
surface. But a perceptron
cannot model any curved
surface
2. To employ two decision
lines (Multi-layered
perceptron)

[Link] Prof/CSE - AIML 33 33


Error Perceptron
• In the Perceptron Learning
Rule, the predicted output is
compared with the known
output. If it does not match,
the error is propagated
backward to allow weight
adjustment to happen.

[Link] Prof/CSE - AIML 34 34


NEURAL NETWORK IMPLEMENTATION
FROM SCRATCH

[Link] Prof/CSE - AIML 35 35


WHAT IS LOGICAL OR GATE?
• Straightforwardly, when one of the inputs is 1, the output of the OR
gate is going to be 1. It means that the output is 0 only when both
of the inputs are 0.

[Link] Prof/CSE - AIML 36 36


TRUTH-TABLE FOR OR GATE:

[Link] Prof/CSE - AIML 37 37


PERCEPTRON FOR THE OR GATE:

[Link] Prof/CSE - AIML 38 38


[Link] Prof/CSE - AIML 39 39
[Link] Prof/CSE - AIML 40 40
[Link] Prof/CSE - AIML 41 41
ERROR CALCULATION:

[Link] Prof/CSE - AIML 42 42


WHAT IS GRADIENT DESCENT?

• Gradient Descent is an optimization algorithm used in machine


learning models to find the minimum value of a cost function.
• It does this by taking small steps in the direction that is opposite
to the gradient of the cost function until it reaches a local
minimum.
• The learning rate determines the size of each step and can be
adjusted to balance convergence speed and accuracy.

[Link] Prof/CSE - AIML 43 43


WHAT IS GRADIENT DESCENT?

• For updating weight values, we are going to use a gradient


descent algorithm.
• Gradient Descent is a machine learning algorithm that
operates iteratively to find the optimal values for its
parameters. It takes into account, user-defined learning rate,
and initial parameter values.

[Link] Prof/CSE - AIML 44 44


WHAT IS GRADIENT DESCENT?

[Link] Prof/CSE - AIML 45 45


GRADIENT DESCENT WORKING

Working: (Iterative)
• 1. Start with initial values.
• 2. Calculate cost.
• 3. Update values using the update function.
• 4. Returns minimized cost for our cost
function

[Link] Prof/CSE - AIML 46 46


WHY DO WE NEED IT?

• Generally, what we do is, we find the


formula that gives us the optimal values
for our parameter. However, in this
algorithm, it finds the value by itself!.
Formula for Gradient descent algorithm

[Link] Prof/CSE - AIML 47 47


Learning Rate

[Link] Prof/CSE - AIML 48 48


DERIVATION OF THE FORMULA USED IN A
NEURAL NETWORK
• what we want to find is how a particular
weight value affects the error. To find that
we are going to apply the chain rule.

[Link] Prof/CSE - AIML 49 49


CALCULATING DERIVATIVES:

[Link] Prof/CSE - AIML 50 50


• In our case:
• Output = 0.68997
Target = 1

[Link] Prof/CSE - AIML 51 51


FINDING THE SECOND PART OF THE
DERIVATIVE:

[Link] Prof/CSE - AIML 52 52


[Link] Prof/CSE - AIML 53 53
[Link] Prof/CSE - AIML 54 54
FINDING THE THIRD PART OF THE DERIVATIVE

Putting it all together:

[Link] Prof/CSE - AIML 55 55


• Putting it in our main equation:

w2=0.3-(0.5)*(-0.06631)
w2=0.3033
Notice that the value of the weight has increased
here. We can calculate all the values in this way, but
as we can see, it is going to be a lengthy process. So
now we are going to implement all the steps in
Python.

[Link] Prof/CSE - AIML 56 56


SUMMARY OF THE MANUAL
IMPLEMENTATION OF A NEURAL
NETWORK:

a. Input for perceptron:

b. Applying sigmoid function for predicted output :

c. Calculate the error:

57
[Link] Prof/CSE - AIML 57
d. Changing the weight value based on gradient descent formula:

e. Calculating the derivative:

f. Individual derivatives:

g. After then we run the same code with updated weight values.
58
IMPLEMENTATION OF A NEURAL
NETWORK IN PYTHON:

10.1 Import Required libraries:

10.2 Assign Input values:

[Link] Prof/CSE - AIML 59 59


10.3 Target Output:

10.3 Assign the Weights :

[Link] Prof/CSE - AIML 60 60


10.4 Adding Bias Values and Assigning a Learning Rate :

10.5 Applying a Sigmoid Function:

10.6 Derivative of sigmoid function:

[Link] Prof/CSE - AIML 61 61


10.7 The main logic for predicting output and updating the weight values:

[Link] Prof/CSE - AIML 62 62

You might also like