0% found this document useful (0 votes)
32 views16 pages

Artificial Neural Networks - Lect - 3

This document discusses multilayer neural networks and the backpropagation algorithm for training neural networks. It begins by introducing multilayer neural networks and feed-forward networks. It then describes the notations used for multilayer networks and provides details on training neural networks with backpropagation, including computing gradients and error propagation. Finally, it works through a step-by-step example of applying backpropagation to update weights in a simple neural network.

Uploaded by

ma5395822
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views16 pages

Artificial Neural Networks - Lect - 3

This document discusses multilayer neural networks and the backpropagation algorithm for training neural networks. It begins by introducing multilayer neural networks and feed-forward networks. It then describes the notations used for multilayer networks and provides details on training neural networks with backpropagation, including computing gradients and error propagation. Finally, it works through a step-by-step example of applying backpropagation to update weights in a simple neural network.

Uploaded by

ma5395822
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Multilayer neural networks

1
Objectives
• Multilayer neural networks

• Training a Neural Network with Backpropagation

• Backpropagation learning algorithm

• Solved example

2
Multilayer Neural Networks
• Multilayer neural network: it is a neural network containing more than
computational layers (hidden layers).
• The specific architecture of multilayers neural networks is called, feed-
forward networks
• The default architecture of feed-forward is that all nodes in one layer
are connected to those in the next layers.
• Bias neuron can be used in hidden layers or in output layer.
• A fully connected architecture perform well in many settings, but
better performance is often achieved by pruning many connections or
sharing them in an insightful way.

3
Notations of Multilayer NNs
Two notations:
We can describe
multilayer network in
two ways(notation):
- vector notation
- Scalar notation

4
Training NN with Backpropagation
• In multilayer NN the loss (refers to the error)is a complicated composition
function in the earlier weights.
• Loss tells us how poorly the model is performing at that current instant
• Now we need to use this loss to train our network such that it performs
better.
• By minimizing the loss, our model is going to perform better.
• How do we minimize the loss?
By using an optimization algorithm(types of optimizers will be discussed later ).
• A popular optimization algorithm is called “Gradient Descent”
• Gradient of a composition function is computed by “Backpropagation”
algorithm.

5
Gradient
• Gradient: is a graphical slope representing
the relationship between a network’s
weights and its error”.
• we can look to gradient as a measure of how
much the output of a function changes if you
change the inputs a little bit.

• Our aim is to get to the bottom of our


graph(Cost vs weights), or to get a point
(local minimum (least value of cost
function)) where we can no longer move
downhill.
6
Backpropagation
• Backpropagation: computes the error gradients as the summation of
local gradients products over the various paths from a node to the
output node.
• Backpropagation can be computed efficiently by dynamic programing.
• It contains two phases:
1- forward phase: compute the output values and local derivatives(of
loss function) at various nodes.
2- backward phase: accumulate the products(derivatives) of these
local values over the paths from the node to output node to learn the
gradients of loss function with respect to weights.

7
Major steps of backpropagation learning algo.
• The network is first initialized by setting up all its weights to be small
random numbers – say between –1 and +1.
• the output at various nodes are calculated (this is forward pass).
• The calculated output is completely different to what you want (the
Target), since all the weights are random.
• We then calculate the Error of each neuron, which is essentially:
=Target – Actual output.
• This error is then used mathematically to change the weights in such a way
that the error will get smaller.
• In other words, the Output of each neuron will get closer to its Target (this
part is the reverse pass.
• The process is repeated until the error is minimal.
• “The next slides show math. equations that control the process of updating
the weights”
8
Backpropagation (Cont.)

Eq. 3.1

9
Backpropagation (Cont.)
• Consider a sequence of hidden layers:

• The partial derivative of loss function with respect to weights is:

Based on the type of activation, we can define delta as follows:

Eq. 3.2

10
Backpropagation(Cont.)
• For more clarification, consider you have the following network:

• Based on “ sigmoid activation”, we can rewrite the final equations of


backpropagation as follows:

11
Eq. 3.3

12
Solved example
• It is important for you to execute and practice backpropagation algorithm
step by step.
• You can follow Equations 3.1 or 3.3
• Consider the following network

Assume that the neurons have a Sigmoid activation function and:


(Optionally , ignore learning rate or set it as 0.1)
(i) Perform a forward pass on the network.
(ii) Perform a reverse pass (training) once (Target = 0.5).
(iii) Perform a further forward pass
13
Solved example(Cont.)
Answer(i):
• Input to top neuron = (0.35*0.1)+(0.9*0.8)=0.755.
Out f(net)= 1 1+𝑒 −0.755 =0.68 (by sigmoid)
• Input to bottom neuron = (0.9*0.6)+(0.35*0.4) = 0.68.
Out f(net) =1 1+𝑒 −0.68 = 0.6637
Input to final neuron = (0.3*0.68)+(0.9*0.6637) = 0.80133.
Out = 1 1+𝑒 −0.801 =0.69
E=d – o = 0.5 – 0.69 = -0.19.

14
Solved example(Cont.)
Answer(ii):
1. Output error δ=(t-o)(1-o)o = (0.5 - 0.69)(1- 0.69)*0.69 = -0.0406
2. New weights for output layer
• w1+ = w1+(δ * input) = 0.3 + (-0.0406*0.68) = 0.272392.
• w2+ = w2+(δ * input) = 0.9 + (-0.0406*0.6637) = 0.87305.
3. Errors for hidden layers:
• δ1 = (δ * w1)* (1-o)o = -0.0406 * 0.272392 *(1-o)o = -2.406* 10−3
• δ2= (δ * w2)* (1-o)o = -0.0406 * 0.87305 * (1-o)o = -7.916* 10−3
4. New hidden layer weights:
• w3+= 0.1 + (-2.406 * 10−3 * 0.35) = 0.09916.
• w4+ = 0.8 + (-2.406 * 10−3 * 0.9) = 0.7978.
• w5+ = 0.4 + (-7.916 * 10−3 * 0.35) = 0.3972.
• w6+ = 0.6 + (-7.916 * 10−3 * 0.9) = 0.5928
15
Solved example(Cont.)
Answer(iii) :
• Old error was -0.19.
• When you execute forward pass again by using new weights,
the new error = -0.18205.
• “Try further forward pass by your self ”

16

You might also like