0% found this document useful (0 votes)
123 views25 pages

Multi Layer Perceptron

The document discusses the multilayer perceptron neural network model and the backpropagation learning algorithm. It describes the algorithm's use of gradient descent to minimize error and update weights to propagate error backward through the network layers. It also discusses training the network with examples and the effects of hidden nodes and training patterns on function approximation.

Uploaded by

gauravlodhi983
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views25 pages

Multi Layer Perceptron

The document discusses the multilayer perceptron neural network model and the backpropagation learning algorithm. It describes the algorithm's use of gradient descent to minimize error and update weights to propagate error backward through the network layers. It also discusses training the network with examples and the effects of hidden nodes and training patterns on function approximation.

Uploaded by

gauravlodhi983
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Multi Layer Perceptron

Multi Layer Perceptron


The multilayer perceptron (MLP) belongs to the class of feedforward networks, meaning that the information flows among
the network nodes exclusively in the forward direction.

Multilayer perceptron with an input layer, three hidden layers, and an output layer
Backpropagation learning algorithm
The algorithm is based on the gradient descent technique for solving an optimization problem, which involves the
minimization of the network cumulative error Ec

where the index i represents the i-th neuron of


the output layer composed of a total number of
q neurons

being the square of the Euclidian norm of the vectorial difference between the k-th target output vector t(k) and the
k-th actual output vector o(k) of the network

Consider n is the number of training patterns presented to the network for learning purposes

The algorithm is designed in such a way as to update the weights in the direction of the gradient
descent of the cumulative error (with respect to the weight vector).
Off line (all the training patterns are
presented to the system at once) or on line
(training is made pattern by pattern)
with respect to the vector w(l ) corresponding to all
interconnection weights between layer (l ) and the preceding
layer (l − 1)

The signal toti (l ) represents the sum of all signals reaching


node (i) at hidden layer (l) coming from previous layer
(l-1).
• Using chain rule differentiation we obtain:

• For the case where layer (l) is the output layer (L), above equation
can be expressed as:
• Considering the case where f is the sigmoid function
• The error signal becomes expressed as:

• Propagating the error backward now, and for the case where (l)
represents a hidden layer (l < L), the expression of Δwij(l) is given as follows

• where the error signal δi(l) is now expressed as a function of output of


previous layers as:
To illustrate this powerful
algorithm, we apply it for the
training of the following
network shown in Figure.
The following three training
pattern pairs are used, with x
and t being the input and the
output data respectively:
Momentum
Effect of Hidden Nodes on Function Approximation
Effect of training patterns on function approximation

You might also like