Unit -4 Artificial Neural Networks
Unit -4 Artificial Neural Networks
Forward Propagation
Back Propagation
Forward Propagation
Activation functions
Sigmoid function
Tanh function
ReLU function
ReLU function
Backpropagation
Gradient Descent
Adam optimizer
Terminologies:
Metrics
Epoch
Hyperparameters
Number of layers
Initialization of weights
Loss function
Metric
Optimizer
Number of epochs
MICE Imputation
Log transformation
Ordinal Encoding
Target Encoding
Z-Score Normalization
For the detailed implementation of the above-mentioned
steps refer my notebook on data preprocessing
Notebook Link
Neural Architecture
Multilayer Perceptron
The Multilayer Perceptron was developed to tackle this
limitation. It is a neural network where the mapping
between inputs and output is non-linear.
Each layer is feeding the next one with the result of their
computation, their internal representation of the data. This
goes all the way through the hidden layers to the output
layer.
Backpropagation
Backpropagation is the learning mechanism that allows the
Multilayer Perceptron to iteratively adjust the weights in
the network, with the goal of minimizing the cost function.
After reading a few pages, you just had a much better idea.
Why not try to understand if guests left a positive or
negative message?
BACKPROPAGATION ALGORITHM
Input layer
Hidden layers
l=2
l=3
Looking carefully, you can see that all of x, z², a², z³, a³,
W¹, W², b¹ and b² are missing their subscripts presented in
the 4-layer network illustration above. The reason is that
we have combined all parameter values in matrices,
grouped by layers. This is the standard way of working
with neural networks and one should be comfortable with
the calculations. However, I will go over the equations to
clear out any confusion.
Equation for x
Equation for z²
This leads to the same “Equation for z²” and proofs that
the matrix representations for z², a², z³ and a³ are correct.
Output layer
and
Multilayer perceptrons (MLPs) are a type of artificial neural network that is commonly
used for classification and regression tasks. However, MLPs have some limitations
that can make them difficult to use in certain applications.
Overfitting: MLPs can be prone to overfitting, which means that they can learn
the training data too well and not generalize well to new data. This can be a
problem if the training data is not large or representative enough.
Vanishing gradients: MLPs with multiple hidden layers can suffer from
vanishing gradients, which means that the weights of the connections
between neurons can become very small. This can make it difficult for the
network to learn, as the updates to the weights become very small.
Interpretability: MLPs can be difficult to interpret, as the weights of the
connections between neurons can be complex and difficult to understand.
This can make it difficult to understand how the network is making its
predictions.
Computational complexity: MLPs can be computationally expensive to train,
especially if the training data is large or the network has many hidden layers.
Despite these limitations, MLPs are a powerful tool for machine learning and can be
used to solve a variety of problems. However, it is important to be aware of the
limitations of MLPs and to take steps to mitigate them.
The generalized delta rule can be used to train neural networks with multiple layers.
It works by adjusting the weights of the connections between neurons so that the
network's output is closer to the desired output.
The generalized delta rule is a gradient descent algorithm, which means that it
updates the weights of the connections in the direction of the steepest descent of the
error function. The error function is a measure of how far the network's output is from
the desired output.
The generalized delta rule is a powerful learning rule that can be used to train neural
networks to solve a variety of problems. However, it can be computationally
expensive to train neural networks with many layers using the generalized delta rule.
where:
The generalized delta rule is a powerful learning rule that can be used to train neural
networks to solve a variety of problems. However, it can be computationally
expensive to train neural networks with many layers using the generalized delta rule.