Artificial Neural Networks
Recurrent neural networks
A recurrent neural network is a model used for sequential data or time series prediction. Recurrent NNs, have feedback
connections to model the temporal characteristics of the problem being learned. For example, a recurrent neural network
can make stock market predictions by calculating what is likely to happen in the future based on what happened in the
past. You can also use a recurrent neural network for tasks like translation, where the sequence of words changes based
on the language, such as a noun before or after an adjective.
In addition to the architecture found in the feedforward neural network, a recurrent network uses loops to circle the data
back through the hidden layers before returning an output. Sometimes, recurrent neural networks include specialized
hidden layers called context layers, which provide feedback to the neural network and help it become more accurate.
09/20/2025 2
Convolutional neural networks
Convolutional neural networks are particularly skilled at recognizing patterns and images, which makes them important for AI
technology like computer vision, among other uses. For example, the CNN can be used to recognize handwritten scripts and
classify images based on their features. Convolutional neural networks are different from other networks because of their
architecture and because the CNN nodes have shared weights and bias values, unlike feedforward or recurrent neural networks.
How does a convolutional neural network architecture work?
In addition to input and output layers, convolutional neural networks contain two main types of hidden layers: convolutional
and pooling. Convolutional layers filter the input, typically an image, to extract various features. This data then feeds into a
pooling layer, simplifying the parameters but keeping important information. The process repeats many times, sometimes
including other layers, such as a multilayer perceptron or a rectified linear unit for activation.
09/20/2025 3
Convolutional neural networks
A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm that can take in an input image, assign
importance (learnable weights and biases) to various aspects/objects in the image, and be able to differentiate one from
the other. The pre-processing required in a ConvNet is much lower as compared to other classification algorithms. While
in primitive methods filters are hand-engineered, with enough training, ConvNets have the ability to learn these
filters/characteristics.
The architecture of a ConvNet is analogous to that of the connectivity pattern of Neurons in the Human Brain and was
inspired by the organization of the Visual Cortex. Individual neurons respond to stimuli only in a restricted region of the
visual field known as the Receptive Field. A collection of such fields overlap to cover the entire visual area.
A ConvNet is able to successfully capture the Spatial and Temporal dependencies in an image through the application of
relevant filters. The architecture performs a better fitting to the image dataset due to the reduction in the number of parameters
involved and the reusability of weights. In other words, the network can be trained to understand the sophistication of the
image better.
Students may refer to the below links for a comprehensive understanding of CNN
https://youtu.be/zfiSAzpy9NM?feature=shared
https://towardsdatascience.com/a-comprehensive-guide-to-
convolutional-neural-networks-the-eli5-way-3bd2b1164a53
09/20/2025 4
CNN - Stride:
...
09/20/2025 5
Computation of the size of the convoluted matrix
09/20/2025 6
Numericals:
1. Consider an input image of size 3*3 applied to a CNN model with a convolution layer with a filter of size 2*2.
Convolve the image with the given filter size and a stride size of 1 without padding the input image?
Solution:
09/20/2025 7
Solution cont…
09/20/2025 8
2. Consider that a CNN consists of Input Image Size: 64 × 64 × 3 (RGB)
Convolutional Layer 1: 20 filters of size 5×5, stride 1, no padding
Convolutional Layer 2: 15 filters of size 5X5, stride 3, padding of 1
Max Pooling Layer: 2×2, stride 2
Fully Connected Layer: Two hidden layers with 128 and 32 neurons respectively
a. Calculate the output dimensions after each layer
b. Compute the total number of parameters at each layer (excluding pooling layers)
09/20/2025 9
Parameters:
Each filter: 5×5×3 = 75 weights =75
Total: 75 × 20 = 1,500 parameters
5
-- ---
57/3-----------
-- -- + 1 = 20
3
-------------- 20X20X15
------------------------------
5X5X20X15 =7500
------------------------------
7500
09/20/2025 10
20X20X15
---
----------------------------
18/2 +1 =10 -------------
----
10X10X15 =1500
10X10X15 =1500
-----------------------------------
1500
09/20/2025 11
1500
(1500X128)+128=192128
09/20/2025 12
Challenges with convolution:
There are primarily two disadvantages here:
• When we apply a convolutional operation, the size of the image shrinks every time.
• Pixels present in the corner of the image i.e, in the edges, are used only a few times during convolution as compared to
the central pixels. Hence, we do not focus too much on the corners so it can lead to information loss.
To overcome these problems, we can apply the padding to the images with an additional border, i.e., we add one
pixel all around the edges. This means that the input will be of the dimension 8 X 8 instead of a 6 X 6 matrix. Applying
convolution on the input of filter size 3 X 3 on it will result in a 6 X 6 matrix which is the same as the original shape of
the image. This is where Padding comes into the picture:
Padding: In convolution, the operation reduces the size of the image i.e, spatial dimension decreases thereby leading to
information loss. As we keep applying convolutional layers, the size of the volume or feature map will decrease faster.
Zero Paddings allow us to control the size of the feature map.
Padding is used to make the output size the same as the input size.
Padding amount = number of rows and columns that we will insert in the top, bottom, left, and right of the image. After
applying padding,
09/20/2025 13
Convolution Layer:
09/20/2025 14
Pooling Layer:
Advantages of Pooling Layer:
1. Dimensionality reduction: The main advantage of pooling layers is that they help in reducing the spatial dimensions of
the feature maps. This reduces the computational cost and also helps in avoiding overfitting by reducing the number of
parameters in the model.
2. Translation invariance: Pooling layers are also useful in achieving translation invariance in the feature maps. This means
that the position of an object in the image does not affect the classification result, as the same features are detected regardless
of the position of the object.
3. Feature selection: Pooling layers can also help in selecting the most important features from the input, as max pooling
selects the most salient features and average pooling preserves more information.
Disadvantages of Pooling Layer:
1. Information loss: One of the main disadvantages of pooling layers is that they discard some information from the input
feature maps, which can be important for the final classification or regression task.
2. Over-smoothing: Pooling layers can also cause over-smoothing of the feature maps, which can result in the loss of some
fine-grained details that are important for the final classification or regression task.
3. Hyperparameter tuning: Pooling layers also introduce hyperparameters such as the size of the pooling regions and the
stride, which need to be tuned in order to achieve optimal performance. This can be time-consuming and requires some
expertise in model building.
09/20/2025 15
Numerical: Home Work
Find the output after Convolution with Pooling.
09/20/2025 16
09/20/2025 17