0% found this document useful (0 votes)

34 views

AITools Unit-4

The document provides information about deep learning including: 1) It describes the motivation for deep learning as overcoming obstacles that simple machine learning algorithms have not succeeded at such as object and speech recognition. 2) It defines deep learning as simulating human brain pattern recognition by passing input through various layers of a neural network and has been applied to fields like computer vision and natural language processing. 3) It gives a brief historical background of deep learning including the first drawing of a neuron in 1899 and how artificial neurons were inspired by biological neurons to receive and pass signals between layers.

Uploaded by

Siva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

AITools Unit-4

Uploaded by

Siva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

UNIT-4

Deep Learning: Basics of Deep Learning, Machine Learning Vs Deep

Learning, Fundamental Deep Learning Algorithm-Convolution Neural
Network (CNN).

Q) Describe the Motivation for Deep Learning.

The simple machine learning algorithms work very well on a wide variety of
important problems. However, they have not succeeded in solving the
central problems in AI, such as recognizing speech or recognizing
objects.Deep learning was designed to overcome these and other obstacles.

Q) Define Deep Learning(DL).

Deep learning is an aspect of artificial intelligence (AI) that is to simulate
the activity of the human brain specifically, pattern recognition by passing
input through various layers of the neural network.
Deep-learning architectures such as deep neural networks, deep belief
networks, recurrent neural networks and convolutional neural
networks have been applied to fields including computer vision, machine
vision, speech recognition, natural language processing, audio recognition,
social network filtering, machine translation, bioinformatics, drug
design, medical image analysis, material inspection and board
game programs.

Q)Give brief historical background of Deep Learning.

All the algorithms that are used in deep learning are largely inspired by the
way neurons and neural networks function and process data in the brain.
This image is one of the very first pictures of aneuron. It was drawn by
Santiago Ramon y Cajal, back in 1899 based on what he saw after placing a
pigeon's brain under the microscope. He is now known as the fatherof
modern neuroscience.

Fig. Human Neural Functioning

It is possible to mimic certain parts of neurons, such as dendrites, cell bodies
and axons using simplified mathematical models of what limited knowledge
we have on their inner workings: signals can be received from dendrites, and
sent down the axon once enough signals were received. This outgoing signal
can then be used as another input for other neurons, repeating the process.
Some signals are more important than others and can trigger some neurons
to fire easier. Connections can become stronger or weaker, new connections
can appear while others can cease to exist.

Fig. Biological Neuron

An artificial neuron behaves in the same way as a biological neuron. So it
consists of a soma(cell body for processing information), dendrites(input),
and an axon terminal to pass on the output of this neuron to other
neurons. The end of the axon can branch off to connect to many other
neurons.

Q) Differentiate a Biological neuron and and Artificial neuron.

Biological neuron Artificial
neuron
dendrites inputs
synapses weight or inter connection
axon output
cell body (Soma) summation and threshold

Q) Define Artificial Neuron(AN). Explain the computation/processing of

AN with an example.

Neural Networks are networks used in Deep Learning that work similar to
the human nervous system.
An artificial neuron is a mathematical function conceived as a model
of biological neurons, a neural network. Artificial neurons are elementary
units in an artificial neural network.
The artificial neuron receives one or more inputs and sums them to produce
an output by applying some activation function.
Fig. Artificial Neuron

For the above general model of artificial neural network, the net input can
be calculated as follows:
yin= (x1.w1+x2.w2+x3.w3…xm.wm) + bias
i.e., yin= ∑imxi.wi+ bias
where, Xi is set of features and Wi is set of weights.
Bias is the information which can impact output without being dependent
on any feature.
The output can be calculated by applying the activation function over the
net input.
Y=F(yin)
Each AN has an internal state, which is called an activation signal. Output
signals, which are produced after combining the input signals and activation
rule, may be sent to other units.
Eg.
Q) Construct a single layer neural network for implementing OR, AND,
NOT gates.

Let us take the activation function as:

The AND function can be implemented as:

The output of this neuron is:a = f( -1.51 + x11 + x2*1 )

Calculation for summation:
X1 = 0, x2 = 0 => f(-1.5 + 0 + 0) = f(-1.5) = 0
0 1 =>f(-1.5 + 0 + 1) = f(-0.5) = 0
1 0 =>f(-1.5 + 1 + 0) =f(-0.5) = 0
1 1 =>f(-1.5 + 1 +1) = f(+0.5) = 1
The truth table for this implementation is:

The OR function can be implemented as:

The output of this neuron is:a = f( -0.5 + x1 + x2 )

The truth table for this implementation is:
The NOT function can be implemented as:

The output of this neuron is:a = f( 1 – 2*x1 )

The truth table for this implementation is:

Q) Explain the need for multi-layered neural network with an example.

1. XOR:

XOR(A,B) = (A+B)*(AB)|
This sort of a relationship cannot be modeled using a single neuron. Thus
we will use a multi-layer network.
The idea behind using multiple layers is that complex relations can be
broken into simpler functions and combined.
2. XNOR function looks like:

Lets break down the XNOR function.

X1 XNOR X2 = NOT ( X1 XOR X2 )
= NOT [ (A+B).(A'+B') ]
= (A+B)' + (A'+B')'
= (A'.B') + (A.B)
a neuron to model A’.B’:

The output of this neuron is:a = f( 0.5 – x1 – x2 )

The truth table for this function is:

The different outputs represent different units:

a1: implements A‟.B‟
a2: implements A.B
a3: implements OR which works on a1 and a2, thus effectively (A‟.B‟ + A.B)
The functionality can be verified using the truth table:
Q) Define activation function. Explain different types of activation
functions.

 Activation Functions are extremely important feature of the Artificial

Neural Network. They basically decide whether a neuron should be
activated or not. It limits the output signal to a finite value.
 Activation Function does the non-linear transformation to the
input making it capable to learn more complex relation between input
and output. It make the network capable of learning more complex
pattern.
 Without an activation function, the neural network is just a linear
regression model as it performs only summation of product of input
and weights.
Eg. In the below image 2 requires a complex relation which is curve unlike a
simple linear relation in image 1.

Fig. Illustrating the need of Activation Function for a complex problem.

Activation function must be efficient and it should reduce the computation

time because the neural network sometimes trained on millions of data
points.

Types of AF:
The Activation Functions can be basically divided into 3 types-
1. Binary step Activation Function
2. Linear Activation Function
3. Non-linear Activation Functions

1. Binary Step Function

A binary step function is a threshold-based activation function. If the input
value is above or below a certain threshold, the neuron is activated and
sends exactly the same signal to the next layer.We decide some threshold
value to decide output that neuron should be activated or deactivated.It is
very simple and useful to classify binary problems or classifier.
Eg.f(x) = 1 if x > 0 else 0 if x <= 0

2. Linear or Identity Activation Function

As you can see the function is a line or linear. Therefore, the output of the
functions will not be confined between any range.

Fig: Linear Activation Function

Equation: f(x) = x
Range : (-infinity to infinity)
It doesn‟t help with the complexity or various parameters of usual data that
is fed to the neural networks
3. Non-linear Activation Function
The Nonlinear Activation Functions are the most used activation functions.
Nonlinearity helps to makes the graph look something like this.
Fig: Non-linear Activation Function
The main terminologies needed to understand for nonlinear functions are:
Derivative or Differential: Change in y-axis w.r.t. change in x-axis.It is also
known as slope.
Monotonic function: A function which is either entirely non-increasing or
non-decreasing.

The Nonlinear Activation Functions are mainly divided on the basis of

their range or curves-

Advantage of Non-linear function over the Linear function :

Differential is possible in all the non -linear function.
Stacking of network is possible, which helps us in creating deep neural nets.
It makes it easy for the model to generalize

3.1 Sigmoid(Logistic AF)(σ):

The main reason why we use sigmoid function is it exists between 0 to 1.
It is especially used for models where we have to predict the probability as
output. Since probability of anything exists only between the range of 0 and
1, sigmoid is the right choice.

Fig: Sigmoid Function (S-shaped Curve)

The function is differentiable and monotonic. But function derivative is
not monotonic.
The logistic sigmoid function can cause a neural network to get stuck at the
training time.
Advantages
1. Easy to understand and apply
2. Easy to train on small dataset
3. Smooth gradient, preventing “jumps” in output values.
4. Output values bound between 0 and 1, normalizing the output of each
neuron.
Disadvantages:
 Vanishing gradient—for very high or very low values of X, there is
almost no change to the prediction, causing a vanishing gradient
problem. This can result in the network refusing to learn further, or
being too slow to reach an accurate prediction.
 Outputs not zero centered.
 Computationally expensive

3.2 TanH(Hyperbolic Tangent AF):

TanH is also like logistic sigmoid but in better way. The range of the
TanHfunction is from -1 to +1.

TanH is often preferred over the sigmoid neuron because it is zero centred.
The advantage is that the negative inputs will be mapped strongly negative
and the zero inputs will be mapped near zero in tanh graph.

tanh(x) = 2 * sigmoid(2x) - 1

Fig. Sigmoid Vs Tanh

The function is differentiable and monotonic. But function derivative is

not monotonic.
Advantages
 Zero centered—making it easier to model inputs that have strongly
negative, neutral, and strongly positive values.
Disadvantages
 Like the Sigmoid function is also suffers from vanishing gradient
problem
 hard to train on small datasets

3.3 ReLU(Rectified Linear Unit):

The ReLU is the most used activation function. It is used in almost all
convolution neural networks in hidden layers only.
The ReLU is half rectified(from bottom). f(z) = 0, if z < 0
= z, otherwise
R(z) = max(0,z)
The range is 0 to inf.

Advantages
 Avoids vanishing gradient problem.
 Computationally efficient—allows the network to converge very
quickly
 Non-linear—although it looks like a linear function, ReLU has a
derivative function and allows for backpropagation

Disadvantages
 Can only be used with a hidden layer
 hard to train on small datasets and need much data for learning non-
linear behavior.
 The Dying ReLU problem—when inputs approach zero, or are
negative, the gradient of the function becomes zero, the network
cannot perform backpropagation and cannot learn.

The function and its derivative both are monotonic.

All the negative values are converted into zero, and this conversion rate is so
fast that neither it can map nor fit into data properly which creates a
problem.
Leaky ReLU Activation Function

We needed the Leaky ReLU activation function to solve the „Dying ReLU‟
problem.
Leaky ReLU we do not make all negative inputs to zero but to a value near
to zero which solves the major issue of ReLU activation function.

R(z) = max(0.1*z,z)

Advantages
 Prevents dying ReLU problem—this variation of ReLU has a small
positive slope in the negative area, so it does enable backpropagation,
even for negative input values
 Otherwise like ReLU
Disadvantages
 Results not consistent—leaky ReLU does not provide consistent
predictions for negative input values.

3.4 Softmax:

 Sigmoid able to handle more than two cases(class label).

 Softmax can handle multiple cases. Softmax function squeeze the
output for each class between 0 and 1 with sum of them is 1.
 It is ideally used in the final output layer of the classifier, where we
are actually trying to attain the probabilities.
 Softmax produces multiple outputs for an input array. For this
reason, we can build neural network models that can classify more
than 2 classes instead of binary class solution.

sigma = softmax
zi = input vector
e^{zi}} = standard exponential function for input vector
K = number of classes in the multi-class classifier
e^{zj} = standard exponential function for output vector
e^{zj} = standard exponential function for output vector
Advantages
Able to handle multiple classes only one class in other activation
functions—normalizes the outputs for each class between 0 and 1with the
sum of the probabilities been equal to 1, and divides by their sum, giving the
probability of the input value being in a specific class.
Useful for output neurons—typically Softmax is used only for the output
layer, for neural networks that need to classify inputs into multiple
categories.

Q) Explain about Deep feedforward networks or feedforward neural

networks or multilayer perceptron (MLP).
A deep neural network is a neural network with atleast two hidden layers.
Deep neural networks use sophisticated mathematical modeling to process
data in different ways.Traditional machine learning algorithms are linear,
deep learning algorithms are stacked in a hierarchy.

Fig. Deep Feedforward Network

Deep learning creates many layers of neurons, attempting to learn
structured representation, layer by layer.

The goal of a feedforward network is to approximate some function f ∗. For

example,for a classifier, y = f ∗(x) maps an input x to a category y.

A feedforward network defines a mapping y = f (x; θ) and learns the value of

the parameters θ that result in the best function approximation.

These models are called feedforward because information flows through the
function being evaluated from x, through the intermediate computations
used to define f, and finally to the output y. There are no feedback
connections in which outputs of the model are fed back into itself.

When feedforward neural networks are extended to include feedback

connections, they are called recurrent neural networks.

Feedforward networks are of extreme importance to machine learning

practitioners.They form the basis of many important commercial
applications. Forexample, the convolutional networks used for object
recognition from photos are aspecialized kind of feedforward network.

Feedforward neural networks are called networks because they are typically
represented by composing together many different functions. The model is
associated with a directed acyclic graph describing how the functions are
composed together.

For example, we might have three functions f (1), f (2), and f (3) connected in a
chain, to form f(x) = f(3)(f (2)(f(1) (x ))). This chain structure is most commonly
used structure of neural networks. In this case, f (1) is called the first layer of
the network called input layer used to feed the input into the network; f (2)
is called the second layer called hidden layer used to train the neural
network, and so on. The final layer of a feedforward network is called the
output layer that provides the output of the network. The overall length of
the chain gives the depth of the model and width of the model is number of
neurons in the input layer. It is from this terminology that the name “deep
learning” arises.

Q) Differentiate ML & DL.

1. Data dependencies for Performance:
When the data is small, deep learning algorithms don‟t perform that well.
This is because deep learning algorithms need a large amount of data to
understand it perfectly. On the other hand, traditional machine learning
algorithms with their handcrafted rules prevail in this scenario.
2. Hardware dependencies
Deep learning algorithms heavily depend on high-end machines, contrary to
traditional machine learning algorithms, which can work on low-end
machines. Deep learning algorithms inherently do a large amount of matrix
multiplication operations. These operations can be efficiently optimized
using a GPU.

3. Feature engineering:
Feature engineering is the process of transforming raw data into features
that better represent the underlying problem to the predictive models,
resulting in improved model accuracy on unseen data.Feature engineering
turn your inputs into things the algorithm can understand.

In Machine learning, most of the applied features need to be identified

by an expert and then hand-coded as per the domain and data type.
Features can be pixel values, shape, textures, position and orientation. The
performance of most of the Machine Learning algorithm depends on how
accurately the features are identified and extracted.

Deep learning algorithms try to learn high-level features from data.

Deep learning reduces the task of developing new feature extractor for every
problem. Like, Convolutional NN will try to learn low-level features such as
edges and lines in early layers then parts of faces of people and then high-
level representation of a face.

4. Problem Solving approach

When solving a problem using traditional machine learning algorithm, it is
generally recommended to break the problem down into different parts,
solve them individually and combine them to get the result. Deep learning in
contrast advocates to solve the problem end-to-end.
Eg. Suppose you have a task of multiple object detection. The task is to
identify what is the object and where is it present in the image.
In a typical ML approach, you would divide the problem into two steps,
object detection and object recognition
On the contrary, in deep learning approach, you would do the process
end-to-end.

5. Execution time
Usually, a deep learning algorithm takes a long time to train. This is
because there are so many parameters in a deep learning algorithm that
training them takes longer than usual. Whereas machine learning
comparatively takes much less time to train, ranging from a few seconds to
a few hours.
This is turn is completely reversed on testing time. At test time, deep
learning algorithm takes much less time to run. Whereas, if you compare it
with k-nearest neighbors (ML algorithm), test time increases on increasing
the size of data. Although this is not applicable on all machine learning
algorithms, as some of them have small testing times too.

6. Interpretability:
Suppose we use deep learning to give automated scoring to essays. The
performance it gives in scoring is quite excellent and is near human
performance. But there‟s is an issue. It does not reveal why it has given that
score. Indeed mathematically you can find out which nodes of a deep neural
network were activated, but we don‟t know what there neurons were
supposed to model and what these layers of neurons were doing collectively.
So we fail to interpret the results.
On the other hand, machine learning algorithms like decision trees
give us crisp rules as to why it chose what it chose, so it is particularly easy
to interpret the reasoning behind it. Therefore, algorithms like decision trees
and linear/logistic regression are primarily used in industry for
interpretability.

Characteristic ML DL
requires less amount of requires large amount
Data dependencies for data for identifying of data for better
Performance rules performance
work on low-end heavily depend on high-
Hardware dependencies machines end machines
Deep learning
algorithms try to learn
high-level features from
features need to be data.
identified by an expert Deep learning reduces
and then hand-coded as the task of developing
per the domain and new feature extractor
Feature engineering data type for every problem.
Break the problem into
Problem Solving parts, finds and Solves the problem end-
approach combines the solution to-end
Takes much small time
for training but may
Execution time Takes more time for take more time for
training and less time testing depending on
for testing the algorithm like KNN
Fails to interpret the Easy to interpret the
Interpretability results results
Q) Explain various applications of Deep Learning.

There are various interesting applications for Deep Learning that made
impossible things before a decade into reality. Some of them are:
1. Color restoration, where a given image in greyscale is automatically
turned into a colored one.
2. Recognizing hand written message.
3. Adding sound to a silent video that matches with the scene taking
place.
4. Self-driving cars
5. Computer Vision: for applications like vehicle number plate
identification and facial recognition.
6. Information Retrieval: for applications like search engines, both text
search, and image search.
7. Marketing: for applications like automated email marketing, target
identification
8. Medical Diagnosis: for applications like cancer identification, anomaly
detection
9. Natural Language Processing: for applications like sentiment analysis,
photo tagging
10. Online Advertising, etc

Q) Briefly explain about loss function in neural networks.

Neural Network uses optimising strategies to minimize the error in the

algorithm. The way we actually compute this error is by using a Loss
Function. It is used to quantify how good or bad the model is performing.
These are divided into two categories i.e. Regression loss and Classification
Loss.

1. Regression Loss Function

Regression Loss is used when we are predicting continuous values like the
price of a house or sales of a company.
Eg. Mean Squared Error
Mean Squared Error is the mean of squared differences between the actual
and predicted value. If the difference is large the model will penalize it as we
are computing the squared difference.
2. Binary Classification Loss Function
Suppose we are dealing with a Yes/No situation like “a person has diabetes
or not”, in this kind of scenario Binary Classification Loss Function is used.
Eg. Binary Cross Entropy Loss
It gives the probability value between 0 and 1 for a classification task.
Cross-Entropy calculates the average difference between the predicted and
actual probabilities.
3. Multi-Class Classification Loss Function
If we take a dataset like Iris where we need to predict the three-class labels:
Setosa, Versicolor and Virginia, in such cases where the target variable has
more than two classes Multi-Class Classification Loss function is used.
Eg. Categorical Cross Entropy Loss:
These are similar to binary classification cross-entropy, used for multi-class
classification problems.

Q) Explain briefly about gradient descent algorithm.

A deep learning neural network learns to map a set of inputs to a set of
outputs from training data. We cannot calculate the perfect weights for a
neural network.
Gradient descent is an iterative optimization algorithm for finding the
minimum of a function.
To find the minimum of a function using gradient descent, one takes
steps proportional to the negative of the gradient of the function at the
current point.
The “gradient” in gradient descent refers to an error gradient. The
model with a given set of weights is used to make predictions and the error
for those predictions is calculated.
Eg.

Fig. Gradient Descent

The gradient is given by the slope of the tangent at w = 0.2, and then the
magnitude of the step is controlled by a parameter called the learning rate.
The larger the learning rate, the bigger the step we take, and the smaller the
learning rate, the smaller the step we take. Then we take the step and we
move to w1.
Now when choosing the learning rate, we have to be very careful as a large
learning rate can lead to big steps and eventually missing the minimum.
On the other hand, a small learning rate can result in very small steps and
therefore causing the algorithm to take a long time to find the minimum
point.

Q) Explain about Back propagation algorithm.

Back-propagation is the essence of neural net training. It is the method of

fine-tuning the weights of a neural net based on the error rate obtained in
the previous epoch (i.e., iteration). Proper tuning of the weights allows you to
reduce error rates and to make the model reliable by increasing its
generalization.

The algorithm is used to effectively train a neural network through a method

called chain rule. In simple terms, after each forward pass through a
network, back propagation performs a backward pass while adjusting the
model‟s parameters (weights and biases).

Algorithm:
1. Initialize the weights and biases.
2. Iteratively repeat the following steps until defined number of times or
threshold value is reached:
i. Calculate network output using forward propagation.
ii. Calculate error between actual and predicted values.
iii. Propagate the error back into the network and update weights
and biases using the equations:

Fig. illustrating BP
Example:
Forward Propagation:

Therefore,
z1 = 0.415 a1 = 0.6023 z2 = 0.9210 a2 = 0.7153

Let us consider,
epochs = 1000 threshold = 0.001
learning rate = 0.4 T = 0.25

4.
E = 1/2(T-a2)2 = 0.1083
Eqn # 1: 𝑧1 = 𝑥1 ∙ 𝑤1 + 𝑏1
Eqn # 2: 𝑎1 = (𝑧1) = 1/( 1+ 𝑒 −𝑧1 )
Eqn # 3: 𝑧2 = 𝑎1 ∙ 𝑤2 + 𝑏2
Eqn # 4: 𝑎2 = (𝑧2) = 1 /(1+ 𝑒 −𝑧2)
Eqn # 5: 𝐸 = 1 /2 (𝑇 − 𝑎2)2
Updating w2:

= 0.45 - 0.4(-(0.25-0.7153))*(0.7153(1-0.7153))*(0.6023)
= 0.45 - 0.4*0.05706
= 0.427
Updating b2:
= 0.65 - 0.4*(-(0.25-0.7153))*(0.7153(1-0.7153))*1
= 0.65 - 0.4*0.0948
= 0.612
Updating w1:

= 0.15 - 0.4(-(0.25-0.7153))(0.7153(1 -0.7153))0.450.6023(1-

0.6023)*0.1
= 0.15 - 0.4*0.001021
= 0.1496
Updating w2:

= 0.40 - 0.4*(-(0.25-0.7153))*(0.7153(1-0.7153))*0.45*0.6023(1-
0.6023)*1
= 0.40-0.4*0.01021
= 0.3959

Therefore we continue next iteration(feedforward) with the update

values of w1,b1,w2 and b2.
w1 = 0.1496 b1 = 0.3959 w2 = 0.427 b2 = 0.612
x1 = 0.1.

Q) What is Vanishing Gradient problem?

As more layers using certain activation functions are added to neural
networks, the gradients of the loss function approaches zero, making the
network hard to train.

Eg. In the below problem the derivatives with respect to weights are very
small.
So when we do back propagation, we keep multiplying the factors that are
less than 1 by each other and gradients tend to smaller and smaller by
moving backward in the network.
This means the neurons in the earlier layers learn very slowly. The result is
a training process that takes too long and prediction accuracy is
compromised.

Q) Explain indetail about CNN model.

MLP‟s use one perceptron for each input (e.g. pixel in an image, multiplied
by 3 in RGB case). The amount of weights rapidly becomes unmanageable
for large images. For a 224 x 224 pixel image with 3 color channels there are
around 150,528 weights that must be trained! As a result, difficulties arise
whilst training and overfitting can occur.

A Convolutional neural network (CNN) is a neural network that has one or

more convolutional layers and is used mainly for image processing,
classification, segmentation.

Fig. CNN Architecture

Input layer:
The input to a cnn, is mostly an image(nxmx1-gray scale image and nxmx3-
colored image)

Fig. RGB image as input

Convolution layer:
Here, we basically define filters and we compute the convolution between the
defined filters and each of the 3 images.

Fig. convolution operation

In the same way we apply to remaining (above is for red image, then we do
same for green and blue) images. We can apply more than one filter. More
filters we use, we can preserve spatial dimensions better.

We use convolution instead of considering flatten image as input as we will

end up with a massive number of parameters that will need to be optimized
and computationally expensive.
Eg. We require 25 weights if we take 5x5x1 image with out convolution.
We require 16 weights(n-f+1 x n-f+1) if we take 5x5x1 image with 2x2
convolution filter.

By using convolution we can prevent overfitting of the model.

It is worth to have ReLU activation function in convolution layer which

passed only positive values and make negative values to zeros.

Pooling layer:
Pooling layer objective is to reduce the spatial dimensions of the data
propagating through the network.
1. Max Pooling is the most common, for each section of the
image we scan and keep the highest value.

Fig. Max Pooling with stride = 2

Max. pooling provides spatial variance which enables the neural network to
recognize objects in an image even if the object does not exactly resemble
the original object.

2 Average Pooling: Here, we take average of area we scan.

Fig. Average Pooling with stride = 2

Fully Connected Layer:

Here, we flatten the output of the last convolutional layer
and connect every node of the current layer with every
other node of the next layer.

This layer basically takes output of the preceding layer,

whether it is a convolutional layer, ReLU or Pooling layer
and outputs an n-dimensional vector, where n is
number of classes pertaining to the problem.

Fig. Fully Connected Layer

Q) Differentiate Shallow NN and Deep NN.

Shallow Neural Network Deep Neural Network

It consists of one hidden layer It consists of more than one hidden
layer
It takes input as vectors only It takes raw data like images and
text as input.

Big Data Analytics (CS443) IV B.Tech (IT) 2018-19 I Semester
No ratings yet
Big Data Analytics (CS443) IV B.Tech (IT) 2018-19 I Semester
72 pages
Deep Learning - Chorale Prelude
No ratings yet
Deep Learning - Chorale Prelude
2 pages
Question Bank
No ratings yet
Question Bank
26 pages
@vtucode - in Module 5 AI 2021 Scheme 5th Sem
No ratings yet
@vtucode - in Module 5 AI 2021 Scheme 5th Sem
66 pages
Deep learning notes
No ratings yet
Deep learning notes
47 pages
Module 5 AIML Notes
No ratings yet
Module 5 AIML Notes
77 pages
Unit I Architecture of Neural Network
No ratings yet
Unit I Architecture of Neural Network
74 pages
ASC unit 1
No ratings yet
ASC unit 1
14 pages
Unit 5
No ratings yet
Unit 5
75 pages
Neural Network (1)
No ratings yet
Neural Network (1)
98 pages
Lecture 2
No ratings yet
Lecture 2
35 pages
Background Theory of Deep Learning: 2.1 Origin
No ratings yet
Background Theory of Deep Learning: 2.1 Origin
18 pages
Networks. These Are Formed From Trillions of Neurons (Nerve
No ratings yet
Networks. These Are Formed From Trillions of Neurons (Nerve
14 pages
Nndl Umit 1 Important Questions
No ratings yet
Nndl Umit 1 Important Questions
8 pages
Module - 3 AAI
No ratings yet
Module - 3 AAI
119 pages
M5 FULL - ANN, CLUSTERING ALGORITHM - 21CS54 notes
No ratings yet
M5 FULL - ANN, CLUSTERING ALGORITHM - 21CS54 notes
71 pages
Function of Single Biological Neuron and Modelling of Artificial Neuron From It
No ratings yet
Function of Single Biological Neuron and Modelling of Artificial Neuron From It
33 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
29 pages
7 Neural Networks - Lecture Slides
No ratings yet
7 Neural Networks - Lecture Slides
74 pages
ANN (SPPU AI&DS Insem Solved Question Paper 2019 Pattern)
No ratings yet
ANN (SPPU AI&DS Insem Solved Question Paper 2019 Pattern)
26 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
8 pages
Machine Learning Unit 2
No ratings yet
Machine Learning Unit 2
33 pages
Unit 1: 1. Introduction To Artificial Neural Network
No ratings yet
Unit 1: 1. Introduction To Artificial Neural Network
17 pages
Unit-5 DR - HCV
No ratings yet
Unit-5 DR - HCV
34 pages
Unit-5
No ratings yet
Unit-5
59 pages
Aditya Jain NN Assignment
No ratings yet
Aditya Jain NN Assignment
13 pages
CL7204-Soft Computing Techniques
No ratings yet
CL7204-Soft Computing Techniques
13 pages
ANN Unit 1 Ass PDF
No ratings yet
ANN Unit 1 Ass PDF
11 pages
Lecture8,9-Neural Networks
No ratings yet
Lecture8,9-Neural Networks
65 pages
CHAPTER-5
No ratings yet
CHAPTER-5
10 pages
NN unit_1
No ratings yet
NN unit_1
27 pages
Neural Networks
No ratings yet
Neural Networks
61 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
14 pages
5 Basic Neural Networks
No ratings yet
5 Basic Neural Networks
40 pages
Lec 19
No ratings yet
Lec 19
16 pages
ANN_Module_1
No ratings yet
ANN_Module_1
39 pages
Research Proposal Presentation
No ratings yet
Research Proposal Presentation
20 pages
Unit 6
No ratings yet
Unit 6
70 pages
Lecture 08 On Neural Networks 1
No ratings yet
Lecture 08 On Neural Networks 1
15 pages
UNIT - 1 DLNN
No ratings yet
UNIT - 1 DLNN
36 pages
Neural Networks
No ratings yet
Neural Networks
21 pages
Biological Neuron and Memory: Understanding The Basics of Neural Function and Memory Mechanisms
No ratings yet
Biological Neuron and Memory: Understanding The Basics of Neural Function and Memory Mechanisms
455 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
48 pages
Week 4
No ratings yet
Week 4
5 pages
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 e914f1ab6afb
No ratings yet
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 e914f1ab6afb
52 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
66 pages
2.neural Network
No ratings yet
2.neural Network
19 pages
Must Know Questions Deep Learning
No ratings yet
Must Know Questions Deep Learning
22 pages
deep_learning
No ratings yet
deep_learning
5 pages
Artificial intelligence basics
No ratings yet
Artificial intelligence basics
13 pages
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
No ratings yet
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
52 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
40 pages
Deep Learning
No ratings yet
Deep Learning
189 pages
Module-2
100% (1)
Module-2
62 pages
SC ESE Cae 1
No ratings yet
SC ESE Cae 1
25 pages
Artificial Intelligence in Robotics-21-50
No ratings yet
Artificial Intelligence in Robotics-21-50
30 pages
Unit-5
No ratings yet
Unit-5
58 pages
An Ingression Into Deep Learning - FP
No ratings yet
An Ingression Into Deep Learning - FP
17 pages
Ann 2022
No ratings yet
Ann 2022
28 pages
EAI14
No ratings yet
EAI14
16 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
14 pages
Neural Networks
From Everand
Neural Networks
Sasha Kurzweil
No ratings yet
I MTechMPharmacy
No ratings yet
I MTechMPharmacy
1 page
Revised IV B. Tech
No ratings yet
Revised IV B. Tech
1 page
Universal Human Values II - Understanding Harmony 20EGM03
No ratings yet
Universal Human Values II - Understanding Harmony 20EGM03
2 pages
IMBAMCA
No ratings yet
IMBAMCA
1 page
PennStateSchool08 LecNotes
No ratings yet
PennStateSchool08 LecNotes
529 pages
R Cet Vacancy 202218032023
No ratings yet
R Cet Vacancy 202218032023
2 pages
MC5403 Adbdm Unit I Notes
No ratings yet
MC5403 Adbdm Unit I Notes
95 pages
Banking UNIT IV
No ratings yet
Banking UNIT IV
17 pages
427 16sacaob3 2020051805192483
No ratings yet
427 16sacaob3 2020051805192483
66 pages
MC5403 Adbdm Unit Ii Notes
No ratings yet
MC5403 Adbdm Unit Ii Notes
59 pages
Img - 31 5 13
No ratings yet
Img - 31 5 13
5 pages
WT R19 Unit 3
No ratings yet
WT R19 Unit 3
18 pages
D2-S1 B Self-Exploration J Happiness and Prosperity July 26
No ratings yet
D2-S1 B Self-Exploration J Happiness and Prosperity July 26
17 pages
Eps Unit 2
No ratings yet
Eps Unit 2
81 pages
Eps Unit 1
No ratings yet
Eps Unit 1
11 pages
Hydrostatics (4A)
No ratings yet
Hydrostatics (4A)
4 pages
Big Data NOTES and QB
No ratings yet
Big Data NOTES and QB
92 pages
249 Enterpreneurship Lesson 14
No ratings yet
249 Enterpreneurship Lesson 14
11 pages
Fineartseventsyouthfestivaldec 2022
No ratings yet
Fineartseventsyouthfestivaldec 2022
1 page
Syllabus EC 003
No ratings yet
Syllabus EC 003
2 pages
NN DL
No ratings yet
NN DL
54 pages
Big Data - Hadoop Questions Answers
No ratings yet
Big Data - Hadoop Questions Answers
18 pages
Organizational Behaviour: Alagappa University
No ratings yet
Organizational Behaviour: Alagappa University
208 pages
M.Tech - CTM II Year Syllabus
No ratings yet
M.Tech - CTM II Year Syllabus
17 pages
B. Techacforgovt
No ratings yet
B. Techacforgovt
4 pages
Bcse332p Deep-Learning-Lab Lo 1.0 0 Bcse332p
No ratings yet
Bcse332p Deep-Learning-Lab Lo 1.0 0 Bcse332p
2 pages
Artificial Neural Networks - MiniProject
100% (1)
Artificial Neural Networks - MiniProject
16 pages
Deep Vs Shallow Neural Networks
No ratings yet
Deep Vs Shallow Neural Networks
13 pages
Deep Learning
No ratings yet
Deep Learning
299 pages
DL Unit 2
No ratings yet
DL Unit 2
107 pages
Image Caption Generator Final Report
No ratings yet
Image Caption Generator Final Report
28 pages
Fundamentals of ANN
No ratings yet
Fundamentals of ANN
213 pages
Machine Learning: Lecture 4: Artificial Neural Networks (Based On Chapter 4 of Mitchell T.., Machine Learning, 1997)
No ratings yet
Machine Learning: Lecture 4: Artificial Neural Networks (Based On Chapter 4 of Mitchell T.., Machine Learning, 1997)
14 pages
CS445 - Neural Networks and Deep Learning - Lecture Notes
No ratings yet
CS445 - Neural Networks and Deep Learning - Lecture Notes
5 pages
Exam 2004
No ratings yet
Exam 2004
20 pages
A Survey of Convolutional Neural Networks - Analysis-Applications-Prospects
No ratings yet
A Survey of Convolutional Neural Networks - Analysis-Applications-Prospects
21 pages
Full Deep Learning With Python Develop Deep Learning Models On Theano and TensorFLow Using Keras Jason Brownlee Ebook All Chapters
100% (3)
Full Deep Learning With Python Develop Deep Learning Models On Theano and TensorFLow Using Keras Jason Brownlee Ebook All Chapters
62 pages
DL CS05
No ratings yet
DL CS05
22 pages
ssw9 PS2-13 Wu
No ratings yet
ssw9 PS2-13 Wu
6 pages
Week 1
No ratings yet
Week 1
24 pages
Learning
No ratings yet
Learning
12 pages
Deep Learning (MODULE-5)
No ratings yet
Deep Learning (MODULE-5)
71 pages
4-Neural Networks and Activation Function
No ratings yet
4-Neural Networks and Activation Function
28 pages
cst414- Deep learning
No ratings yet
cst414- Deep learning
34 pages
Neural Networks:: Basics Using MATLAB
No ratings yet
Neural Networks:: Basics Using MATLAB
54 pages
Keras and Tensorflow
No ratings yet
Keras and Tensorflow
11 pages
lec_12_generative_adversarial_networks
No ratings yet
lec_12_generative_adversarial_networks
85 pages
Transformer
No ratings yet
Transformer
33 pages
REG NO. 18MIS7099 Machine Learning - Lab - 10 Name: Dana Vamsi Krishna
No ratings yet
REG NO. 18MIS7099 Machine Learning - Lab - 10 Name: Dana Vamsi Krishna
5 pages
LSTM.pptx
No ratings yet
LSTM.pptx
11 pages
Csa4020 Deep-Learning LP 1.0 22 Csa4020 Deep-Learning LP 1.0 1 Deep Learning
No ratings yet
Csa4020 Deep-Learning LP 1.0 22 Csa4020 Deep-Learning LP 1.0 1 Deep Learning
2 pages
DL - Assignment 8 Solution
100% (2)
DL - Assignment 8 Solution
6 pages
ccs355 Lab Manual
No ratings yet
ccs355 Lab Manual
24 pages
G5Baim Artificial Intelligence Methods: Graham Kendall
No ratings yet
G5Baim Artificial Intelligence Methods: Graham Kendall
47 pages

AITools Unit-4

Uploaded by

AITools Unit-4

Uploaded by

UNIT-4

Deep Learning: Basics of Deep Learning, Machine Learning Vs Deep

Q) Describe the Motivation for Deep Learning.

Q) Define Deep Learning(DL).

Q)Give brief historical background of Deep Learning.

Fig. Human Neural Functioning

Fig. Biological Neuron

Q) Differentiate a Biological neuron and and Artificial neuron.

Q) Define Artificial Neuron(AN). Explain the computation/processing of

Let us take the activation function as:

The AND function can be implemented as:

The output of this neuron is:a = f( -1.5*1 + x1*1 + x2*1 )

The OR function can be implemented as:

The output of this neuron is:a = f( -0.5 + x1 + x2 )

The output of this neuron is:a = f( 1 – 2*x1 )

Q) Explain the need for multi-layered neural network with an example.

Lets break down the XNOR function.

The output of this neuron is:a = f( 0.5 – x1 – x2 )

The different outputs represent different units:

 Activation Functions are extremely important feature of the Artificial

Fig. Illustrating the need of Activation Function for a complex problem.

Activation function must be efficient and it should reduce the computation

1. Binary Step Function

2. Linear or Identity Activation Function

Fig: Linear Activation Function

The Nonlinear Activation Functions are mainly divided on the basis of

Advantage of Non-linear function over the Linear function :

3.1 Sigmoid(Logistic AF)(σ):

Fig: Sigmoid Function (S-shaped Curve)

3.2 TanH(Hyperbolic Tangent AF):

Fig. Sigmoid Vs Tanh

The function is differentiable and monotonic. But function derivative is

3.3 ReLU(Rectified Linear Unit):

The function and its derivative both are monotonic.

 Sigmoid able to handle more than two cases(class label).

Q) Explain about Deep feedforward networks or feedforward neural

Fig. Deep Feedforward Network

The goal of a feedforward network is to approximate some function f ∗. For

A feedforward network defines a mapping y = f (x; θ) and learns the value of

When feedforward neural networks are extended to include feedback

Feedforward networks are of extreme importance to machine learning

Q) Differentiate ML & DL.

In Machine learning, most of the applied features need to be identified

Deep learning algorithms try to learn high-level features from data.

4. Problem Solving approach

Q) Briefly explain about loss function in neural networks.

Neural Network uses optimising strategies to minimize the error in the

1. Regression Loss Function

Q) Explain briefly about gradient descent algorithm.

Fig. Gradient Descent

Q) Explain about Back propagation algorithm.

Back-propagation is the essence of neural net training. It is the method of

The algorithm is used to effectively train a neural network through a method

= 0.15 - 0.4*(-(0.25-0.7153))*(0.7153(1 -0.7153))*0.45*0.6023(1-

Therefore we continue next iteration(feedforward) with the update

Q) What is Vanishing Gradient problem?

Q) Explain indetail about CNN model.

A Convolutional neural network (CNN) is a neural network that has one or

Fig. CNN Architecture

Fig. RGB image as input

Fig. convolution operation

We use convolution instead of considering flatten image as input as we will

By using convolution we can prevent overfitting of the model.

It is worth to have ReLU activation function in convolution layer which

Fig. Max Pooling with stride = 2

2 Average Pooling: Here, we take average of area we scan.

Fig. Average Pooling with stride = 2

Fully Connected Layer:

This layer basically takes output of the preceding layer,

Fig. Fully Connected Layer

Q) Differentiate Shallow NN and Deep NN.

Shallow Neural Network Deep Neural Network

You might also like

The output of this neuron is:a = f( -1.51 + x11 + x2*1 )

= 0.15 - 0.4(-(0.25-0.7153))(0.7153(1 -0.7153))0.450.6023(1-