0% found this document useful (0 votes)

13 views

DL_Unit I

Uploaded by

dubeynandini73

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

DL_Unit I

Uploaded by

dubeynandini73

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 84

UNIT – 1

INTRODUCTION TO NEURAL
NETWORK

Final Year
BTECH Subject : Deep Learning (PE4)
Unit I : Contents
2

Introduction To Neural Network

Introduction, The architecture of an artificial neural network, Types

of ANN architecture, Advantages and disadvantages of ANN,
Perceptron, Sigmoid Neurons, Activation Functions, Loss Function.
Introduction, The architecture of an ANN, Types of ANN
architecture, Advantages and Disadvantages of ANN

Sources:

• https://www.javatpoint.com/artificial-neural-network
• Introduction to Artificial Neural Systems by Jacek M. Zurada
Yegnanarayana B, Artificial Neural Systems , PHP learning

Final Year
BTECH Subject : Deep Learning (PE4)
What is Machine Learning?
4

⮚ Artificial Intelligence (AI) systems learn by extracting patterns from

input and output data.
⮚ Machine Learning (ML) relies on learning patterns based on sample
data. Programs learn from labeled data (supervised learning),
unlabeled data (unsupervised learning), or a combination of both
(semi-supervised learning).
⮚ Artificial Intelligence (AI) came around in the middle of 1900s
when scientists tried to envision intelligent machines. Machine
Learning evolved in the late 1900s. This allowed scientists to train
machines for AI.
⮚ In the early 2000s, certain breakthroughs in multi-layered neural
networks facilitated the advent of Deep Learning.
What is Artificial Neural Network?
5

The term "Artificial Neural Network" is derived from Biological neural

networks that develop the structure of a human brain. Similar to the human
brain that has neurons interconnected to one another, artificial neural
networks also have neurons that are interconnected to one another in
various layers of the networks. These neurons are known as nodes.

The given figure illustrates the typical diagram of Biological Neural

Network.

https://www.javatpoint.com/artificial-neural-network
What is Artificial Neural Network?

⮚ ANN in the field of AI where it attempts to mimic the network of neurons

makes up a human brain so that computers will have an option to understand
things and make decisions in a human-like manner.
⮚ It is designed by programming computers to behave simply like
interconnected brain cells.
⮚ There are around 1000 billion neurons in the human brain. Each neuron has
an association point somewhere in the range of 1,000 and 100,000.
⮚ In the human brain, data is stored in such a manner as to be distributed, and
we can extract more than one piece of this data when necessary from our
memory parallelly.
⮚ Human brain is made up of incredibly amazing parallel processors.
What is Artificial Neural Network?
7

⮚ ANN example-
Consider an example of a digital logic gate that takes an input and
gives an output. "OR" gate, which takes two inputs. If one or both
the inputs are "On," then we get "On" in output. If both the inputs
are "Off," then we get "Off" in output. Here the output depends
upon input. Our brain does not perform the same task. The outputs
to inputs relationship keep changing because of the neurons in our
brain, which are "learning."
Relationship between Biological neural network and
artificial neural network
8

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output

Table 1: Neural network (biological and artificial)

Typical Artificial Neural Network
9

The typical Artificial Neural Network looks something like the given figure.

Fig 1. Neuron function

Dendrites from Biological Neural Network represent inputs in Artificial Neural

Networks, cell nucleus represents Nodes, synapse represents Weights, and Axon
represents Output.
The architecture of an artificial neural network
10

To understand the concept of the architecture of an artificial neural network, we have to

understand what a neural network consists of. In order to define a neural network that
consists of a large number of artificial neurons, which are termed units arranged in a
sequence of layers. Lets us look at various types of layers available in an artificial neural
network.

Fig 2 :Neural network’ different architectures

Layers of Artificial Neural Network
11

Input Layer:
◻ As the name suggests, it accepts inputs in several different formats provided
by the programmer.

Hidden Layer:
◻ The hidden layer presents in-between input and output layers. It performs all
the calculations to find hidden features and patterns.

Output Layer:
◻ The input goes through a series of transformations using the hidden layer,
which finally results in output that is conveyed using this layer.
12
Need of Bais
13
Layers of Artificial Neural Network
14

⮚ The artificial neural network takes input and computes the weighted sum of
the inputs and includes a bias. This computation is represented in the form of
a transfer function.
⮚ It determines weighted total is passed as an input to an activation function to
produce the output. Activation functions choose whether a node should fire
or not. Only those who are fired make it to the output layer.
⮚ There are distinctive activation functions available that can be applied upon
the sort of task we are performing.
Types/Models of ANN Architecture
15

Feedforward Network

Feedback Network

Introduction to Artificial Neural Systems by Jacek M. Zurada Yegnanarayana B, Artificial Neural Systems , PHP learning. (Page no. 37)
Types/Models of ANN Architecture
Feedforward Network
16

Figure 2.8(b) shows the block diagram of the feedforward network. As can be seen, the
generic feedforward network is characterized by the lack of feedback. This type of
network can be connected in cascade to create a multilayer network. In such a network,
the output of a layer is the input to the following layer. Even though the feedforward
network has no explicit feedback connection when x(t) is mapped into o(t), the output
values are often compared with the "teacher's" information, which provides the desired
output value, and also an error signal can be employed for adapting the network's
weights.

Introduction to Artificial Neural Systems by Jacek M. Zurada Yegnanarayana B, Artificial Neural Systems , PHP learning. (Page no. 37)
Types/Models of ANN Architecture
Feedback Network
17

A feedback network can be obtained from the feedforward network shown in Figure
2.8(a) by connecting the neurons' outputs to their inputs. The result is depicted in Figure
2.10(a).

Introduction to Artificial Neural Systems by Jacek M. Zurada Yegnanarayana B, Artificial Neural Systems , PHP learning. (Page no. 42)
Advantages of Artificial Neural Network (ANN)
18

⮚ Parallel processing capability:

Artificial neural networks have a numerical value that can perform more than one task simultaneously.

⮚ Storing data on the entire network:

Data that is used in traditional programming is stored on the whole network, not on a database. The
disappearance of a couple of pieces of data in one place doesn't prevent the network from working.

⮚ Capability to work with incomplete knowledge:

After ANN training, the information may produce output even with inadequate data. The loss of
performance here relies upon the significance of missing data.

⮚ Having a memory distribution:

For ANN to be able to adapt, it is important to determine the examples and to encourage the network
according to the desired output by demonstrating these examples to the network. The succession of the
network is directly proportional to the chosen instances, and if the event can't appear to the network in
all its aspects, it can produce false output.

⮚ Having fault tolerance:

Extortion of one or more cells of ANN does not prohibit it from generating output, and this feature
makes the network fault-tolerance.
Disadvantages of Artificial Neural Network (ANN)
19

⮚ Assurance of proper network structure:

There is no particular guideline for determining the structure of artificial neural networks. The
appropriate network structure is accomplished through experience, trial, and error.

⮚ Unrecognized behavior of the network:

It is the most significant issue of ANN. When ANN produces a testing solution, it does not provide
insight concerning why and how. It decreases trust in the network.

⮚ Hardware dependence:
Artificial neural networks need processors with parallel processing power, as per their structure.
Therefore, the realization of the equipment is dependent.

⮚ Difficulty of showing the issue to the network:

ANNs can work with numerical data. Problems must be converted into numerical values before being
introduced to ANN. The presentation mechanism to be resolved here will directly impact the
performance of the network. It relies on the user's abilities.

⮚ The duration of the network is unknown:

The network is reduced to a specific value of the error, and this value does not give us optimum results.
1.1 Neural Computation (With Example)
20

Let us try to inspect the performance of a simple classifier-

Introduction to Artificial Neural Systems by Jacek M. Zurada Yegnanarayana B, Artificial Neural Systems ,
PHP learning. (Page no. 3-8)
1.1 Neural Computation (With Example)
21

Assume that a set of eight points, Po, P1, . . . , P7, in three-dimensional space is available.
The set consists of all vertices of a three-dimensional cube as follows:

Elements of this set need to be classified into two categories. The first category is defined
as containing points with two or more positive ones; the second category contains all the
remaining points that do not belong to the first category. Accordingly, points P3, P5, P6,
and P7 belong to the first category, and the remaining points to the second category.
Classification of points P3, P5, P6, and P7 can be based on the summation of coordinate
values for each point evaluated for category membership. Notice that for each point Pi (x,,
x2, x3), where i = 0, . . . , 7, the membership in the category can be established by the
following calculation:

Describes the decision function of

the classifier designed by inspection
of the set that needs to be partitioned
Introduction to Artificial Neural Systems by Jacek M. Zurada Yegnanarayana B, Artificial Neural Systems ,
PHP learning. (Page no. 3-8)
1.1 Neural Computation (With Example)
22

The unit from Figure 1 .l(a) maps the entire three-dimensional space into just two points, 1
and - 1. A question arises as to whether a unit with a "squashed" sgn function rather than a
regular sgn function could prove more advantageous. Assuming that the "squashed" sgn
function has the shape as in Figure 1.2, notice that now the outputs take values in the range (-
1,l) and are generally more discernible than in the previous case. Using units with continuous
characteristics offers tremendous opportunities for new tasks that can be performed by neural
networks. Specifically, the fine granularity of output provides more information than the
binary f 1 output of the thresholding element.

Introduction to Artificial Neural Systems by Jacek M. Zurada Yegnanarayana B, Artificial Neural Systems , PHP learning. (Page no. 3-8)
Perceptron, Sigmoid Neurons

Sources:

1. Michael A. Nielsen, "Neural Networks and Deep Learning", Determination

Press, 2015 (Module I- Perceptron, Sigmoid Neurons)
2. https://www.simplilearn.com/what-is-perceptron-tutorial

Final Year
BTECH Subject : Deep Learning (PE4)
Perceptron
24

⮚ A perceptron is a neural network unit (an artificial neuron)

that does certain computations to detect features or business
intelligence in the input data.
⮚ A type of artificial neuron called a perceptron.
⮚ Perceptron was introduced by Frank Rosenblatt in 1957. He
proposed a Perceptron learning rule based on the original
MCP neuron.
⮚ A Perceptron is an algorithm for supervised learning of binary
classifiers. This algorithm enables neurons to learn and
processes elements in the training set one at a time.

Michael A. Nielsen, "Neural Networks and Deep Learning", Determination Press, 2015 (Module I- Perceptron, Sigmoid Neurons)
How do perceptron's work?
25

• A perceptron takes several binary inputs, x1 , x2 , . . ., and produces a single

binary output

• In the example shown the perceptron has three inputs, x1 , x2 , x3 . In general it

could have more or fewer inputs
• Rosenblatt proposed a simple rule to compute the output. He introduced
weights, w1 ,w2 , . . ., real numbers expressing the importance of the respective
inputs to the output.
• The neuron’s output, 0 or 1, is determined by whether the weighted sum is less
than or greater than some threshold value
• Just like the weights, the threshold is a real number which is a parameter
of the neuron. To put it in more precise algebraic terms
Perceptron
26

⮚ The first column of perceptrons – the first layer of perceptrons – is making

three very simple decisions, by weighing the input evidence.
⮚ Perceptrons in the second layer- Each of those perceptrons is making a
decision by weighing up the results from the first layer of decision-
making.
⮚ Perceptron in the second layer can make a decision at a more complex and
more abstract level than perceptrons in the first layer.
⮚ And even more complex decisions can be made by the perceptron in the
third layer.
⮚ A many-layer network of perceptrons can engage in sophisticated decision
making.
Perceptron
27

⮚ There are two types of Perceptrons: Single layer and Multilayer.

⮚ Single layer Perceptrons can learn only linearly separable patterns.
⮚ Multilayer Perceptrons or feedforward neural networks with two or
more layers have the greater processing power.
⮚ The Perceptron algorithm learns the weights for the input signals in
order to draw a linear decision boundary.
⮚ This enables you to distinguish between the two linearly separable
classes +1 and -1.
⮚ Supervised Learning is a type of Machine Learning used to learn
models from labeled training data. It enables output prediction for
future or unseen data.

https://www.simplilearn.com/what-is-perceptron-tutorial
Perceptron-Single layer
28

◻ It includes a feed-forward network depends on a threshold

transfer function in its model.
◻ It is the easiest type of ANN that able to analyze only linearly
separable objects with binary outcomes(target) i.e. 1, and 0.
Perceptron-Single layer
29

⮚ In single-layered perceptron model, its algorithm doesn’t have

previous information,
⮚ Initially, weights are allocated inconstantly, then the algorithm
adds up all the weighted inputs, if the added value is more
than some pre-determined value( or, threshold value) then
single-layered perceptron is stated as activated and delivered
output as +1.
⮚ Multiple input values feed up to the perceptron model, model
executes with input values, and if the estimated value is the
same as the required output, then the model performance is
found out to be satisfied, therefore weights demand no
changes. In fact, if the model doesn’t meet the required result
then few changes are made up in weights to minimize errors.
Perceptron-Multi layer
30

⮚ It has a structure similar to a single-layered perceptron model with more

number of hidden layers.
⮚ It is also termed as a Backpropagation algorithm. It executes in two
stages; the forward stage and the backward stages.
Perceptron-Multilayer
31

⮚ In the forward stage, activation functions are originated from

the input layer to the output layer,
⮚ In the backward stage, the error between the actual observed
value and demanded given value is originated backward in the
output layer for modifying weights and bias values.
⮚ In simple terms, multi-layered perceptron can be treated as a
network of numerous artificial neurons overhead varied
layers, the activation function is no longer linear, instead, non-
linear activation functions such as Sigmoid functions, TanH,
ReLU activation Functions, etc are deployed for execution.
Perceptron
32
Perceptron Learning Rule
33

⮚ Perceptron Learning Rule states that the algorithm would

automatically learn the optimal weight coefficients. The input
features are then multiplied with these weights to determine if
a neuron fires or not.
⮚ The Perceptron receives multiple input signals, and if the sum
of the input signals exceeds a certain threshold, it either
outputs a signal or does not return an output. In the context of
supervised learning and classification, this can then be used to
predict the class of a sample.
Perceptron Learning Rule
34
Perceptron Function
35

⮚ Perceptron is a function that maps its input “x,” which is multiplied with
the learned weight coefficient; an output value ”f(x)”is generated.

⮚ In the equation given above:

“w” = vector of real-valued weights
“b” = bias (an element that adjusts the boundary away from origin without
any dependence on the input value)
“x” = vector of input x values

“m” = number of inputs to the Perceptron

The output can be represented as “1” or “0.” It can also be represented as “1”
or “-1” depending on which activation function is used.
Inputs of a Perceptron
36

⮚ A Perceptron accepts inputs, moderates them with certain

weight values, then applies the transformation function to
output the final result. The above below shows a Perceptron
with a Boolean output.
⮚ A Boolean output is based on inputs such as salaried, married,
age, past credit profile, etc. It has only two values: Yes and No
or True and False. The summation function “∑” multiplies all
inputs of “x” by weights “w” and then adds them up as
follows:
Inputs of a Perceptron
37
Activation Functions of Perceptron
38

⮚ The activation function applies a step rule (convert the

numerical output into +1 or -1) to check if the output of the
weighting function is greater than zero or not.
Activation Functions of Perceptron
39

⮚ E.g.

If ∑ wixi> 0 => then final output “o” = 1 (issue bank loan)

Else, final output “o” = -1 (deny bank loan)

⮚ Step function gets triggered above a certain value of the

neuron output; else it outputs zero. Sign Function outputs +1
or -1 depending on whether neuron output is greater than zero
or not. Sigmoid is the S-curve and outputs a value between 0
and 1.
Output of Perceptron
40

⮚ Perceptron with a Boolean output:

⮚ Inputs: x1…xn
⮚ Output: o(x1….xn)

⮚ Weights: wi=> contribution of input xi to the Perceptron

output;
⮚ w0=> bias or threshold
⮚ If ∑w.x > 0, output is +1, else -1. The neuron gets triggered
only when weighted input reaches a certain threshold value.
Output of Perceptron
41

⮚ An output of +1 specifies that the neuron is triggered. An

output of -1 specifies that the neuron did not get triggered.
⮚ “sgn” stands for sign function with output +1 or -1.
Error in Perceptron
42

◻ In the Perceptron Learning Rule, the predicted output is

compared with the known output. If it does not match, the
error is propagated backward to allow weight adjustment to
happen.
Perceptron: Decision Function
43

◻ A decision function φ(z) of Perceptron is defined to take a linear

combination of x and w vectors.

◻ The value z in the decision function is given by:

◻ The decision function is +1 if z is greater than a threshold θ, and it is -1

otherwise.
Perceptron: Decision Function
44

◻ Bias Unit
◻ For simplicity, the threshold θ can be brought to the left and represented as
w0x0, where w0= -θ and x0= 1.

◻ The value w0 is called the bias unit.

◻ The decision function then becomes:
Perceptron: Decision Function
45

⮚ Output
⮚ The figure shows how the decision function squashes wTx to
either +1 or -1 and how it can be used to discriminate between
two linearly separable classes.
Perceptron
46

⮚ Perceptron has the following characteristics:

⮚ Perceptron is an algorithm for Supervised Learning of single layer binary
linear classifier.
⮚ Optimal weight coefficients are automatically learned.
⮚ Weights are multiplied with the input features and decision is made if the
neuron is fired or not.
⮚ Activation function applies a step rule to check if the output of the
weighting function is greater than zero.
⮚ Linear decision boundary is drawn enabling the distinction between the
two linearly separable classes +1 and -1.
⮚ If the sum of the input signals exceeds a certain threshold, it outputs a
signal; otherwise, there is no output.
⮚ Types of activation functions include the sign, step, and sigmoid
functions.
Summary Perceptron
47

⮚ The activation function to be used is a subjective decision based on the problem

statement and the form of the desired results.
⮚ If the learning process is slow or has vanishing or exploding gradients, change
the activation function to see if these problems can be resolved.
⮚ An artificial neuron is a mathematical function conceived as a model of
biological neurons, that is, a neural network.
⮚ A Perceptron is a neural network unit that does certain computations to detect
features or business intelligence in the input data. It is a function that maps its
input “x,” which is multiplied by the learned weight coefficient, and generates
an output value ”f(x).
⮚ ”Perceptron Learning Rule states that the algorithm would automatically learn
the optimal weight coefficients.
⮚ Single layer Perceptrons can learn only linearly separable patterns.
⮚ Multilayer Perceptron or feedforward neural network with two or more layers
have the greater processing power and can process non-linear patterns as well.
⮚ Perceptrons can implement Logic Gates like AND, OR, or XOR.
Sigmoid Neuron
48

⮚ Sigmoid neurons are the building block of the deep neural

networks.
⮚ Sigmoid neurons are similar to perceptrons, but they are
slightly modified such that the output from the sigmoid
neuron is much smoother than the step functional output from
perceptron.

https://towardsdatascience.com/sigmoid-neuron-deep-neural-networks-a4cd35b629d7
Why Sigmoid Neuron
49

⮚ Perceptron model takes several real-valued inputs and gives a

single binary output.
⮚ In the perceptron model, every input xi has weight wi
associated with it.
⮚ The weights indicate the importance of the input in the
decision-making process.
⮚ The model output is decided by a threshold Wₒ if the weighted
sum of the inputs is greater than threshold Wₒ output will be 1
else output will be 0.
⮚ In other words, the model will fire if the weighted sum is
greater than the threshold.
Why Sigmoid Neuron
50

⮚ From the mathematical representation, we might say that the thresholding

logic used by the perceptron is very harsh
Why Sigmoid Neuron
51

⮚ Consider the decision making process of a person, whether

he/she would like to purchase a car or not based on only one
input X1 — Salary and by setting the threshold b(Wₒ) = -10
and the weight W₁ = 0.2.
⮚ The output from the perceptron model will look like in the
figure shown below.
Sigmoid Neuron
52

⮚ In sigmoid neurons where the output function is much

smoother than the step function.
⮚ In the sigmoid neuron, a small change in the input only causes
a small change in the output as opposed to the stepped output.
⮚ There are many functions with the characteristic of an
“S” shaped curve known as sigmoid functions. The most
commonly used function is the logistic function.
Sigmoid Neuron
53

⮚ The inputs to the sigmoid neuron can be real numbers unlike the
boolean inputs in MP Neuron and the output will also be a real
number between 0–1.
⮚ In the sigmoid neuron, we are trying to regress the relationship
between X and Y in terms of probability.
⮚ Even though the output is between 0–1, we can still use the
sigmoid function for binary classification tasks by choosing
some threshold.
Sigmoid Neuron
54

⮚ Learning Algorithm
⮚ algorithm for learning the parameters w and b of the sigmoid
neuron model by using the gradient descent algorithm.

⮚ The objective of the learning algorithm is to determine the best

possible values for the parameters, such that the overall loss
(squared error loss) of the model is minimized as much as
possible. Here goes the learning algorithm
Sigmoid Neuron
55
Sigmoid Neuron
56

⮚ Initialize w and b randomly, then iterate over all the

observations in the data, for each observation find the
corresponding predicted outcome using the sigmoid function
and compute the squared error loss.
⮚ Based on the loss value, we will update the weights such that
the overall loss of the model at the new parameters will be
less than the current loss of the model.
⮚ Loss optimization
Activation Functions, Loss Function
Sources:

1. https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6
2. https://medium.com/@abhigoku10/activation-functions-and-its-types-in-artifical-ne
ural-network-14511f3080a8

3. https://medium.com/@zeeshanmulla/cost-activation-loss-function-neural-network-d
eep-learning-what-are-these-91167825a4de

4. https://towardsdatascience.com/deep-learning-which-loss-and-activation-functions-
should-i-use-ac02f1c56aa8

Final Year
BTECH Subject : Deep Learning (PE4)
A Simple Neural Network
58

A Neuron [2]:
• (x1,x2, …xn) - input signal vector
• (w1,w2,…wn) - weights
• accumulation ( i.e. summation + addition of bias b)
• an activation function f is applied to this sum

https://medium.com/@abhigoku10/activation-functions-and-its-types-in-artifical-neural-network-14511f30
80a8
Activation Function
59

⮚ An activation function is a very important feature of an

Artificial Neural Network to learn and understand the
complex patterns
⮚ A mathematical equation that determine the output of node
⮚ It helps to normalize the output of each neuron to a range
between 1 and 0 or between -1 and 1
⮚ It is also known as Transfer Function
Activation Function
60

⮚ The function
⮚ attached to each neuron in the network
⮚ determines whether neuron should be activated (“fired”)
or not,
⮚ based on whether each neuron’s input is relevant for the
model’s prediction
⮚ Computationally efficient (calculated across
thousands/millions of neurons for each data sample)
⮚ The need for speed has led to the development of
new functions such as ReLu
Activation Function (Types)
61

⮚ The Activation Functions [1] can be basically

divided into 2 types-

⮚ Linear Activation Function

⮚ Non-linear Activation Functions

https://medium.com/@abhigoku10/activation-functions-and-its-types-in-artifical-neural-network-1451
1f3080a8
Activation Function (Linear)
62

⮚ Linear Activation Function:

⮚ Equation : f(x) = x
⮚ Range : (-infinity to infinity)
⮚ It doesn’t help with the complex data
Activation Function (Non-Linear)
63

⮚ Non-linear Activation Functions

⮚ The model generalizes or adapts with variety of data
(images, video, audio, and have high dimensionality)
⮚ It allows backpropagation (derivative function which is
related to the inputs)
⮚ Create a deep neural network (“stacking” of multiple
layers of neurons)
Activation Function (Non-Linear)
64

◻ The Nonlinear Activation Functions are mainly

divided on the basis of their range or curves
◻ Sigmoid or Logistic Activation Function
🞑 A S-shape curve
🞑 Predict the probability
🞑 (0 and 1)
🞑 N/W can stuck at the
training time
Activation Function (Non-Linear)
65

⮚ Tanh or hyperbolic tangent Activation Function

⮚ tanh is also sigmoidal (S - shaped) with range -1 to 1
⮚ the negative inputs will also be mapped
⮚ Mostly tanh &logistic sigmoid are used in feed-forward n/w
Activation Function (Non-Linear)
66

⮚ ReLU (Rectified Linear Unit) Activation

Function
⮚ Widely used function (DL and CNN)
⮚ ReLU is half rectified (from bottom)
⮚ f(z) is zero when z is less than zero
⮚ f(z) is equal to z when z is above or equal to zero
Activation Function (Non-Linear)
67

⮚ ReLU Advantages
⮚ Computationally efficient—allows the network to
converge very quickly
⮚ Non-linear—although it looks like a linear function, ReLU
has a derivative function and allows for backpropagation

⮚ Disadvantages
⮚ The Dying ReLU problem—when inputs approach zero,
or are negative, the gradient of the function becomes zero,
the network cannot perform backpropagation and cannot
learn
Activation Function (Non-Linear)
68

⮚ Leaky ReLU
⮚ Attempt to solve the dying ReLU problem (a small positive
slope in the negative area enables backpropagation)
⮚ The leak helps to increase the range of the ReLU function
⮚ f(x) = ax for x<0 and f(x) = x for x>0
⮚ Range : (0.01 to infinity)
⮚ When a is not 0.01 then it is called Randomized ReLU.
Activation Function (Non-Linear)
69

Src: Sze, Vivienne & Chen, Yu-Hsin & Yang, Tien-Ju & Emer, Joel. (2017). Efficient Processing of Deep Neural Networks: A Tutorial and Survey.
Proceedings of the IEEE. 105.
Activation Function
70

⮚ Heuristics to apply activation function

⮚ Sigmoid functions and their combinations generally work better
in the case of classification problems
⮚ Sigmoids and tanh functions are sometimes avoided due to the
vanishing gradient problem
⮚ Tanh is avoided most of the time due to dead neuron problem
⮚ ReLU activation function is widely used as it yields better results
⮚ In case of dead neurons in the networks, the leaky ReLU
function is the best choice
⮚ ReLU function should only be used in the hidden layers
Summary of Activation Functions
71

⮚ Various activation functions that can be used with Perceptron are shown
here.

https://www.simplilearn.com/what-is-perceptron-tutorial
Loss Function
72

⮚ In a supervised deep learning context the loss function

measures the quality of a particular set of parameters based
on how well the output of the network agrees with the ground
truth labels in the training data

⮚ Loss function is a method of evaluating “how well the

algorithm models the dataset”
Nomenclature

loss function
=
cost function
=
objective function
=
error function
Loss function

How good is the network with the training data?

Deep Network

input output

labels (ground truth)

input

error parameters (weights, biases)

Loss function
75

⮚ If the predictions are totally off

⮚ Loss function will output a higher number
⮚ If the predictions are pretty good
⮚ Loss function output a lower number
⮚ Tune the algorithm to try and improve the model
⮚ Loss function will tell if its improving or not

⮚ ‘Loss’ helps to understand how much the predicted

value differ from actual value
https://medium.com/@zeeshanmulla/cost-activation-loss-function-neural-network-deep-learning-wh
at-are-these-91167825a4de
Types of Loss function
76

⮚ Regression Loss Function:

⮚ Regression models deals with predicting a continuous
value
⮚ Ex. floor area, number of rooms, size of rooms, predict the
price of the room.
⮚ The loss function used in the regression problem is
called “Regression Loss Function”
Types of Loss function
77

⮚ Binary Classification Loss Functions:

⮚ Binary classification is a prediction algorithm where
the output can be either 0 or 1
⮚ The output of binary classification algorithms is a
prediction score (mostly)
⮚ So the classification happens based on the threshold
the value (default value is 0.5)
⮚ If the prediction score > threshold then 1 else 0.
Types of Loss function
78

⮚ Multi-class Classification Loss Functions:

⮚ Multi-Class classification are those predictive

modeling problems where there are more target
variables/class
⮚ It is just the extension of binary classification problem
Loss Function
79
⮚ Regression: Predicting a numerical value
⮚ E.g. predicting the price of a product
⮚ The final layer of the neural network will have one
neuron and the value it returns is a continuous
numerical value
⮚ Compare the true value with predicted value

https://towardsdatascience.com/deep-learning-which-loss-and-activation-functions-should-i-use-ac02f1
c56aa8
Loss Function
80

⮚ Mean squared error (MSE)

⮚ The average squared difference between the predicted value
and the true value
Loss Function
81

⮚ Root Mean Square error (RMSE)

⮚ Root Mean Square error is the extension of MSE
⮚ Its the average of square root of sum of squared differences
between predictions and actual observations
Loss Function
82

⮚ Binary Cross Entropy Loss Function

⮚ Binary cross entropy measures how far away from the
true value (which is either 0 or 1) the prediction is for
each of the classes
⮚ It averages these class-wise errors to obtain the final loss
⮚ Cross entropy is the difference between two probability
distributions p and q, where p is our true output and q is
our estimate of this true output
⮚ This difference is applied to neural networks
References
83

• https://towardsdatascience.com/activation-functions-neural-ne
tworks-1cbd9f8d91d6

• https://medium.com/@abhigoku10/activation-functions-and-it
s-types-in-artifical-neural-network-14511f3080a8

• https://medium.com/@zeeshanmulla/cost-activation-loss-func
tion-neural-network-deep-learning-what-are-these-91167825a
4de

• https://towardsdatascience.com/deep-learning-which-loss-and
-activation-functions-should-i-use-ac02f1c56aa8
References
84

• https://www.simplilearn.com/what-is-perceptron-tutorial
• https://towardsdatascience.com/sigmoid-neuron-deep-neural-
networks-a4cd35b629d7
• https://www.javatpoint.com/artificial-neural-network

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
57% (83)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (108)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
100 Questions To Ask Your Partner
78% (36)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
91% (35)
The 36 Questions That Lead To Love - The New York Times
3 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
1001 Songs
70% (73)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Action Plan Reading Enrichment
100% (8)
Action Plan Reading Enrichment
2 pages
Updated Neural Networks (1)
No ratings yet
Updated Neural Networks (1)
49 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
4 pages
Unit III Deep Learning Chapter Notes
No ratings yet
Unit III Deep Learning Chapter Notes
23 pages
Class Notes Unit 2
No ratings yet
Class Notes Unit 2
25 pages
What is Artificial Neural Network
No ratings yet
What is Artificial Neural Network
4 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
5 pages
DL Unit 3 Upto Mid 1
No ratings yet
DL Unit 3 Upto Mid 1
23 pages
Module 1 Ann
No ratings yet
Module 1 Ann
31 pages
DL Mod1
No ratings yet
DL Mod1
58 pages
Unit 4 Notes
100% (1)
Unit 4 Notes
45 pages
Artificial Neural Network Tutorial
No ratings yet
Artificial Neural Network Tutorial
8 pages
2 - Artificial Neural Network - 18.03.2024
No ratings yet
2 - Artificial Neural Network - 18.03.2024
7 pages
Artificial Neural Networks: An Overview: August 2023
No ratings yet
Artificial Neural Networks: An Overview: August 2023
11 pages
AI&ML Unit 5
No ratings yet
AI&ML Unit 5
122 pages
Activity-2: Submitted To: C.P Bhargava Sir Submitted By: Priyanshi Gupta 110 CS-2 5sem
No ratings yet
Activity-2: Submitted To: C.P Bhargava Sir Submitted By: Priyanshi Gupta 110 CS-2 5sem
25 pages
ML_Unit-5
No ratings yet
ML_Unit-5
20 pages
Artificial Neural Networks: An Overview: Mesopotamian Journal of Computer Science August 2023
No ratings yet
Artificial Neural Networks: An Overview: Mesopotamian Journal of Computer Science August 2023
11 pages
LIET III CSE AIML II SEM A & B OU Soft Computing UNIT IV LN
No ratings yet
LIET III CSE AIML II SEM A & B OU Soft Computing UNIT IV LN
43 pages
Unit 2
No ratings yet
Unit 2
15 pages
Class 9 AI Artificial Neural Networks
No ratings yet
Class 9 AI Artificial Neural Networks
5 pages
Neural Network ML 1
No ratings yet
Neural Network ML 1
17 pages
ASC Unit I
No ratings yet
ASC Unit I
32 pages
Artificial Neural Network Tutorial
No ratings yet
Artificial Neural Network Tutorial
9 pages
priyanshu.pptx
No ratings yet
priyanshu.pptx
18 pages
NN AND DL
No ratings yet
NN AND DL
93 pages
Anns
No ratings yet
Anns
19 pages
CHA-2-Fundamentals of ANN PDF
No ratings yet
CHA-2-Fundamentals of ANN PDF
23 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
3 pages
UNIT4
No ratings yet
UNIT4
13 pages
CS 403-Soft Computing QA-Part-1
No ratings yet
CS 403-Soft Computing QA-Part-1
7 pages
Introduction to Neural Networks
No ratings yet
Introduction to Neural Networks
125 pages
Lesson 14 ANN Supervised
No ratings yet
Lesson 14 ANN Supervised
37 pages
Neural Network
No ratings yet
Neural Network
44 pages
Module-3 notes
No ratings yet
Module-3 notes
51 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
28 pages
Artificial Neural Networks: Part 1/3
No ratings yet
Artificial Neural Networks: Part 1/3
25 pages
Artifcial Neural Network": "A Project On
No ratings yet
Artifcial Neural Network": "A Project On
31 pages
Neural Network
No ratings yet
Neural Network
8 pages
DL Mod 1
No ratings yet
DL Mod 1
68 pages
Introduction To Neural Networks
No ratings yet
Introduction To Neural Networks
70 pages
Artificial Neural Networks For Machine Learning - Every Aspect You Need To Know About
No ratings yet
Artificial Neural Networks For Machine Learning - Every Aspect You Need To Know About
9 pages
Lecture Notes SC
No ratings yet
Lecture Notes SC
21 pages
Soft Computing: by K.Sai Saranya, Assistant Professor, Department of CSE
No ratings yet
Soft Computing: by K.Sai Saranya, Assistant Professor, Department of CSE
127 pages
Neural Networks
No ratings yet
Neural Networks
13 pages
Aditi Report
No ratings yet
Aditi Report
3 pages
Artificial Neural Networks and Its Applications
No ratings yet
Artificial Neural Networks and Its Applications
99 pages
L2 Neural Network.pptx
No ratings yet
L2 Neural Network.pptx
44 pages
Using Python in AI
No ratings yet
Using Python in AI
50 pages
ML Unit-5 Final
No ratings yet
ML Unit-5 Final
23 pages
w1 01 Introtonn
No ratings yet
w1 01 Introtonn
42 pages
What Is Artificial Neural Network?
No ratings yet
What Is Artificial Neural Network?
6 pages
Artificial Neural Network Tutorial
No ratings yet
Artificial Neural Network Tutorial
15 pages
Artificial Neural Network Tutorial - Javatpoint
No ratings yet
Artificial Neural Network Tutorial - Javatpoint
13 pages
Neural Networks
No ratings yet
Neural Networks
75 pages
Introduction To Artificial Neural Network
No ratings yet
Introduction To Artificial Neural Network
9 pages
ANN Lecture 01
No ratings yet
ANN Lecture 01
26 pages
Ann - Unit 1
No ratings yet
Ann - Unit 1
96 pages
Unit V Tn321
No ratings yet
Unit V Tn321
50 pages
CH 9: Connectionist Models
No ratings yet
CH 9: Connectionist Models
35 pages
Neural Networks
From Everand
Neural Networks
Sasha Kurzweil
No ratings yet
Product PDF 37692
No ratings yet
Product PDF 37692
2 pages
Grammar Unit II: Lesson 5: Using Indefinite Pronouns Correctly
0% (1)
Grammar Unit II: Lesson 5: Using Indefinite Pronouns Correctly
10 pages
Copia de Copia de NW - Bicu Personal Wellness Plan (1) 2
No ratings yet
Copia de Copia de NW - Bicu Personal Wellness Plan (1) 2
4 pages
Dissolving spiritual contracts
No ratings yet
Dissolving spiritual contracts
12 pages
Grade 4 Chapter Test Alignment - 1018
No ratings yet
Grade 4 Chapter Test Alignment - 1018
35 pages
Togakure Ryu Densho in English
100% (1)
Togakure Ryu Densho in English
80 pages
Digital Art Live Issue 62
No ratings yet
Digital Art Live Issue 62
92 pages
Download Study Resources for Solution Manual for Introduction to Java Programming, Brief Version, 11th Edition, Y. Daniel Liang
100% (19)
Download Study Resources for Solution Manual for Introduction to Java Programming, Brief Version, 11th Edition, Y. Daniel Liang
52 pages
Unit 7. Then and Now (Past Simple Review) (w2)
No ratings yet
Unit 7. Then and Now (Past Simple Review) (w2)
10 pages
Jss3 English MOCK
No ratings yet
Jss3 English MOCK
4 pages
Arduino Soil Moisture Sensor
No ratings yet
Arduino Soil Moisture Sensor
5 pages
Taqeem_case study
No ratings yet
Taqeem_case study
15 pages
Hunt 2015 Tunneling in Cobbles and Boulders, Final
No ratings yet
Hunt 2015 Tunneling in Cobbles and Boulders, Final
39 pages
St. John Bosco - Forty Dreams of St. John Bosco - The Apostle of Youth (2009, TAN Books) - Libgen - Li
No ratings yet
St. John Bosco - Forty Dreams of St. John Bosco - The Apostle of Youth (2009, TAN Books) - Libgen - Li
222 pages
Vertex Standard Vx-351pmr446
No ratings yet
Vertex Standard Vx-351pmr446
30 pages
Kessel Catalogue Final 26.3.2023
No ratings yet
Kessel Catalogue Final 26.3.2023
35 pages
CBTP P 1 Ed
No ratings yet
CBTP P 1 Ed
31 pages
Hulst Koster Vermeulen EPAP 2015
No ratings yet
Hulst Koster Vermeulen EPAP 2015
6 pages
Brochure Msmat 2025
No ratings yet
Brochure Msmat 2025
8 pages
Topic Risks of Artificial Intelligence 1
No ratings yet
Topic Risks of Artificial Intelligence 1
12 pages
Rebuilding Lego, Brick by Brick
No ratings yet
Rebuilding Lego, Brick by Brick
12 pages
Folklore and Folktales
71% (7)
Folklore and Folktales
28 pages
Vegetation Analysis WSUD
No ratings yet
Vegetation Analysis WSUD
43 pages
Literature Review Document Management System
100% (3)
Literature Review Document Management System
7 pages
LGBT Candles
No ratings yet
LGBT Candles
3 pages
Genigraphics Poster Template Sidebar 44x44
0% (1)
Genigraphics Poster Template Sidebar 44x44
1 page
Final Output in Disciplines and Ideas in The Social Sciences
No ratings yet
Final Output in Disciplines and Ideas in The Social Sciences
2 pages
Letter of Deferment
100% (2)
Letter of Deferment
1 page
Visser Supplement 2023 For A4 Printing - V1.1
No ratings yet
Visser Supplement 2023 For A4 Printing - V1.1
7 pages