ASC Unit I
ASC Unit I
Computing
KCS 056
Neural networks – I
Artificial neural network tutorial covers all the aspects related to the artificial
neural network. In this tutorial, we will discuss ANNs, Adaptive resonance
1
theory, Kohonen self-organizing map, Building blocks, unsupervised learning,
Genetic algorithm, etc.
The typical Artificial Neural Network looks something like the given figure.
2
Dendrites from Biological Neural Network represent inputs in Artificial Neural
Networks, cell nucleus represents Nodes, synapse represents Weights, and Axon
represents Output.
Dendrites Inputs
Synapse Weights
Axon Output
There are around 1000 billion neurons in the human brain. Each neuron has an
association point somewhere in the range of 1,000 and 100,000. In the human
3
brain, data is stored in such a manner as to be distributed, and we can extract more
than one piece of this data when necessary from our memory parallelly. We can
say that the human brain is made up of incredibly amazing parallel processors.
4
Input Layer:
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer, which
finally results in output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the
inputs and includes a bias. This computation is represented in the form of a
transfer function.
5
It determines weighted total is passed as an input to an activation function to
produce the output. Activation functions choose whether a node should fire or
not. Only those who are fired make it to the output layer. There are distinctive
activation functions available that can be applied upon the sort of task we are
performing.
Artificial neural networks have a numerical value that can perform more than one
task simultaneously.
Data that is used in traditional programming is stored on the whole network, not
on a database. The disappearance of a couple of pieces of data in one place doesn't
prevent the network from working.
After ANN training, the information may produce output even with inadequate
data. The loss of performance here relies upon the significance of missing data.
6
Extortion of one or more cells of ANN does not prohibit it from generating output,
and this feature makes the network fault-tolerance.
It is the most significant issue of ANN. When ANN produces a testing solution,
it does not provide insight concerning why and how. It decreases trust in the
network.
Hardware dependence:
Artificial neural networks need processors with parallel processing power, as per
their structure. Therefore, the realization of the equipment is dependent.
ANNs can work with numerical data. Problems must be converted into numerical
values before being introduced to ANN. The presentation mechanism to be
resolved here will directly impact the performance of the network. It relies on the
user's abilities.
The network is reduced to a specific value of the error, and this value does not
give us optimum results.
7
How do artificial neural networks work?
If the weighted sum is equal to zero, then bias is added to make the output non-
zero or something else to scale up to the system's response. Bias has the same
8
input, and weight equals to 1. Here the total of weighted inputs can be in the range
of 0 to positive infinity. Here, to keep the response in the limits of the desired
value, a certain maximum value is benchmarked, and the total of weighted inputs
is passed through the activation function.
The activation function refers to the set of transfer functions used to achieve the
desired output. There is a different kind of the activation function, but primarily
either linear or non-linear sets of functions. Some of the commonly used sets of
activation functions are the Binary, linear, and Tan hyperbolic sigmoidal
activation functions
There are various types of Artificial Neural Networks (ANN) depending upon the
human brain neuron and network functions, an artificial neural network similarly
performs tasks. The majority of the artificial neural networks will have some
similarities with a more complex biological partner and are very effective at their
expected tasks. For example, segmentation or classification.
Feedback ANN:
In this type of ANN, the output returns into the network to accomplish the best-
evolved results internally. As per the University of Massachusetts, Lowell
Centre for Atmospheric Research. The feedback networks feed information back
into itself and are well suited to solve optimization issues. The Internal system
error corrections utilize feedback ANNs.
Feed-Forward ANN:
It’s just a thing function that you use to get the output of node. It is also known
as Transfer Function.
It is used to determine the output of neural network like yes or no. It maps the
resulting values in between 0 to 1 or -1 to 1 etc. (depending upon the function).
As you can see the function is a line or linear. Therefore, the output of the
functions will not be confined between any range.
10
Fig: Linear Activation Function
Equation : f(x) = x
It doesn’t help with the complexity or various parameters of usual data that is fed
to the neural networks.
The Nonlinear Activation Functions are the most used activation functions.
Nonlinearity helps to makes the graph look something like this
11
Fig: Non-linear Activation Function
It makes it easy for the model to generalize or adapt with variety of data and to
differentiate between the output.
The Nonlinear Activation Functions are mainly divided on the basis of their range
or curves-
12
1. Sigmoid or Logistic Activation Function
The main reason why we use sigmoid function is because it exists between (0 to
1). Therefore, it is especially used for models where we have to predict the
probability as an output.Since probability of anything exists only between the
range of 0 and 1, sigmoid is the right choice.
The function is differentiable.That means, we can find the slope of the sigmoid
curve at any two points.
The logistic sigmoid function can cause a neural network to get stuck at the
training time.
13
The softmax function is a more generalized logistic activation function which is
used for multiclass classification.
tanh is also like logistic sigmoid but better. The range of the tanh function is from
(-1 to 1). tanh is also sigmoidal (s - shaped).
The advantage is that the negative inputs will be mapped strongly negative and
the zero inputs will be mapped near zero in the tanh graph.
Both tanh and logistic sigmoid activation functions are used in feed-forward nets.
The ReLU is the most used activation function in the world right now.Since, it is
used in almost all the convolutional neural networks or deep learning.
As you can see, the ReLU is half rectified (from bottom). f(z) is zero when z is
less than zero and f(z) is equal to z when z is above or equal to zero.
Range: [ 0 to infinity)
But the issue is that all the negative values become zero immediately which
decreases the ability of the model to fit or train from the data properly. That means
any negative input given to the ReLU activation function turns the value into zero
15
immediately in the graph, which in turns affects the resulting graph by not
mapping the negative values appropriately.
4. Leaky ReLU
The leak helps to increase the range of the ReLU function. Usually, the value
of a is 0.01 or so.
Both Leaky and Randomized ReLU functions are monotonic in nature. Also, their
derivatives also monotonic in nature.
16
Recurrent Neural Network
RNN have a “memory” which remembers all information about what has been
calculated. It uses the same parameters for each input as it performs the same
task on all the inputs or hidden layers to produce the output. This reduces the
complexity of parameters, unlike other neural networks.
17
4. One can go as many time steps according to the problem and join the
information from all the previous states.
5. Once all the time steps are completed the final current state is used to
calculate the output.
6. The output is then compared to the actual output i.e the target output and the
error is generated.
7. The error is then back-propagated to the network to update the weights and
hence the network (RNN) is trained.
18
Discrete Hopfield Network: It is a fully interconnected neural network where
each unit is connected to every other unit. It behaves in a discrete manner, i.e. it
gives finite distinct output, generally of two types:
Binary (0/1)
Bipolar (-1/1)
The weights associated with this network is symmetric in nature and has the
following properties.
19
Fig 1: Discrete Hopfield Network Architecture
[ x1 , x2 , ... , xn ] -> Input to the n given neurons.
[ y1 , y2 , ... , yn ] -> Output obtained from the n given neurons
Wij -> weight associated with the connection between the i th and the jth n
Boltzmann Machines
20
Boltzmann Machine
These kinds of neural networks work on the basis of pattern association, which
means they can store different patterns and at the time of giving an output they
can produce one of the stored patterns by matching them with the given input
pattern. These types of memories are also called Content-Addressable
Memory CAMCAM. Associative memory makes a parallel search with the
stored patterns as data files.
21
Auto Associative Memory
This is a single layer neural network in which the input training vector and the
output target vectors are the same. The weights are determined so that the network
stores a set of patterns.
Architecture
Similar to Auto Associative Memory network, this is also a single layer neural
network. However, in this network the input training vector and the output target
vectors are not the same. The weights are determined so that the network stores a
set of patterns. Hetero associative network is static in nature, hence, there would
be no non-linear and delay operations.
Architecture
22
As shown in the following figure, the architecture of Hetero Associative Memory
network has ‘n’ number of input training vectors and ‘m’ number of output target
vectors.
Hebb’s law
Hebbian Learning Rule, also known as Hebb Learning Rule, was proposed by
Donald O Hebb. It is one of the first and also easiest learning rules in the neural
network. It is used for pattern classification. It is a single layer neural network,
i.e. it has one input layer and one output layer. The input layer can have many
units, say n. The output layer only has one unit. Hebbian rule works by updating
the weights between neurons in the neural network for each training sample.
23
Gradient Descent is an optimization algorithm used for minimizing the cost
function in various machine learning algorithms. It is basically used for updating
the parameters of the learning model.
24
examples in one go. Thus, it works for larger training examples and that too
with lesser number of iterations.
Perceptron Convergence theorem states that a classifier for two linearly separable
classes of patterns is always trainable in a finite number of training steps.
(Or)
In other words, after a finite number of iterations, the algorithm yields a vector w
that classifies perfectly all the examples.
Note: Although the number of iterations is finite, it is usually larger than the size
of the training set, because each example needs to be processed more than once.
25
Types of Leanings in Neural Networks
Based on the methods and way of learning, machine learning is divided into
mainly four types, which are:
26
Let's understand supervised learning with an example. Suppose we have an input
dataset of cats and dog images. So, first, we will provide the training to the
machine to understand the images, such as the shape & size of the tail of cat and
dog, Shape of eyes, colour, height (dogs are taller, cats are smaller), etc. After
completion of training, we input the picture of a cat and ask the machine to
identify the object and predict the output. Now, the machine is well trained, so it
will check all the features of the object, such as height, shape, colour, eyes, ears,
tail, etc., and find that it's a cat. So, it will put it in the Cat category. This is the
process of how the machine identifies the objects in Supervised Learning.
The main goal of the supervised learning technique is to map the input
variable(x) with the output variable(y). Some real-world applications of
supervised learning are Risk Assessment, Fraud Detection, Spam filtering, etc.
Supervised machine learning can be classified into two types of problems, which
are given below:
o Classification
o Regression
a) Classification
b) Regression
27
Regression algorithms are used to solve regression problems in which there is a
linear relationship between input and output variables. These are used to predict
continuous output variables, such as market trends, weather prediction, etc.
o Image Segmentation
o Medical Diagnosis
o Fraud Detection
o Spam detection
o Speech Recognition
In unsupervised learning, the models are trained with the data that is neither
classified nor labelled, and the model acts on that data without any supervision.
28
totally unknown to the model, and the task of the machine is to find the patterns
and categories of the objects.
So, now the machine will discover its patterns and differences, such as colour
difference, shape difference, and predict the output when it is tested with the test
dataset.
Unsupervised Learning can be further classified into two types, which are given
below:
o Clustering
o Association
1) Clustering
The clustering technique is used when we want to find the inherent groups from
the data. It is a way to group the objects into a cluster such that the objects with
the most similarities remain in one group and have fewer or no similarities with
the objects of other groups. An example of the clustering algorithm is grouping
the customers by their purchasing behaviour.
2) Association
29
Some popular algorithms of Association rule learning are Apriori Algorithm,
Eclat, FP-growth algorithm.
o Reinforcement Learning
30
References:
https://www.javatpoint.com/artificial-neural-network
https://www.geeksforgeeks.org/hopfield-neural-network/
https://www.geeksforgeeks.org/hebbian-learning-rule-with-
implementation-of-and-gate/
Internet
https://web.mit.edu/course/other/i2course/www/vision_and_learning/perc
eptron_notes.pdf
https://www.javatpoint.com/types-of-machine-learning
31