0% found this document useful (0 votes)
156 views

ASC Unit I

The document provides an overview of artificial neural networks (ANNs). It defines ANNs as computational networks inspired by biological neural networks in the human brain. ANNs contain interconnected nodes similar to neurons, organized in layers. The document discusses the basic components and architecture of ANNs, including input, hidden, and output layers. It also covers how ANNs work by taking weighted inputs, applying activation functions, and producing outputs. The document outlines some advantages like parallel processing and fault tolerance, as well as disadvantages like difficulty ensuring the proper network structure. Finally, it briefly introduces different types of ANNs.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views

ASC Unit I

The document provides an overview of artificial neural networks (ANNs). It defines ANNs as computational networks inspired by biological neural networks in the human brain. ANNs contain interconnected nodes similar to neurons, organized in layers. The document discusses the basic components and architecture of ANNs, including input, hidden, and output layers. It also covers how ANNs work by taking weighted inputs, applying activation functions, and producing outputs. The document outlines some advantages like parallel processing and fault tolerance, as well as disadvantages like difficulty ensuring the proper network structure. Finally, it briefly introduces different types of ANNs.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Application of Soft

Computing
KCS 056

UNIT – I (Neural Networks –I)


UnIt – I

Neural networks – I

Artificial Neural Network Tutorial

Artificial Neural Network Tutorial provides basic and advanced concepts of


ANNs. Our Artificial Neural Network tutorial is developed for beginners as well
as professions.

The term "Artificial neural network" refers to a biologically inspired sub-field of


artificial intelligence modeled after the brain. An Artificial neural network is
usually a computational network based on biological neural networks that
construct the structure of the human brain. Similar to a human brain has neurons
interconnected to each other, artificial neural networks also have neurons that are
linked to each other in various layers of the networks. These neurons are known
as nodes.

Artificial neural network tutorial covers all the aspects related to the artificial
neural network. In this tutorial, we will discuss ANNs, Adaptive resonance

1
theory, Kohonen self-organizing map, Building blocks, unsupervised learning,
Genetic algorithm, etc.

What is Artificial Neural Network?

The term "Artificial Neural Network" is derived from Biological neural


networks that develop the structure of a human brain. Similar to the human brain
that has neurons interconnected to one another, artificial neural networks also
have neurons that are interconnected to one another in various layers of the
networks. These neurons are known as nodes.

The given figure illustrates the typical diagram of Biological Neural


Network.

The typical Artificial Neural Network looks something like the given figure.

2
Dendrites from Biological Neural Network represent inputs in Artificial Neural
Networks, cell nucleus represents Nodes, synapse represents Weights, and Axon
represents Output.

Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output

An Artificial Neural Network in the field of Artificial intelligence where it


attempts to mimic the network of neurons makes up a human brain so that
computers will have an option to understand things and make decisions in a
human-like manner. The artificial neural network is designed by programming
computers to behave simply like interconnected brain cells.

There are around 1000 billion neurons in the human brain. Each neuron has an
association point somewhere in the range of 1,000 and 100,000. In the human

3
brain, data is stored in such a manner as to be distributed, and we can extract more
than one piece of this data when necessary from our memory parallelly. We can
say that the human brain is made up of incredibly amazing parallel processors.

We can understand the artificial neural network with an example, consider an


example of a digital logic gate that takes an input and gives an output. "OR" gate,
which takes two inputs. If one or both the inputs are "On," then we get "On" in
output. If both the inputs are "Off," then we get "Off" in output. Here the output
depends upon input. Our brain does not perform the same task. The outputs to
inputs relationship keep changing because of the neurons in our brain, which are
"learning."

The architecture of an artificial neural network:

To understand the concept of the architecture of an artificial neural network, we


have to understand what a neural network consists of. In order to define a neural
network that consists of a large number of artificial neurons, which are termed
units arranged in a sequence of layers. Lets us look at various types of layers
available in an artificial neural network.

Artificial Neural Network primarily consists of three layers:

4
Input Layer:

As the name suggests, it accepts inputs in several different formats provided by


the programmer.

Hidden Layer:

The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.

Output Layer:

The input goes through a series of transformations using the hidden layer, which
finally results in output that is conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the
inputs and includes a bias. This computation is represented in the form of a
transfer function.

5
It determines weighted total is passed as an input to an activation function to
produce the output. Activation functions choose whether a node should fire or
not. Only those who are fired make it to the output layer. There are distinctive
activation functions available that can be applied upon the sort of task we are
performing.

Advantages of Artificial Neural Network (ANN)

Parallel processing capability:

Artificial neural networks have a numerical value that can perform more than one
task simultaneously.

Storing data on the entire network:

Data that is used in traditional programming is stored on the whole network, not
on a database. The disappearance of a couple of pieces of data in one place doesn't
prevent the network from working.

Capability to work with incomplete knowledge:

After ANN training, the information may produce output even with inadequate
data. The loss of performance here relies upon the significance of missing data.

Having a memory distribution:

For ANN is to be able to adapt, it is important to determine the examples and to


encourage the network according to the desired output by demonstrating these
examples to the network. The succession of the network is directly proportional
to the chosen instances, and if the event can't appear to the network in all its
aspects, it can produce false output.

Having fault tolerance:

6
Extortion of one or more cells of ANN does not prohibit it from generating output,
and this feature makes the network fault-tolerance.

Disadvantages of Artificial Neural Network:

Assurance of proper network structure:

There is no particular guideline for determining the structure of artificial neural


networks. The appropriate network structure is accomplished through experience,
trial, and error.

Unrecognized behavior of the network:

It is the most significant issue of ANN. When ANN produces a testing solution,
it does not provide insight concerning why and how. It decreases trust in the
network.

Hardware dependence:

Artificial neural networks need processors with parallel processing power, as per
their structure. Therefore, the realization of the equipment is dependent.

Difficulty of showing the issue to the network:

ANNs can work with numerical data. Problems must be converted into numerical
values before being introduced to ANN. The presentation mechanism to be
resolved here will directly impact the performance of the network. It relies on the
user's abilities.

The duration of the network is unknown:

The network is reduced to a specific value of the error, and this value does not
give us optimum results.

7
How do artificial neural networks work?

Artificial Neural Network can be best represented as a weighted directed graph,


where the artificial neurons form the nodes. The association between the neurons
outputs and neuron inputs can be viewed as the directed edges with weights. The
Artificial Neural Network receives the input signal from the external source in
the form of a pattern and image in the form of a vector. These inputs are then
mathematically assigned by the notations x(n) for every n number of inputs.

Afterward, each of the input is multiplied by its corresponding weights ( these


weights are the details utilized by the artificial neural networks to solve a specific
problem ). In general terms, these weights normally represent the strength of the
interconnection between neurons inside the artificial neural network. All the
weighted inputs are summarized inside the computing unit.

If the weighted sum is equal to zero, then bias is added to make the output non-
zero or something else to scale up to the system's response. Bias has the same
8
input, and weight equals to 1. Here the total of weighted inputs can be in the range
of 0 to positive infinity. Here, to keep the response in the limits of the desired
value, a certain maximum value is benchmarked, and the total of weighted inputs
is passed through the activation function.

The activation function refers to the set of transfer functions used to achieve the
desired output. There is a different kind of the activation function, but primarily
either linear or non-linear sets of functions. Some of the commonly used sets of
activation functions are the Binary, linear, and Tan hyperbolic sigmoidal
activation functions

Types of Artificial Neural Network:

There are various types of Artificial Neural Networks (ANN) depending upon the
human brain neuron and network functions, an artificial neural network similarly
performs tasks. The majority of the artificial neural networks will have some
similarities with a more complex biological partner and are very effective at their
expected tasks. For example, segmentation or classification.

Feedback ANN:

In this type of ANN, the output returns into the network to accomplish the best-
evolved results internally. As per the University of Massachusetts, Lowell
Centre for Atmospheric Research. The feedback networks feed information back
into itself and are well suited to solve optimization issues. The Internal system
error corrections utilize feedback ANNs.

Feed-Forward ANN:

A feed-forward network is a basic neural network comprising of an input layer,


an output layer, and at least one layer of a neuron. Through assessment of its
output by reviewing its input, the intensity of the network can be noticed based
9
on group behavior of the associated neurons, and the output is decided. The
primary advantage of this network is that it figures out how to evaluate and
recognize input patterns.

What is Activation Function?

It’s just a thing function that you use to get the output of node. It is also known
as Transfer Function.

Why we use Activation functions with Neural Networks?

It is used to determine the output of neural network like yes or no. It maps the
resulting values in between 0 to 1 or -1 to 1 etc. (depending upon the function).

The Activation Functions can be basically divided into 2 types-

1. Linear Activation Function

2. Non-linear Activation Functions

Linear or Identity Activation Function

As you can see the function is a line or linear. Therefore, the output of the
functions will not be confined between any range.

10
Fig: Linear Activation Function

Equation : f(x) = x

Range : (-infinity to infinity)

It doesn’t help with the complexity or various parameters of usual data that is fed
to the neural networks.

Non-linear Activation Function

The Nonlinear Activation Functions are the most used activation functions.
Nonlinearity helps to makes the graph look something like this

11
Fig: Non-linear Activation Function

It makes it easy for the model to generalize or adapt with variety of data and to
differentiate between the output.

The main terminologies needed to understand for nonlinear functions are:

Derivative or Differential: Change in y-axis w.r.t. change in x-axis.It is also


known as slope.

Monotonic function: A function which is either entirely non-increasing or non-


decreasing.

The Nonlinear Activation Functions are mainly divided on the basis of their range
or curves-

12
1. Sigmoid or Logistic Activation Function

The Sigmoid Function curve looks like a S-shape.

Fig: Sigmoid Function

The main reason why we use sigmoid function is because it exists between (0 to
1). Therefore, it is especially used for models where we have to predict the
probability as an output.Since probability of anything exists only between the
range of 0 and 1, sigmoid is the right choice.

The function is differentiable.That means, we can find the slope of the sigmoid
curve at any two points.

The function is monotonic but function’s derivative is not.

The logistic sigmoid function can cause a neural network to get stuck at the
training time.

13
The softmax function is a more generalized logistic activation function which is
used for multiclass classification.

2. Tanh or hyperbolic tangent Activation Function

tanh is also like logistic sigmoid but better. The range of the tanh function is from
(-1 to 1). tanh is also sigmoidal (s - shaped).

Fig: tanh v/s Logistic Sigmoid

The advantage is that the negative inputs will be mapped strongly negative and
the zero inputs will be mapped near zero in the tanh graph.

The function is differentiable.

The function is monotonic while its derivative is not monotonic.


14
The tanh function is mainly used classification between two classes.

Both tanh and logistic sigmoid activation functions are used in feed-forward nets.

3. ReLU (Rectified Linear Unit) Activation Function

The ReLU is the most used activation function in the world right now.Since, it is
used in almost all the convolutional neural networks or deep learning.

Fig: ReLU v/s Logistic Sigmoid

As you can see, the ReLU is half rectified (from bottom). f(z) is zero when z is
less than zero and f(z) is equal to z when z is above or equal to zero.

Range: [ 0 to infinity)

The function and its derivative both are monotonic.

But the issue is that all the negative values become zero immediately which
decreases the ability of the model to fit or train from the data properly. That means
any negative input given to the ReLU activation function turns the value into zero

15
immediately in the graph, which in turns affects the resulting graph by not
mapping the negative values appropriately.

4. Leaky ReLU

It is an attempt to solve the dying ReLU problem

Fig : ReLU v/s Leaky ReLU

Can you see the Leak? 😆

The leak helps to increase the range of the ReLU function. Usually, the value
of a is 0.01 or so.

When a is not 0.01 then it is called Randomized ReLU.

Therefore the range of the Leaky ReLU is (-infinity to infinity).

Both Leaky and Randomized ReLU functions are monotonic in nature. Also, their
derivatives also monotonic in nature.

16
Recurrent Neural Network

Recurrent Neural Network(RNN) are a type of Neural Network where


the output from previous step are fed as input to the current step. In
traditional neural networks, all the inputs and outputs are independent of each
other, but in cases like when it is required to predict the next word of a sentence,
the previous words are required and hence there is a need to remember the
previous words. Thus RNN came into existence, which solved this issue with
the help of a Hidden Layer. The main and most important feature of RNN
is Hidden state, which remembers some information about a sequence.

RNN have a “memory” which remembers all information about what has been
calculated. It uses the same parameters for each input as it performs the same
task on all the inputs or hidden layers to produce the output. This reduces the
complexity of parameters, unlike other neural networks.

Training through RNN

1. A single time step of the input is provided to the network.


2. Then calculate its current state using set of current input and the previous
state.
3. The current ht becomes ht-1 for the next time step.

17
4. One can go as many time steps according to the problem and join the
information from all the previous states.
5. Once all the time steps are completed the final current state is used to
calculate the output.
6. The output is then compared to the actual output i.e the target output and the
error is generated.
7. The error is then back-propagated to the network to update the weights and
hence the network (RNN) is trained.

Advantages of Recurrent Neural Network

1. An RNN remembers each and every information through time. It is useful in


time series prediction only because of the feature to remember previous
inputs as well. This is called Long Short Term Memory.
2. Recurrent neural network are even used with convolutional layers to extend
the effective pixel neighbourhood.
Disadvantages of Recurrent Neural Network
1. Gradient vanishing and exploding problems.
2. Training an RNN is a very difficult task.
3. It cannot process very long sequences if using tanh or relu as an activation
function.

Hopfield Neural Networks

The Hopfield Neural Networks, invented by Dr John J. Hopfield consists of one


layer of ‘n’ fully connected recurrent neurons. It is generally used in performing
auto association and optimization tasks. It is calculated using a converging
interactive process and it generates a different response than our normal neural
nets.

18
Discrete Hopfield Network: It is a fully interconnected neural network where
each unit is connected to every other unit. It behaves in a discrete manner, i.e. it
gives finite distinct output, generally of two types:
 Binary (0/1)
 Bipolar (-1/1)
The weights associated with this network is symmetric in nature and has the
following properties.

Structure & Architecture


 Each neuron has an inverting and a non-inverting output.
 Being fully connected, the output of each neuron is an input to all other
neurons but not self.
Fig 1 shows a sample representation of a Discrete Hopfield Neural Network
architecture having the following elements.

19
Fig 1: Discrete Hopfield Network Architecture
[ x1 , x2 , ... , xn ] -> Input to the n given neurons.
[ y1 , y2 , ... , yn ] -> Output obtained from the n given neurons
Wij -> weight associated with the connection between the i th and the jth n

Boltzmann Machines

Boltzmann Machines is an unsupervised DL model in which every node is


connected to every other node. That is, unlike the ANNs, CNNs, RNNs and
SOMs, the Boltzmann Machines are undirected (or the connections are
bidirectional). Boltzmann Machine is not a deterministic DL model but
a stochastic or generative DL model. It is rather a representation of a certain
system. There are two types of nodes in the Boltzmann Machine — Visible
nodes — those nodes which we can and do measure, and the Hidden nodes –
those nodes which we cannot or do not measure. Although the node types are
different, the Boltzmann machine considers them as the same and everything
works as one single system. The training data is fed into the Boltzmann Machine
and the weights of the system are adjusted accordingly. Boltzmann machines
help us understand abnormalities by learning about the working of the system
in normal conditions.

20
Boltzmann Machine

These kinds of neural networks work on the basis of pattern association, which
means they can store different patterns and at the time of giving an output they
can produce one of the stored patterns by matching them with the given input
pattern. These types of memories are also called Content-Addressable
Memory CAMCAM. Associative memory makes a parallel search with the
stored patterns as data files.

Following are the two types of associative memories we can observe −

 Auto Associative Memory


 Hetero Associative memory

21
Auto Associative Memory

This is a single layer neural network in which the input training vector and the
output target vectors are the same. The weights are determined so that the network
stores a set of patterns.

Architecture

As shown in the following figure, the architecture of Auto Associative memory


network has ‘n’ number of input training vectors and similar ‘n’ number of
output target vectors.

Hetero Associative memory

Similar to Auto Associative Memory network, this is also a single layer neural
network. However, in this network the input training vector and the output target
vectors are not the same. The weights are determined so that the network stores a
set of patterns. Hetero associative network is static in nature, hence, there would
be no non-linear and delay operations.

Architecture

22
As shown in the following figure, the architecture of Hetero Associative Memory
network has ‘n’ number of input training vectors and ‘m’ number of output target
vectors.

Hebb’s law
Hebbian Learning Rule, also known as Hebb Learning Rule, was proposed by
Donald O Hebb. It is one of the first and also easiest learning rules in the neural
network. It is used for pattern classification. It is a single layer neural network,
i.e. it has one input layer and one output layer. The input layer can have many
units, say n. The output layer only has one unit. Hebbian rule works by updating
the weights between neurons in the neural network for each training sample.

Hebbian Learning Rule Algorithm :


1. Set all weights to zero, wi = 0 for i=1 to n, and bias to zero.
2. For each input vector, S(input vector) : t(target output pair), repeat steps 3-
5.
3. Set activations for input units with the input vector X i = Si for i = 1 to n.
4. Set the corresponding output value to the output neuron, i.e. y = t.
5. Update weight and bias by applying Hebb rule for all i = 1 to n:

23
Gradient Descent is an optimization algorithm used for minimizing the cost
function in various machine learning algorithms. It is basically used for updating
the parameters of the learning model.

Types of gradient Descent:


1. Batch Gradient Descent: This is a type of gradient descent which processes
all the training examples for each iteration of gradient descent. But if the
number of training examples is large, then batch gradient descent is
computationally very expensive. Hence if the number of training examples
is large, then batch gradient descent is not preferred. Instead, we prefer to
use stochastic gradient descent or mini-batch gradient descent.
2. Stochastic Gradient Descent: This is a type of gradient descent which
processes 1 training example per iteration. Hence, the parameters are being
updated even after one iteration in which only a single example has been
processed. Hence this is quite faster than batch gradient descent. But again,
when the number of training examples is large, even then it processes only
one example which can be additional overhead for the system as the number
of iterations will be quite large.
3. Mini Batch gradient descent: This is a type of gradient descent which
works faster than both batch gradient descent and stochastic gradient descent.
Here b examples where b<m are processed per iteration. So even if the
number of training examples is large, it is processed in batches of b training

24
examples in one go. Thus, it works for larger training examples and that too
with lesser number of iterations.

Perceptron Convergence Theorem:

Perceptron Convergence theorem states that a classifier for two linearly separable
classes of patterns is always trainable in a finite number of training steps.

(Or)

Perceptron Convergence Theorem: For any finite set of linearly separable


labelled examples, the Perceptron Learning Algorithm will halt after a finite
number of iterations.

In other words, after a finite number of iterations, the algorithm yields a vector w
that classifies perfectly all the examples.

Note: Although the number of iterations is finite, it is usually larger than the size
of the training set, because each example needs to be processed more than once.

25
Types of Leanings in Neural Networks

Based on the methods and way of learning, machine learning is divided into
mainly four types, which are:

1. Supervised Machine Learning


2. Unsupervised Machine Learning
3. Semi-Supervised Machine Learning
4. Reinforcement Learning

Supervised Machine Learning

As its name suggests, Supervised machine learning is based on supervision. It


means in the supervised learning technique, we train the machines using the
"labelled" dataset, and based on the training, the machine predicts the output.
Here, the labelled data specifies that some of the inputs are already mapped to the
output. More preciously, we can say; first, we train the machine with the input
and corresponding output, and then we ask the machine to predict the output using
the test dataset.

26
Let's understand supervised learning with an example. Suppose we have an input
dataset of cats and dog images. So, first, we will provide the training to the
machine to understand the images, such as the shape & size of the tail of cat and
dog, Shape of eyes, colour, height (dogs are taller, cats are smaller), etc. After
completion of training, we input the picture of a cat and ask the machine to
identify the object and predict the output. Now, the machine is well trained, so it
will check all the features of the object, such as height, shape, colour, eyes, ears,
tail, etc., and find that it's a cat. So, it will put it in the Cat category. This is the
process of how the machine identifies the objects in Supervised Learning.

The main goal of the supervised learning technique is to map the input
variable(x) with the output variable(y). Some real-world applications of
supervised learning are Risk Assessment, Fraud Detection, Spam filtering, etc.

Categories of Supervised Machine Learning

Supervised machine learning can be classified into two types of problems, which
are given below:

o Classification
o Regression

a) Classification

Classification algorithms are used to solve the classification problems in which


the output variable is categorical, such as "Yes" or No, Male or Female, Red or
Blue, etc. The classification algorithms predict the categories present in the
dataset. Some real-world examples of classification algorithms are Spam
Detection, Email filtering, etc.

b) Regression

27
Regression algorithms are used to solve regression problems in which there is a
linear relationship between input and output variables. These are used to predict
continuous output variables, such as market trends, weather prediction, etc.

Applications of Supervised Learning

Some common applications of Supervised Learning are given below:

o Image Segmentation
o Medical Diagnosis
o Fraud Detection
o Spam detection
o Speech Recognition

Unsupervised Machine Learning

Unsupervised learning is different from the Supervised learning technique; as its


name suggests, there is no need for supervision. It means, in unsupervised
machine learning, the machine is trained using the unlabeled dataset, and the
machine predicts the output without any supervision.

In unsupervised learning, the models are trained with the data that is neither
classified nor labelled, and the model acts on that data without any supervision.

The main aim of the unsupervised learning algorithm is to group or


categories the unsorted dataset according to the similarities, patterns, and
differences. Machines are instructed to find the hidden patterns from the input
dataset.

Let's take an example to understand it more preciously; suppose there is a basket


of fruit images, and we input it into the machine learning model. The images are

28
totally unknown to the model, and the task of the machine is to find the patterns
and categories of the objects.

So, now the machine will discover its patterns and differences, such as colour
difference, shape difference, and predict the output when it is tested with the test
dataset.

Categories of Unsupervised Machine Learning

Unsupervised Learning can be further classified into two types, which are given
below:

o Clustering
o Association

1) Clustering

The clustering technique is used when we want to find the inherent groups from
the data. It is a way to group the objects into a cluster such that the objects with
the most similarities remain in one group and have fewer or no similarities with
the objects of other groups. An example of the clustering algorithm is grouping
the customers by their purchasing behaviour.

2) Association

Association rule learning is an unsupervised learning technique, which finds


interesting relations among variables within a large dataset. The main aim of this
learning algorithm is to find the dependency of one data item on another data item
and map those variables accordingly so that it can generate maximum profit. This
algorithm is mainly applied in Market Basket analysis, Web usage mining,
continuous production, etc.

29
Some popular algorithms of Association rule learning are Apriori Algorithm,
Eclat, FP-growth algorithm.

o Reinforcement Learning

Reinforcement learning works on a feedback-based process, in which an AI


agent (A software component) automatically explore its surrounding by
hitting & trail, taking action, learning from experiences, and improving its
performance. Agent gets rewarded for each good action and get punished for
each bad action; hence the goal of reinforcement learning agent is to maximize
the rewards.

30
References:

 https://www.javatpoint.com/artificial-neural-network
 https://www.geeksforgeeks.org/hopfield-neural-network/
 https://www.geeksforgeeks.org/hebbian-learning-rule-with-
implementation-of-and-gate/
 Internet
 https://web.mit.edu/course/other/i2course/www/vision_and_learning/perc
eptron_notes.pdf
 https://www.javatpoint.com/types-of-machine-learning

31

You might also like