Module - 3 AAI
Module - 3 AAI
DEPT. OF AIML
• Artificial intelligence (AI) is the simulation of human intelligence in
machines that are programmed to think and act like humans. Learning,
reasoning, problem-solving, perception, and language comprehension are
all examples of cognitive abilities.
ARTIFICIAL NEURAL NETWORKS
ARTIFICIAL NEURAL NETWORKS
• FeedBack ANN
RECURRENT NETWORK
• Recurrent network is a feedback network with a
closed loop.
• This type of network can be a single layer network or
multilayer network.
• In a single layer network with a feedback connection,
a processing element’s output can be directed back to
itself or to another processing element or to both.
• When the feedback is directed back to the hidden
layers it forms a multilayer recurrent network.
• In addition, a processing element output can be
directed back to itself and to other processing
elements in the same layer.
LEARNING
•Flexible Output: ANNs can predict single or multiple outputs, which could be categories (e.g., yes/no), continuous
values (e.g., temperature), or a mix of both. For example, it can predict both steering directions and acceleration in a
self-driving car.
•Handling Errors in Data: ANNs can learn effectively even if the training data has mistakes or noise, making them
robust to imperfect data.
•Long Training Times: Training ANNs can take longer than simpler models, ranging from seconds to hours, but this
time is necessary for learning complex patterns in data.
•Fast Prediction: Once trained, ANNs can quickly make predictions on new data. For instance, in real-time
applications like autonomous driving, they can update decisions several times per second.
•Not Easily Interpreted: The internal workings of ANNs (weights and connections) are hard to understand for
humans, unlike simpler models that can be explained more clearly.
PERCEPTRON
• Perceptron is considered as a single-layer neural
network that consists of four main parameters
named input values (Input nodes), weights and
Bias, net sum, and an activation function.
• The perceptron model begins with the
multiplication of all input values and their
weights, then adds these values together to create
the weighted sum.
• Then this weighted sum is applied to the
activation function 'f' to obtain the desired
output.
• This activation function is also known as the step
function and is represented by 'f'.
WORKING OF PERCEPTRON
One type of ANN system is based on a unit called a perceptron
REPRESENTATIONAL POWER OF PERCEPTRON
PERCEPTRON FOR BOOLEAN
Not gate: NOT(x) is a 1-variable function,
that means that we will have one input at a
time: N=1. Also, it is a logical function, and
so both the input and the output have only
two possible states: 0 and 1
AND LOGICAL FUNCTION
• Output;
OR LOGICAL FUNCTION
Output;
XOR — ALL (PERCEPTRONS) FOR ONE (LOGICAL FUNCTION)
PERCEPTRON TRAINING ALGORITHM
Initial Setup
Training Data
Cat example: Ear size = 3, Tail length = 5, Target output = 0 (representing "cat").
Dog example: Ear size = 8, Tail length = 12, Target output = 1 (representing "dog").
Initial Prediction
Ex : 0-1=-1
Updating the Weights
Repeat the Process
• After adjusting the weights, the perceptron tries again on new examples, and
over time, it makes fewer errors.
• For example, after training on many images of cats and dogs, the perceptron
learns to predict accurately whether the input is a cat or a dog based on ear size
and tail length.
GRADIENT DESCENT AND DELTA RULE
VISUALIZING THE HYPOTHESIS SPACE
DERIVATION OF DELTA RULE
HOW TO CALCULATE THE DIRECTION OF STEEPEST DESCENT ALONG
THE ERROR SURFACE?
TRAINING RULE
Finally,
MULTILAYER NETWORKS AND THE
BACKPROPAGATION ALGORITHM
A multilayer network is a type of neural network where information moves through layers
of interconnected nodes (neurons). It consists of an input layer, one or more hidden layers,
and an output layer. These networks can solve complex problems because the hidden
layers learn to capture intricate patterns in data.
MLP networks are used for supervised learning format. A typical learning algorithm for
MLP networks is also called back propagation's algorithm.
BACKPROPAGATION ALGORITHM
The backpropagation algorithm is a method used to train multilayer networks.
Here's how it works:
1. Data is fed into the input layer and passed forward through the network.
2. The network makes a prediction at the output layer.
3. If the prediction is wrong, the backpropagation algorithm calculates the error and
sends it backward through the network.
4. The network adjusts its weights (connections between neurons) to reduce the error.
Through repeated training, the network gets better at making accurate predictions.
DIFFERENTIABLE THRESHOLD UNIT
• A differentiable threshold unit (like a neuron with a sigmoid or softmax
activation function) helps the network learn by producing smooth outputs that
can change gradually, rather than hard "yes/no" decisions. For instance, instead
of deciding "yes" or "no," it outputs a value between 0 and 1,or -1 and +1
which indicates confidence in its prediction.
A DIFFERENTIABLE THRESHOLD UNIT (SIGMOID UNIT)
• Sigmoid unit-a unit very much like a perceptron, but based on a smoothed differentiable threshold
function.
• The sigmoid unit first computes a linear combination of its inputs, then applies a threshold to the result
and the threshold output is a continuous function of its input.
• More precisely, the sigmoid unit computes its output O as
THE BACKPROPAGATION
ALGORITHM
ADDING MOMENTUM
LEARNING IN ARBITRARY ACYCLIC NETWORKS
DERIVATION OF THE BACKPROPAGATION
RULE
CONTINUATION…
CONTINUATION…
REMARKS ON THE BACKPROPAGATION
ALGORITHM
• Convergence and Local Minima
• Representational Power of Feedforward Networks
• Hypothesis Space Search and Inductive Bias
• Hidden Layer Representations
• Generalization, Overfitting, and Stopping Criterion
Convergence and Local Minima
Generalization refers to the model’s ability to perform well on new, unseen data
after being trained on a specific dataset. The goal of training a neural network is
to create a model that accurately predicts outcomes for examples it has never
encountered. If a model has good generalization, it will make reliable predictions
not only on the training data but also on data from the real world.
OVERFITTING:
In backpropagation, overfitting happens during later stages of training when the
model starts to adapt to specific details in the training data that do not generalize
well to new data. The error on the training set continues to decrease, but the error
on a separate validation set starts to rise, signaling overfitting.
STOPPING CRITERION
• The stopping criterion is the rule used to determine when to stop the training
process to avoid overfitting. A common approach in backpropagation is to use
early stopping, where training is halted when the model’s performance on a
validation set starts to degrade.
• The network is trained until the error on the validation set reaches its
minimum, after which training is stopped to avoid fitting noise or unnecessary
details in the training data.
GENETIC ALGORITHM IN MACHINE
LEARNING
• A genetic algorithm is an adaptive heuristic search algorithm inspired by
"Darwin's theory of evolution in Nature.“
• Proposed by “John Holland”
• It is used to solve optimization problems in machine learning.
• It is one of the important algorithms as it helps solve complex problems that
would take a long time to solve.
GENETIC ALGORITHM
• Genetic algorithms simulate the process of natural selection which means those species
that can adapt to changes in their environment can survive and reproduce and go to the
next generation.
• In simple words, they simulate “survival of the fittest” among individuals of
consecutive generations to solve a problem.
• Each generation consists of a population of individuals and each individual represents
a point in search space and possible solution.
• Each individual is represented as a string of character/integer/float/bits. This string is
analogous to the Chromosome.
GENETIC ALGORITHM
Genetic algorithms are based on an analogy with the
genetic structure and behavior of chromosomes of the
population. Following is the foundation of GAs based
on this analogy –
Phenotypes are:13,24,8,19
GENETIC PROGRAMMING
• Genetic programming (GP) is a form of evolutionary computation in which the individuals in the evolving
population are computer programs rather than bit strings.
• Koza (1992) describes the basic genetic programming approach and presents a broad range of simple
programs that can be successfully learned by GP.
Crossover operation applied to two parent program trees (top). Crossover points (nodes
shown in bold at top) are chosen at random. The subtrees rooted at these crossover
points are then exchanged to create children trees (bottom).
EXAMPLE
A block-stacking problem. The task for GP is to discover a program that can transform an arbitrary
initial configuration of blocks into a stack that spells the word "universal." A set of 166 such initial
configurations was provided to evaluate fitness of candidate programs (after Koza 1992).
On the table (from left to right):
On the stack (from bottom to top):
• Block with the letter v
• Block with the letter r
• Block with the letter u
• Block with the letter s
• Block with the letter l
• Block with the letter e
• Block with the letter a
• Block with the letter n
• Block with the letter i
Representation (How GP sees the problem):
• CS (Current Stack): This tells the system what block is currently at the top of the stack. If there's no
stack, it returns F (False).
• TB (Top Correct Block): This refers to the topmost block that is in the correct order on the stack. For
example, if you've correctly stacked the blocks up to "U", this will return "U".
• NN (Next Necessary Block): This refers to the next block that needs to be stacked to correctly spell the
word. For example, after "U", "N" is the next necessary block.
Applications of GP:
While the earlier block-stacking example is simple, Koza et al. (1996) have applied GP to more
complex problems, such as:
• Designing electronic filter circuits
• Classifying segments of protein molecules
Electronic Filter Circuit Design:
•A challenging problem where GP evolves programs to transform a seed circuit into a final
design.
•Primitive functions in GP edit circuits by adding/deleting components and wiring.
•Circuit fitness is calculated using the SPICE circuit simulator by comparing the designed
circuit’s output with the desired output across 101 input frequencies.