0% found this document useful (0 votes)
5 views

soft computing unit 2

The Perceptron Learning Algorithm is a supervised learning method used to classify data into two categories by finding a linear boundary. It adjusts weights during training to improve accuracy and is effective for linearly separable data. The document also discusses Multi-Layered Perceptrons (MLP), Adaline, Madaline, and error backpropagation, highlighting their architectures, algorithms, advantages, and limitations.

Uploaded by

Azy Man
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

soft computing unit 2

The Perceptron Learning Algorithm is a supervised learning method used to classify data into two categories by finding a linear boundary. It adjusts weights during training to improve accuracy and is effective for linearly separable data. The document also discusses Multi-Layered Perceptrons (MLP), Adaline, Madaline, and error backpropagation, highlighting their architectures, algorithms, advantages, and limitations.

Uploaded by

Azy Man
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

PERCEPTRON TRAINING ALGORITHM

Perceptron Learning Algorithm


1. Definition

The Perceptron Learning Algorithm is a supervised learning technique used in Soft Computing to classify
data into two categories. It is the simplest type of artificial neural network, designed to find a linear boundary
(called a hyperplane) that separates two classes of data.

It is based on the perceptron, a computational model that makes predictions using a weighted combination of
input features and adjusts these weights during training to improve its accuracy.

2. Why We Need It

In real-world scenarios, we often need to classify objects or events into distinct categories. Examples include:

 Deciding whether an email is spam or not spam.


 Identifying whether an image contains a cat or a dog.

The Perceptron Learning Algorithm helps us automate this decision-making process by "learning" from
examples and finding the best decision boundary to separate two classes. It is especially useful when the data
is linearly separable (can be separated by a straight line).

3. What It Does

The perceptron takes labeled input data, processes it, and creates a decision boundary. Once trained:

 It classifies new inputs into one of two categories.


 It adjusts its internal weights during training to improve its predictions.

4. How It Does It

The perceptron works by:

1. Taking input data in the form of numerical features.


2. Multiplying each input feature by a weight (a number that represents its importance).
3. Summing up these weighted inputs to produce an output.
4. Comparing the output to a threshold:
o If the output is greater than the threshold, it predicts one class (e.g., +1).
o If it is less, it predicts the other class (e.g., -1).

If the prediction is wrong, the algorithm adjusts the weights to improve its future predictions.
5. Algorithm: Step by Step

Step 1: Initialize Weights and Threshold

 Start with random weights for each feature.


 Set a small positive constant (η) called the learning rate to control weight updates.

Step 2: Input Training Data

 Prepare a set of examples where each example has:


o Input features (x1,x2,…,xn)
o Target output (+1 for one class, -1 for the other).

Step 3: Make Predictions

 For each input vector, calculate:

y=w1x1+w2x2+⋯+wnxn

Compare y to the threshold:

o If y > 0, predict +1.


o If y ≤ 0, predict -1.

Step 4: Update Weights

 If the prediction is correct, no changes are made.


 If the prediction is wrong, adjust the weights: wnew=wold+η⋅(actual output−predicted output)⋅x

Step 5: Repeat

 Go through all training examples multiple times until all predictions are correct or a maximum number of
iterations is reached.

6. Example

Scenario: Classify whether a student passes (+1) or fails (-1) based on their study hours and attendance.

Study Hours (X1) Attendance (X2) Class Label (Target Output)

5 90 +1
1 50 -1

Steps:

1. Initialize random weights (w1 = 0.5,w2 = 02) and threshold y=0.


2. For the first input (5, 90):
o Calculate output: 0.5*5 + 0.2*90 = 19.5 > 0
o Predicted class: +1 (correct).
3. For the second input (1, 50):
o Calculate output: 0.5*1 + 0.2*50 = 10.5 > 0..
o Predicted class: +1 (wrong; should be -1).
o Adjust weights: w1 = 0.5 − η⋅x1, w2 = 0.2 − η⋅x2 .

After repeating for all examples, the perceptron will learn to classify correctly.

7. Advantages

 Simple to implement.
 Works well for linearly separable data.
 Fast convergence for simple problems.

8. Limitations

 Cannot handle non-linearly separable data (e.g., circles vs squares on a graph).


 Only works for binary classification problems.

Building Blocks of a Perceptron (Short Overview)

1. Input Layer: Takes numerical data as input (e.g., features like study hours or attendance).
2. Weights (w): Assign importance to each input; multiplied with input features.
3. Summation Function: Computes the weighted sum of inputs: S = w1*x1 + w2*x2 + ⋯ + wn*xn.
4. Bias (b): A constant added to the sum to adjust the decision boundary.
5. Activation Function: Decides the output class using a step function:
o Output +1: Class 1 (e.g., pass).
o Output −1: Class 0 (e.g., fail).
6. Output: Final prediction (e.g., spam or not spam).
Multi-Layered Perseptron

1. A Multilayer Perceptron (MLP) is a type of artificial neural network consisting of an input layer,
one or more hidden layers, and an output layer.
2. It processes data by passing it through layers, where each layer applies a weighted sum of inputs
followed by an activation function.
3. MLP uses non-linear activation functions (e.g., sigmoid or ReLU) in hidden and output layers,
enabling it to solve non-linear problems.
4. The hidden layers extract features and patterns from input data, making MLP suitable for complex
tasks like classification and regression.
5. It works on the principle of supervised learning, adjusting weights using algorithms like
backpropagation to minimize prediction errors.

Why Do We Use MLP?

In simple perceptrons, we can only solve basic problems where the data is linearly separable (like dividing
with a straight line).
MLP solves more complex problems because:

 It has hidden layers that can detect patterns.


 It uses non-linear activation functions, which allow it to model curved or non-linear boundaries.

Main Differences from Single-Layer Perceptron

1. MLP has hidden layers for better processing, while single-layer perceptrons don't.
2. MLP uses non-linear activation functions (like sigmoid or ReLU) for complex problems.
3. It can solve non-linear problems, unlike single-layer perceptrons that only solve linearly separable problems.

Why Use MLP?

 To handle complex tasks like image recognition or speech processing.


 Hidden layers allow the network to learn patterns in the data that cannot be captured by simple models.

MPL Diagram fig 2.3


1. Structure of MLP (Fig. 2.3):
o Input Layer: Receives the input data, such as numbers or images.
Example: I₁, I₂, I₃, … represent input neurons that simply forward data.
o Hidden Layer:
 Contains neurons (nodes) that process the input.
 These neurons calculate a weighted sum and apply non-linear activation functions (like
sigmoid or ReLU).
 Purpose: Enables the network to model complex, non-linear relationships in data.
o Output Layer: Provides the final decision or classification. For example, O₁, O₂, O₃ represent possible
outcomes (like "Cat," "Dog," or "Bird").
2. Equation Explanation:
o The equation O=N3[N2[N1[I]]] means the output (O) is calculated step-by-step:
 N1: Processing in the input layer.
 N2: Processing in the hidden layer.
 N3: Final output after the hidden layers.
3. Key Feature:
o Non-linear Activation Functions: Hidden and output layers use functions like the sigmoid (S-shaped)
curve to model non-linear relationships. Without these, MLP would behave like a simple perceptron.

Block Diagram (Fig. 2.4)

 Simplifies MLP into three blocks:


o Input Layer (N₁): Takes raw data as input.
o Hidden Layer (N₂): Processes input data with weights and activation functions.
o Output Layer (N₃): Outputs the final result or classification

Summary:

1. The input layer distributes raw data.


2. The hidden layer extracts features using weights and activation functions.
3. The output layer combines hidden layer outputs to make predictions.
4. MLP uses formulas like O = N3[N2[N1(I)]] to show step-by-step transformations through layers.
Effect/Problem of Linear function with MLP
ADALINE AND MADALINE NEURAL NETWORK
1. Adaline (Adaptive Linear Neural) :
 A network with a single linear unit is called Adaline (Adaptive Linear Neural). A unit
with a linear activation function is called a linear unit. In Adaline, there is only one
output unit and output values are bipolar (+1,-1). Weights between the input unit
and output unit are adjustable. It uses the delta rule i.e
,where
,

And

are the weight, predicted output, and true value respectively.


 The learning rule is found to minimize the mean square error between activation
and target values. Adaline consists of trainable weights, it compares actual output
with calculated output, and based on error training algorithm is applied.

Workflow:

Adaline

First, calculate the net input to your Adaline network then apply the activation function
to its output then compare it with the original output if both the equal, then give the
output else send an error back to the network and update the weight according to the
error which is calculated by the delta learning rule. i.e ,
where and are the weight, predicted output, and true value respectively.
Architecture:

Adaline

In Adaline, all the input neuron is directly connected to the output neuron with the
weighted connected path. There is a bias b of activation function 1 is present.
Algorithm:
Step 1: Initialize weight not zero but small random values are used. Set learning
rate α.
Step 2: While the stopping condition is False do steps 3 to 7.
Step 3: for each training set perform steps 4 to 6.
Step 4: Set activation of input unit xi = si for (i=1 to n).
Step 5: compute net input to output unit

Here, b is the bias and n is the total number of neurons.


Step 6: Update the weights and bias for i=1 to n

and calculate
when the predicted output and the true value are the same then the weight
will not change.
Step 7: Test the stopping condition. The stopping condition may be when the
weight changes at a low rate or no change.

2. Madaline (Multiple Adaptive Linear Neuron) :

 The Madaline(supervised Learning) model consists of many Adaline in parallel


with a single output unit. The Adaline layer is present between the input layer
and the Madaline layer hence Adaline layer is a hidden layer. The weights
between the input layer and the hidden layer are adjusted, and the weight
between the hidden layer and the output layer is fixed.

 It may use the majority vote rule, the output would have an answer either true or
false. Adaline and Madaline layer neurons have a bias of ‘1’ connected to them.
use of multiple Adaline helps counter the problem of non-linear separability.

Architecture:

Madaline
There are three types of a layer present in Madaline First input layer contains all
the input neurons, the Second hidden layer consists of an adaline layer, and
weights between the input and hidden layers are adjustable and the third layer is
the output layer the weights between hidden and output layer is fixed they are not
adjustable.
Algorithm:
Step 1: Initialize weight and set learning rate α.
v1=v2=0.5 , b=0.5
other weight may be a small random value.
Step 2: While the stopping condition is False do steps 3 to 9.
Step 3: for each training set perform steps 4 to 8.
Step 4: Set activation of input unit xi = si for (i=1 to n).
Step 5: compute net input of Adaline unit
zin1 = b1 + x1w11 + x2w21
zin2 = b2 + x1w12 + x2w22
Step 6: for output of remote Adaline unit using activation function given below:
Activation function f(z) = .
z1=f(zin1)
z2=f(zin2)
Step 7: Calculate the net input to output.
yin = b3 + z1v1 + z2v2
Apply activation to get the output of the net
y=f(yin)
Step 8: Find the error and do weight updation
if t ≠ y then t=1 update weight on z(j) unit whose next input is close to 0.
if t = y no updation
wij(new) =wij(old) + α(t-zinj)xi
bj(new) = bj(old) + α(t-zinj)
if t=-1 then update weights on all unit zk which have positive net input
Step 9: Test the stopping condition; weights change all number of epochs.
ERROR BACK PROPOGATION
Definition of Error Backpropagation in Soft Computing

Error backpropagation, or simply backpropagation, is a supervised learning algorithm used in artificial neural
networks to minimize the error between the predicted output and the actual target. It does this by iteratively
adjusting the weights of the network using the gradient descent method. The algorithm "propagates" the error
backward from the output layer to the input layer through hidden layers, which allows the network to learn
by updating weights.

Working

 Phase 1: Feedforward - Compute outputs from the input layer through hidden layers to the output layer
using weights and activation functions.
 Phase 2: Error Calculation - Calculate the difference between the target and actual output (error).
 Phase 3: Backpropagation - Distribute the error backward, layer by layer, and compute gradients.
 Phase 4: Weight Update - Update weights and biases using gradient descent to minimize error.

Characteristics of Backpropagation

1. Works with labeled data (supervised learning).


2. Adjusts weights to reduce error.
3. Sends error backward through layers.
4. Needs smooth activation functions (like sigmoid).
5. Improves accuracy step by step.

Advantages

1. Efficiently trains multi-layer neural networks.


2. Handles non-linear relationships in data.
3. Works well with large datasets.
4. Can use different activation functions.
5. Adaptable to various applications (e.g., vision, NLP).

Limitations

1. Computationally expensive for deep networks.


2. Can get stuck in local minima.
3. Requires a differentiable activation function.
4. Sensitive to the learning rate selection.
5. May overfit if the network is too complex.
BACK PROPOGATION ALGORITHM

YOUTUBE LINK :- https://youtu.be/tQTThWbx1Cg?si=7Kh-F3gLwqIJ8rPN


RBNF
What is RBNF?

Radial Basis Neural Function (RBNF) is a type of artificial neural network that uses a Radial Basis
Function (RBF) as its activation function in the hidden layer.

 It is commonly used for tasks like classification, regression, and function approximation.
 The hidden layer transforms the input data into a higher-dimensional space, making it easier for the network
to identify patterns.

Why do we need RBNF?

RBNF is needed for non-linearly separable data, where data cannot be separated using a straight line.

 Traditional neural networks may struggle with such data, but RBNF transforms the data into a higher-
dimensional space, where it becomes linearly separable.

Characteristics

1. Three Layers: Input layer, Single Hidden Layer, and Output layer.
2. Localized Activation: Each RBF neuron activates based on the distance between the input and its center.
3. Fast Learning: Due to its simple structure, the training process is efficient.
4. Non-Linear Mapping: Transforms input data into a new space for easier classification.
5. Radial Basis Function: Hidden neurons use functions like Gaussian to calculate their outputs.

Components of RBNF

1. Input Layer: Passes raw data directly to the hidden layer without any transformation.
2. Hidden Layer:
o Neurons calculate how close the input is to their center using an RBF function (e.g., Gaussian
function).
o Only neurons near the input get activated.
3. Centers: Reference points for hidden neurons, representing different regions of the input space.
4. Radius: Defines how far each center's influence extends in the input space.
5. Output Layer: Combines results from the hidden layer and generates the final output.
6. Weights: Connect the hidden neurons to the output layer, determining their influence on the final output.
Image Compression Using Artificial Neural Networks (ANN)
Need for Image Compression

 Storing images requires a lot of memory. For example, one image might need 65,536 bytes of memory.
 A large amount of digital image data is created every year (e.g., in hospitals), which needs to be stored
efficiently.
 Compression is used to reduce the memory needed for images, but this can result in some loss of quality
(distortion).
 To recover from this loss, restoration techniques are used.

Types of Compression

1. Lossless Compression:
o No data is lost; the original image can be perfectly restored.
o Compression ratio is limited (not much space saved).
2. Lossy Compression:
o Reduces file size more effectively but some image quality is lost.
o Useful for fast browsing of large image databases.
o Combining both lossless and lossy methods can be beneficial when only part of the image needs to
be preserved with high quality.

How ANN Works in Image Compression

Artificial Neural Networks (ANNs) are used to process and compress images efficiently. They exploit
relationships between image pixels to reduce image size.

1. Network Architecture:

 A 5-layer feedforward neural network is used.


 Hidden Layers: Three key layers are used:

 Combiner Layer: Focuses on correlations between pixels.


 Compressor Layer: Reduces the size of the image.
 Decombiner Layer: Reconstructs the image after compression.

 Connections: These layers are fully connected to each other, from the input to the combiner layer
and from the decombiner to the output layer.

2. Training:
o ANN is trained with large image datasets.
o The network learns patterns in pixel data (inner layer) and block data (outer layers).
o Training involves dividing the data into smaller blocks and training the ANN to minimize errors (using
the backpropagation algorithm).
3. Compression and Reconstruction:
o Compression: Data is reduced in the hidden layer (compressor layer).
o Reconstruction: Compressed data is decoded in the output layer to approximate the original image.
Key Features of ANN-Based Compression

 Fully Connected Network: All layers are connected for efficient data flow.
 Hidden Layers:
o Exploit pixel and block-level correlations to reduce redundancy.
 Two-Stage Process:
o Compression: Reduces data size.
o Reconstruction: Restores data with minimal loss.

Summary

 Image compression is essential for reducing memory requirements for storage and transmission.
 ANN-based compression uses a 5-layer feedforward network to process image data effectively.
 Lossless and lossy compression methods can be combined for balance between quality and size.
 ANN’s ability to learn and exploit data patterns makes it an efficient tool for image compression, with
practical applications in fields like healthcare and digital storage.
Architecture of Neural Networks
SINGLE LAYER FFN & MULTILYAER FFN

You might also like