0% found this document useful (0 votes)
184 views5 pages

Bipolar Sigmoid in Neural Networks

1. The sigmoid activation function is commonly used in neural networks because the relationship between its value and derivative reduces computational burden during training. 2. There are two types of sigmoid functions: the logistic sigmoid which ranges from 0 to 1, and the bipolar sigmoid which ranges from -1 to 1. 3. A perceptron is a basic unit of an artificial neural network that takes inputs, calculates a linear combination, and outputs 1 if the result is above a threshold and -1 otherwise. It is trained using the perceptron training rule to update weights based on errors.

Uploaded by

Yash Gandharv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
184 views5 pages

Bipolar Sigmoid in Neural Networks

1. The sigmoid activation function is commonly used in neural networks because the relationship between its value and derivative reduces computational burden during training. 2. There are two types of sigmoid functions: the logistic sigmoid which ranges from 0 to 1, and the bipolar sigmoid which ranges from -1 to 1. 3. A perceptron is a basic unit of an artificial neural network that takes inputs, calculates a linear combination, and outputs 1 if the result is above a threshold and -1 otherwise. It is trained using the perceptron training rule to update weights based on errors.

Uploaded by

Yash Gandharv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Sigmodal function:

It is widely used in back propagation network because of the relationship between the value of the
function at a point and the value of the derivative at that point which reduces the computational
burden during training.

It is divided into 2 types: -

1. Binary sigmoid or logistic sigmoid or Uni-polar sigmoid:

Sigmoid function is defined as

f(x) = 1/1+e^-λx

where λ is stiffness parameter

Range of sigmoid function varies from 0 to 1

2. Bi-polar sigmoid: It is defined as

f(x) = 2/1+e^λx - 1

1-e^-λx/1+e^λx

f'(x) = λ/2[1+f(x)][1-f(x)]
Range of sigmoid function varies from -1 to 1

a) Sigmoid activation function:

We can obtain the output of the neuron, for binary sigmoidal function and bi-polar sigmoidal
activation function

b) Bi-polar sigmoidal activation function:


Perceptron in Artificial Neural Network:

A perceptron is used to build the artificial neural network system. It takes a vector of real-valued
inputs and calculates linear combination of these inputs then output is 1 if the result is greater than
some threshold value else -1 otherwise.

Inputs: x1 to xn and the output ~ O(x1 …. xn)

Computed by the perception is

O(x1, x2,….,xn) = 1 if w0 + x1w1 + x2w2 + …. + xnwn > 0

-1 if otherwise

Updation of weights is called Perceptron Training Rules

If Output is equal to target O/P then don’t update weights

If O/P is not equal to target O/P then update weights

Perceptron Training Rules:

1. One way to learn an acceptable weight vector is to begin with random weights, then
iteratively apply the perceptron to each training example, modify the perceptron weights
whenever it misclassifies an example.
2. This process is repeated iterating through the training examples as many times as needed
until the perceptron classifies all training examples correctly.
3. Weights are modified at each step according to the perceptron training rule, which revises
the weights ‘wi’ associated with input ‘xi’ according to the rule

Wi <- wi + ∆ wi

Where ∆ wi = n( t – o ) xi

n -> learning rate = 0.1

t = target O/P

o = Actual O/P

xi = Input
Algorithm:

perceptron_training_rule(X,n)

Initialise w(small random values)

repeat

for each training instance(x,tx) ∑ X

Compute the real OX = Actual O/P

∑ (w,x)

if (tx ≠ ox)

for each wi

wi <- wi + ∆wi

∆ wi < n(tx-ox) xi
end for

end if

end for

Note: Until all training instances in X are correctly classified then you have to return w

Implementation of logical AND gate using perceptron end rule:

Truth Table:

x1 x2 x1 ^ x2
0 0 0
0 1 0
1 0 0
1 1 1

w1 = 1.2, w2 = 0.6, VT = 1

Learning rate n = 0.5

x1 = 0, x2 = 0, tx = 0 (If wixi ≥ 1 then o/p is 1, if wixi ≤ 1 then o/p is 0)

1. wixi = w1x1 + w2x2 = (1.2 x 0) + (0.6 x 0) = 0 ≥ 1 => tx = 0

ox = 0 (see fig) tx = 0 (see fig) => tx = ox

2. wixi = w1x1 + w2x2 = (1.2 x 0) + (0.6 x 1) = 0.6 ≥ 1 => tx = 0

ox = 0 (see fig) tx = 0 (see fig) => tx = ox

3. wixi = w1x1 + w2x2 = (1.2 x 1) + (0.6 x 0) = 1.2 ≥ 1 => 1

ox = 1 (see fig) tx = 0 (see fig) => tx ≠ ox


weights should be updated

wi = w1 + n(tx-ox)xi

w1 = 1.2 + 0.5(0-1)1 = 0.7

w2 = 0.6 + 0.5(0-1)0 = 0.6

w1 = 0.7, w2 = 0.6

i) x1 = 0 & x2 = 0, tx = 0
wixi = w1x1 + w2x2 = 0.7 x 0 + 0.6 x 0 = 0
tx = ox
ii) x 1 = 0 & x2 = 1
tx = ox
iii) x1 = 1, x2 = 1, tx = 1
tx = ox
iv) x1 = 1, x2 = 1, tx = 1
wixi = w1x1 + w2x2 = 0.7 x 1 + 0.6 x 1 = 1.3 ≥ 1
tx = 1 => tx = ∆x

Common questions

Powered by AI

In the implementation of the AND gate using a perceptron, a threshold value of 1 dictates that the output is 1 only when the weighted sum of inputs, calculated as w1x1 + w2x2, equals or exceeds 1. This ensures that both inputs need to be high to produce a high output, aligning with the AND operation's requirement. This threshold effectively filters cases where inputs do not cumulatively provide enough evidence to meet the condition for a positive output, hence maintaining strict adherence to logical AND semantics .

Implementing a logical AND gate using a perceptron illustrates the perceptron's capacity as a linear classifier by setting specific input weights and a threshold that correspond to the linear decision boundary between 0 and 1 outputs. For instance, with weights w1 = 1.2, w2 = 0.6, and a threshold VT = 1, the perceptron only outputs a 1 when the sum of weighted inputs equals or exceeds 1; thus, only the input pattern (1, 1) meets this condition. This setup exemplifies the perceptron's function in distinguishing between classes of inputs based on a linear equation, foundational to more complex neural network architecture .

The threshold value in perceptron-based logical gates determines the demarcation line between binary output states. In logical gate implementation, like the AND gate, the perceptron outputs a high value only when the weighted sum of the inputs meets or exceeds this threshold. This mechanism is crucial for representing logical conditions, as it simulates a decision boundary that reflects the logic gate's requirements. For instance, with a threshold set to 1, only certain combinations of inputs will satisfy the condition to output a 1, replicating the precise behavior of the logical AND gate .

The binary sigmoid function, with a range of 0 to 1, is beneficial in applications like binary classification due to its monotonic nature, which simplifies the computation of errors. However, its outputs are not zero-centered, which can slow down the learning process in neural networks. The bi-polar sigmoid outputs values between -1 and 1, making it zero-centered, which can accelerate learning by allowing for uniform signal processing through the network layers. However, it may introduce complexity due to negative outputs, potentially complicating error handling and optimization in certain network architectures .

The perceptron serves as a fundamental building block in artificial neural networks by taking a vector of real-valued inputs, calculating their linear combination, and producing an output of 1 or -1 based on whether the result surpasses a threshold. It is trained using the perceptron training rule, which involves beginning with random weights and updating them iteratively across training examples until all are classified correctly. Weights are updated using the formula Wi <- wi + ∆wi, where ∆wi = n(t - o)xi, with 'n' as the learning rate, 't' the target output, 'o' the actual output, and 'xi' the input .

The stiffness parameter (λ) in the sigmoid function controls the steepness of the function's curve. A higher λ value results in a steeper gradient, meaning that the function transitions more abruptly from low to high values around the central point (typically x=0). This results in more pronounced activation for small changes in input, which can lead to faster convergence during training but might introduce numerical instability. Conversely, a lower λ leads to a gentler slope, providing smoother and more gradual activations that can help stabilize learning but may slow down convergence. Therefore, tuning λ is crucial for optimizing the balance between learning speed and network stability .

The learning rate 'n' significantly influences how quickly and effectively weights are modified during perceptron training. It determines the size of the step that adjustments take towards minimizing classification errors. A small learning rate may prolong training because changes in weights are minute, making convergence slower. Conversely, a large learning rate can lead to overshooting the optimal weight configurations, causing oscillation or divergence in training. Hence, choosing an appropriate learning rate is critical to balancing the speed of convergence with stability, ensuring efficient and accurate training of the perceptron .

The sigmoid function is advantageous in training artificial neural networks because it reduces the computational burden due to the relationship between the function's value and its derivative at a specific point. It allows for efficient backpropagation, a key process in training networks. The binary (or logistic or uni-polar) sigmoid outputs a range from 0 to 1, defined as f(x) = 1/(1+e^(-λx)), aiding in binary classification tasks. The bi-polar sigmoid, ranging from -1 to 1, is defined as f(x) = 2/(1+e^(λx)) - 1 and is useful for networks requiring outputs that can handle negative values .

Using random initial weights in the perceptron training algorithm prevents the model from converging to a biased solution. Randomized starting values ensure that the learning process can explore a diverse solution space, aiding in finding a more global optimum rather than getting trapped in local minima that deterministic starting positions might encourage. This approach enhances the adaptability and generalization capacity of the perceptron by promoting comprehensive coverage of potential outcomes, subsequently improving its classification performance .

In perceptron learning, when the actual output (o) doesn't match the target output (t), weight updates rectify the discrepancy by adjusting the contribution of each input. The update Δwi = n(t - o)xi modifies the weight such that the perceptron will be more or less sensitive to input xi, depending on whether the adjustment intends to increase or decrease the output signal. This iterative correction aligns the perceptron's decision boundary closer to the ideal one over successive training examples, progressively improving its accuracy in classifying input patterns .

You might also like