SSG303:
MATHEMATICAL
MODELING FOR AI
SYSTEMS
INTRODUCTION TO NEURAL
NETWORKS 1
A practical neural network
A practical neural network is a collection of
interconnected neurons that incrementally
learn from their environment (data) to capture
essential linear and nonlinear trends in
complex data, so that it provides reliable
predictions for new situations containing even
noisy and partial information.
Weight, Hidden and Output Neuron
• Weights are links connecting inputs to neurons
and neurons to outputs.
• These links facilitate a structure for flexible
learning that allows a network to freely follow
the patterns in the data.
• The input layer transmits input data to the
hidden neurons through input-hidden layer
weights. Inputs are weighted by the
corresponding weights before they are received
by the hidden neurons.
Weight, Hidden and Output Neuron
• The neurons in the hidden layer accumulate and
process the weighted inputs before sending their
output to the output neurons via the hidden-output
layer weights, where the hidden-neuron output is
weighted by the corresponding weights and
processed to produce the final output.
Weight, Hidden and Output Neuron
cont.
•This structure is trained to learn by repeated
exposure to examples (input–output data) until the
network produces the correct output.
• Learning involves incrementally changing the
connection strengths (weights) until the network
learns to produce the correct output.
• The final weights are the optimized parameters of
the network.
Connection weight, bias and activation
function
Different activation neuron
1. Linear activation
function
•It performs no input
squashing
•It is not very interesting
•It is represented by
g(a) = a
2. Sigmoid activation function
•It squashes the neuron’s
pre-activation between 0
and 1.
•It is always positive
•It is bounded
•It is strictly increasing.
•It is represented by
3. Hyperbolic tangent (‘tanh’) activation
function
• It squashes the neuron’s
pre-activation between
-1 and 1.
• It can be positive or
negative.
• It is bounded
• It is strictly increasing.
• It is represented by
4. Rectified linear activation function
• It is bounded below by
0, always non-negative.
• It is not upper bounded.
• It is strictly increasing.
• It tends to give neurons
with sparse activities.
• It is represented by
Biological Neuron versus Artificial neurons
Basic Single Layer Network using ReLu
activation function
Input:
W:
Bias:
[ −1 −8 5 ] [ 1 − 7 8 ] [ 1 0 8]
(Neural) Network
Artificial Neural Network
Input Layer Hidden Layers Output Layer
Multi-Layer Perceptron
Epoch / Batch size / Iteration
Epoch is the presentation of the entire training set
to the neural network.
One epoch is one forward and backward pass of
all training data
Batch size is the number of training examples in
one forward and backward pass
One iteration is number of passes
If we have 55,000 training data, and the batch size
is 1,000. Then, we need 55 iterations to complete
1 epoch.
Cost function
This is just a function that we want to minimize.
There is no guarantee that it will bring us to the
best solution!
It should be differentiable!
Error is the amount by which the value output by
the network differs from the target value.
Problems with non-linear Decision
Boundary
Hence a decision surface that will capture
the pink variables effectively is needed.
Assignment
• Find the output of the above multi layer perceptron using
sigmoid function.
• Given the input: I = [1 2 4], bias at the input (b1 = 2) and at the
hidden layer (b2 = 3)