0% found this document useful (0 votes)
11 views

Lecture 11 - Introduction To Artificial Neural Networks (ANN)

This document provides an introduction to artificial neural networks (ANNs). It defines ANNs as being inspired by biological neural networks in the brain and composed of interconnected nodes similar to neurons. The document outlines ANN components including the artificial neuron, which receives weighted inputs and uses an activation function to determine output. It also describes common ANN architectures like feedforward and recurrent neural networks, and how ANNs are trained through forward and backward propagation to minimize error and adjust weights.

Uploaded by

johndeuterok
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lecture 11 - Introduction To Artificial Neural Networks (ANN)

This document provides an introduction to artificial neural networks (ANNs). It defines ANNs as being inspired by biological neural networks in the brain and composed of interconnected nodes similar to neurons. The document outlines ANN components including the artificial neuron, which receives weighted inputs and uses an activation function to determine output. It also describes common ANN architectures like feedforward and recurrent neural networks, and how ANNs are trained through forward and backward propagation to minimize error and adjust weights.

Uploaded by

johndeuterok
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Introduction

to Artificial
Neural
Networks
(ANN)

Prepared by Stephanie Chua, edited by Liew SH


Outline

+ Definition
+ Human Biological Neuron
+ Artificial Neuron
+ Artificial Neural Network
+ Types of ANN
+ How do ANN work?
+ Activation Functions
+ Applications of ANN
Neural networks, also known as artificial
neural networks (ANNs) or simulated
neural networks (SNNs), are a subset
of machine learning and are at the heart
of deep learning algorithms.
Definition
Their name and structure are inspired
by the human brain, mimicking the way
that biological neurons signal to one
another.
Human
Biological
Neuron
+ A biological neuron has three types
of main components; dendrites,
soma (or cell body) and axon.
+ Dendrites receive signals from
other neurons.
+ The soma sums the incoming
signals. When sufficient input is
received, the cell fires; it transmit a
signal over its axon to other cells.
Artificial
Neuron
+ Once an input layer is determined, weights are
assigned.
+ These weights help determine the importance
of any given variable, with larger ones
contributing more significantly to the output
compared to other inputs.
+ All inputs are then multiplied by their respective
weights and then summed.
+ The output is then passed through an activation
function, which determines the output. If that
output exceeds a given threshold, it “fires” (or
activates) the node, passing data to the next
layer in the network.
Biological Neuron vs Artificial Neuron

Source: https://www.researchgate.net/publication/325870973_Investigating_Keystroke_Dynamics_as_a_Two-Factor_Biometric_Security/figures?lo=1
Artificial Neural
Network
+ Artificial neural
networks (ANNs) are
comprised of node
layers, containing an
input layer, one or
more hidden layers,
and an output layer. Source: https://www.ibm.com/cloud/learn/neural-networks
Artificial Neural Network
Training a neural network :
i.Forward Propagation - Apply a set of
weights to the input data and calculate an
output. For the first forward propagation,
the set of weights is selected randomly.

ii.Backward Propagation - Measure


the margin of error of the output and
adjust the weights accordingly to decrease
the error.

ANN repeats both forward and back


propagation until the weights are
calibrated to accurately predict an output.
Types of ANN – Feedforward Neural Networks
+ The flow of information through the network is unidirectional
without going through loops.
+ Single layer networks vs Multilayer networks
• The number of layers depends on the complexity of the function that needs to
be performed.
• The single-layered feedforward neural network consists of only two layers of
neurons and no hidden layers in between them.
• Multi-layered neural network consist of multiple hidden layers between the
input and output layers, allowing for multiple stages of information processing.
• Used to perform basic pattern and image recognition.
Single Layer vs Multilayer Neural Networks

Source: https://towardsdatascience.com/multi-layer-neural-networks-with-sigmoid-function-deep-learning-for-rookies-2-bf464f09eb7f
ANN - CALCULATION

We use (1,1) => 0 to demonstrate forward propagation

Simple Dataset
Input Output
0,0 0
0,1 1
1,0 1
1,1 0
ANN - CALCULATION

1. Assign weights to all synapses Note: Weights are selected randomly


(based on Gaussian distribution).

Since it is the first iteration, the initial


weights will be between 0 and 1.

However, final weights don’t need to be.


ANN - CALCULATION

2. Sum products of inputs with respective (1*0.8) + (1*0.2) = 1


weight. (1*0.4) + (1*0.9) = 1.3
(1*0.3) + (1*0.5) = 0.8

To get the final value, apply an


activation function to the
hidden layer sums
Types of Activation Function

Let’s choose sigmoid


ANN - CALCULATION

3. Apply S(x) to the three hidden layer sum.


S(1.0) = 0.7311
S(1.3) = 0.7858
S(0.8) = 0.6900

Then, sum the product of the


hidden layer results with the
second set of weights (also
determined at random the first
time around) to determine the
output sum.
ANN - CALCULATION

4. Calculate sum of product of hidden layers and


apply activation function.
Output sum
= (0.73 * 0.3) + (0.79 * 0.5) + (0.69 * 0.9)
= 1.235

S(1.235) = 0.7747
ANN – Complete Diagram

Question 1: Why is the calculated


output off the target mark?

Answer: Because our weights


were initialized randomly.

Action:We need to adjust the


weight via back propagation.
Back Propagation – Adjust the
Weights
+ Calculating the incremental change to these weights happens in two
steps:
1. we find the margin of error of the output result to back out the
necessary change in the output sum (delta output sum)
2. Extract the change in weights by multiplying delta output sum by
the hidden layer results

Output sum margin of error = target – calculated


Back Propagation – Adjust the
Weights
Output Sum of Margin Error = Target – Calculated
= 0 – 0.77
= -0.77

Calculate: Delta Output Sum = Derivative of the activation (sigmoid) function


and apply to the output sum

Delta output sum = S’(output sum) * (output sum margin error) Delta output
sum = S’(1.235) * (-0.77)
Delta output sum = -0.1344 (proposed change)
Back Propagation

hidden result 1 = 0.73


hidden result 2 = 0.79
hidden result 3 = 0.69

Delta weights = delta output sum * hidden layer results


Delta weights = -0.1344 * [0.73,0.79,0.69]
Delta weights = [-0.0983,-0.1056,-0.0941]

Hence
old w7 = 0.3 -> new w7 = 0.202
old w8 = 0.5 -> new w8 = 0.394
old w9 = 0.9 -> new w9 = 0.806
Back Propagation
To determine the change in weight between input
and hidden layer,we perform similar calculations.

Delta hidden sum = delta output sum * hidden-to-


outer weights * S'(hidden sum)

Delta hidden sum = -0.1344 * [0.3,0.5,0.9] * S'([1,


1.3,0.8])

Delta hidden sum = [-0.0403,-0.0672,-0.1209] *


[0.1966,0.1683,0.2139]

Delta hidden sum = [-0.0079,-0.0113,-0.0259]


Back Propagation
Then multiply the results with the input data.

input 1 = 1
input 2 = 1

Delta weights = delta hidden sum * input


Delta weights = [-0.0079,-0.0113,-0.0259] * [1,1]
Delta weights = [-0.0079,-0.0113,-0.0259,-0.0079,
-0.0113,-0.0259]
old w1 = 0.8 > new w1 = 0.7921
old w2 = 0.4 > new w2 = 0.3887
old w3 = 0.3 > new w3 = 0.2741
old w4 = 0.2 > new w4 = 0.1921
old w5 = 0.9 > new w5 = 0.8887
old w6 = 0.5 > new w6 = 0.4741
Back Propagation
New weights.

old new
--------------------------
w1:0.8 w1:0.7921
w2:0.4 w2:0.3887
w3:0.3 w3:0.2741
w4:0.2 w4:0.1921
w5:0.9 w5:0.8887
w6:0.5 w6:0.4741
w7:0.3 w7:0.2020
w8:0.5 w8:0.3940
w9:0.9 w9:0.8060
Types of ANN – Recurrent Neural Networks
+ Recurrent neural networks (RNN), as the name suggests, involves the recurrence
of operations in the form of loops. These are much more complicated than
feedforward networks and can perform more complex tasks than basic image
recognition.
+ While in feedforward neural networks, connections only lead from one neuron to
neurons in subsequent layers without any feedback, recurrent neural networks
allow for connections to lead back to neurons in the same layer allowing for a
broader range of operations.
+ One of the limitations for RNN is that they are difficult to train and have a very
short-term memory, which limits their functionality.
+ To overcome the memory limitation, a newer form of RNN, known as Long Short-
term Memory (LSTM) networks are used. LSTMs extend the memory RNNs to
enable them to perform tasks involving longer-term memory.
+ The main application areas for RNNs include natural language processing problems
such as speech and text recognition, text prediction, and natural language
generation.
Recurrent vs Feedforward Neural Networks

Source: https://machine-learning.paperspace.com/wiki/recurrent-neural-network-rnn

Source: https://towardsdatascience.com/implementation-of-rnn-lstm-and-gru-a4250bf6c090
Types of ANN – Convolutional Neural Networks
+ Convolutional neural networks (CNN) is commonly associated with
computer vision applications. Their architecture is specifically suited for
performing complex visual analysis.
+ The convolutional neural network architecture is defined by a three-
dimensional arrangement of neurons, instead of the standard two-
dimensional array.
+ The first layer in such neural networks is called a convolutional layer.
+ Each neuron in the convolutional layer only processes the information from
a small part of the visual field.
+ The convolutional layers are followed by rectified layer units or ReLU,
which enables the CNN to handle complicated information.
+ CNNs are mainly used in object recognition applications like machine vision
and in self-driving vehicles.
Convolutional Neural Networks

Source: https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53

Source: https://www.simplilearn.com/tutorials/deep-learning-tutorial/convolutional-neural-network
Four Important Layers in
Convolutional Neural Network
+ Convolution layer
• This is the first step in the process of extracting valuable features from an image.
• A convolution layer has several filters that perform the convolution operation. Every image is considered as a matrix of pixel values.
+ ReLU layer
• ReLU stands for the rectified linear unit.
• Once the feature maps are extracted, the next step is to move them to a ReLU layer.
• ReLU performs an element-wise operation and sets all the negative pixels to 0.
• It introduces non-linearity to the network, and the generated output is a rectified feature map.
+ Pooling layer
• Pooling is a down-sampling operation that reduces the dimensionality of the feature map.
• The rectified feature map now goes through a pooling layer to generate a pooled feature map.
+ Fully connected layer
• The next step in the process is called flattening.
• Flattening is used to convert all the resultant 2-Dimensional arrays from pooled feature maps into a single long continuous linear
vector.
• The flattened matrix is fed as input to the fully connected layer to classify the image.
How do ANN work?
+ Each node, or artificial neuron, connects to another and has an
associated weight and threshold.
Bias, b

𝑓 𝑥 = 1 if σ 𝑥𝑗 𝑤𝑗 + b ≥ 0
0 if σ 𝑥𝑗 𝑤𝑗 + b < 0

Source: https://www.freecodecamp.org/news/deep-learning-neural-networks-explained-in-plain-english/
Activation Functions
+ Linear • Sigmoid
• f(z) = z • f(x) = 1 / (1 + 𝑒 (−1∗𝑧) )
+ Non-linear 1. Negate z by multiplying by -1.
2. Find the exponent of the output in
• Rectified Linear Units (ReLU) No.1.
• ReLU ensures that the output is not 3. Add 1 to the output in No.2.
negative.
4. Divide 1 by the output in No.3.
• If z is greater than zero, the output
remains z, else if z is negative, the
output is zero.
• Softmax
• f(z) = max(0, z) • Used in the output layer.
• Calculates the probabilities
• Tanh distribution of the event over n
• Hyperbolic tangent of z. events.
• f(z) = tanh(z) • f(𝑧𝑗 ) = 𝑒 𝑧𝑗 / σ𝐾 𝑧𝑘
for j = 1 … K
𝑘=1 𝑒
Weight Matrix

Source: https://www.researchgate.net/publication/292077006_Power-Efficient_Accelerator_Design_for_Neural_Networks_Using_Computation_Reuse/figures?lo=1
Example
Activation function – tanh
Activation function - Softmax

The second output node value is greater than the


first, the computed output would be interpreted as
the categorical value corresponding to (0, 1).

Source: https://visualstudiomagazine.com/articles/2014/11/01/use-python-with-your-neural-networks.aspx
Example – Abdominal Pain Prediction
Applications of ANN
+ Image recognition
+ Speech recognition
+ Facial recognition
+ Machine translation
+ Medical diagnosis
+ Stock market prediction
+ Fraud detection
+ Many more …
End of Lecture

You might also like