0% found this document useful (0 votes)

253 views78 pages

ML06 Neural-Network 2024-2025

The document provides an overview of deep learning and neural networks, focusing on supervised learning methods and the architecture of neural networks. It explains the significance of deep learning in mimicking human learning through examples and discusses various types of neural networks and their applications. Additionally, it covers concepts such as activation functions, gradient descent, and backpropagation, which are essential for training neural networks.

Uploaded by

2603755040

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

253 views78 pages

ML06 Neural-Network 2024-2025

Uploaded by

2603755040

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 78

Deep Learning & Neural Network

Supervised Learning

• Computation methods used in the supervised learning process

• Regression
• Naïve Bayes
• K-Nearest Neighbors
• Support Vector Machines
• Decision Tree
• Neural Networks
Deep Learning
What is Deep Learning?

“Deep learning attempts to mimic the human brain—

albeit far from matching its ability—enabling systems
to cluster data and make predictions with incredible
accuracy.”
Part of the Bigger Picture

Deep learning is a machine learning

technique that teaches computers to do what
comes naturally to humans: learn by example.

A computer model learns to perform

classification tasks directly from images, text,
or sound.

Models are trained by using a large set of

labeled data and neural network
architectures that contain many layers.
Why now?
• There are two main reasons DL has only recently become useful:
1) Deep learning requires large amounts of labelled data.
2) Deep learning requires substantial computing power.
• Examples:
• Autonomous Driving
• Objects identify/classification
• Automatically detection of cancer and other diseases
• Automated hearing and speech translation
Deep Learning Applications
• Face & Object Recognition • Bioinformatics & Drug Discovery

• Natural Language & Speech • Fraud Detection

Why deep learning is so Popular?
This wasn’t always the case!
Since 1980s: Form of models hasn’t changed much, but lots of new tricks…
– More hidden units
– Better (online) optimization
– New nonlinear functions (ReLUs)
– Faster computers
– (CPUs and GPUs)
Why deep learning is so Popular?
• Has won numerous pattern recognition competitions.
• Does so with minimal feature engineering.
• A lot of money is invested in it:
• DeepMind: Acquired by Google for $400 million
• DNNResearch: Three person startup (including Geoff Hinton) acquired by
Google for unknown price tag
• Enlitic, Ersatz, MetaMind, Nervana, Skylab:
Deep Learning startups commanding millions of VC dollars
Learning Highly Non-Linear Functions

• We have a mapping y = F(x)

• F might be non-linear function
• x (vector of) continuous and/or
discrete vars
• y (vector of) continuous and/or
discrete vars
XOR gate:
You can’t split 0 (red) and 1
(blue) with a single straight
line
Learning Highly Non-Linear Functions
• In SVM, we solve such problem by applying Kernel function.

• An alternative approach is to use a fixed number of basis function in

which the parameter values are adapted during training.

• The most successful model of this type in the context of pattern

recognition is the feedforward neural network, also known as the
multilayer perceptron.
Perceptron and Neural Nets
• From biological neuron to artificial neuron (perceptron)
Artificial Neuron
Artificial Neuron
• An artificial neuron is a mathematical function based on a model of
biological neurons.
• Each neuron
• takes inputs,
• weighs them separately,
• sums them up,
• passes this sum through a
nonlinear function to produce
output.

Source: https://www.simplilearn.com/tutorials/deep-learning-tutorial/perceptron
Perceptron
• A Perceptron is an algorithm for supervised learning of binary
classifiers.
• This algorithm enables neurons to learn and processes elements in
the training set one at a time.
Components of Perceptron
• Input Layer: the input layer consists of one or more input neurons, which receive
input signals from the external world or other layers of the neural network.
• Weights: each input neuron is associated with a weight, representing the strength
of the connection between the input and output neurons.
• Bias: a bias term is added to the input layer to give the perceptron additional
flexibility in modelling complex patterns in the input data.
• Activation Function: the activation function determines the perceptron’s output
based on the weighted sum of the inputs and the bias term. Common activation
functions used in perceptrons include the step, sigmoid, and ReLU functions.
Components of Perceptron
• Output: the output of the perceptron is a single binary value, either 0 or 1, which
indicates the class or category to which the input data belongs.
• Training Algorithm: The perceptron is typically trained using a supervised learning
algorithm such as the perceptron learning algorithm or backpropagation. During
training, the weights and biases of the perceptron are adjusted to minimize the
error between the predicted output and the true output for a given set of training
examples.
• Overall, the perceptron is a simple yet powerful algorithm that can be used to
perform binary classification tasks and has paved the way for more complex
neural networks used in deep learning today.
Neural Networks Variations
• There are many types of artificial neural networks, each with their
unique strengths.
• Different types of neural networks are used for different data and
applications.
• They use different principles in determining their own rules.
• The different architectures of neural networks are specifically
designed to work on those particular types of data or domain.
Types of Neural Networks
• Feed Forward Neural Network
• Multilayer Perceptron
• Convolutional Neural Network
• Radial Basis Functional Neural
Network
• Recurrent Neural Network
• LSTM – Long Short-Term Memory
• Sequence to Sequence Models
• Modular Neural Network
• Graph Neural Network
Neural Network
Neural Networks
• Neural networks, also known as artificial
neural networks (ANNs), or Feedforward
Neural Networks (FNN), are a subset
of machine learning and are at the heart
of deep learning algorithms.
• Their name and structure are inspired by
the human brain, mimicking the way that
biological neurons signal to one another.
Network Structure
• NN/FNN comprises several layers containing an
input layer, one or more hidden layers, and an
output layer.
• Each node, or artificial neuron, connects to
another and has an associated weight and
threshold.
• If the output of any individual node is above the
specified threshold value, that node is activated,
sending data to the next layer of the network.
• Otherwise, no data is passed along to the next
layer of the network.
Feedforward Networks
𝑦 = 𝜎(𝑏)
ො

Output y
𝐷
𝑏 = ෍𝑤
ෞ𝑗 𝑧𝑗
𝑗=0

Hidden Layer z1 z2 … zD
𝑧𝑗 = 𝜎 𝑎𝑗

𝑀
𝑎𝑗 = ෍ 𝑤𝑗𝑖 𝑥𝑖
Input x1 x2 x3 … xM
𝑖=0

𝑥𝑖

Source: https://deepai.org/machine-learning-glossary-and-terms/feed-forward-neural-network
Feedforward Networks
𝑦 = 𝜎(𝑏)
ො

Output y
𝐷
𝑏 = ෍𝑤
ෞ𝑗 𝑧𝑗
𝑗=0
𝝈 and 𝝈ෝ are
activation
Hidden Layer z1 z2 … zD functions
𝑧𝑗 = 𝜎 𝑎𝑗

𝑀
Input …
x1 x2 x3 xM 𝑎𝑗 = ෍ 𝑤𝑗𝑖 𝑥𝑖
𝑖=0

𝑥𝑖
27
Decision Boundary
• 0 hidden layers: linear classifier (Hyperplanes)

x1 x2
Decision Boundary
• 1 hidden layer: Boundary of convex region (open or closed)

x1 x2
Decision Boundary
• 2 hidden layers: Combinations of convex regions
y

x1 x2
Example 1
• Use the given inputs and weights to calculate the outputs O1 and O2.
Example 1 - Solution

For h1: For h2:

Example 1 - Solution

For O1: For O2:

Multi-Class Output

Output … yK Where 𝑦𝑖 should be

y1
the possibility of each
category, thus:

Hidden Layer a1 a2 … aD 0 < 𝑦𝑖 < 1

∑𝑦𝑖 = 1

Input x1 x2 x3 … xM
Multi-Class Output
y1 … yK

• Softmax activation function: Softmax

𝑒 𝑧𝑖 Output …
𝑦𝑖 = 𝜎 𝒛 𝑖 = 𝐾 z1
zK

∑𝑗=1 𝑒 𝑧𝑗

0 < 𝑦𝑖 < 1 Hidden Layer a1 a2

… aK

∑𝑦𝑖 = 1

Input x1 x2 x3 … xM
Terminologies
Activation Functions
• Sigmoid or Logistic Activation
Function.

• So far, we’ve assumed that the

activation function (nonlinearity)
is always the sigmoid.
Activation Functions
• Tangent hyperbolic
function simply referred
to as tanh function

𝑧 = tanh(𝑥)

• Shifted the output range

int (-1, +1)
Activation Functions
• Intend: Reducing the gradiaent
vanish problem

𝑅𝑒𝑙𝑢 = max(0, 𝑥)

ReLU: Rectified Linear Unit

The gradient of sigmoids becomes

increasingly small as the absolute
value of x increases. The constant
gradient of ReLUs results in faster
learning.
Dataset Split
• Training Dataset
The sample of data used to fit the model.

• Validation Dataset
The sample of data used to provide an unbiased evaluation of a model fit on
the training dataset while tuning model hyperparameters.

• Test Dataset
The sample of data used to provide an unbiased evaluation of a final model fit
on the training dataset.
Overfitting and Underfitting

• Underfitting happens when

the model has a very high
bias and is unable to
capture the complex
patterns in the data.

• Overfitting is the opposite

in the sense that the model
is too complex (or higher
model) and captures even
the noise in the data.
Tensor
• A tensor is just a container for
data, typically numerical data.
• Tensors are a generalization of
matrices to any number of
dimensions.
• You may already be familiar
with matrices, which are 2D
tensors.
• Note that in the context of
tensors, a dimension is often
called an axis.
Gradient Descent
• Gradient descent is an optimization algorithm which is commonly
used to train machine learning models and neural networks.

• Training data helps these models learn over time, and the cost
function within gradient descent specifically acts as a barometer,
gauging its accuracy with each iteration of parameter updates.

• Until the function is close to or equal to zero, the model will continue
to adjust its parameters to yield the smallest possible error.
Gradient Descent
• The starting point is just an arbitrary point
for us to evaluate the performance.
• From that starting point, we will find the
derivative (or slope), and from there, we
can use a tangent line to observe the
steepness of the slope.
• The slope will inform the updates to the
parameters—i.e. the weights and bias.

• The slope at the starting point will be steeper, but as new parameters are generated, the
steepness should gradually reduce until it reaches the lowest point on the curve, known as the
point of convergence.
Learning Rate
• Learning rate (also referred to as step size or the alpha) is the size of
the steps that are taken to reach the minimum.
• This is typically a small value, and it is evaluated and updated based
on the behaviour of the cost function.
• High learning rates result in larger steps but risks overshooting the
minimum.
• Conversely, a low learning rate has small step sizes. While it has the
advantage of more precision, the number of iterations compromises
overall efficiency as this takes more time and computations to reach
the minimum.
Learning Rate
• Learning rate is a hyper-
parameter that controls how
much we are adjusting the
weights of our network with
respect the loss gradient.
Learning Rate
Find a Good Learning Rate
Momentum in Neural Network
• Gradient descent is an optimization algorithm that follows the
negative gradient of an objective function in order to locate the
minimum of the function.

• Momentum is an extension to the gradient descent optimization

algorithm that allows the search to build inertia in a direction in the
search space and overcome the oscillations of noisy gradients and
coast across flat spots of the search space.
Example 2
Inputs
.6 Output
Age 34 .4
.2 
.1 .5 0.6
Gender 2 .3 .2 
 .8
.7 “Probability of
beingAlive”
Stage 4 .2

Dependent
Independent Weights Hidden Weights variable
variables
Layer
Prediction
Example 2
Inputs
.6 Output
Age 34
.5 0.6
.1
Gender 2 
.7 .8 “Probability of
beingAlive”
Stage 4

Dependent
Independent Weights Hidden Weights variable
variables
Layer
Prediction
Example 2
Inputs
Output
Age 34
.2 .5
0.6
Gender 2 .3

“Probability of
.8
beingAlive”
Stage 4 .2

Dependent
Independent Weights Hidden Weights variable
variables
Layer
Prediction
Example 2
Inputs
.6 Output
Age 34
.2 .5
.1 0.6
Gender 1 .3

.7 “Probability of
.8
beingAlive”
Stage 4 .2

Dependent
Independent Weights Hidden Weights variable
variables
Layer
Prediction
Backpropagation: Multi-Layer
Perceptron
What is Backpropagation?
• Backpropagation in neural network is a short form for “backward
propagation of errors”
• Backpropagation is the method of fine-tuning the weights of a neural
network based on the error rate obtained in the previous epoch (i.e.,
iteration)
• This method helps calculate the gradient of a loss function with
respect to all the weights in the network
• Proper tuning of the weights allows you to reduce error rates and
make the model reliable by increasing its generalization
How Backpropagation Works?
• Inputs X, arrive through the preconnected path
• Input is modelled using real weights W. The weights are usually randomly
selected.
• Calculate the output for every neuron from the input layer, to the hidden
layers, to the output layer.
• Calculate the error in the outputs
• Travel back from the output layer to
the hidden layer to adjust the
weights such that the error is
decreased.
• Keep repeating the process until the
desired output is achieved
Chain Rule
y1

𝑦𝑘 = g 𝐮
Given 𝑦𝑖 = ෍ 𝑢𝑗 ∗ 𝑤
ෞ𝑗 𝑢𝑗 = 𝑥𝑘 ∗ 𝑤𝑗
𝑗
u1 u2 uJ
𝑑𝑦𝑖 𝑑𝑦𝑖 𝑑𝑢𝑗
Chain rule: = ∑𝑗
𝑑𝑥𝑘 𝑑𝑢𝑗 𝑑𝑥𝑘
𝑢𝑗 = ℎ(𝑥2 )

Backpropagation is just repeated application of the chain rule from Calculus.

Backpropagation Algorithm
𝑑𝑦𝑖 𝑑𝑦𝑖 𝑑𝑢𝑗
Chain rule: = ∑𝑗
𝑑𝑥𝑘 𝑑𝑢𝑗 𝑑𝑥𝑘

1. Instantiate the computation as a directed acyclic graph, where each

intermediate quantity is a node
2. At each node, store (a) the quantity computed in the forward pass and (b) the
partial derivative of the goal with respect to that node’s intermediate quantity.
3. Initialize all partial derivatives to 0.
4. Visit each node in reverse topological order. At each node, add its contribution
to the partial derivatives of its parents

This algorithm is also called automatic differentiation in the reverse-mode

Convolutional Neural Networks
What is CNNs?
• When it comes to image classification, the most used neural networks are
Convolution Neural Networks (CNN).
• CNNs contain multiple convolution layers, which are responsible for the
extraction of important features from the image
• The earlier layers are responsible for low-level details, and the later layers
are responsible for more high-level features

Applications:
• Image processing
• Computer Vision
• Speech Recognition
• Machine translation
Pros and Cons of CNN

• Advantages:
• Used for deep learning with few parameters
• Less parameters to learn as compared to fully connected layer

• Disadvantages:
• Comparatively complex to design and maintain
• Comparatively slow [depends on the number of hidden layers]
CNN Example

Emotion Recognition from face image

Convolutional Neural Network
• How to compute the convolution of an image?
1 0 1
Suppose the kernel K= 0 1 0
1 0 1
AlexNet
We want the model to learn the kernel

Achieved best performance in image classification competition 2010

Source: A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,”
Residual Network

By adding an identity mapping in the between

the convolutional layer, the authors created a
deeper network for image classification

This model won the 1st place on the ILSVRC

2015 classification task.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep

Residual Learning for Image
Recognition”
You Only Look Once(YOLO): Unified, Real-Time
Object Detection

A single neural network predicts bounding boxes and class probabilities directly from full images in one
evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on
detection performance.

Source: J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,”
Recurrent Neural Network
What is RNNs?

• RNNs come into picture when

there’s a need for predictions using
sequential data
• Sequential data can be a sequence
of images, words, etc
• The RNN have a similar structure to
that of a Feed-Forward Network,
except that the layers also receive a
time-delayed input of the previous
instance prediction
• This instance prediction is stored in
the RNN cell which is a second input
for every prediction
Applications of RNNs?

• Prediction problems
• Machine Translation
• Speech Recognition
• Language Modelling and Generating Text (Text processing like auto
suggest, grammar checks, etc.)
• Video tagger
• Generating image descriptions
• Sentiment Analysis
• Text Summarization.
• Call Centre Analysis.
Pros and Cons

• Advantages:
• Model sequential data where each sample can be assumed to be dependent
on historical ones is one of the advantage
• Used with convolution layers to extend the pixel effectiveness

• Disadvantages:
• Gradient vanishing and exploding problems
• Training recurrent neural nets could be a difficult task
• Difficult to process long sequential data using ReLU as an activation function
Sequence to Sequence Models
What is StS Models?
• A StS model consists of two Recurrent Neural Networks
• There exists an encoder that processes the input and a decoder that
processes the output
• The encoder and decoder work simultaneously – either using the
same parameter or different ones
• This model, on contrary to the actual RNN, is particularly applicable in
those cases where the length of the input data is equal to the length
of the output data
• These models are usually applied mainly in chatbots, machine
translations, and question answering systems
Sequence to Sequence Models

• Example: Machine translation

LSTM: Long Short-Term Memory
What is LSTM?
• LSTM Neural Networks overcome the issue of Vanishing Gradient in
RNNs by adding a special memory cell that can store information for
long periods of time
• LSTM uses 3 gates to define which output should be used or forgotten
- Input gate, Output gate and a Forget gate
• The Input gate controls what all data should be kept in memory
• The Output gate controls the data given to the next layer
• The forget gate controls when to dump/forget the data not required
Applications of LSTM?

• Applications:
• Gesture recognition
• Speech recognition
• Text prediction
Long Short-Term Memory (LSTM)

• LSTMs are explicitly designed to avoid the long-term dependency problem

• Remembering information for long periods of time is practically their default
behavior, not something they struggle to learn
• They really work a lot better for most tasks!
Gated Recurrent Unit (GRU)

• Aims to solve the vanishing gradient problem which comes with a standard
recurrent neural network. GRU can also be considered as a variation on the
LSTM
Graph Neural Network
Graph Neural Network (GNN)
• Although counterintuitive, one can learn more about the symmetries
and structure of images and text by viewing them as graphs and
build an intuition that will help understand other less grid-like graph
data

Lecture 1
No ratings yet
Lecture 1
38 pages
Deep Learning - Unit 1 Notes
No ratings yet
Deep Learning - Unit 1 Notes
27 pages
Unit 4
100% (1)
Unit 4
57 pages
NN Topologies
No ratings yet
NN Topologies
19 pages
Comprehensive Guide to Neural Networks
No ratings yet
Comprehensive Guide to Neural Networks
47 pages
Week 3 - Advanced Topics in Machine Learning
100% (1)
Week 3 - Advanced Topics in Machine Learning
22 pages
Unit 1
No ratings yet
Unit 1
19 pages
Deep Learning All Modules
No ratings yet
Deep Learning All Modules
445 pages
Chapter 3-1 Neural Network
No ratings yet
Chapter 3-1 Neural Network
43 pages
SkillForge - Artificial Intelligence Course
No ratings yet
SkillForge - Artificial Intelligence Course
12 pages
Neural Networks for Control Overview
No ratings yet
Neural Networks for Control Overview
38 pages
Course file-AIML-AI Sub-2025
No ratings yet
Course file-AIML-AI Sub-2025
21 pages
Lecture No 6 Deep Learning Algorithm
No ratings yet
Lecture No 6 Deep Learning Algorithm
37 pages
Unit 1 Introduction To AI and ML
No ratings yet
Unit 1 Introduction To AI and ML
74 pages
Unit
No ratings yet
Unit
112 pages
Artificial Intelligence and Machine Learning Question Bank
No ratings yet
Artificial Intelligence and Machine Learning Question Bank
23 pages
Unit1 ML
No ratings yet
Unit1 ML
23 pages
Are You Ready To Land A $100,000+ Codingjob in 2025?: Assessment
No ratings yet
Are You Ready To Land A $100,000+ Codingjob in 2025?: Assessment
11 pages
Bellman-Ford Algorithm Guide
No ratings yet
Bellman-Ford Algorithm Guide
13 pages
Machine Learning Lecture Notes
No ratings yet
Machine Learning Lecture Notes
19 pages
AI & Machine Learning Essentials
No ratings yet
AI & Machine Learning Essentials
44 pages
Deep Learning Algorithms
No ratings yet
Deep Learning Algorithms
19 pages
Business Data Mining Week 5
No ratings yet
Business Data Mining Week 5
19 pages
Null 5
No ratings yet
Null 5
16 pages
9.4.1 Deep Learning and AI Development Framework Lab Guide-1
No ratings yet
9.4.1 Deep Learning and AI Development Framework Lab Guide-1
20 pages
Navigation Software Update
No ratings yet
Navigation Software Update
4 pages
Artificial Neural Networks: Part 1/3
No ratings yet
Artificial Neural Networks: Part 1/3
25 pages
Lect 2 Common Architectural Principles of Deep Networks
No ratings yet
Lect 2 Common Architectural Principles of Deep Networks
20 pages
HW 12 Digital Edition3 PDF
No ratings yet
HW 12 Digital Edition3 PDF
100 pages
Machine Learning - Question Bank
No ratings yet
Machine Learning - Question Bank
45 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
18 pages
Machine Learning Concepts Guide
No ratings yet
Machine Learning Concepts Guide
122 pages
Supp Exam Paper - MAT3707
No ratings yet
Supp Exam Paper - MAT3707
19 pages
Machine Learning Ppts
No ratings yet
Machine Learning Ppts
38 pages
Modelling & Neural Network Grade 9
0% (1)
Modelling & Neural Network Grade 9
71 pages
Introduction to Artificial Intelligence
No ratings yet
Introduction to Artificial Intelligence
25 pages
Machine Learning Basics and Types
No ratings yet
Machine Learning Basics and Types
20 pages
CS221 Lecture 4: Generalization in ML
No ratings yet
CS221 Lecture 4: Generalization in ML
15 pages
Grade 7 Paper1 MG-1
No ratings yet
Grade 7 Paper1 MG-1
3 pages
Fundamental - Deep Learning
No ratings yet
Fundamental - Deep Learning
69 pages
Quantecon Python Programming
No ratings yet
Quantecon Python Programming
388 pages
Unit 2
No ratings yet
Unit 2
64 pages
Deep Learning Basics Explained
No ratings yet
Deep Learning Basics Explained
21 pages
ML Full Syllabus
No ratings yet
ML Full Syllabus
576 pages
Huawei Big Data Platform Overview
No ratings yet
Huawei Big Data Platform Overview
4 pages
CNN Guide for Machine Learning Students
No ratings yet
CNN Guide for Machine Learning Students
37 pages
Deep Neural Network Presentation
No ratings yet
Deep Neural Network Presentation
9 pages
DL Unit 1
No ratings yet
DL Unit 1
19 pages
Slides CNN Unit 3
No ratings yet
Slides CNN Unit 3
36 pages
Basics and Benefits of Neural Networks
No ratings yet
Basics and Benefits of Neural Networks
46 pages
App SRM Unit 5 Notes
No ratings yet
App SRM Unit 5 Notes
35 pages
Soft Computing Assignment
100% (1)
Soft Computing Assignment
13 pages
Deployment Diagram
No ratings yet
Deployment Diagram
8 pages
Introduction To Data Science Module 3
No ratings yet
Introduction To Data Science Module 3
24 pages
Recurrent Neural Network Wiki
100% (1)
Recurrent Neural Network Wiki
7 pages
Lab Manual CL III
No ratings yet
Lab Manual CL III
66 pages
01 - ML Introduction - Course Outline
No ratings yet
01 - ML Introduction - Course Outline
21 pages
ML-5TH Unit
No ratings yet
ML-5TH Unit
28 pages
CHA-2-Fundamentals of ANN PDF
No ratings yet
CHA-2-Fundamentals of ANN PDF
23 pages
2 DeepLearning
No ratings yet
2 DeepLearning
46 pages
MCSE - PGCS202 - SOFT COMPUTING - R18 - Booklet
No ratings yet
MCSE - PGCS202 - SOFT COMPUTING - R18 - Booklet
2 pages
Neural Networks for Beginners
No ratings yet
Neural Networks for Beginners
4 pages
Aiml 5 6 Sem
No ratings yet
Aiml 5 6 Sem
31 pages
Ddos Detection ANN
No ratings yet
Ddos Detection ANN
9 pages
Question Bank Module-1 Questions. Introduction and Concept Learning
No ratings yet
Question Bank Module-1 Questions. Introduction and Concept Learning
6 pages
Neural Networks: Training & Evolution
No ratings yet
Neural Networks: Training & Evolution
17 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
Software Seismik Refraksi Dan Reflaksi
No ratings yet
Software Seismik Refraksi Dan Reflaksi
29 pages
Deep Learning Quiz for Enthusiasts
No ratings yet
Deep Learning Quiz for Enthusiasts
6 pages
DL Unit 3 Notes
No ratings yet
DL Unit 3 Notes
16 pages
Award Price Estimator For Public Procurement Auctions Using Machine Learning Algorithms: Case Study With Tenders From Spain
No ratings yet
Award Price Estimator For Public Procurement Auctions Using Machine Learning Algorithms: Case Study With Tenders From Spain
10 pages
Unit 2
No ratings yet
Unit 2
72 pages
Deep Learning Essentials Guide
100% (1)
Deep Learning Essentials Guide
158 pages
cs3244 11a.explainable-Ai
No ratings yet
cs3244 11a.explainable-Ai
55 pages
Esn Tutorial Rev
No ratings yet
Esn Tutorial Rev
46 pages
Deep Learning Lab With Tensorflow
No ratings yet
Deep Learning Lab With Tensorflow
84 pages
ML Manual AIDS
No ratings yet
ML Manual AIDS
44 pages
Deep Learning: Huawei AI Academy Training Materials
No ratings yet
Deep Learning: Huawei AI Academy Training Materials
47 pages
UNIT-III DeepLearning Notes
No ratings yet
UNIT-III DeepLearning Notes
30 pages
MachineLearningSlides PartOne
No ratings yet
MachineLearningSlides PartOne
252 pages
M.SC Computer Sci 180622
No ratings yet
M.SC Computer Sci 180622
24 pages
Ad3501 DL Unit 2
No ratings yet
Ad3501 DL Unit 2
8 pages
Machine Learning Lab Guide
No ratings yet
Machine Learning Lab Guide
27 pages
Predicting Root Cause Analysis (RCA) Bucket For
No ratings yet
Predicting Root Cause Analysis (RCA) Bucket For
4 pages
Paper Linear-Least-Squares Initialization - J. Principe
No ratings yet
Paper Linear-Least-Squares Initialization - J. Principe
14 pages
Neuro-Fuzzy Speed Control for PMSM
No ratings yet
Neuro-Fuzzy Speed Control for PMSM
8 pages
3 Neural Networks
No ratings yet
3 Neural Networks
72 pages
Deep Dive Into ML
No ratings yet
Deep Dive Into ML
124 pages
21cse356t NLP Unit 4
No ratings yet
21cse356t NLP Unit 4
81 pages
Document
No ratings yet
Document
7 pages

ML06 Neural-Network 2024-2025

Uploaded by

ML06 Neural-Network 2024-2025

Uploaded by

Deep Learning & Neural Network

• Computation methods used in the supervised learning process

“Deep learning attempts to mimic the human brain—

Deep learning is a machine learning

A computer model learns to perform

Models are trained by using a large set of

• Natural Language & Speech • Fraud Detection

• We have a mapping y = F(x)

• An alternative approach is to use a fixed number of basis function in

• The most successful model of this type in the context of pattern

For h1: For h2:

For O1: For O2:

Output … yK Where 𝑦𝑖 should be

Hidden Layer a1 a2 … aD 0 < 𝑦𝑖 < 1

• Softmax activation function: Softmax

0 < 𝑦𝑖 < 1 Hidden Layer a1 a2

• So far, we’ve assumed that the

• Shifted the output range

ReLU: Rectified Linear Unit

The gradient of sigmoids becomes

• Underfitting happens when

• Overfitting is the opposite

• Momentum is an extension to the gradient descent optimization

Backpropagation is just repeated application of the chain rule from Calculus.

1. Instantiate the computation as a directed acyclic graph, where each

This algorithm is also called automatic differentiation in the reverse-mode

Emotion Recognition from face image

Achieved best performance in image classification competition 2010

By adding an identity mapping in the between

This model won the 1st place on the ILSVRC

K. He, X. Zhang, S. Ren, and J. Sun, “Deep

• RNNs come into picture when

• Example: Machine translation

• LSTMs are explicitly designed to avoid the long-term dependency problem

You might also like