0% found this document useful (0 votes)

18 views24 pages

UNIT-III-3.2-ML-Features of ANN and Case Study ANN

Uploaded by

commander29032003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views24 pages

UNIT-III-3.2-ML-Features of ANN and Case Study ANN

Uploaded by

commander29032003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

MACHINE LEARNING

(20BT60501)

COURSE DESCRIPTION:
Concept learning, General to specific ordering, Decision tree learning, Support vector
machine, Artificial neural networks, Multilayer neural networks, Bayesian learning,
Instance based learning, reinforcement learning.

1
Subject :MACHINE LEARNING -(20BT60501)

Topic: Unit III – ARTIFICIAL NEURAL NETWORKS

Prepared By:
Dr.J.Avanija
Professor
Dept. of CSE
Sree Vidyanikethan Engineering College
Tirupati.

2
Unit III – ARTIFICIAL NEURAL NETWORKS

Neural network representations

 Appropriate problems for neural network learning
 Perceptrons
 Multilayer networks and Backpropagation algorithm
 Convergence and local minima
 Representational power of feedforward networks
Hypothesis space search and inductive bias
Hidden layer representations, Generalization
 Overfitting, Stopping criterion
 An Example - Face Recognition.

3
Features of ANNs
 ANNs perform well, generally better with larger number of hidden units.
 More hidden units generally produce lower error.
 Determining network topology is difficult.
 Choosing single learning rate is impossible.
 Difficult to reduce training time by altering the network topology or learning parameters.
 Highly accurate predictive models for a large number of different types of problems.
 Ease of use and deployment – poor.
o Connection between nodes
o Number of units
o Training level
 Learning Capability
 Model is built one record at a time

4
Features of ANNs . . .
 Weaknesses
 Long training time
 Require a number of parameters typically best determined
empirically, e.g., the network topology or “structure."
 Poor interpretability: Difficult to interpret the symbolic
meaning behind the learned weights and of “hidden units"
in the network

 Strengths
 High tolerance to noisy data
 Ability to classify untrained patterns
 Well-suited for continuous-valued inputs and outputs
 Successful on a wide array of real-world data
 Algorithms are inherently parallel
 Techniques have recently been developed for the
extraction of rules from trained neural networks

5
Hypothesis Space and Inductive Bias
 Every possible combination of network weights is a potential candidate.
 All potential candidates form the Hypothesis Space.
 A Hypothesis space can be defined as an N-dimensional Euclidean space of N network weights.
 The hypothesis space of a neural network is a continuous space.
 Error E of a network is differentiable with respect to the continuous parameters of the hypothesis
space.
 The above two factors results in a well-defined error gradient which leads to efficient search
strategies.
 Inductive Bias – can be defined as the set of assumptions (implicit or explicit) made by learning
algorithms in order to perform induction (or generalization).
 Inductive Bias of NN – “Smooth Interpolation between the data points”.

6
Representational Power of Feedforward NNs
Representational power specifies the power of NNs. What set of problems can be represented by NN?
 Boolean Functions –
• Every Boolean function can be represented by NN with 2 layers.
• The maximum no. of hidden neurons required = no. of samples in training data.
• NN may be designed with less number of hidden neurons.
 Continuous Functions –
• Every bounded continuous function can be approximated with arbitrarily small error E by
using NN with 2 layers.
• Hidden layer uses sigmoid function.
• Output layer uses unthresholded linear function.
 Arbitrary Functions –
• Any arbitrary function can be approximated by a NN with 3 layers.
• Hidden layer uses sigmoid function.
• Output layer uses unthresholded linear function.

7
Hidden Layer Representations
 Hidden layers capture the characteristics of training data to learn the target function.
 Training samples only restrain the number of input neurons and output neurons.
 The hidden layers and hidden neurons are not explicitly introduced by human designer.
 Hence, NN has capability to adjust its structure and parameters to discover efficient NN to solve
the given problem with minimal possible error E.
 This ability of Multilayer NNs to automatically discover useful representations of hidden layers
is a key feature of ANN learning.
 The more the number of hidden layers/neurons, the more complex problems can be
represented by the NN.

8
Convergence and Local Minima
 BPN uses Gradient Descent search to search through the hypothesis space.
 The objective is to search through the hypothesis space in the direction so as to reduce the error
E.
 Because the error surface for multilayer NNs may contain multiple local minima, the gradient
descent search may get stuck at one of these local minima.
 As a result BPN is not guaranteed to converge at global minima, but may converge at a local
minima.

9
Convergence and Local Minima . . .
 In spite of this disadvantage BPN is a largely popular model for a NN consisting of a large number
of weights.
 Large number of weights correspond to very high dimensional error surfaces ;
 local minima for a weight may not be local minima for other weights ;
 Hence they provide escape routes  BPN may not get stuck at a local minima.
 If initial weights are near to zero
 during early iterations, sigmoid function provides smooth & linear function ;
 as iterations pass, weights tend to increase their value in order to reduce the error E ;
 this is the time where NN represents complex/nonlinear functions ;
 this is the region which may have more local minima, and BPN may get stuck at a
local minima ;
 But, it may be hoped that by this time BPN has reached close enough to the global
minima;
 and it is acceptable even if BPN gets stuck at a local minima closer to global minima.

10
Convergence and Local Minima . . .
 Regarding gradient descent over complex error surfaces, the following heuristics may be
attempted to alleviate the problem of local minima:
1) Add momentum to Chain Rule –

Momentum Term

Momentum has two effects on the gradient descent:

 It keeps the descent in the same direction through the iterations and
 It keeps the descent going through local minima and flat regions.

11
Convergence and Local Minima . . .
 Regarding gradient descent over complex error surfaces, the following heuristics may be
attempted to alleviate the problem of local minima: . . .
2) Use Stochastic Gradient Descent instead of Standard Gradient Descent
Stochastic Gradient Descent travels through approximate error surfaces which will have
different local minima. Hence, it can be hoped that BPN will not get stuck in one of
these local minima.

3) Train multiple NNs with different initial weights.

Different initial weights lead to different error surface and different local minima;
The NN with best performance can be selected as final model.

12
Generalization, Overfitting
 Generalization is the capability of the model to perform well on unseen data.
 Overfitting Why does overfitting occur at
later iterations of learning
process?

Through iterations
 weights tend to increase their
values to reduce error E
 Larger weight values increase
model complexity
 Leads to overfitting

Solutions:
 Weight Decay
 Validation Data
 K-Fold Cross Validation

13
Case Study: ALVINN
 Autonomous Land Vehicle In a Neural Network - 1989
 ALVINN is a neural network designed to steer an
autonomous vehicle driving at normal speeds on
public highways.
 A forward-pointed camera is mounted on the
vehicle.
 The camera takes images of resolution 120 x 128
pixels.
 Currently ALVINN takes images from a camera and
a laser range finder as input and produces as
output the direction the vehicle should travel in
order to follow the road.
 ALVINN is trained for 5 minutes to observe and
learn from human driving.
 Further it has been tested successfully for
autonomous driving of 90 miles with up to 70 miles
speed on public highways (driving in the left lane
14 of
highway, with other vehicles present)
Case Study: ALVINN . . .

 ALVINN is a 2-layer Backpropagation NN

with
960 input neurons, 4 hidden neurons
and 30 output neurons.
 Here the individual units are interconnected
in layers that form a directed acyclic graph.
 It is a feedforward network.
 It is a fully connected network.
 The output layer is a linear representation of
the direction the vehicle should travel in order
to keep the vehicle on road.

15
Case Study: ALVINN . . .
 The 120 x 128 image taken by camera is
converted into a coarse-resolution image of 30
x 32.
 Each coarse resolution pixel intensity is
obtained by selecting the intensity of a single
pixel at random from the appropriate region
within the high-resolution image.
 This 30 x 32 coarse-resolution image is
used as input to the network.
 This method significantly reduces the
computation required to produce the coarse-
resolution image from the available high-
resolution image.
 This efficiency is especially important when
the network must be used to process many
images per second while autonomously driving
the vehicle.
 Output from each output unit corresponds 16
to a particular steering direction, and the
Case Study: ALVINN . . .

 The large matrix of black and white boxes

depicts the weights from the 30 x 32 pixel
inputs into the hidden unit. Here, a white
box indicates a positive weight, a black box
a negative weight, and the size of the box
indicates the weight magnitude.
 The smaller rectangular diagram directly
above the large matrix shows the weights
from this hidden unit to each of the 30
output units.

17
Case Study: Face Recognition
 Application of Backpropagation NN to learn the direction of face in the images – left, right, up,
down.
 The learning task here involves classifying camera images of faces of various people in various
poses.

18
Case Study: Face Recognition . . . .
Data Collection
 Images of 20 different people were collected, including approximately 32 images per person,
• varying the person's expression (happy, sad, angry, neutral),
• Varying the direction in which they were looking (left, right, straight ahead, up), and
whether or not they were wearing sunglasses.
• Varying the background behind the person,
• Varying the clothing worn by the person,
• Varying the position of the person’s face within the image.
 In total, 624 greyscale images were collected, each with a resolution of 120 x 128, with each
image pixel described by a greyscale intensity value between 0 (black) and 255 (white).

19
Case Study: Face Recognition . . . .
Input Encoding
 The 120 x 128 image is encoded into a coarse-resolution image of 30 x 32 pixels.
 Each coarse resolution pixel intensity is calculated as the mean of the corresponding high-resolution
pixel intensities.
 This 30 x 32 coarse-resolution image is used as input to the network.
 Data Scaling - The pixel intensity values ranging from 0 to 255 were linearly scaled to range from
0 to 1 so that network inputs would have values in the same interval as the hidden unit and output
unit activations.

Output Encoding
 1-of-n output encoding is used.
 Each output neuron produced a real-valued number between 0.1 and 0.9.
 The NN’s prediction will be equal to the neuron with highest value.

20
Case Study: Face Recognition . . . .
Network Graph Structure
 The Backpropagation network is an acyclic directed graph of sigmoid units.
 It is a feedforward network.
 It is a fully connected network.
 A 2 layer NN with 960 input neurons, 4 output neurons. (2899 weights)
 Experimentation is done with
• 3 hidden neurons – produced model with accuracy of 90%. (less training time)
• Up to 30 hidden neurons – produced model with accuracy of 91% - 92%. (more training
time)

 Using 260 training images, the training time on a Sun Sparc5 workstation was approximately
• 5 minutes for the 3 hidden unit network,
• 1 hour for the 30 hidden unit network.

21
Case Study: Face Recognition . . . .
 In these learning experiments the

• Learning rate l was set to 0.3,

• Momentum α was set to 0.3.
 Lower values for both parameters produced roughly equivalent generalization accuracy, but longer
training times.
 If these values are set too high, training fails to converge to a network with acceptable error over
the training set.
 Full gradient descent was used in all these experiments (in contrast to the stochastic
approximation to gradient descent).
 Input unit weights were initialized to zero.
 Network weights in the output units were initialized to small random values.

22
Case Study: Face Recognition . . . .
 Number of training iterations was selected by partitioning the available data into a training set
and a separate validation set.
 Gradient descent was used to minimize the error over the training set, and after every 50 gradient
descent steps the performance of the network was evaluated over the validation set.
 The final selected network was the one with the highest accuracy over the validation set.
 The final reported was measured over a test dataset.

23
Case Study: Face Recognition . . . . Large +ve
weight Large -ve
weight

UNIT 3 - Backpropagation Algorithm
No ratings yet
UNIT 3 - Backpropagation Algorithm
38 pages
ML Unit - 2
No ratings yet
ML Unit - 2
70 pages
Chap 2 Training Feed Forward Neural Networks
No ratings yet
Chap 2 Training Feed Forward Neural Networks
22 pages
Neural Network
No ratings yet
Neural Network
44 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Ai - W7L13
No ratings yet
Ai - W7L13
46 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
ML Unit-2
No ratings yet
ML Unit-2
141 pages
5 1 ArtificialNeuralNetworks 4up
No ratings yet
5 1 ArtificialNeuralNetworks 4up
12 pages
Multi Layer Perceptron Haykin
No ratings yet
Multi Layer Perceptron Haykin
50 pages
CS224n: Natural Language Processing With Deep Learning
No ratings yet
CS224n: Natural Language Processing With Deep Learning
18 pages
Lecture 10
No ratings yet
Lecture 10
155 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
1d Backprop4
No ratings yet
1d Backprop4
6 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Unit 3
No ratings yet
Unit 3
110 pages
4.2 Ann
No ratings yet
4.2 Ann
26 pages
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2015
No ratings yet
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2015
14 pages
Ece18898g Neural Networks
No ratings yet
Ece18898g Neural Networks
47 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
71 pages
Contemporary ML For Physicists
No ratings yet
Contemporary ML For Physicists
91 pages
Neural Networks for Advanced Learners
No ratings yet
Neural Networks for Advanced Learners
23 pages
A Probabilistic Theory of Deep Learning: Unit 2
100% (1)
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Deep Learning for Beginners
100% (1)
Deep Learning for Beginners
87 pages
1d Backprop
No ratings yet
1d Backprop
23 pages
Lecture 8
No ratings yet
Lecture 8
65 pages
Lecture W15ab
No ratings yet
Lecture W15ab
44 pages
Al3502 - DLV Unit 2
No ratings yet
Al3502 - DLV Unit 2
18 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
Unit 4 Short Notes Deep Feedforward Networks Gradient Learning
No ratings yet
Unit 4 Short Notes Deep Feedforward Networks Gradient Learning
27 pages
855597620
No ratings yet
855597620
44 pages
Machine Learning: Chapter 4. Artificial Neural Networks
0% (1)
Machine Learning: Chapter 4. Artificial Neural Networks
34 pages
Unit 4 Short Notes
No ratings yet
Unit 4 Short Notes
27 pages
3 ArtificialNeuralNetworks PDF
No ratings yet
3 ArtificialNeuralNetworks PDF
77 pages
Module 3 - Modified
No ratings yet
Module 3 - Modified
106 pages
Artificial Neural Networks: Biological Motivation
No ratings yet
Artificial Neural Networks: Biological Motivation
22 pages
Mod 2.4,2.5,2.6 Architecture Design
No ratings yet
Mod 2.4,2.5,2.6 Architecture Design
20 pages
Cours 4
No ratings yet
Cours 4
30 pages
NN Notes
No ratings yet
NN Notes
39 pages
Chapter 11 Neural Nets (Python)
No ratings yet
Chapter 11 Neural Nets (Python)
43 pages
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2016
No ratings yet
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2016
14 pages
L2 Neural Network Basics
No ratings yet
L2 Neural Network Basics
105 pages
Backpropagation & Neural Networks
No ratings yet
Backpropagation & Neural Networks
30 pages
Unit 4
No ratings yet
Unit 4
13 pages
Artificial Neural Networks - Lect - 4
No ratings yet
Artificial Neural Networks - Lect - 4
17 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
Deep Learning 1
No ratings yet
Deep Learning 1
48 pages
Lecture8 DeepLearning
No ratings yet
Lecture8 DeepLearning
94 pages
Artificial Neural Networks: Introduction To Computational Neuroscience
No ratings yet
Artificial Neural Networks: Introduction To Computational Neuroscience
42 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Introtodeeplearning MIT 6.S191
No ratings yet
Introtodeeplearning MIT 6.S191
36 pages
1.1 Introduction
No ratings yet
1.1 Introduction
73 pages
Artificial Neural Network (A.k.a. Deep Learning) : Dr. Md. Aminul Haque Akhand Dept. of CSE, Kuet
No ratings yet
Artificial Neural Network (A.k.a. Deep Learning) : Dr. Md. Aminul Haque Akhand Dept. of CSE, Kuet
29 pages
00005187-Deep Learning
No ratings yet
00005187-Deep Learning
11 pages
Class 10 - Sahodaya - Artificial Intelligence - AK
No ratings yet
Class 10 - Sahodaya - Artificial Intelligence - AK
11 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
66 pages
Proposal Latex Template BE
No ratings yet
Proposal Latex Template BE
23 pages
Btech III Year I Semester (Ar20)
No ratings yet
Btech III Year I Semester (Ar20)
7 pages
Predictive Analytics
No ratings yet
Predictive Analytics
46 pages
Project Report Group 3
No ratings yet
Project Report Group 3
32 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
54 pages
Your First Deep Learning Project in Python With Keras Step-By-Step
No ratings yet
Your First Deep Learning Project in Python With Keras Step-By-Step
229 pages
Unit 2 Data Science
No ratings yet
Unit 2 Data Science
53 pages
Deep Learning - School - AAM
No ratings yet
Deep Learning - School - AAM
20 pages
Mehdi RESUME
No ratings yet
Mehdi RESUME
8 pages
AI Project Cycle Quiz
No ratings yet
AI Project Cycle Quiz
19 pages
Machine Learning MCQs: Ridge, SVM, Naive Bayes
No ratings yet
Machine Learning MCQs: Ridge, SVM, Naive Bayes
22 pages
Natural Disaster Prediction Model Using Deep Learning
No ratings yet
Natural Disaster Prediction Model Using Deep Learning
2 pages
PA DL Consolidated
No ratings yet
PA DL Consolidated
94 pages
Waterfall Model Diagram
No ratings yet
Waterfall Model Diagram
16 pages
Class 10 Artificial Intelligence Sample Paper Set 12
No ratings yet
Class 10 Artificial Intelligence Sample Paper Set 12
9 pages
21CST603 AIML - Model Question Paper 1
No ratings yet
21CST603 AIML - Model Question Paper 1
3 pages
ML in Network Intrusion Detection
No ratings yet
ML in Network Intrusion Detection
17 pages
Machine Learning for LULC in Mardan
No ratings yet
Machine Learning for LULC in Mardan
9 pages
Expt 2 Ressearch
No ratings yet
Expt 2 Ressearch
7 pages
Heart Attack Risk Prediction Plag Check PDF
No ratings yet
Heart Attack Risk Prediction Plag Check PDF
49 pages
Ahmad Et Al. - 2023
No ratings yet
Ahmad Et Al. - 2023
19 pages
Career Guidance With AI
No ratings yet
Career Guidance With AI
10 pages
Evaluating The Effectiveness of Machine Learning Methods For
No ratings yet
Evaluating The Effectiveness of Machine Learning Methods For
8 pages
Understanding and
No ratings yet
Understanding and
57 pages
Inn Hotels Group ML 1 Coded Project Business Report
No ratings yet
Inn Hotels Group ML 1 Coded Project Business Report
14 pages
Predicting Student Performance To
No ratings yet
Predicting Student Performance To
17 pages
Predicting The Market Value of Tesla Vehicles
No ratings yet
Predicting The Market Value of Tesla Vehicles
45 pages
Project
No ratings yet
Project
74 pages

UNIT-III-3.2-ML-Features of ANN and Case Study ANN

Uploaded by

UNIT-III-3.2-ML-Features of ANN and Case Study ANN

Uploaded by

MACHINE LEARNING

Topic: Unit III – ARTIFICIAL NEURAL NETWORKS

Neural network representations

Momentum has two effects on the gradient descent:

3) Train multiple NNs with different initial weights.

 ALVINN is a 2-layer Backpropagation NN

 The large matrix of black and white boxes

• Learning rate l was set to 0.3,

You might also like