Training and Testing
Neural Networks
Contents
Introduction
When Is the Neural Network Trained?
Controlling the Training Process with Learning
Parameters
Iterative Development Process
Avoiding Over-training
Automating the Process
Introduction (1)
Training a neural network
perform a specific processing function
1) parameter?
2) how used to control the training process
3) management of the training data - training process ?
Development Process
1) Data preparation
2) neural network model & architecture
3) train the neural network
neural network function
Application
trained
Introduction (2)
Learning Parameters for Neural Network
Disciplined approach to iterative neural network
development
Introduction (3)
When Is the Neural Network Trained?
When the network is trained?
the type of neural network
the function performing
classification
clustering data
build a model or time-series forecast
the acceptance criteria
meets the specified accuracy
the connection weights are locked
cannot be adjusted
When Is the Neural Network Trained?
Classification (1)
Measure of success : percentage of correct
classification
incorrect classification
no classification : unknown, undecided
threshold limit
When Is the Neural Network Trained?
Classification
(2)
confusion matrix
: possible output categories and the corresponding
percentage of correct and incorrect classifications
Category A Category B Category C
Category A 0.6
0.25
0.15
Category B 0.25
0.45
0.3
Category C 0.15
0.3
0.55
When Is the Neural Network Trained?
Clustering (1)
Output a of clustering network
open to analysis by the user
Training regimen is determined:
the number of times the data is presented to the neural
network
how fast the learning rate and the neighborhood decay
Adaptive resonance network training (ART)
vigilance training parameter
learn rate
When Is the Neural Network Trained?
Clustering (2)
Lock the ART network weights
disadvantage : online learning
ART network are sensitive to the order of the
training data
When Is the Neural Network Trained?
Modeling (1)
Modeling or regression problems
Usual Error measure
RMS(Root Square Error)
Measure of Prediction accuracy
average
MSE(Mean Square Error)
RMS(Root Square Error)
The Expected behavior
RMS error , stable
minimum
When Is the Neural Network Trained?
Modeling (2)
When Is the Neural Network Trained?
Modeling (3)
network fall into a local minima
the prediction error doesnt fall
oscillating up and down
reset(randomize) weight and start again
training parameter
data representation
model architecture
When Is the Neural Network Trained?
Forecasting
Forecasting (1)
prediction problem
RMS(Root Square Error)
visualize : time plot of the actual and desired network
output
Time-series forecasting
long-term trend
influenced by cyclical factor etc.
random component
variability and uncertainty
neural network are excellent tools for modeling complex
time-series problems
recurrent neural network : nonlinear dynamic systems
no self-feedback loop & no hidden neurons
When Is the Neural Network Trained?
Forecasting (2)
Controlling the Training Process with
Learning Parameters (1)
Learning Parameters depends on
Type of learning algorithm
Type of neural network
Controlling the Training Process with
Learning Parameters (2)
- Supervised training
Pattern
Pattern
Neural Network
Prediction
Prediction
Desired
Desired
Output
Output
1) How the error is computed
2) How big a step we take when adjusting the
connection weights
Controlling the Training Process with
Learning Parameters (3)
- Supervised training
Learning rate
magnitude of the change when adjusting the connection
weights
the current training pattern and desired output
large rate
giant oscillations
small rate
to learn the major features of the problem
generalize to patterns
Controlling the Training Process with
Learning Parameters (4)
- Supervised training
Momentum
filter out high-frequency changes in the weight values
oscillating around a set values
Error
Error tolerance
how close is close enough
0.1
net input must be quite large?
Controlling the Training Process with
Learning Parameters (5)
-Unsupervised learning
Parameter
selection for the number of outputs
granularity of the segmentation
(clustering, segmentation)
learning parameters (architecture is set)
neighborhood parameter : Kohonen maps
vigilance parameter : ART
Controlling the Training Process with
Learning Parameters (6)
-Unsupervised learning
Neighborhood
the area around the winning unit, where the non-wining
units will also be modified
roughly half the size of maximum dimension of the
output layer
2 methods for controlling
square neighborhood function, linear decrease in the learning
rate
Gaussian shaped neighborhood, exponential decay of the
learning rate
the number of epochs parameter
important in keeping the locality of the topographic
amps
Controlling the Training Process with
Learning Parameters (7)
-Unsupervised learning
Vigilance
control how picky the neural network is going to be
when clustering data
discriminating when evaluating the differences between
two patterns
close-enough
Too-high Vigilance
use up all of the output units
Iterative Development Process (1)
Network convergence issues
fall quickly and then stays flat / reach the global
minima
oscillates up and down / trapped in a local minima
some random noise
reset the network weights and start all again
design decision
Iterative Development Process (2)
Iterative Development Process (3)
Model selection
inappropriate neural network model for the function to
perform
add hidden units or another layer of hidden units
strong temporal or time element embedded
recurrent back propagation
radial basis function network
Data representation
key parameter is not scaled or coded
key parameter is missing from the training data
experience
Iterative Development Process (4)
Model architecture
not converge : too complex for the architecture
some additional hidden units, good
adding many more?
Just, Memorize the training patterns
Keeping the hidden layers as this as possible, get the
best results
Avoiding Over-training
Over-training
pattern
cannot generalize
pattern
switch between training and testing data
Automating the Process
Automate the selection of the appropriate number
of hidden layers and hidden units
pruning out nodes and connections
genetic algorithms
opposite approach to pruning
the use of intelligent agents