2012-1158. Backpropagation NN
2012-1158. Backpropagation NN
Networks
Introduction to
Backpropagation
- In 1969 a method for learning in multi-layer network, Backpropagation, was invented by Bryson and Ho. - The Backpropagation algorithm is a sensible approach for dividing the contribution of each weight.
1) The activation of the hidden unit is used instead of activation of the input value.
2) The rule contains a term for the gradient of the activation function.
1. Computes the error term for the output units using the observed error. 2. From output layer, repeat - propagating the error term back to the previous layer and - updating the weights between the two layers until the earliest hidden layer is reached.
Backpropagation Algorithm
Initialize weights (typically random!) Keep doing epochs For each example e in training set do
forward pass to compute
O = neural-net-output(network,e) miss = (T-O) at each output unit
Backward Pass
Compute deltas to weights
from hidden layer to output layer
Gradient Descent
Think of the N weights as a point in an Ndimensional space Add a dimension for the observed error Try to minimize your position on the error surface
Error Surface
error
Compute deltas
Gradient
We need a derivative! Activation function must be continuous, differentiable, non-decreasing, and easy to compute
Updating hidden-to-output
We have teacher supplied desired values deltawji = * aj * (Ti - Oi) * g(ini)
= * aj * (Ti - Oi) * Oi * (1 - Oi)
for sigmoid the derivative is,
alpha
Here we have general formula with derivative, next we use for sigmoid
miss
Compute deltas
How do we pick ?
1. Tuning set, or 2. Cross validation, or 3. Small for slow, conservative learning
The general Backpropagation Algorithm for updating weights in a multilayer network Repeat until convergent
Compute the error in output Update weights to output layer Compute error in each hidden layer
NETalk (1987)
Mapping character strings into phonemes so they can be pronounced by a computer Neural network trained how to pronounce each letter in a word in a sentence, given the three letters before and three letters after it in a window Output was the correct phoneme Results
95% accuracy on the training data 78% accuracy on the test set
Other Examples
Neurogammon (Tesauro & Sejnowski, 1989)
Backgammon learning program
Speech Recognition (Waibel, 1989) Character Recognition (LeCun et al., 1989) Face Recognition (Mitchell)
ALVINN
Steer a van down the road
2-layer feedforward
using backpropagation for learning
Learning on-thefly
Interactive
activation propagates forward & backwards propagation continues until equilibrium is reached in the network We do not discuss these networks here, complex training. May be unstable.
I/O pairs:
given the inputs, what should the output be? [typical learning problem]
- Developed in 1993.
Output units
Hidden layer
- Performs driving with Neural Networks. - An intelligent VLSI image sensor for road following.
Input units
ex2. Nestor:
- Uses Nestor Learning System (NLS). - Several multi-layered feed-forward neural networks. - Intel has made such a chip - NE1000 in VLSI technology.
- Real time operation without the need of special computers or custom hardware DSP platforms
Software exists.
2. How do we create neural networks in a repeatable and predictable manner? 3. Absence of quality assurance methods for neural network models and implementations
How do I verify my implementation?
Repeatability
Relevant information must be captured in problem specification and combinations of parameters
Implementation
Remain unfashionable
Summary
- Neural network is a computational model that simulate some properties of the human brain.
- The connections and nature of units determine the behavior of a neural network.
- Perceptrons are feed-forward networks that can only represent linearly separable functions.
Summary
- Given enough units, any function can be represented by Multi-layer feed-forward networks. - Backpropagation learning works on multi-layer feed-forward networks. - Neural Networks are widely used in developing artificial learning systems.
References
- Russel, S. and P. Norvig (1995). Artificial Intelligence - A Modern Approach. Upper Saddle River, NJ, Prentice Hall. - Sarle, W.S., ed. (1997), Neural Network FAQ, part 1 of 7: Introduction, periodic posting to the Usenet newsgroup comp.ai.neural-nets, URL: ftp://ftp.sas.com/pub/neural/FAQ.html
Sources
Eric Wong
Eddy Li
Martin Ho
Kitty Wong