0% found this document useful (0 votes)
6 views

neural-network

The document provides an overview of Artificial Neural Networks (ANN), detailing their history, structure, and learning processes, including the backpropagation algorithm. It explains how ANN can learn complex functions and illustrates this with an example of a network designed to learn the identity function. Key concepts such as the sigmoid function, error reduction, and the robustness of ANN to noise in training data are also discussed.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

neural-network

The document provides an overview of Artificial Neural Networks (ANN), detailing their history, structure, and learning processes, including the backpropagation algorithm. It explains how ANN can learn complex functions and illustrates this with an example of a network designed to learn the identity function. Key concepts such as the sigmoid function, error reduction, and the robustness of ANN to noise in training data are also discussed.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 18

Artificial Neural Network

➲ 1 Brief Introduction
➲ 2 Backpropogation Algorithm
➲ 3 A Simply Illustration
Chapter 1 Brief Introduction
➲ History
➲ 1.2 Review to Decision Tree
 Learning process is to reduce the error, which can
be understood as the difference between the target
and output values from learning structure.
 ID3 Algorithm can be implemented only for discrete
values.
 Artificial Neural Network (ANN) can describe
arbitrary functions.
➲ 1.3 Basic Structure
 This example of ANN learning is provided by
Pomerluau’s(1993) system ALVINN, which uses a
learned ANN to steer an autonomous vehicle
driving at normal speeds. The input of ANN is a
30x32 grid of pixel intensities obtained from
forward-faced camera mounted on the vehicle. The
output is the direction in which the vehicle is
steered.
 As can be seen, 4 units receive inputs directly from
all of the 30X32 pixels from the camera in vehicle.
These are called ”hidden” units because their
outputs are only available to the coming units in the
network, but not as apart of the global network.
➲ 1.4 Ability
 Instances are represented by many attribute-value
pairs. The target function to be learned is defined
over instances that can be described by a vector of
predefined feature. such as the pixel values in the
ALVINN example.
 The training examples may contain errors. In
following sections we can see, that ANN learning
methods are quite robust to noise in training data.
 Long training times are acceptable. Compared to
decision tree learning, network training algorithm
requires longer training time, depending on factors
such as the number of the weights in network.
Chapter 2
backpropagation Algorithm
➲ 2.1 Sigmoid
 Like the perceptron, the
sigmoid unit first
computes a linear
combination of its input.
 then the sigmoid unit
computes its output with
the following function.
 This equation 2 is often referred to as the
squashing function since it map very large
input domain to a small range of output.

 this sigmoid function has a useful property that


its derivative is easily expressed in terms of its
output. In the following description of the
backpropagation we can see, the algorithm
makes use of this derivative.
➲ 2.2 Function
 the sigmoid is only one unit in the network, now
we take a look at the whole function, which the
neural network calculates. There is a figure 2.2,
if we consider an example (x, t), where x is
called input attribute and t is called target
attribute, than:
➲ 2.3 Squared Error
 Above it has mentioned, that the whole learning
process is in order to reduce the error, but how
can man error describe? Generally the function
squared error is used.
 Notice: this function 3 sums all the error over all
of the networks output units after a whole set of
training examples has been computed.
 then the value-vector can be updated by:

 where ∇E(~w) is the gradient of E:

so for each value k can be updated by:


 But in practice, because the function 3 sums all the
error over a whole set of the training data, so need
the algorithm with this function more time to
compute, and can easily be effected by local
minimum, so construct man a new function, named
stochastic squared error:

 As can be seen, the function computes error only


about a example. The gradient of Ed(~w) is easily
made out:
➲ 2.4 Backpropagation Algorithm
 The learning problem faced by Backpropagation is
to search a large hypothesis space defined by all
possible weight values for all the units in the
network. The diagram of Algorithm is:
 Notice: the error term for hidden unit h is
calculated by summing the error terms σ_k for
each output unit influenced by unit h, weighting
each of the σ_k’s by w_kh,the weight from
hidden unit h to output unit k. This weight
characterizes the degree to which hidden unit
h is ”responsible for” the error in output unit k.
Chapter 3 A Simple Illustration
Now we make an example to give a more inductive
knowledge. How does ANN learn the most simply
function, a identity id. We construct the network
shown in figure. There are eight network input units,
which are connected to three hidden units, which are
in turn connected to eight output units. Because of
this structure, the three hidden units will be forced to
represent the eight input values in some way that
captures their relevant features, so that this hidden
layer representation can be used by the output units
to compute the correct target values.
 This 8 x 3 x 8 network was trained to learn the
identity function. After 5000training times, the three
hidden unit values encode the eight distinct inputs
using the encoding shown in the tabular. Notice if
the encoded values are rounded to zero or one, the
result is the standard binary encoding for 8 distinct
values.

You might also like