
4
Hence, the application of (1) to this case yields
In this case, the partial derivative has the same algebraic sign on consecutive itera-
tions. Hence, with 0
< α < 1 the exponentially weighted adjustment to the weight w at
time n grows in magnitude. That is, the weight w is adjusted by a large amount. The inclusion of
the momentum constant α in the algorithm for computing the optimum weight w* = w
0
tends to
accelerate the downhill descent toward this optimum point.
Problem 4.5
Consider Fig. 4.14 of the text, which has an input layer, two hidden layers, and a single output
neuron. We note the following:
Hence, the derivative of with respect to the synaptic weight connecting neuron k in
the second hidden layer to the single output neuron is
(1)
where is the activation potential of the output neuron. Next, we note that
(2)
where is the output of neuron k in layer 2. We may thus proceed further and write
(3)
∆wn() 2k
1
ηα
n-t
wt() w
0
–()
t=1
n
∑
–=
∂Et() ∂wt()⁄
∆wn()
y
1
3()
FA
1
3()
()F wx,()==
FA
1
3()
() w
1k
3()
∂FA
1
3()
()
∂w
1k
3()
----------------------
∂FA
1
3()
()
∂y
1
3()
----------------------
∂y
1
3()
∂v
1
3()
------------
∂v
1
3()
∂w
1k
3()
-------------
=
v
1
3()
∂FA
1
3()
()
∂y
1
3()
----------------------
1=
y
1
3()
ϕ v
1
3()
()=
v
1
3()
w
1k
3()
y
k
2()
k
∑
=
y
k
2()
∂y
1
3()
∂v
1
3()
------------
ϕ′ v
1
3()
()ϕ′A
1
3()
==