Unit 3
Unit 3
• This has the effect of moving the weight wij directly towards the current input.
• Note that the only weights that are being updated are those of the winning unit.
Normalization
• Suppose an input vector with values (0.2, 0.3, -0.1) is presented and it happens to
be an exact match for one of the neurons.
• Then, the activation of that neuron will be:
0.2*0.2 + 0.3*0.3 + (-0.1)*(-0.1) = 0.14
• However, consider a neuron with large weights (10, 9, 8). Its activation will be:
0.2*10 + 0.3*9 + (-0.1)*8 = 3.9
• This will be the winner. However, this second neuron and all other neurons are
not perfect matches, so their activation should all be less.
• Thus, we can only compare activations if we know that the weights for all of the
neurons are the same size. We do this by normalizing all the weight vectors.
Using competitive learning for clustering
• Deciding which cluster each new datapoint belongs to is now an easy task.
• We present it to the trained algorithm and observe which neuron (ie cluster
center) is activated.
Vector Quantization
• Consider the example of Data Communication.
• We need to reduce the amount of data transmitted in order to keep the
transmission cost to a minimum.
• Instead of sending whole datapoints, we can encode the data and send only the
indices of the datapoints.
• The codebook can be shared with the receiver so that the data can be decoded.
• The codebook will not contain every possible datapoint. Now, if we want to send
a datapoint which is not in the codebook, the index of the prototype vector which
is closest to it is sent. This is known as Vector Quantization. This same idea is used
in lossy compression.
Vector Quantization (cont..)
• We need to accept that the received data will not look exactly the same as the
original data.
• In a Voronoi tessellation of space, the dots at the centre of each cell are the
prototype vectors and any datapoint that lies within a cell is represented by the
dot.
Vector Quantization (cont..)
• The question is: how to choose the prototype vectors.
• We need to choose prototype vectors that are as close as possible to all of the
possible inputs that we might see.
• The self-organizing feature map is used to solve this problem.
Self-Organizing Feature Maps
• The SOM is a neural network in which the relative locations of the neurons in the
Network matters. This property is known as feature mapping whereby nearby
neurons correspond to similar input patterns.
• The neurons are arranged in a grid with connections between the neurons, rather
than in layers with connections only between the different layers.
• The SOM demonstrates relative ordering preservation, which is sometimes
known as topology preservation.
• The relative ordering of the inputs should be preserved by the ordering in the
neurons, so that neurons that are close together represent inputs that are close
together, while neurons that are far apart represent inputs that are far apart.
Self-Organizing Feature Maps (cont..)
• The winning neuron should pull other neurons that are close to it in the network
closer to itself in weight space, which means that we need positive connections.
• Likewise, neurons that are further away should represent different features, and
so should be a long way off in weight space, so the winning neuron ‘repels’ them,
by using negative connections to push them away.
• Neurons that are very far away in the network should already represent different
features, so we just ignore them.
• This is known as the ‘Mexican Hat’ form of lateral connections.
Self-Organizing Feature Maps (cont..)
Neighbourhood Connections
• If we start our network off with random weights, then at the beginning of
learning, the network is unordered.
• As the weights are random, two nodes that are very close in weight space could
be on opposite sides of the map and vice versa.
• Therefore, at the beginning of the algorithm, the neighborhood size should be
large.
• Once the network has been learning for a while, the algorithm starts to fine-tune
the individual local regions of the network. At this stage, the neighborhood
should be small.
• These two phases of learning are also known as ordering and convergence.
Ordering
• Initially, similar input vectors excite neurons that are far apart, so that the
neighborhood (shown as a circle) needs to be large.
Convergence
• Later on during training the neighborhood can be smaller, because similar input
vectors excite neurons that are close together.
Self-Organization
• A particularly interesting aspect of feature mapping is that we get a global
ordering of the neurons in the network, despite the fact that the interactions are
all local, since neurons that are very far apart do not interact with each other.
• We thus get a global ordering of the space using only a set of local interactions.
This is known as self-organization.
• Consider a flock of birds flying in formation. The birds cannot possibly know
exactly where each other are, so how do they keep in formation?
• If each bird just tries to stay diagonally behind the bird to its right, and fly at the
same speed, then they form perfect flocks, no matter how they start off and what
objects are placed in their way.
• So the global ordering of the whole flock can arise from the local interactions of
each bird looking to the one on its right (or left).
Network Dimensionality and Boundary Conditions
• The SOM algorithm is usually applied on 2D rectangular array of neurons.
• There are cases where a line of neurons (1D) works better, or where three
dimensions are needed. It depends on the dimensionality of the inputs.
• We also need to consider the boundaries of the network. For example, if we are
arranging sounds from low pitch to high pitch, then the lowest and highest
pitches we can hear are obvious endpoints.
• However, it is not always the case that such boundary conditions are clearly
defined. In this case, we might want to remove the boundary conditions.
Circular boundary conditions
• Using circular boundary conditions in 1D turns a line into a circle.
End of Unit-3