Chap 1
Chap 1
Dendrites: Input
Cell body: Processor
Synaptic: Link
Axon: Output
Once input exceeds a critical level, the neuron discharges a spike ‐ an electrical
pulse that travels from the body, down the axon, to the next neuron(s)
Continued..
■ So we can translate this functional understanding of the neurons in
our brain into an artificial model that we can represent on our
computer.
■ In 1943, neurophysiologist Warren McCulloch and mathematician
Walter Pitts published a paper on how neurons might work. In
order to demonstrate how neurons in the human brain might
function, they develop a simple neural network using electrical
circuits.
■ Linear neuron takes in inputs, do a weighted sum and produce ‘0’ if
below threshold and ‘1’ otherwise.
Mapping from Biological Neuron to
ANN
w0 w2 w1
Processing ∑
W0X0+W1X1+W2 X2
=Y
Output Y
Feed-Forward Neural Networks
■ Although single neurons are more powerful than linear
perceptrons, they’re not nearly expressive enough to
solve complicated learning problems.
■ The neurons in the human brain are organized in layers.
■ The human cerebral cortex (the structure responsible for
most of human intelligence) is made up of six layers.
■ Information flows from one layer to another until sensory
input is converted into conceptual understanding.
a feed-forward neural network
■ Hidden layers identify useful features automatically.
■ Connections only traverse from a lower layer to a
higher layer.
■ They are the simplest to analyze.
■ Hidden layers have fewer neurons than input layer.
■ Selecting which neurons to connect to which
neurons in the next layer is an art that comes from
experience.
■ The inputs and outputs are vectorized
representation.
Expressing neural network as a series of
vector and matrix operations
■ input to the ith layer of the network x = [x1 x2 ... xn]
■ vector produced by propagating the input through the neurons y =
[y1 y2 ... ym]
■ weight matrix of size n × m and a bias vector of size m.
■ jth element of a column corresponds to the weight of the
connection pulling in the jth element of the input.
■ y= ƒ(WTx + b) (the transformation function) is applied to the
vector elementwise.
■ This reformulation will become all the more critical as we begin to
implement these networks in software.
Linear Neurons and Their Limitations
■ Linear neurons are easy to compute
with, but they run into serious limitations.
■ A feed-forward neural network consisting
of only linear neurons can be expressed
as a network with no hidden layers.
■ In order to learn complex relationships,
we need to use neurons that employ
some sort of nonlinearity.
Sigmoid, Tanh, and ReLU Neurons
Softmax Output Layers
■ Oftentimes, we want our output vector to be a probability
distribution over a set of mutually exclusive labels.
■ For example, let’s say we want to build a neural network
to recognize handwritten digits.
■ This is achieved by using a special output layer called a
softmax layer.
■ The output of a neuron in a softmax layer depends on the
outputs of all the other neurons in its layer.
Softmax Output Layers
■ Letting zi be the logit of the ith softmax neuron, we can
achieve this normalization by setting its output to: