Title: Introduction to Neural Networks
1Introduction to Neural Networks
- Resources
- Chapter 20, textbook
- Sections 20.1, 20.5
- Winston (1993) Chapter 22
- Feldman Ballard (1982). Connectionist models
and their properties. Cognitive Science, 6,
205-254. - Fausett, L. (1994). Fundamentals of Neural
Networks. Prentice Hall. - Mehrotra, K., Mohan, C. K., Ranka, S. (1997).
Elements of Artificial Neural Networks. MIT Press.
2Before We Start
- Learning with neural nets can in principle be
supervised, unsupervised, and possibly even
semi-supervised - Lots of specific neural network algorithms
3Outline
- Neuroanatomy metaphor
- Notation terms
- Computing with neural networks
- Architecture (network topography)
- Multilayer networks
- Function approximation
- Hidden layers
- Measurement performance
- Bias weights
4Neuroanatomy Metaphor
- Neural networks (aka connectionist, PDP,
artificial neural networks, ANN) - Rough approximation to animal nervous system
- See systems such as NEURON for modeling at
greater biological levels of detail
http//neuron.duke.edu/ - Neuron components in brains
- Soma (cell body) dendritic tree
- Synapses
- Receive incoming signals from upstream neurons
- Connections on dendrites, cell body, axon,
synapses - Chemical (neurotransmitter) mechanisms
- Axon sends signal downstream
5http//users.rcn.com/jkimball.ma.ultranet/BiologyP
ages/N/Neurons.html
6Example neurotransmitters Epinephrine, Dopamine,
Serotonin
http//homepage.psy.utexas.edu/HomePage/Class/Psy3
01/Pennebaker/
7Neuron Firing Process
- 1) Synapse receives incoming signals
(neurotransmitter based communication), change
electrical (ionic) potential of cell body - 2) When potential of cell body reaches some
limit, neuron fires - - electrical signal (action potential) sent down
axon - 3) Axon propagates signal to other neurons,
downstream
8Cell body
http//homepage.psy.utexas.edu/HomePage/Class/Psy3
01/Pennebaker/
9What is represented by a biological neuron?
- Cell body sums electrical potentials from
incoming signals - Serves as an accumulator function over time
- But as a rule many impulses must reach a neuron
almost simultaneously to make it fire (p. 33,
Brodal, 1992) - Synapses have varying effects on cell potential
- Synaptic strength
10ANN (Artificial Neural Nets)
- Approximation of biological neural nets by ANNs
- Synaptic strength
- Approximate with connection weights (real
numbers) - Spiking of output
- Approximate with non-linear activation functions
- No direct model of accumulator function over time
- Neural units
- Represent activation values (numbers)
- Represent inputs, and outputs (numbers)
11Graphical Notation Terms
- Circles
- Are neural units
- Metaphor for nerve cell body
- Arrows
- Represent synaptic connections from one unit to
another - These are called weights and represented with a
scalar numeric value (e.g., a real number)
12Another Example 8 units in each layer, fully
connected network
inputs
outputs
13Units Weights
- Units
- Sometimes notated with unit numbers
- Weights
- Sometimes give by symbols
- Sometimes given by numbers
- Always represent numbers
- May be integer or real valued
14Computing with Neural Units - 1
- Need specific connectivity of the ANN, and
numbers of input output units - Need specific weights
- Inputs are presented to input units
- E.g., input is (3, 1, 0, -2)
15Computing with Neural Units - 2
- How do we generate output?
- First idea
- Summed Weighted Inputs
Input (3, 1, 0, -2) Processing 3(0.3) 1(-0.1)
0(2.1) -1.1(-2) 0.9 (-0.1) 2.2
Output
3
16Using Spreadsheets
17Computing with Neural Units - 3
Input (a1, a2, a3, a4) N4 Processing
1
18Activation Functions
- Usually, dont just use weighted sum directly
- Apply some function to weighted sum before use
(e.g., as output) - Call this the activation function
- Step function is one approximation of a
biological neuron spiking
Is called the threshold
Step function
19Step Function Example
Network output after passing through step
activation function
Input (3, 1, 0, -2)
20Step Function Example (2)
Network output after passing through step
activation function
Input (0, 10, 0, 0)
21(No Transcript)
22Another Activation FunctionThe Sigmoidal
- Math used with some neural nets requires that the
activation function be continuously
differentiable - Sigmoidal function often used to approximate the
step function
steepness parameter
23Sigmoidal - 1
sigmoidal(0) 0.5
24Sigmoidal - 2
Is the steepness parameter
Offset on X-axis
Offset on Y-axis
25Offset on X-axis
Offset on Y-axis
26Sigmoidal Example
3
Network output?
27(No Transcript)
28Architecture Terms
- Feed forward
- When all of the arrows connecting unit to unit in
a network move only from input to output - Recurrent or feedback networks
- Arrows feed back into prior layers
- Hidden layer
- Middle layer of units
- Not input layer and not output layer
- Hidden units
- Units that are not directly connected to the
input units, and not directly connected to the
output units - Perceptron
- A network with a single layer of weights
29Another Example
- A two weight layer, feed forward network
- Two inputs, one output, one hidden unit
Input (3, 1)
What is the output?
30Computing in Multilayer Networks
- Start at leftmost layer
- Compute activations based on inputs
- Then work from left to right, using computed
activations as inputs to next layer - Example solution
- Activation of hidden unit
- f(0.5(3) -0.5(1))
- f(1.5 0.5)
- f(1) 0.731
- Output activation
- f(0.731(0.75))
- f(0.548) .634
.731
.634
f(0.548) .634
f(1) 0.731
31(No Transcript)
32Notation
- Useful to represent weights and activations using
vector and matrix notations
Weight (scalar) from unit src in left layer to
unit dest in right layer
Activation value of unit unit in layer layer
layers increase in number from left to right
33Notation for Thresholded, Weighted Sum
34Generalizing
Number of units in layer to left
n
Weight (scalar) from unit i in left layer to unit
k in right layer
Activation value of unit unit in layer layer
layers increase in number from left to right
k1,l2
35Can Also Use Vector Notation
Row vector of incoming weights for unit i
Column vector of activation values of units
connected (providing inputs) to unit i
Each vector has n values
(Assuming that the layer for unit i is specified
in the context)
36Example
From linear algebra multiplying an nr with an
rm matrix produces an nm matrix, C, where each
element in that nm matrix Ci,j is produced as
the scalar product of row i of the left and
column j of the right
37Scalar ResultSummed Weighted Input
41 column vector
11 matrix (scalar)
14 row vector
38Computing New Activation Value
For the case we were considering
In the general case
Where f(x) is the activation function, e.g., the
sigmoidal function, and we are talking about unit
i in some layer
39Example
- Draw the corresponding ANN
- Compute the output value
40Calculations
41Function Approximation
- We can use ANNs to approximate functions
- g(X) Y
- Input units (X) Function inputs (vector)
- Output units (Y) Function outputs (vector)
- Hidden layers/weights
- Computation of specific function
42Example
- Say we want to create a neural network that tests
for equality of two bits, x1 and x2 - Equality with two bits can be viewed as a
function - g(x1, x2) z
- When x1 and x2 are equal, z is 1, otherwise, z is
0 - The function we want to approximate is as follows
Goal outputs
Inputs
What architecture might be suitable for a neural
network?
43What about this one?
Possible network architecture
Goal outputs
Inputs
- Can this architecture solve the problem?
- I.e., is there a pair of weights, w1 and w2, that
for the inputs given would produce the required
outputs?
44We need
Goal outputs
Inputs
- f(0w1 0w2) 1
- Best we can do is f(0).5
- f(0w1 1w2) 0
- e.g., w2 -10
- f(1w1 0w2) 0
- e.g., w1 -10
- f(1w1 1w2) 1
- f(-10 -10) 0
45w1-10, w2-10
actual outputs
Inputs
Problem We want the output for case 4 to be
lower than the output for the previous two (case
2 case 3)
46Are we sunk?
- Have not tried other network architectures
- Recall Hidden units let us indirectly compute
the output(s) on the basis of the inputs - Can think of this as re-formulating the inputs so
as to arrive at the outputs we want - Question What inputs would give us the outputs
we want?
47Modified Inputs andw5-10, w615
Modified Inputs
outputs
Now Can we create a network that uses the
original inputs, and generates these modified
inputs as outputs?
48More specifically
- Need to create a new network
x1
y1
Such that it produces, as outputs, the modified
inputs that we want
x2
y2
49y1
50SummaryUse a Hidden Layer of Units
Hidden layer recodes the problem inputs, to make
problem solution easier or possible to
solve. Important point Input representation is
crucial!
51Approximate Solution
Actual network results
Network Architecture
Weights
http//www.cprince.com/courses/cs5541/lectures/neu
ral-networks/equality-no-bias.xls
52Quality Measures
- A given ANN may only approximate the desired
function (e.g., equality for two bits) - We need to measure the quality of the
approximation - I.e., how closely did the ANN approximate the
desired function?
53How well did this approximate the goal function?
- Categorically
- For inputs x10, x20 and x11, x21, the output
of the network was always greater than for inputs
x11, x20 and x10, x21 - Summed squared error
54- Compute the summed squared error for our example
55Solution
Sum squared error
Generally, lower values for sum squared error
indicate better approximation 0 is
perfect Need also to consider generalization--
later.
56More Notation
- Row vector provides weights for a single unit in
right layer
- A weight matrix can provide all weights
connecting left layer to right layer - Let W be a n?r weight matrix
- Row vector i in matrix connects unit i in left
layer to units in right layer - r units in layer to left
- n units in layer to right
57Notation
vector of activation values of layer to left
an r1 column vector (same as before)
n1 column vector summed weighted inputs for
right layer
n1 - New activation values for right
layer Activation function f is now taken as
applying to elements of a vector
58Example Matrix representation for one network
Updating hidden layer activation values
Updating output activation values
Draw the architecture for the connectionist
model units and arcs representing weights
59Answer
- 2 input units
- 5 hidden layer units
- 3 output units
- Fully connected, feedforward network
60Bias Weights
- Used to provide a trainable threshold provides
offset on the X-axis
b is treated as another weight but connected to
a unit with constant activation value
61Models of the human brain?
- Do these computer models of neurons model the
human brain?