Introducing NonLinearities - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Introducing NonLinearities

Description:

No thresholding into different classes. Output usually a vector. Not always single forward pass' ... response to an input is: y = xW. If the input vector is he ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 40
Provided by: scie256
Category:

less

Transcript and Presenter's Notes

Title: Introducing NonLinearities


1
Introducing Non-Linearities
  • Decision boundary w0x0w1x1w2x2 0
  • This represents a linear decision boundary
  • x2 -(w1/w2) w0/w2
  • How could we introduce non-linearities in the
    input layer resulting in a separation boundary is
    which is not a straight line (Elliptical Boundary
    )
  • Use same training algorithm

2
Non-Linearities
  • Introduce non-linearities
  • The following equation represents an ellipse in
    the two dimensional input vector space
  • w0 w1x12 w2x1 w3x1x2 w4x2 w5x22 0

3
Non-linear Neuron Architecture
x0 x1 x12 x2 x22 x1x2
?
y
4
Non Linear Neuron - Exclusive OR
X 1, -1, -1 1, 1, 1 Training
Vectors 1 -1, 1 1, 1, -1 1, 1,
-1 1, 1, -1 1, 1, 1 1, 1,
1' t -1, 1, 1, -1 Target
Values alpha .01 Learning rate
5
Exclusive OR 3D
6
Exclusive OR 2D
7
Reading Assignment
  • Finish reading chapter 2 ( skip section 2.4.5 )
  • Quiz on Tuesday

8
Assignment 2 Due Thursday, January 10th
  • PART 1 of 2 Parts
  • Program the Delta Learning Rule in MATLAB
  • Use following parameters ( AND Function )
  • X 1, -1, -1 Training Vectors
  • 1 -1, 1
  • 1, 1, -1
  • 1, 1, 1 '
  • t -1, 1, 1, 1 Target Values
  • alpha .01 Learning rate
  • Experiment with tolerance and learning rate. Does
    it find the correct weights every time?
  • Plot final boundary

9
Example of 2D plotting Script
plotBoundary.m Roger S. Gaborski, December 19,
2001 reads in weights and plots 2D
boundary Wn weights x1 -2 .5 2 x2
-1(Wn(2)/Wn(3))x1 -(Wn(1)/Wn(3)) Wn indices
larger than notes because

matrix starts at index 1 instead of
zero plot(x1,x2), axis(-2,2,-2,2) grid hold
on plot(1,1,'') plot(1, -1, '') plot(-1, 1,
'') plot(-1, -1, 'o')
10
Example of AND Decision Boundary
11
Assignment 2 Due Thursday, January 10th
  • Part 2
  • Implement the Exclusive OR using nonlinearities
  • Create 3D plot and thresholded 2D shown in
    previous slides

12
Assignment 2 Due Thursday, January 10th
  • Write up observations
  • Turn in hardcopy of MATLAB code
  • Email MATLAB scripts and directions
    rsg_at_cs.rit.edu

13
Memory
  • Content Addressable
  • Distributed, robust, noise tolerant
  • Fast retrieval
  • Adaptive

14
Memory Model
Memory Model
Two input patterns mapped to this pattern
M output patterns
Learning Stage
M input patterns
15
Memory
  • If input is noisy, distorted or only partial
    information available the memory model will
    respond with the output to correct output

16
Memory Model
Memory Model
Similar Pattern
M output patterns
17
Memory Damage
Memory Model
Similar Pattern
M output patterns
18
Memory Damage
100
Accuracy
0
Damage
19
Pattern Association
  • Learning form associations between patterns
  • Visual image associated with another visual image
  • ( recognize a person we have only seen in a
    photograph )
  • Visual image associated with a smell
  • ( beach scene ? coconut smell (suntan oil))
  • - Music ? a few notes ? artist ? events when ong
    as popular ? where you lived, job, chool

20
Pattern Association
  • Single Layer Neural Network
  • Store associations
  • Retrieve information based on content rather than
    computer memory address
  • Information is distributed in the weights
  • ? Does not have specific storage address

21
Pattern Associations
  • How are associations different that
    classification neural networks??
  • No thresholding into different classes
  • Output usually a vector
  • Not always single forward pass. Sometime an
    iterative operation is employed

22
Pattern Association
  • Each association is an Input Output vector pair
    st
  • If s t, autoassociative memory
  • If s ? t, heteroassociative memory
  • Not only learns specific pairs used in training,
    but able to recall a stimulus that is similar,
    but NOT identical

23
Heteroassociative Memory s ? t
  • Each association is a pair of vectors ( s(p) ,
    t(p) ) p1,2,3,P
  • Each vector s(p) is an n-tuple
  • Each vector t(p) is an m-tuple
  • Weights can be found using either the Hebb Rule
    or the Extended Delta Rule

24
Hebb Rule for Pattern Association
  • Use either binary or bipolar vectors
  • Training vector pairs st
  • Testing Input Vector x
  • Procedure
  • Initialize all weights to 0, wij 0, ( i
    1,,n j 1,,m)
  • For each training pair
  • Set activations for input neurons to current
    training input ( i 1, , n ) xi si
  • Set activation for output neurons to current
    target output ( j 1,,m) yj tj
  • Update weights wij(new) wij(old) xiyj

25
Hebb Rule using Outer Products
  • For individual input / output pair
  • s ( s1, , si , sn ) 1xn vector
  • t ( t1, , tj , tm ) 1xm vector
  • S s S is nx1 after transpose
  • T t T is still 1xm, no transpose
  • ST

s1 . . sn
s1t1 s1tj s1tm . . . snt1 sntj sntm
t1, , tm

1xm
nx1
26
Hebb Rule using Outer Products
  • For a set of Associations s(p)t(p)
  • W ? s(p) t(p)

p
p1
Just sum weight matrices for each pair
27
Heteroassociative Memory
w11
Y1 Yj Ym
X1 Xi Xn
Output vector y is the pattern associated with
input vector x
w1j
w1m
28
Hebb Learning for Heteroassociative Memory
  • Step 1 Initialize weights
  • Step 2 For each input vector
  • Set activations for input layer equal to the
    current input vector
  • Compute net input to output neurons
  • y_inj ?xiwij
  • Determine activation of output units
  • 1 if y_inj gt0
  • yj 0 if y_inj 0
  • -1 if y_inj lt 0

29
Example of Hebb Outer Product Rule for
Heteroassociative Memory - 1
Input row vectors s ( s1, s2, s3, s4 ) Output
vectors t ( t1, t2 ) s1 ( 1, 0, 0, 0 ) t1
( 1, 0 ) s2 ( 1, 1, 0, 0 ) t2 ( 1, 0 ) s3
( 0, 0, 0, 1 ) t3 ( 0, 1 ) s4 ( 0, 0, 1,
1 ) t4 ( 0, 1 )
1 0 0 0
1 0 0 0 0 0 0 0
1 1 0 0
  • 0
  • 1 0
  • 0 0
  • 0 0

1 0
1 0


30
Example of Hebb Outer Product Rule for
Heteroassociative Memory - 2
0 0 0 0 0 1 0 1
0 0 0 1
0 0 0 0 0 0 0 1
0 0 1 1
0 1
0 1


The weight matrix to store all four patterns is
simply the Sum of the four individual patterns
2 0 1 0 0 1 0 2
W
31
Example of Hebb Outer Product Rule for
Heteroassociative Memory 3 TESTING
Test on training date W
x ( 1, 0,0,0 )
2 0 1 0 0 1 0 2
2 0 1 0 0 1 0 2
( 1, 0,0,0 )
(2,0 )
xW ( y_in1, y_in2 )
f(2) 1, f(0) 0, y ( 1,0 )
32
Example of Hebb Outer Product Rule for
Heteroassociative Memory 4 TESTING
f
( 1, 0, 0, 0 ) W ( 2,0 ) ? (1,0 ) where f is
the activation function
Test on new data similar to training date (
0,1,0,0 ) W ( 1,0 ) ? ( 1,0 )
Is this a reasonable response?? Original
Data s1 ( 1, 0, 0, 0 ) t1 ( 1, 0 ) s2
( 1, 1, 0, 0 ) t2 ( 1, 0 ) s3 ( 0, 0,
0, 1 ) t3 ( 0, 1 ) s4 ( 0, 0, 1, 1 )
t4 ( 0, 1 )
33
Example of Hebb Outer Product Rule for
Heteroassociative Memory 5 TESTING
Hamming distance is a measure of how different
two digital Words are. Simply count the number of
places where the words differ Input codeword
(0,1,0,0) s1 ( 1, 0, 0, 0 ) hamming distance
2 s2 ( 1, 1, 0, 0 ) hamming distance 1 s3
( 0, 0, 0, 1 ) hamming distance 2 s4 ( 0,
0, 1, 1 ) hamming distance 3 The second
codeword is closest to the input word, and
its Recall word is ( 1,0 )
34
Example of Hebb Outer Product Rule for
Heteroassociative Memory 6 TESTING
Consider ( 0, 1,1, 0) This codeword differs in
two positions s1 ( 1, 0, 0, 0 ) hamming
distance 3 s2 ( 1, 1, 0, 0 ) hamming
distance 2 s3 ( 0, 0, 0, 1 ) hamming
distance 3 s4 ( 0, 0, 1, 1 ) hamming
distance 2 (0, 1, 1, 0)W (1,1) ? (1,1) Not a
valid stored word- FAILS
35
Bipolar vs Binary
Bipolar data gives you the ability to represent
unknown (noisy data) with a 0, and good data
with 1 or 1
36
How well does it work??
  • If input vectors are orthogonal, the Hebb rule
    will produce the correct weights.
  • Testing on training vectors will result in the
    expected answer ( scaled by the square of the
    norm of the input vector, where the norm is the
    inner product with itself )
  • Details
  • Recall, two vectors s(k) and s(p), k?p, that are
    orthogonal have a dot product 0
  • s(k) s(p) 0

n
? si(k) si(p) 0
i1
37
How well does it work 2 ??
  • Calculate Weight matrix W ? s(p) t(p)
  • The net response to an input is y xW
  • If the input vector is he kth training vector, x
    s(k)
  • s(k)W s(k)?s(p)t(p) s(k)s(k)t(k)
    ?s(k)s(p)t(p)
  • Where s(k)s(k)t(k) is target t(k) scaled by
    square of norm of s(k)
  • And ?s(k)s(p)t(p) if s(k) is orthogonal to s(p)
    this term is 0

p?k
p?k
38
Delta Rule for Pattern Association
  • Recall Hebb learning is a one pass learning
    process.
  • Delta Rule is an iterative learning process
  • Can be used for input patterns that are linearly
    independent, but not orthogonal
  • Avoids difficulty of cross talk which is
    encountered in Hebb Rule
  • Delta Rule produces least square solution when
    input patterns are not linearly independent

39
Extended Delta Rule
  • The original Delta Rule used the identity
    function for the activation function of the
    output neuron resulting in
  • ?wij ?( tj yj ) xi
  • The Extended Delta Rule uses a differentiable
    activation function resulting in
  • ?wIJ ?( tJ yJ ) xI f ( y_inJ )
  • This is the update for the weight between neuron
    I and J
Write a Comment
User Comments (0)
About PowerShow.com