Extending SpikeProp presentation

About This Presentation

Transcript and Presenter's Notes

Title: Extending SpikeProp

1
Extending SpikeProp

Benjamin Schrauwen
Jan Van Campenhout
Ghent University
Belgium

2
Overview

Introduction
SpikeProp
Improvements
Results
Conclusions

3
Introduction

Spiking neural networks get increased attention
Biologically more plausible
Computationally stronger (W. Maass)
Compact and fast implementation in hardware
possible (analogue and digital)
Have temporal nature
Main problem supervised learning algorithms

4
SpikeProp

Introduced by S. Bohte et al. in 2000
An error-backpropagation learning algorithm
Only for SNN using time-to-first-spike coding

5
Architecture of SpikeProp

Originally introduced by Natschläger and Ruf
Every connection consists of several synaptic
connections
All 16 synaptic connections have enumerated
delays (1-16ms) and different weights, originally
same filter

6
SRM neuron

Modified Spike Response Model (Gerstner)

Neuron reset of no interest because only one
spike needed !
7
Idea behind SpikeProp
Minimize SSE between actual output spike time and
desired output spike time
Change weight along negative direction of the
gradient
8
Math of SpikeProp
Linearise around threshold crossing time
Only output layer given
9
Problems with SpikeProp

Overdetermined architecture
Tendency to get stuck when a neuron stops firing
Problems with weight initialisation

10
Solving some of the problems

Instead of enumerating parameters learn them
Delays
Synaptic time constants
Thresholds
We can use much more limited architecture
Add specific mechanism to keep neurons firing
decrease threshold

11
Learn more parameters

Quite similar to weight update rule
Gradient of error with respect to parameter
Parameter specific learning rate

12
Math of the improvements - delays
Delta is the same as for weight rule, thus
different delta formula for output as for inner
layers.
13
Math of the improvements synaptic time constants
14
Math of the improvements - thresholds
15
What if training gets stuck?

If one of the neurons in the network stops
firing training rule stops working
Solution actively lower threshold of neuron
whenever it stops firing (multiply by 0.9)
Same as scaling all the weights up
Improves convergence

16
What about weight initialisation

Weight initialisation is a difficult problem
Original publication has vague description of
process
S. M. Moore contacted S. Bohte personally for
clarifying the subject for his masters thesis
Weight initialisation is done by a complex
procedure
Moore concluded that weights should be
initialized in such a way that every neuron
initially fires, and that its membrane potential
doesnt surpass the threshold too much

17
What about weight initialisation

In this publication we chose a very simple
initialisation procedure
Initialise all weights randomly
Afterwards, set a weight such that the sum of all
weights is equal to 1.5
Convergence rates could be increased by using
more complex initialisation procedure

18
Problem with large delays

During the testing of the algorithm a problem
arose when the trained delays got very large
delay learning stopped
If input is preceded by output problem
Solved by constraining delays

Output of neuron
Input of neuron
19
Results

Tested for binary XOR (MSE 1ms)

20
Results

Optimal learning rates (found by experiment)
Some rates seem very high, but that is because
the values we work with are times expressed in ms
Idea that learning rate must be approx. 0.1 is
only correct when input and weights are
normalised !!

21
Conclusions

Because parameters can be learned, no enumeration
is necesarry, thus architectures are much smaller
For XOR
8 times less weights needed
Learning converges faster (50 of original)
No complex initialisation functions
Positive and negative weights can be mixed
But convergence deteriorate with further
reduction of weights

22
Conclusions

Technique only tested on small problem, should be
tested on real world applications
But, we are currently preparing a journal paper
on a new backprop rule that
supports a multitude of coding hypotheses
(population coding, convolution coding, ...)
better convergence
simpler weight initialisation
...

Write a Comment

User Comments (0)

About PowerShow.com

Extending SpikeProp PowerPoint PPT Presentation