On the capacity of unsupervised recursive neural networks for symbol processing - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

On the capacity of unsupervised recursive neural networks for symbol processing

Description:

On the capacity of unsupervised recursive neural networks for symbol processing Prof. Dr. Barbara Hammer Computational Intelligence Group Institute of Computer Science – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 24
Provided by: Nicola196
Category:

less

Transcript and Presenter's Notes

Title: On the capacity of unsupervised recursive neural networks for symbol processing


1
On the capacity of unsupervised recursive neural
networks for symbol processing
Prof. Dr. Barbara Hammer Computational
Intelligence Group Institute of Computer
Science Clausthal University of
Technology Nicolas Neubauer, M.Sc. Neural
Information Processing Group Department of
Electrical Engineering Computer
Science Technische Universität Berlin
29. 8. 2006
2
Overview
  • Introduction GSOMSD models and capacities
  • Main part Implementing deterministic push-down
    automata in RecSOM
  • Conclusion

3
Unsupervised neural networks
  • Clustering algorithms
  • Each neuron/prototype i has a weight vector wi
  • Inputs x are mapped to a winning neuron i such
    that d(x,wi) is minimal
  • Training Adapt weights to minimize some error
    function
  • Self-Organizing Maps (SOMs) Neurons arranged on
    lattice
  • During training, adapt neighbours also
  • After training, similar inputs ? neighbouring
    neurons
  • variants without fixed grid (e.g. neural gas)
  • Defined for finite-dimensional input vectors
  • Question How to adapt algorithms for inputs of
    non-fixed length like time series?
  • E.g. time windowing, statistical analysis, ...

Rn? R?
4
Recursive processing of time series
  • Process each input of time series separately
  • Along with a representation C of the maps
    response to the previous input (the context)
  • function rep RN ? Rr (Nnumber of neurons)
  • Ct rep(d1(t-1), ..., dN(t-1))
  • Neurons respond not only to input, but to context
    also
  • Neurons require additional context weights c
  • distance of neuron i at timestep tdi(t)
    ad(xt,wi) ßdr(Ct,ci)

x2
xt
...
rep
rep
rep
C2
Ct
C
5
Generalized SOM for Structured Data (GSOMSD)
Hammer, Micheli, Sperduti, Stricker, 04
  • Unsupervised recursive algorithms Instances of
    GSOMSD
  • Varying in
  • context function rep, distances d, dr
  • and lattice (metric, hyperbolic, neural gas, ...)
  • Example
  • Recursive SOM (RecSOM)
  • context stores all neurons activations
  • rN, rep(d1,...,dN) -exp(d1),...,-exp(dN)
  • each neuron needs N context weights! (memory
    N2)
  • other models store
  • properties of winning neuron
  • previous activations only for single neurons

6
Computational capacity
  • 01111 ! 11111
  • Ability to keep state information Equivalence
    to Finite State Automata (FSA) / regular
    languages
  • Decaying context will eventually forget leading
    0
  • (()) ! (()
  • Ability to keep stack information Equivalence
    to Pushdown Automata (PDA) / context-free
    languages
  • Finite context cannot store potentially infinite
    stack
  • Ability to store at least two binary stacks
    Turing Machine Equivalence
  • ? connecting context models to Chomsky hierarchy

7
Why capacity matters
  • Explore dynamics of algorithms in detail
  • Distinguish power of different models
  • different contexts within GSOMSD e.g., justify
    huge memory costs of RecSOM compared to other
    models
  • to other approaches e.g., supervised recurrent
    networks
  • Supervised recurrent networks Turing machine
    equivalence
  • in exponential time for sigmoidal activation
    functions Kilian/Siegelmann 96
  • in polynomial time for semilinear activation
    functionsSiegelmann/Sontag 95

8
Various recursive models and their capacity
TKM Chappell/Taylor,93 RSOMKoskela/Varsta/Heikkonen,98 MSOMHammer/Strickert,04 SOMSDHagenbuchner/Sperduti/Tsoi,03 RecSOMVoegtlin,02
context neuron itself neuron itself winner content winner index exp(all act.)
encoding input space input space input space index space activ. space
lattice all all all SOM/ HSOM all
capacity ltFSA ltFSA FSA FSA PDA
TKM Chappell/Taylor,93 RSOMKoskela/Varsta/Heikkonen,98 MSOMHammer/Strickert,04 SOMSDHagenbuchner/Sperduti/Tsoi,03 RecSOMVoegtlin,02
context neuron itself neuron itself winner content winner index exp(all act.)
encoding input space input space input space index space activ. space
lattice all all all SOM/ HSOM all
capacity ltFSA ltFSA FSA FSA PDA
for WTA semilinear context
9
Overview
  • Introduction GSOMSD models and capacities
  • Main part Implementing deterministic push-down
    automata in RecSOM
  • Conclusion

10
Goal Build a Deterministic PDA
  • Using
  • the GSOMSD recursive equation di(t) ad(xt,wi)
    ßdr(Ct,ci)
  • L1 distances for d, dr (i.e., d(a,b) a-b)
  • parameters
  • a1
  • ß¼
  • modified RecSOM context
  • instead of original exp(-di) max(1-di,0)
  • similar overall shape
  • easier to handle analytically
  • additionally, winner-takes-all
  • required at one place in the proof...
  • makes life easier overall, however

11
Three levels of abstraction
  • Layerstemporal (vertical) grouping ?
    feed-forward architecture
  • Operatorsfunctional grouping ? defined
    operations
  • Phasescombining operators -gt automaton
    simulations


12
First level of abstraction Feed-forward
  • One-dimensional input weights w
  • Encoding function enc S ? Nl Input symbol si ?
    series of inputs (i, e, 2e, 3e, , (l-1)e)
  • with e gtgt max(i)S
  • Each neuron is active for at most one component
    of enc
  • resulting in l layers of neurons
  • In layer l, we know that only neurons from layer
    l-1 have been active, i.e. are represented gt0 in
    context
  • pass on activity from layer to layer

13
Feed-forward Generic Architecture
  • Sample architecture
  • 2 states Sa,b
  • 2 inputs Ss0,s1
  • Output layer
  • Network represents state a ?? a active ??
    C(0,0,...,1,0)
  • Network represents state b ?? b active ??
    C(0,0,...,0,1)

Simulation of a state transition function d S x
S ? S
a
b
(l-1)e
Hidden layers arbitrary intermediate
computations
e
  • Input layer
  • encoding input X state
  • get input via input weight
  • get state via context weight

0,1
enc
wc
14
Second Layer of Abstraction Operators
  • We might be finished
  • In fact, we are - for the FSA case
  • However, what about the stack?
  • looks like ?0 ?1 ?1 ...
  • how to store potentially infinite symbol
    sequences?
  • General idea
  • Encode stack in the winner neurons activation
  • Then build operators to
  • read
  • modify or
  • copy
  • the stack by changing the winners activation

15
Encoding the stack in neurons activations
  • To save a sequence of stack symbols within the
    map,
  • turn ?0 ?1 ?1 into binary sequence alpha011
  • f4(alpha)
  • f4 (?) 0
  • f4 (0) ¼
  • f4 (1) ¾
  • f4 (01) ¼ 3/16
  • f4 (011) ¼ 3/16 3/64
  • push(s,?1) ¼s ¾
  • pop(s,?0 ) (s- ¼)4
  • Encode stack in activation Activation a 1¼s
  • ? push(a, ?1) 13/16 -1/16s
  • pop(a, ?0) 5/4 s

16
Operators
  • COPY
  • copy activation into next layer
  • TOP
  • identify top stack symbol
  • OR
  • get activation of active neurons(if any)
  • PUSH
  • modify activation for push
  • push(a, ?1) 13/16 -1/16s
  • POP
  • modify activation for pop
  • pop(a, ?0) 5/4 s

17
Third abstraction Phases
Set Content Elements generic / examples Elements generic / examples
S States s a s(uccess) f(ail)
S Input alphabet s ( ) e
G Stack alphabet ? ( ?
U Stack actions push( push pop( pop) do nothing
a
b
Phase Task Input ? Output Required operators
Finalize Collect all results leading to same state U x S ? S OR, COPY
Execute Manipulate stack where needed U x S ? U x S PUSH,COPY/POP,COPY
Merge Collect all states resulting in common stack state S x S x G ? U x S OR, COPY
Separate Read top stack symbol S x S ? S x S x G TOP
18
The final architecture
0S2 1S3
19
Overview
  • Introduction GSOMSD models and capacities
  • Main part Implementing deterministic push-down
    automata in RecSOM
  • Conclusion

20
Conclusions
  • RecSOM stronger computational capacity than
    SOMSD/MSOM
  • Does this mean its worth the cost?
  • Simulations not learnable with Hebbian learning
    Practical relevance questionable
  • Anyway Elaborate context (costly) rather
    hindering for simulations
  • too much context results in a lot of noise
  • maybe better simpler models, slightly enhanced
  • for example MSOM, SOMSD with context variable
    indicating last winners activation
  • Turing machines also possible?
  • Storing two stacks into a real number is possible
  • Reconstructing two stacks from real number is
    hard
  • particularly when using only differences
  • may have to leave constant-size simulations
  • Other representation of stacks may be required

21
Thanks
22
Aux slide PDA definition
23
Aux slide PDA definition suitable for map
construction
Write a Comment
User Comments (0)
About PowerShow.com