Introduction to Artificial Neural Networks - PowerPoint PPT Presentation

1 / 82
About This Presentation
Title:

Introduction to Artificial Neural Networks

Description:

huge number of slow units highly connected. self-organizing and ... neurosciences. psychologie. mathematics. physical sciences. computer sciences. engineering ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 83
Provided by: Pelle7
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Artificial Neural Networks


1
Introduction to Artificial Neural Networks
2
Inspiration - The brain
  • Capable of remembering, recognizing patterns and
    associating. Main characteristics
  • massively parallel
  • non-linear
  • huge number of slow units highly connected
  • self-organizing and self-adapting
  • Some statistics about the brain
  • 1011 neurons
  • 1015 connections
  • and about neurons
  • 1 neuron is connected with 103 to 105 other
    neurons
  • slow 10-3 sec (silicon logic gates 10-9 sec)

3
An artificial neural network
  • An artificial neural network (ANN) is a
    machine
  • assemblage of artificial neurons
  • created to model the way the brain execute tasks
    by simulating mathematically the neurons and
    their connections
  • Requirements to achieve a good performance
  • a great number of neurons
  • massive interconnection among them

4
Artificial neural network characteristics
  • The power of an ANN follows from its
  • massively parallel and distributed structure
  • learning capacity
  • generalizing capacity
  • Generalization
  • is the fact of producing reasonable answers to
    data which were never presented during the
    learning phase

5
Artificial neural network research
  • is an interdisciplinary field with branches in
  • neurosciences
  • psychologie
  • mathematics
  • physical sciences
  • computer sciences
  • engineering

6
Use - Examples
  • Data compression,
  • Classification,
  • Associative memory,
  • Pattern recognition,
  • Data filtering and noise reduction
  • Data analysis,
  • Data prediction.

7
Biological elements
C. Pellegrini
8
Two pioneers
  • Santiago Ramón y Cajál (1852-1934) introduced the
    concept of neurons as structural elements of the
    brain.
  • Camillo Golgi (1843-1926) invented a coloring
    method that makes neurons visible.

C. Pellegrini
9
Biological neurons
C. Pellegrini
10
Structure of the neuron
  • diameter of the cell 0.01-0.05 mm ,
  • Soma (cell body)
  • various shapes (mostly spherical),
  • 20 mm of diameter,
  • contains the nucleus,
  • limited by a membrane of 5 nm.
  • Axon
  • unique to nerve cells,
  • diameter 1-25 mm (humans),
    1 mm (squid),
  • length 1 mm to 1 m. (!!!),
  • connects to other neurons (synapses),
  • allows transmission of information,
  • Dendrites
  • receive signals from other neurons,
  • each covered by hundreds of synapses.

Dendrites
Soma
Axonal cone
Primaryaxon
Secondaryaxons
C. Pellegrini
11
Synapse
Presynaptic element
Mitochondries
Secretionvesicles
Active zones
Membranedifferenciation
Synapticgap
Postsynapticdensity
Synapticvesicles
Receptors
Postsynaptic dendrite
C. Pellegrini
12
Synaptic arrangements
Synapses are made on the dendritric tree, but
they can also be
of axo-axonal type
of axo-somatic type
C. Pellegrini
13
Types of neurons
Pyramidal cell
Star-shaped neuron
C. Pellegrini
14
The nervous system
Human nervous system is a three-stage system
The brain continually receives information,
perceives it and makesappropriate decisions.
C. Pellegrini
15
Organization of the brain
C. Pellegrini
16
Artificial Neural Networks
  • Models and architectures

17
History of neural networks
  • 1940's Neural network models
  • 1943 McCulloch Pitts introduced the formal
    neuron model and published their paper A
    Logical calculus of the idea immanent in nervous
    activity
  • 1949, Hebb publishes his book "The Organization
    of Behavior" in which he describes what is now
    known as "Hebbian cell assemblies"
  • 1950's Learning in neural networks
  • the Hebbian learning rule
  • 1954, Minsky PhD thesis on a Reinforcement
    learning machine
  • 1959, Rosenblatt the Perceptron learning
    procedure
  • 1960's The age of the Perceptron (a period of
    massive enthusiasm)
  • Many wild claims are made by Rosenblatt and
    others about the potential of Perceptrons as
    all-powerful learning devices

18
History of neural networks (cont.)
  • 1970's Limitations of Perceptrons are realized
    (the dark ages)
  • 1969 Minsky and Papert's book "Perceptrons" is
    published, in which it is shown that Perceptrons
    are only capable of learning a very limited class
    of functions.
  • Minsky Papert predict that there will be no
    fruitful or interesting extensions of Perceptrons
    even if multi-layer learning procedures are
    developed
  • The flow of funding into neural networks
    temporarily ceases
  • 1980's The discovery of back-propagation (the
    Renaissance)
  • 1980, S. Grossberg, competitive learning and ART
    models.
  • 1982, J. Hopfield, recursive networks (i.e.
    retro-action) and associative memory
  • 1986, D. Rumelhart et al, back-propagation
  • Other learning procedures for multi-layer neural
    networks are invented
  • The power of neural networks begins to be
    realized and the hype cranks up again ...

19
Artificial neuron model
  • Introduced by McCulloch Pitts (1943)

Quite simple All signals can be 1 or -1. The
neuron calculates a weighted sum of inputs and
compares it to a threshold. If the sum is higher
then the threshold, the output is set to 1,
otherwise -1.
20
Artificial neuron model
  • This simple neuron model has
  • A set of connections, called synapses, which
    make the link to other neurons to create a
    network. Each synapse has a synaptic weight which
    represents the strength of the connection.
  • One unity which multiplies each incoming
    activity by the weight on the connection and adds
    together all these weighted inputs to get a total
    input.
  • An activation function that transforms the total
    input into an outgoing activity (to restraint the
    input amplitude).

21
Geometrical interpretation
Consider a McCulloch Pitts formal neuron with
two inputs x1 and x2
Classify the input vectors into one of two
classes. Geometrically these two equations split
the plane (x1, x2) in two, with the decision
boundary defined by
22
Artificial neuron model (cont.)
modern McCulloch Pitts neuron
Activation function
Input signals
Output yk
Summation unit
Threshold qk
Synaptic weights
23
Artificial neuron model (cont.)
The model is mathematically described by
and
Where
24
Artificial neuron model (cont.)
External applied threshold or bias has the effect
of lowering or increasing the net input of
activation function.
wk0 qk (threshold)
Activation function
Input signals
Output yk
Summation unit
Synaptic weights
25
Types of activation functions
The activation function defines the output of a
neuron in terms of the activity level at its
inputs. There are 3 basic types of activation
functions.
  • threshold function
  • piecewise-linear function
  • sigmoid function

or
26
Activation functions - interpretation
  • An activation function is a decision function
  • defines a threshold under which the activation
    value will not fire any output,
  • allows to select, linearly or not, among
    different activation values,
  • the highest output value comes from the highest
    activation value, i.e from the greater similarity
    between the input values and the synaptic weights.

27
Network architectures
The power of neural networks comes from its
collective behavior in a network where all
neurons are interconnected. The network starts
evolving neurons continuously evaluate their
output by looking at their inputs, calculating
the weighted sum and comparing to a threshold to
decide if they should fire. This is highly
complex parallel process whose features cannot be
reduced to phenomena taking place with individual
neurons.
28
Network architectures
  • Neural networks are formed by assemblage of may
    artificial neurons.
  • An artificial neural network may be seen as a
    distributed processor massively parallel.
  • The basic work of a neural network is determined
    by learning. The memorized information is
    retained through the synaptic weights
  • Knowledge is represented by the free parameters
    of the neural network, i.e. synaptic weights and
    thresholds.

29
Network architectures
  • There are 4 different classes of architectures
  • single-layer feedforward networks
  • one input layer of source nodes,
  • one output layer,
  • feedforward from input layer to output layer
  • multilayer feedforward networks
  • one input layer of source nodes,
  • one or more hidden layers,
  • one output layer

30
Network architectures
  • recurrent networks
  • at least one feedback loop
  • lattice structures
  • neurons are organized in arrays

31
Learning methods
  • An artificial neural network learning method is
    defined by
  • the procedure which adjusts the neural network
    free parameters i.e. synaptic weights and
    thresholds. Different learning methods adjust
    differently these parameters.
  • Usually, a learning method is composed of 3
    steps
  • 1. The neural network receives an input stimulus.
  • 2. The free parameters are adapted according to
    the neural network output for this input and
    eventually according to the desired output.
  • 3. The neural network will have a different
    output for this input since its internal
    structure have been changed.

32
Learning taxonomy
Learning procedure
Learning algorithms (rules)
Learning paradigms
Supervised learning
Hebb rule
with teaching
Widrow-Hoff rule
Reinforcement learning
Competitive learning
with critics
Non-supervised learning
Back-propagation error
self-organizing
33
Learning Supervised We feed the neural network
with the input (entries) and their corresponding
desired output. The learning algorithm modifies
(little by little) the synaptic weights to adapt
the obtained output according to the desired
output. Only the synaptic weights which produce
an error are modified. Non-supervised We feed
the neural network with the input (entries) only.
The neural network will organise itself in order
to represent the input data.
P. Palagi
34
Hebb learning rule
  • Donald Hebb (1949) demonstrated a simple
    learning rule for neural networks
  • When an axon of a cell A is close enough to
    excite a cell B and repeatedly or persistently
    takes part in firing it, some growth process or
    metabolic process take place in one or both cells
    such that As efficiency as one of the cell
    firing B, is increased.
  • In other words
  • If two connected neurons fire synchronously and
    repeatedly, their synapses strength increases
    selectively.
  • If two connected neurons fire asynchronously,
    then their synapse decreases or is eliminated.
  • Also called Hebb synapses.

35
Hebb learning rule (cont.)
Hebb rule mathematically
where h is a positive and constant learning
rate. Two neurons j and k connected by a
synaptic weight wkj
36
Hebb learning rule (application)
Initial conditions h 1, weights and threshold
are zero.
37
Hebb learning rule (application)
Initial conditions h 1, weights and threshold
are zero.
38
Hebb learning rule (cont.)
The Hebb rule discharges an exponential increase
which ends up to saturate the neuron. To limit
the increase, a weight-decay factor is introduced
In other words
where c h/a. If yj(n) lt wkj(n)/c then the
synaptic weight wkj (n1), will decrease
proportionally to the post-synaptic weight yk(n).
If yj(n) gt wkj(n)/c , the synaptic weight will
increase proportionally to yk(n). The learning
phase is finished when the wkj variations are
small or zero.
39
Learning rule - interpretation
  • A weighted sum may have two interpretations
  • 1. The activation function vk is the similarity
    measure between two sets of numbers (xj and wkj)
    it is the scalar product between the input values
    and the synaptic weights. The greater their
    similarity, the greater is the activation
    function.
  • The synaptic weight vectors may be used to
    memorize a model of the input values vector.
  • 2. The activation function vk also represents the
    equation of a plan (or hyperplan) whose position
    is defined by the wkj values set, the sign of the
    activation function changes according to the
    coordinates xj (one side or the other of the
    plan).
  • The weight vector may then be used to split the
    input space into two regions it then operates as
    a classifier.

40
Learning rule - geometric interpretation
41
The Perceptron Learning through error correction
A single layer Perceptron based on the McCulloch
and Pitts model of neuron - a linear combiner
followed by a hard limiter - the neuron produces
an output 1 if hard limiter input is positive
and output -1 it it is negative.
v
42
The Perceptron as a pattern classifier
Classify the input vectors x (x1, x2, , xp)
into one of two classes C1 or C2 - x belongs to
C1 if perceptron output is 1 and to C2 if it
is -1. - in the p-dimensional space the two
classes correspond to two regions separated by a
hyperplane defined by
Case for two input variables x1 and x2
43
The Perceptron learning rule
  • Given
  • (p1)-by-1 input vector x(n) x1 (n), x2
    (n), , xp (n)
  • (p1)-by-1 weight vector w(n) w1 (n), w2
    (n), , wp (n)
  • a linear combiner output is v(n) wT(n) x(n)
  • wT(n) x(n) defines a hyperplane (in a
    p-dimensional space) as the decision boundary
    between 2 different classes of inputs.
  • X1 x1 (1), x1 (2), subset of input
    vectors (x) that belong to C1
  • X2 x2 (1), x2 (2), subset of input
    vectors (x) that belong to C2

44
The Perceptron learning rule (cont.)
  • - Training set X X1 ? X2
  • - Training process
  • - adjustment of weight vector w such that C1 and
    C2 are separable
  • - if C1 and C2 are linearly separable then w
    exists.
  • - given the subsets of training vectors X1 and
    X2, the training problem for the elementary
    perceptron is to find a weight vector w such
    that
  • wTx ? 0 ? x ? C1 (i.e. x ? X1)
  • wTx lt 0 ? x ? C2 (i.e. x ? X2)

45
The Perceptron learning rule (cont.)
Algorithm to adapt the weight vector w - If nth
member of the training set, x(n), is correctly
classified by w(n), no correction is made,
then - w(n1) w(n) if wT(n) x(n) ? 0
and x(n) ? C1 - w(n1) w(n) if wT(n) x(n) lt
0 and x(n) ? C2 Otherwise the weight vector is
updated according to the rule - w(n1) w(n) -
?(n)x(n) if wT(n) x(n) ? 0 and x(n) ? C2 -
w(n1) w(n) ?(n)x(n) if wT(n) x(n) lt 0
and x(n) ? C1 The learning rate parameter ?(n)
controls the adjustment applied at iteration n, 0
lt ?(n) lt 1.
46
Perceptron Algorithm
  • Step 1 Initialization
  • set w(0) 0 then perform the following for time
    n 1, 2, 3,
  • Step 2 Activation
  • activate the perceptron with input x(n) and
    desired response d(n)
  • Step 3 Computation of actual response
  • compute the actual response y(n) sgn wT(n)
    x(n) - ?
  • Step 4 Adaptation of weight vector
  • update the weight vector w(n1) w(n) ?d(n)
    - y(n) x(n)
  • where
  • 1 if w(n) ? C1
  • d(n) -1 if w(n) ? C2
  • Step 5 Increment time
  • n n 1 and go back to step 2.

47
Perceptron Algorithm
48
The learning rate
  • The learning rate h
  • insures the stability in the learning process,
  • small values è slow learning è slower
    convergence
  • high values è learning is faster è divergence
    or instability
  • Error surface (graphics e(n) vs synaptic weight)
  • If the neural network has only linear neurons
    quadratic function with one unique minimum.
  • If the neural network has non-linear neurons
    the surface has one global minimum (rarely some),
    but many local minima.
  • Learning through error correction
  • Starts from an arbitrary point from the error
    surface
  • Progresses little by little towards a minima
    global or local.

49
Perceptron - Limitations
Minsky and Pappert (1969) show that single-layer
perceptron cannot represent the simple
exclusive-or function.
(-1,1)
(1,1)
(-1,-1)
(1,-1)
OR
AND
XOR
How can this limitation be overcome?
50
Learning through error correction
Widrow-Hoff Delta rule Given the stimulus
vector x(n), if dk(n) is the desired output of
neuron k and yk(n) is the obtained output from
neuron k, then the error ek(n) is given by
It means that the learning procedure has the
objective of minimizing the error function
The synaptic weight changes according to the
following learning rule
51
Back-propagation algorithm
Widrow-Hoff is only applied on single layer
neural networks. Multi-layer perceptron For
neural networks with hidden layers (between the
input and output layers), the measured error in
the output layer must be distributed backward
proportionally to each neurons contribution to
the total error. It uses the back-propagation
algorithm created by P. Werbos in 1974 and
rediscovered in 1986 by D. Rumelhart and Y. Le
Cun. Principle 1. Propagate the input values
through the neural network and calculate the
output values. 2. Compare the obtained output
with the desired output and calculate its
error. 3. Distribute the error to the neurons
from the hidden layer according to each one of
their contributions. 4. Modify the synaptic
weights backward (from the output layer to the
input layer).
52
Notation
  • indices i, j, k refer to neurons
  • iteration n refers to the nth training pattern
    (example)
  • ej(n) error signal at the output of neuron j
    for iteration n
  • dj(n) desired output for neuron j used to
    compute ej(n)
  • e(n) instantaneous sum of squared errors
  • eavg average of e(n) over all n (the entire
    training set)
  • yi(n) function at the output of neuron j
  • wji(n) synaptic weight connecting output of
    neuron i to input of neuron j
  • vj(n) network internal activity of neuron j

53
Back-propagation algorithm (cont.)
  • For neuron j
  • error value e j(n) dj(n) - yj(n)
  • sum of squared errors
  • (C set of all neurons in output layer)
  • average squared error
  • e(n) and eavg are functions of the free
    parameters of the network (i.e. synaptic weights
    and threshold)
  • eavg represents the cost function as measure of
    training set learning performances
  • objective of learning process adjust the free
    parameters to minimize eavg

54
Back-propagation algorithm (cont.)
  • Two cases
  • neuron j output neuron
  • neuron j hidden neuron
  • Hidden neurons share responsibilities for errors
    at the output of the network,
  • How to penalize or reward hidden neurons?
    (credit-assignment problem).
  • Case 1 neuron j output neuron
  • - knowing dj(n), compute ej(n), then compute
    local gradient dj(n)
  • dj(n) e j(n)jj (vj(n)) and Dwji(n)
    -h dj(n) yi(n)

55
Back-propagation algorithm (cont.)
  • Case 2 neuron j hidden neuron
  • redefine local gradient for hidden neuron (no
    error ej(n) available)
  • need to express e(n) at the output level
    following the last hidden layer.

56
Back-propagation algorithm (cont.)
  • Putting all together

(using the definition of the local gradient)
  • Finally going back to local gradient dj(n) for
    hidden neuron j, we have

57
Summary Back-propagation algorithm
Correction Dwji(n), applied to weight connecting
neuron i to neuron j, is defined by the delta
rule
  • Local gradient dj(n) is computed according to the
    location of neuron j.
  • neuron j output neuron dj(n) e j(n)jj
    (vj(n))
  • neuron j hidden neuron dj(n) equals product of
    jj (vj(n)) and the weighted sum of ds computed
    for neurons in the next hidden or output layer
    connected to neuron j.

58
Example Back-propagationSolution to the XOR
problem
Characteristics Generalization (size of the
training set) Overtraining (number of hidden
neurons)
59
Perceptron Multi-couches avec apprentissage par
retro-propagation de l erreur - Algorithme
1. Initialisation de poids synaptiques à des
valeurs choisies au hasard. 2. Présentation
d une entrée de la base d apprentissage. 3.
Calcul, par propagation, de la sortie obtenue
pour cette entrée.
4. Si la sortie obtenue est différente de la
sortie désirée, alors modification des poids
synaptiques depuis la sortie vers
l entrée Wij(t1) Wij m . yi . xi
5. Tant que l erreur est trop importante,
retour à l étape 2.
P. Palagi
60
NetTalk apprend a prononcer du texte en anglais
Exemple d apprentissage entrée une chaîne de
caractères sortie associée sa transcription
phonétique
NetTalk 309 neurones (3 couches) avec 80
cellules cachées. Après 12 heures
d apprentissage, la performance est de 95 sur
le texte d apprentissage (1000 mots) et 90 sur
du texte nouveau (mots nouveaux).
P. Palagi
61
Prédiction de structures secondaires de protéines
Exemple Hierarchical Neural Network (Guermeur,
1997) acides aminés codés par des vecteur
binaires à 20 dimensions entrée fenêtre qui
se déplace sur la séquence des résidus
...VVATLGATNPDKISACQQAG sortie associée
3 classes - hélice-a, feuillet-b et spiral
P. Palagi
62
Prédiction de structure secondaire de protéines
P. Palagi
63
Prédiction de structure secondaire de protéines
HNN gtgt http//pbil.ibcp.fr/cgi-bin/npsa_automat.p
l?pagenpsa_nn.html
P. Palagi
64
PREDICTION DE LOCCURRENCE DE LA MYRISTOYLATION
DES PROTEINES PAR RESEAUX DE NEURONES -
Rétropropagation de lerreurPierre Scorpil,
Emmanuelle Roulet, Anne-Lise Veuthey
La myristoylation des protéines est une
modification post-traductionnelle qui consiste en
lattachement covalent dun myristate (C140) via
une liaison amide au résidu glycine situé à
l extrémité NH2 terminal du peptide naissant.
Cette réaction nécessite non seulement la
présence dune Glycine en première position, mais
aussi dun pattern consensuel de 6 acides aminés
définis par PROSITE qui consiste en la formule
suivante G-?EDRKHPFYW?-x2-?STAGCN?-?P?. Lenzy
me qui catalyse cette réaction sappelle la
N-myristoyl-transférase et agit sur de nombreux
substrats protéiques. Il existe un large spectre
de protéines myristoylées. Parmi celles-ci on
peut citer les protéines G associée à des
récepteurs transmembranaires tels que le
récepteur de lacetylcholine, le recepteur GABA
etc
65
Competitive learning
  • Principle
  • Only the weight corresponding to one neuron
    (called winner) or a group of neurons (called
    winning cluster) changes responding to inputs at
    a time. Quite similar to what are called vector
    quantization algorithms.

66
Competitive learning - algorithm
  • 1. Initialization
  • initialize wj(0) with different random values for
    j 1, 2, ...N
  • 2. Sampling
  • draw a sample vector x representing the input
    signal,
  • 3. Similarity matching
  • find the winning neuron i(x) at time n using
  • 4. Updating
  • adjust the synaptic weight of the winning neuron
    using
  • 5. Continuation
  • continue with step 2 until no noticeable changes
    are observed or the total number of iterations is
    attained.

j 1,2, .., N
67
Competitive learning - weakness
  • High sensitivity of the network to the weight
    initialization.
  • How to overcome this problem
  • Initialize weights inside the input population
    (bad solution...)
  • Frequency sensitive competitive learning (FSCL)
  • the winner neuron is penalized after winning too
    much.
  • Lattice networks (example Kohonen neural
    network)
  • the winner neuron weight is updated as well as
    its neighbors.

68
Self-organizing feature maps (SOFM) - Basic Models
  • Principle of topographic map formation
  • The spatial location of an output neuron in the
    topographic map corresponds to a particular
    domain or feature of the input data. (T.
    Kohonen, 1990)
  • output map arranged in a one- or two-dimensional
    lattice,
  • lattice a topology in which each neuron has a
    set of neighbors,
  • two basic models
  • Willshaw - von der Malsburg model (1976) (not
    described here),
  • Kohonen model (1982).

69
Self-organizing feature maps
- The SOFM belong to the class of competitive
neural networks. The neurones of the single
layer compete against each other in such a way
that only one neurone will be activated for a
given entry. - They organise themselves
according to the entries of the database while
keeping the topological constraints of this input
space. There is a strict correspondence between
the input space and network space.
P. Palagi
70
Kohonen Model
  • also inspired by neurobiological considerations,
    but not meant to explain neurobiological details,
  • captures essential features of computational maps
    in the brain and yet remains computationally
    tractable,
  • can perform data compression (dimensionality
    reduction on the input)
  • by optimally placing code-vectors into
    higher-dimensional input space,
  • well known as self-organizing feature map (SOFM).

C. Pellegrini
71
Lateral Feedback
  • mechanism for modifying the form of excitation
    applied to a network,
  • takes place between neurons of the same layer or
    map,
  • depends on the lateral distance from its point of
    application,
  • two types of connections
  • weighted sum of input signals performs feature
    detection,
  • feedback connection has excitatory of inhibitory
    effects.

C. Pellegrini
72
Lateral Feedback
  • following biological motivation, lateral feedback
    is usually described by a Mexican hat
    function,
  • three distinct areas
  • 1. short-range lateral excitation area,
  • 2. penumbra of inhibitory action,
  • 3. area of weaker excitation (usually ignored).

C. Pellegrini
73
SOFM algorithm
  • 1. Initialization
  • initialize wj(0) with different random values for
    j 1, 2, ...N
  • 2. Sampling
  • draw a sample vector x representing the sensory
    signal,
  • 3. Similarity matching
  • find the winning neuron i(x) at time n using
  • 4. Updating
  • adjust synaptic weights of all neurons using
  • 5. Continuation
  • continue with step 2 until no noticeable changes
    are observed or the total number of iterations is
    attained.

j 1,2, .., N
otherwise
C. Pellegrini
74
Selection of parameters
  • Learning-rate parameter h
  • should be time-varying,
  • during the ordering phase (first 1000
    iterations)
  • phase during which the topological ordering takes
    place,
  • initial value of h close to 1, decreasing
    gradually to 0.1,
  • exact form of variation not critical (linear,
    exponential or inversely proportional to n),
  • during the convergence phase
  • phase during which the fine tuning of the map is
    done (thousands of iterations),
  • for good statistical accuracy, h(n) should be
    maintained at a small value (approx. 0.01 or
    less).
  • rate of decrease for h,

h0 initial value of ht current training
iterationT total number of iterations
C. Pellegrini
75
Selection of parameters
  • Neighborhood function Li
  • usually a square region around the winning
    neuron,
  • begins by including all the neurons of the
    network,
  • then gradually shrinks with time,
  • during ordering phase
  • radius of Li shrinks linearly to a couple of
    neighboring neurons,
  • during convergence phase
  • neighborhood function Li should contain only the
    nearest neighbors of winning neuron i,
  • rate of decrease for the "radius" of Li ,

r0 initial value of the radiust current
training iterationT total number of iterations
C. Pellegrini
76
Example
  • 2-dimensional lattice of 10x10 neurons,
  • 2-dimensional input vector x (x1, x2) with x1
    and x2 uniformly distributed and -1 lt x1 lt 1
    -1 lt x2 lt 1,
  • synaptic weights chosen from a random set, (fig.
    a)
  • the following figures illustrate the learning
    algorithm of SOFM
  • ordering phase the map unfolds to form a mesh,
  • after 50 iterations (fig. b),
  • after 1000 iterations (fig. c),
  • convergence phase the map spreads out to fill
    the input space,
  • after 10'000 iterations (fig. d),
  • at the end the statistical distribution of
    neurons in the map approaches that of input
    vectors (except for some edge effects).

C. Pellegrini
77
Fig. 10.12
C. Pellegrini
78
Example
  • each pattern coded as a 4-component vector a
    (a1, a2 , a3 , a4),
  • 26 input patterns A - Z,
  • Minimal spanning tree

a1
a2
a3
a4
C. Pellegrini
79
Example
  • output space defined as a 7x10 lattice,
  • after training the network with SOFM algorithm,
    we get

C. Pellegrini
80
Example
  • output space defined as a 7x10 lattice,
  • after training the network with SOFM algorithm,
    we get

C. Pellegrini
81
Exemple Kohonen Construction darbres
phylogénétiques à partir de séquences de
protéines, application a linterleukine-2
Self-organizing tree algorithm (classifies and
constructs phylogenetic trees) http//www.cnb.uam.
es/bioinfo/Software/sota/sotadocument.html
82
Summary
  • Neurones are processors distributed in parallel
    and massively connected.
  • They are able to stock information and this
    information may be retrieved.
  • The knowledge is acquired during a learning
    process and it is stocked in the synaptic weights
    of the neurones connections.
  • Two phases are needed
  • learning (information acquisition)
  • test or use

P. Palagi
83
Quelques sites WEB avec exemples,applications ou
autres
Les réseaux de neurones artificiels -
Introduction au connexionnisme http//avalon.epm.o
rnl.gov/touzetc/Book/Bouquin.htm A short guide
to prediction and classification
systems http//bopwww.biologie.uni-freiburg.de/bi
oinfo/tutorials/networks_doc.html Prediction of
the subcellular location of proteins by neural
networks http//predict.sanger.ac.uk/nnpsl/ http/
/predict.sanger.ac.uk/nnpsl/ Secondary structure
and class protein server http//www.cmpharm.ucsf.
edu/imc/pred2ary/index.html
Write a Comment
User Comments (0)
About PowerShow.com