Kohonens SelfOrganizing Feature Maps - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Kohonens SelfOrganizing Feature Maps

Description:

The Euclidean distance is given as: ... Distance Calculation. As an example, to calculate the distance between the vector for the colour red ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 25

Provided by: milar1

Category:

more less

Transcript and Presenter's Notes

Title: Kohonens SelfOrganizing Feature Maps

1
Kohonens Self-Organizing Feature Maps

Dr. N. Reyes
Adapted from AIJunkie.com

2
A Simple Kohonen Network
Lattice
4x4
Node
Weight Vectors
Input Nodes
Input Vector
3
SOM for Color Clustering
Unsupervised learning
Reduces dimensionality of information
Data Compression Vector Quantisation
Clustering of data
Topological relationship between data is
maintained
Input 3D , Output 2D
Vector quantisation
4
SOM

A SOM does not need a target output to be
specified unlike many other types of network.
Instead, where the node weights match the input
vector, that area of the lattice is selectively
optimized to more closely resemble the data for
the class the input vector is a member of.
From an initial distribution of random weights,
and over many iterations, the SOM eventually
settles into a map of stable zones. Each zone is
effectively a feature classifier, so you can
think of the graphical output as a type of
feature map of the input space.
If you take another look at the trained network
shown in figure 1, the blocks of similar colors
represent the individual zones. Any new,
previously unseen input vectors presented to the
network will stimulate nodes in the zone with
similar weight vectors.

5
Learning Algorithm

Training occurs in several steps and over many
iterations
Each node's weights are initialized.
A vector is chosen at random from the set of
training data and presented to the lattice.
Every node is examined to calculate which one's
weights are most like the input vector. The
winning node is commonly known as the Best
Matching Unit (BMU).
The radius of the neighbourhood of the BMU is now
calculated. This is a value that starts large,
typically set to the 'radius' of the lattice,
but diminishes each time-step. Any nodes found
within this radius are deemed to be inside the
BMU's neighbourhood.
Each neighbouring node's (the nodes found in step
4) weights are adjusted to make them more like
the input vector. The closer a node is to the
BMU, the more its weights get altered.
Repeat steps 2 through 5 for N iterations.

6
Initializing The Weights

Prior to training, each node's weights must be
initialized. Typically these will be set to small
standardized random values. The weights in the
SOM demo project are initialized so that 0 lt w lt 1

7
Calculating the Best Matching Unit

To determine the best matching unit, one
method is to iterate through all the nodes and
calculate the Euclidean distance between each
node's weight vector and the current input
vector. The node with a weight vector closest to
the input vector is tagged as the BMU.

8
Calculating the Best Matching Unit

The Euclidean distance is given as
where V is the current input vector and W is the
node's weight vector.

Equation 1
9
Distance Calculation

As an example, to calculate the distance between
the vector for the colour red (1, 0, 0) with an
arbitrary weight vector (0.1, 0.4, 0.5)
distance sqrt( (1 - 0.1)2 (0 - 0.4)2 (0 -
0.5)2 )
sqrt( (0.9)2 (-0.6)2 (-0.5)2
)
sqrt( 0.81 0.36 0.25 )
sqrt(1.42)
distance 1.19

10
Determining the Best Matching Unit's Local
Neighborhood

Each iteration, after the BMU has been
determined, the next step is to calculate which
of the other nodes are within the BMU's
neighbourhood. All these nodes will have their
weight vectors altered in the next step.
First, the radius of the neighbourhood is
determined, then each nodes position is
inspected if it falls within the radial distance
or not.

11
Initial Size of a Typical Neighborhood
The neighborhood shown above is centered around
the BMU (colored yellow) and encompasses most of
the other nodes. The green arrow shows the
radius,
12
Shrinking Neighbourhood

A unique feature of the Kohonen learning
algorithm is that the area of the neighborhood
shrinks over time. This is accomplished by making
the radius of the neighborhood shrink over time.
The exponential decay function is used to
implement this

13
Area of Neighbourhood

Exponential Decay Function

Equation 2
where the Greek letter sigma, s0, denotes the
width of the lattice at time t 0 and the Greek
letter lambda, ?, denotes a time constant. t is
the current time-step (iteration of the loop)
14
MapRadius

In code, the value s is represented by MapRadius,
and is equal to s0 at the commencement of
training.
To calculate s0
MapRadius max(LatticeWidth,
LatticeHeight)/2

15
Lambda (time constant)

The value of ? is dependent on s and the number
of iterations chosen for the algorithm to run.
In code,
Lambda NumOfIterations / log(MapRadius)
NumOfIterations is the number of iterations the
learning algorithm will perform

16
Neighborhood Radius

To calculate the neighborhood radius for each
iteration of the algorithm using Equation 2,
NeighborhoodRadius
MapRadius exp(-IterCount/Lambda)

17
Ever Shrinking Neighborhood Radius
Neighborhood size decreases over time (the figure
is drawn assuming the neighborhood remains
centered on the same node, in practice the BMU
will move around according to the input vector
being presented to the network) Over time the
neighborhood will shrink to the size of just one
node... the BMU
18
Weight Adjustment

Now we know the radius, it's a simple matter to
iterate through all the nodes in the lattice to
determine if they lay within the radius or not.
If a node is found to be within the neighborhood
then its weight vector is adjusted as follows..

19
Weight Adjustment

Every node within the BMU's neighborhood
(including the BMU) has its weight
vector adjusted according to the following
equation
Where t represents the time-step and L is a small
variable called the learning rate, which
decreases with time

Equation 3
20
Learning Rate Decay

The decay of the learning rate is calculated each
iteration using the following equation
In code,
LearningRate
StartLearningRate exp(-IterCount/NumOfIteratio
ns)
Initially set as 0.1, then gradually decays over
time so that during the last few iterations it is
close to zero.

Equation 4
21
Distance Influence

However, not only does the learning rate have to
decay over time, but also, the effect of
learning should be proportional to the distance
of the node from the BMU.
Indeed, at the edges of the BMUs neighbourhood,
the learning process should have barely any
effect at all. Ideally, the amount of learning
should fade over distance similar to the Gaussian
decay shown in the figure.

22
Learning with Distance Influence

Modified Equation 3
the Greek capital letter theta, T, represents the
amount of influence a node's distance from the
BMU has on its learning

Equation 5
23
T

T, the amount of influence a node's distance from
the BMU has on its learning
Where dist is the distance a node is from the BMU
and s, is the width of the neighbourhood function
as calculated by Equation 2
Additionally, note that T also decays over time.

Equation 6
24