Title: LVQ Algorithms
1LVQ Algorithms
- Presented by
- KRISHNAMOORTHI.BM.
- RAJKUMAR
2- Introduction
- Codebook vectors
- LVQ1 algorithm
- Example for LVQ1
- LVQ2 and OLVQ
- Application of LVQ
- Issues of LVQ
- Summary and Conclusion
3Learning Vector Quantization
- Combine competitive learning with supervision.
- Competitive learning achieves clusters.
- Assign a class (or output value) to each cluster
- Reinforce cluster representative (a neuron) when
it classifies input in the desired class - Positive reinforcement pull the neuron weights
toward the input. - Negative reinforcement push the weights away.
4Voronoi Region, Codeword
5- An input vector x is picked at random from the
input space. If the class labels of the input
vector x and a Voronoi vector w agree, the
Voronoi vector w is moved in the direction of the
input vector x. If on the other hand, the class
labels of the input vector x and the Voronoi
vector w disagree, the Voronoi vector w is moved
away from the input vector x.
6The LVQ1 learning algorithm.
- Inputs
- Let mi be the set of untrained codebook vectors
which are properly distributed among class
labels. - Let rlen be the number of learning cycles
defined by the user. - Outputs
- A trained set of codebook vectors mi which
correctly classify the given instances.
7Procedure LVQ1
- procedure LVQ1
- Let c argmini x mi define the nearest
mi to x denoted by mc. - Let 0 lt a(t) lt 1 and a(t) be constant or
decreasing monotonically with time. - Let the mi(t) represent sequences of the mi in
the discrete-time domain. - while learning cycles lt rlen loop
- for each input sample x(t) loop
- if i ? c then
- mc(t 1) mc(t).
- else
- if x and mc belong to the same class then
- mc(t 1) mc(t) a(t) x(t) mc(t).
- end if
- if x and mc belong to the different class then
- mc(t 1) mc(t) - a(t) x(t) mc(t).
- end if
- end if
- End loop
- End loop
- End
8LVQ1 Algorithm cont.
- Stopping criteria
- Codebook vectors have stabilized or
- Maximum number of epochs has been reached.
9- LVQ establishes a number of codebook vectors into
the input space to approximate various domains of
the input vector by quantized values. The labeled
samples are fed into a generalization module that
finds a set of codebook vectors representing
knowledge about the class membership of the
training set.
10Example
- 4 input vectors, 2 classes
- Step 1 assign target vectors to each input
- Target vectors should be binary each one
contains only zeros except for a single 1 - Step 2 Choose how many sub-classes will make up
each of the 2 classes e.g. 2 for each class - ? 4 prototype vectors (competitive neurons)
- Note typically the prototype vectors ltlt input
vectors
11Example contd
- W2
- Neurons 1 and 2 are connected to class 1.
- Neurons 3 and 4 to class 2
- Step 3 W1 is initialized to small random
values. The weight vectors belonging to the
competitive neurons which define class 1 are
marked with circles class 2-squares
12(No Transcript)
13Example Iteration 1, Second Layer
a2(0) W2 a1(0)
This is the correct class, therefore. The weight
vector is moved toward the Input vector (learning
rate 0.5)
1w1(1) 1w1(0) a p1 1w1(0))
14P3
P1
Input presented
2W1(0)
1W1(0)
4W1(0)
3W1(0)
P4
P2
15Iteration 1 illustrated
Input presented
Winner Positively reinforced
16After several iterations
P3
P1
1W1(inf)
3W1(inf)
P4
P2
4W1(inf)
2W1(inf)
17Stopping rule
- Neural-network algorithms often overlearn,
i.e., with extensive learning, the recognition
accuracy is first improved, but may then very
slowly start decreasing again. Various reasons
account for this phenomenon in the present case,
especially with the limited number of training
samples that are being recycled in learning, the
codebook vectors become very specifically tuned
to the training data, with the result that their
ability to generalize for other data values
suffer from that.
18Stopping rule - Continued
- Stop the learning process after some optimal
number of steps, say, 30 to 50 times the total
number of codebook vectors in the case of OLVQ1,
or 100 to 500 times the number of codebook
vectors in the cases of LVQ1 and LVQ2 (depending
on the learning rate and type of data).
19LVQ 2
- A second, improved, LVQ algorithm known as LVQ2
is sometimes preferred because it comes closer in
effect to Bayesian decision theory.
20LVQ2
- LVQ2 is applied only after LVQ1 has been applied
using a small learning rate and small number of
iterations - It optimizes the relative distance of the
codebook vectors from the class borders ? the
results are typically more robust
21LVQ2 - Continued
- The same weight/vector update equations are used
as in the standard LVQ, but they only get applied
under certain conditions, namely when - The input vector x is incorrectly classified by
the associated Voronoi vector wI(x). - The next nearest Voronoi vector wS(x) does give
the correct classification, and - The input vector x is sufficiently close to the
decision boundary (perpendicular bisector plane)
between wI(x) and wS(x).
22LVQ2 - Continued
- In this case, both vectors wI(x) and wS(x) are
updated (using the incorrect and correct
classification update equations respectively). - Various other variations on this theme exist
(LVQ3, etc.), and this is still a fruitful
research area for building better classification
systems.
23All about Learning rate
- We address the problem of whether ai(t) can be
determined optimally for fastest possible
convergence of - mc(t 1) (1-s(t) ac(t)mc(t) s(t)ac(t)x(t)
- Where s(t) 1 if the classification is correct
and -1 if it is wrong. We first directly see that
mc(t) is statistically independent of x(t). - It may also be obvious that the statistical
accuracy of the learned codebook vector values is
optimal if the effects of the corrections made at
different times.
24- Notice that mc(t1) contains a trace from x(t)
through the last term in the equation given
previously and traces from the earlier x(t),
t 1,2,t-1 through mc(t). The (absolute)
magnitude of the last trace from x(t) is scaled
down by the factor ac(t), and, for instance, the
trace from x(t-1) is scaled down by - 1-s(t)ac(t). ac(t-1). Now first we stipulate
that these two scalings must be identical - ac(t) 1 s(t) ac(t) ac(t-1)
25- Now first we stipulate that these two scalings
must be identical - ac(t) 1 s(t) ac(t) ac(t-1)
- Thus optimal values of ai(t) are determined by
the recursion - ac(t) ac(t 1) / ( 1 s(t) ac(t-1) )
26LVQ - Issues
- Initialization of the codebook vectors
- Random
- Dead neurons too far away from the training
examples, never take the competition - How many codebook vectors for each class?
- It is set proportional to the number of input
vectors in each class - How many epochs?
- Depends on the complexity of the data, learning
rate
27Various applications of LVQ
- Image analysis and OCR
- Speech analysis and recognition
- Signal processing and radar
- Industrial and real world measurements and
robotics - Mathematical problems
28One Application
- Learning vector Quantization in
- Footstep indentification
29Intelligent Systems Group ( ISG ) research
laboratory at the University of Oulu.
30(No Transcript)
31(No Transcript)
32LVQ1 is applied for 1000 iterations. And OLVQ was
run for 100 iterations.
Accuracy rate is 78
33Summary and Conclusions
- LVQ is a supervised learning algorithm
- Classifies input vectors using competitive layer
to find subclasses of input vectors that are then
combined into classes - Can classify any set of input vector (linearly
separable and non-linearly separable, convex
regions and non-convex regions) if - there are enough neurons in the competitive layer
- each class is assigned enough sub-classes (i.e.
enough competitive neurons)
34References
- Neural Networks Haykin.
- LVQ-PAK A program package for the correct
application of Learning vector quantization
algorithms Teuvo Kohonen, Jari Kangas, Jorma
Laaksomen, and Kari Torkkola.