Title: Diapositiva 1
1Università Degli studi di CassinoFacoltà di
Ingegneria delle Telecomunicazioni
Dipartimento DAEIMI
Comparison of Object-based Classification
Techniques on Multispectral Images
G. Cuozzo(1), C. DElia(1), C. De Stefano(1), F.
Fontanella(2), C. Marrocco(1), M.
Molinara(1), A. Scotto di Freca(1), F.
Tortorella(1) (1) DAEIMI. University of Cassino,
Via Di Biasio, 43, 03043 Cassino, Italy. Ph
(39) 07762993748, Fax (39) 07762993987
Email g.cuozzo, delia, destefano, c.marrocco,
m.molinara, a.scotto, tortorella_at_unicas.it (2)
DIS. University of Napoli, Via Claudio, 21, 80300
Napoli, Italy. Email frfontan_at_unina.it
2Overview
- Scenario and motivation
- Pattern Recognition
- Classifiers
- MLP
- LVQ
- DLVQ
- K-NN
- SVM (Linear, RBF, Polynomial)
- ECOC
- Genetic Algorithms
- Experimental results
- Conclusions
- Neural Network
- Kernel machine
- Evolutionary Algorithm
3Scenario and Motivations
- Contextual
- Radiometric
- Geometrical
4Pattern Recognition
- Given the description of an object that belong to
one of N possible classes, the system has to
associate a class to each object using a base
knowledge on the single class. - Training phase
- Test phase
5Classifiers Neural Networks
\
6Classifiers MLP
- Perceptron
- Several layers
- Backpropagation
7Classifiers MLP
8Classifiers LVQ
9Classifiers LVQ
distance
min
class
class
Update with rule
Update with rule
Euclidean distance
10Dynamic Learning Vector Quantization
11DLVQ How to select the neurons to split ?
- We have introduced a Gain Functional G(n)
- G(n) (Cm(n) - Pm(n)) / Cm(n)
- Pm is the number of positive match, i.e. the
number of samples of its class for whom it is
the net winner - Cm is the number of class match, i.e. the number
of samples of its class for whom it is the
class winner (this number includes both the
number of positive match and the number
of samples of its class for whom it is not the
net winner but it is the closest neuron among
those belonging to its class)
12DLVQ
- Chooses Number of Neurons per Class
- Progressive Learning
- Fast
13K-NN
The K nearest points are selected and the most
frequently represented class is associated with
the sample under analysys
K 1 gt class K 3 gt class -
14SVM
- While the classifiers before described are
applicable both in binary and multiclass
problems, now we introduce the Support Vector
Machines (SVM) that are a binary classifier (or
dichotomizer). - In two-class classification problems, a sample
can be assigned to one of two mutually exclusive
classes that can be generically assigned
corresponding labels yi 1, where the sign of
the label indicates the class which the data
point belongs to.
15Two classes linearly separable
- Binary classification can be viewed as the task
of separating classes in feature space
wTx b 0
wTx b gt 0
wTx b lt 0
f(x) sign(wTx b)
16SVM
- Datasets that are linearly separable with some
noise work out great - But what are we going to do if the dataset is
just too hard? - How about mapping data to a higher-dimensional
space
x2
x
0
17Non-linear SVMs Feature spaces
- General idea the original feature space can
always be mapped to some higher-dimensional
feature space where the training set is separable
F x ? f(x)
18SVM
- The learning task can be reduced to the
minimization of the primal Lagrangian - where ?i are Lagrangian multipliers (hence ?i ?
0). -
- The decision for a new sample z to be classified
is based on the sign of
(2)
19The Kernel Trick
- As said above the linear classifier relies on
inner product between vectors K(xi,xj)xiTxj - When every datapoint is mapped into
high-dimensional space via some transformation F
x ? f(x), the inner product becomes - K(xi,xj) f(xi) Tf(xj)
- A kernel function is some function that
corresponds to an inner product in some expanded
feature space. - Mercers theorem
- Every semi-positive definite symmetric function
is a kernel
20Examples of Kernel Functions
- Linear K(xi,xj) xi Txj
- Polynomial of power p K(xi,xj) (1 xi Txj)p
- Gaussian (radial-basis function network)
K(xi,xj) - Two-layer perceptron K(xi,xj) tanh(ß0xi Txj
ß1)
21Multiclass (N) problem with SVM
- One vs. All
- LN binary problems
- All pairs
- LN (N-1)/2 binary problems
- ECOC
- L gt N binary problems (depending on the ECC
adopted)
22OVA coding
(d-1)/2
N 4 L 4
Problem 1 A vs. B ?C ?D Problem 2 B vs.
A?C?D Problem 3 C vs. A?B?D Problem 4 D vs.
A?B?C
Distance matrix d 2
23All pairs coding
N 4 L 6
Problem 1 A ? B vs C ? D Problem 2 A ? C vs B ?
D
Distance matrix d 4
24ECOC coding
N 4 L 7
Problem 1 A?C vs. B?D Problem 2 B vs.
A?C?D Problem 7 A?D vs. B?C
Distance matrix d 3
25Evolutionary Algorithms (EAs)
- Wide class of algorithms mimicking the natural
phenomena of evolution - well suited for problems where the solution space
is very large, multidimensional, complex and
discontinuous - they typically work on a population of
individuals each representing a solution of the
problem to be solved.
26A Typical EA
- An initial population of individuals (i.e. set of
solutions) is generated (usually randomly) - The effectiveness of each individual in the
current population is evaluated by a fitness
function - A new population is generated by
- - selecting individuals in the current
population - - modifying the selected individuals by using
some genetic operators - The last two steps are repeated until a
termination criterion is satisfied
27EA Basic Elements
- Solution encoding
- Fitness function
- Selection mechanism
- Genetic Operators
28Solution Encoding
- An individual is variable length list of real
valued vectors (genes), representing a set of
reference vectors (prototypes) in the feature
space - Hence, each individual allow to implement a
complete classifier classification is performed
by assigning an unknown sample the label of the
nearest prototype in the feature space
Breeder Genetic Algorithm
29Fitness function
- The fitness of each individual is evaluated as
follows - each sample of the training set is assigned to
the nearest prototype in the feature space.
Euclidean distance is used - after this step, each prototype is assigned a
label corresponding to the class whose sample are
more frequent in its neighborhood - the recognition rate is computed and used as
fitness value of that individual
30Selection mechanism
- We have adopted a selection mechanism based on
the concept of tournament - In the tournament selection, a number T of
individuals is randomly chosen from the
population and the best individual from this
group is selected as parent - Such a mechanism ensures to control the loss of
diversity and the selection intensity
31Experimental results
32Examples
Original Segmented Classification Map
33Conclusions
- In this paper we have compared the performance of
several widely adopted classification schemes on
a remote sensing problem. The faced problem is
very complex with highly overlapped classes and
represented a significant test bed for the
classifiers considered. - The future development, rather than focusing on
enhancement of single classifiers, will consider
the analysis of the correlation existing among
the results provided by the classifiers so as to
test new classification systems made of
combination of single classifiers. - DLVQ can be used to address the problem of
progressive learning and clustering.
34Parent 1
Example of Crossover (1)
Length 4, cut point 1
Parent 2
Length 3, cut point 2
35Example of Crossover (2)
Offspring 1
Final length 2
Offspring 2
Final length 5
36Classifiers DLVQ
- Dynamic Learning Vector Quantization (DLVQ)
- set the number k of iterations of the FSCL
algorithm - assign the same initial number of neurons to each
class - set N as the total number of assigned neurons
- while (stop condition is not satisfied)
-
- perform k iterations of the FSCL algorithm
- select the neuron n
- if (n does not match any samples of its class)
- remove n
- decrement N
- else
-
- split the neuron n
- increment the number of neurons of the class of
n - increment N
-
- evaluate stop condition
- We have introduced a Gain Functional G(n)
- G(n) (Cm(n) - Pm(n))
- Pm is the number of positive match, i.e. the
number of samples of its class for whom n is the
net winner - Cm is the number of class match, i.e. the number
of samples of its class for whom n is the class
winner (this number includes both the number of
positive match and the number of samples of its
class for whom n is not the net winner but it is
the closest neuron among those belonging to its
class)
37SVM Classification Margin
- Distance from example to the separator is
- Examples closest to the hyperplane are support
vectors. - Margin ? of the separator is the width of
separation between classes.
r
38Linear SVMs Mathematically
- Then we can formulate the quadratic optimization
problem
Find w and b such that F(w) ½ wTw is minimized
and for all (xi ,yi) yi (wTxi b) 1
39Linear SVMs Mathematically
- The classifier is a separating hyperplane.
- Most important training points are support
vectors they define the hyperplane. - Quadratic optimization algorithms can identify
which training points xi are support vectors with
non-zero Lagrangian multipliers ai. - Both in the dual formulation of the problem and
in the solution training points appear only
inside inner products
Find a1aN such that Q(a) Sai -
½SSaiajyiyjxiTxj is maximized and (1) Saiyi
0 (2) 0 ai C for all ai
f(x) SaiyixiTx b