Title: Topographic processing of relational data
1Topographic processing of relational data
Barbara Hammer, TU Clausthal, Germany Alexander
Hasenfuss, TU Clausthal, Germany Fabrice Rossi,
INRIA, France Marc Strickert, IPK Gatersleben,
Germany
2- Topographic processing of euclidean data
- SOM
- NG
- Topographic processing of relational data
- Median clustering
- Relational clustering
- Supervision
- Experiments
- Proteins
- Chromosome images
- Macroarrays
3Topographic processing of euclidean data
4(No Transcript)
5SOM and NG
- Prototype based clustering
- prototypes element of data space
- clustering by means of receptive fields
- euclidean distance
6SOM
Self-organizing map initialize wi adapt wi
exp(-rk(xj,wi)/s)(xj-wi)
almost optimizes ESOM ?ij?j(i)?kexp(-nd(i,k
)/s) (xj-wk)2
direct visualization in euclidean or hyperbolic
space
7Neural Gas
almost optimizes ESOM ?ij?j(i)?kexp(-nd(i,k
)/s) (xj-wk)2
kij
Batch-SOM repeat
optimize kij given fixed w
optimize w given fixed kij i.e.
repeat kij 1 iff wi
winner for xj , I(xi) winner
wi ?j exp(-nd(I(xj),i)/s) xj / ?j
exp((-nd(I(xj),i)/s) converges, (quadratic)
Newton scheme
8Neural Gas
Neural Gas initialize wi adapt wi
exp(-rk(xj,wi)/s)(xj-wi)
optimizes ENG ?ijexp(-rk(xj,wi)/s) (xj-wi)2
neighborhood graph induced by Delaunay
triangulation visualization by means of MDS or
similar
9Neural Gas
optimizes ENG ?ijexp(-rk(xj,wi)/s) (xj-wi)2
kij
Batch-NG repeat optimize
kij given fixed w optimize
w given fixed kij i.e. repeat
kij rank of prototype wi given
xj wi ?j exp(-kij/s) xj
/ ?j exp(-kij/s) converges, (quadratic) Newton
scheme
10Topographic processing of noneuclidean data
11left angle
12median clustering restrict prototypes to data
positions relational clustering substitute
distance from prototype
dij
method to represent wi and compute distances
- dissimilarity matrix D or data
- no (explicit) euclidean embedding
13Median clustering
- Median clustering
- no assumptions on D
- prototypes restricted to data points ? only
discrete values - consecutive optimization of kij and wi as
beforehand
Median-NG kij rank of
prototype wi given xj wi xk
with ?ij exp(-kij/s) (xj-xk)2 minimum
avoid identical prototypes!
converges for every D
14Relational clustering
- Relational clustering
- d(xi,xj) f(xi)-f(xj)2 is euclidean, but
embedding unknown
optimum prototypes fulfill wi ?l ail xl where
?ail 1 normalized ranks ? xj-wi2 (D ai)j
½ ait D ai ? dual cost function ENG ?i?ll
exp(-rk(xl,wi)/s) exp(-rk(xl,wi)/s) d(xl,xl)2 /
4 ?lexp(-rk(xl,wi)/s)
Relational-NG xj-wi2 (D ai)j ½ ait D
ai kij rank of prototype wi
given xj aij exp(-kij/s),
normalize
only implicit prototypes represented by aij ?
continuous adaptation converges for every
symmetric nonsingular D
15Supervision
- integrate additional label information for data
yi - enrich prototypes by labels Yj
- substitute distance
- xi-wj2 ? ß xi-wj2 (1-ß) yi-Yj2
- solve as beforehand
16Experiments
17Proteins
- Protein classifiation
- 226 points, 5 classes (HA, HB, MY, GG/GP, other)
- alignment measures evolutionary distance of
globin proteins - 29 neurons, 150 epochs,
repeated
cross-validation,
mixing
parameter 0.5
18Proteins
NG MDS
HSOM
19Chromosomes
- Kopenhagen Chromosome database
- 4200 points, 22 classes, alignment distance
- difference of thickness profiles of grey images
- 85 neurons, 100 epochs,
repeated
cross-validation,
mixing
parameter 0.9
20Macroarray data
- Macroarray data
- gene expressions at 14 time points after
flowering - 4824 selected genes
- 85 epochs, 150 neurons, no supervision
- Pearson correlation
21(No Transcript)