Title: The Development of a Scientific Permanent Scatterer System
1 Incremental Clustering Mariana Ciucu, Mihai
Datcu
ESA-EUSC 2004 Theory and Applications of
Knowledge driven Image Information Mining with
focus on Earth Observation - EUSC, Madrid (Spain)
- March 17-18, 2004
2Cluster analysis is the art of finding groups in
data
3EXAMPLE DATA CLUSTERING
4EXAMPLE DATA CLUSTERING
5EXAMPLE DATA CLUSTERING
6EXAMPLE DATA CLUSTERING
7APPLICATIONS IMAGE ANALYSIS
- IMAGE INTERPRETATION
- classification
- sensor/data fusion
- scene understanding
8APPLICATIONS DATA MINING
- IMAGE INFORMATION MINING
- find ceratian images in an archive
- explore the image information content
- discover scene structures or dynamics by
exploring image collections
9APPLICATIONS INTERNET SEARCH ENGINES
- SEMANTIC WEB
- find similar documents
- find key-structures in documents
- understand the nature of documents
- document heterogeneous data
10CHALLANGES
- how many clusters ?
- which is the shape of the clusters ?
- how to cluster heterogeneous data ?
- how to compute it fast ?
11AutoClass
- Bayesian estimation of
- optimal classes
12k - means
- fast
- imposed no. of clusters
- unbalanced clusters
13Clustering and DataBase index systems
clustering
data
DBMS
14The need for incremental clustering
- DB items might get added and deleted over time
- These changes should be reflected in the
partition generated without significantly
affecting the current clusters. - Incremental clustering algorithm has three
steps - Assign the first datum to a cluster.
- Read the next datum, and assign this item to one
of the existing clusters or generate a new
cluster. This assignment is done based on some
similarity measure over the data space. - Repeat step 2 until all the data set is
clustered.
15Incremental clustering State of the art
- Hierarchical methods and density - based methods
- BIRCH Balanced Iterative Reducing and
Clustering using Hierarchies - CURE Clustering Using Representatives
- DBSCAN Density-Based Spatial Clustering of
Applications with Noise - OPTICS Ordering Pints To Identify the
Clustering Structure - DENCLUE DENsity-based CLUstEring
16Incremental clustering State of the art
- Grid - based methods
- STING STatistical INformation Grid approach (
grid-based and density-based algorithm) - CLIQUE Clustering high-dimensional space
(density-based grid-based clustering) - WaveCluster Clustering using wavelet
transformation
17Incremental clustering State of the art
- Daniel Barbara, Ping Chen, Using the Fractal
Dimension to Cluster Datasets, Oct. 1999 - grid-based clustering fractal analysis of the
data - Fractal Clustering (FC) places points
incrementally in the cluster for witch the change
in the fractal dimension after adding the point
is the last.
18Incremental clustering State of the art
- Chia H. Yeh Chung. J. Kuo, Iteration-free
clustering algorithm for nonstationary image
database, IEEE Transactions on Multimedia, vol.
5, no.2, pp. 223-236, Jun. 2003. - Iteration-free clustering algorithm (IFC) is
based on Lagrangian function for adapting the
existing indexing structure without reapplying
the clustering algorithm to database.
19Grid based clustering
20Adaptive grid clustering
- Similar to vector quantization but dynamic
adapation to data model - Merging / Splitting strategies
- 1. density and neighbors
- 2. densty and ? densities ( neighborhood )
afer clustering,if we want 4 clusters
before clustering
Neighbor searching and merging to existing cluster
21Evaluation
Classification
22Evaluation
Classification
23Proabilitsic search
24SUMMARY
- Incremental clustering is a key issue for IIM
- Adaptive grid is presently a working solution
- Further work is done
- Evaluation on growing massive data sets
- Understanding data models
- Adaptation methods