Title: Primitive Feature Extraction via a Combined ICAWavelet Method
1Primitive Feature Extraction via a Combined
ICA-Wavelet Method
- Vijay Shah, Nick Younan, Surya Durbha, and Roger
King - Department of Electrical and Computer Engineering
- GeoResource Institute
- Mississippi State University,
- Mississippi State, MS 39762, USA
- IIM 2008, Frascati, Italy
- March 4-6, 2008
2Overview
- Background
- Data Transformation
- ICA-Wavelet Feature Extraction for Image
Information Mining - Methodology
- Coarse Image segmentation
- Region Identification
- Experimental Results
- Summary
3Background
- Feature extraction and data transformation are
important in any image information system - Primitive features are extracted based on color,
texture, and shape within a region of interest. - Wavelet transformation has been effectively used
for feature extraction in many applications. - It reduces the complexity and dimensionality of
the extracted features to expedite the image
retrieval in EO data archives.
4Data Transformation
- Data transformation is used to improve the
quality of knowledge discovery in an image. - Spectral transformation can take place in many
different forms - Applying arithmetic operation (,-,) on a
feature set - Combining features that are correlated
- Transformation along the spectral axis to obtain
a new set features - Linear transformation
- Nonlinear transformation
5Common Methods
- HSV
- RGB space is nonlinearly converted to the HSV
(Hue color type, Saturation color purity, and
Value color intensity) space to make color
components perceptually independent and uniform. - Fails to capture the complete spectral pattern
an important characteristic available in remote
sensing imagery - HSV space is not statistically independent and
uncorrelated - PCA
- Linear transformation of the RGB space.
- Not optimized for class separation
6Current I3KR System Intelligent Interactive
Image Knowledge Retrieval
Color LAB Space Texture Features from
L-component Co-occurrence matrix Uniformity,
Entropy, First Order Element, Maximum
Probability, First Order Inverse
Element Primitive length matrix - gray level
uniformity, long primitive emphasis, short
primitive emphasis, uniformity, and primitive
percentage
7Methodology
Data Transformation for Feature Extraction
Spectral Transformation (ICA)
Raw Imagery
Spatial-Transformation (Wavelet-Transform)
Image Segmentation
Clustering
Region Feature Extraction
8Feature Extraction
- Component energies and Cross-correlation energies
of the coarse scale wavelet coefficients are
considered robust to capture color and texture
information. - For the l-level DWT, a n-D vector for the jth
wavelet coefficient of B sub-band, is obtained,
where n number of independent components. - Appropriate wavelet decomposition level is
selected based on the frequency content of the
image.
9Decomposition Level Selection
- Steps
- Calculate the Fourier transform of the image.
- Retain the frequencies whose energy is greater
than 1 of the main peak, as the frequency
components of the image. - Calculate the total energy E of the spectrum.
- Calculate the total energy Ei in sub-band (i),
over the region of with i 1,2, , N. - If Ei/E lt threshold, increase i and repeat the
previous step, else return (i-1) as the
appropriate level of decomposition for the image.
The threshold value is typically chosen to be
close to 0 to guarantee that there is no loss in
the frequency components of an image, i.e. higher
threshold values will disregard low magnitude
frequency components as being significant to the
image content.
10Independent Component Analysis
- Variant of Principal Component Analysis (PCA)
- Seeks those directions in the feature space that
are statistically independent and uncorrelated - Uses higher order statistics to determine the
mixing matrix
11Clustering and Object Identification
- Clustering K-means and Kernel K-means
- provide relative scalability and very efficient
processing for very large datasets. - Eliminates the use of cross-correlation energies
as features (Kernel K-means), i.e., feature
reduction during the pre-processing stage of ICA - Object Identification - SVM
- Proven to be successful for many applications of
nonlinear classification and function estimation - Trained to find the global optimum
- Hyperplane that separates the two classes with
maximum margin for linear case - For non-linear cases, the feature space is mapped
to higher dimension to separate two classes
12Segmentation Evaluation
- Index I
- Maximized for when correct number of cluster are
found - where,
- U(X) ukjKxn is a partition matrix of the data
- Silhouette Coefficient
- Average SC provides information on cluster
separation, and for each point given by - where dissimilarity of point i to other pt. in
clusters
- J-value
- Minimized for good segmentation algorithm, given
by - where,
- z (x,y) represent the 2-D vector image pixel
position of the classified image z?Z - Z is divided into C classes
13Experimental Results
- Data Sets
- false color Landsat 7 ETM scenes
- 512 x 512 pixels each
- 30m x 30m spatial resolution.
- 4 classes - water bodies are generally dark color
objects with smooth texture, agricultural land
with healthy vegetation is dark pinkish-red with
smoother texture, forest is dark red with coarse
texture, and fallow land is yellowish-gray in
color.
Image 1
Image 3
Image 2
14Experiment 1 -Comparison of Different Spectral
Transformation Methods
15Experiment 2 Spectral and Spatial Transformation
Order
Spatial-Spectral transformation order does not
matter for estimating the number of clusters when
using ICA spectral transformation
J-value reduces if the spectral transformation is
performed after taking 2D-DWT
16Segmentation -Visual Comparison
Image 1
(a) ICA-spectral Xformation and 2D-DWT
(b) 2D-DWT and ICA-spectral Xformation
(c) JSEG
(d) HSV spectral Xformation
(e) PCA spectral Xformation
(f) No spectral Xformation
17Experiment 3 - Clustering Approaches Comparison
- Experiment conducted on pixels from classes
- water, agricultural land, forest, and fallow land
- 100 samples from each class
- Total iteration is set to 100 and the number of
replication set is 10 - Number of clusters for both algorithms is varied
from 2 to 5 - Gaussian RBF kernel with s 1 for kernel k-means
algorithm
18Error Evaluation
- Error calculated based on the improper cluster
assignment of the sample - The omission error is defined as excluding a
sample that should have been included in the
cluster. - The commission error is defined as including a
sample in a cluster when it should have been
excluded
19Results - Experiment 3
k-means
Kernel k-means
20Experimental 4 Region Feature Extraction
Comparison
- Data
- 150 region sample of each class from the Landsat7
ETM image archive - Total of four classes Water Fallow Land
Agricultural Land, and Forest Area - Used leave-one-out method for training and
testing purpose
21Results Experiment 4
Using the Haar Mother Wavelet
Using the rbio3.1 Mother Wavelet
22Data - Overall System Performance
- Image Archive -
- 400 False color Landsat 7 ETM scenes
- images of size 128 x 128 pixels, 256 x 256
pixels, and 512 x 512 pixels - Image Resolution 28.5 m x 28.5 m for MS image
and 14.25 x 14.25 m for Pan image - Number of bands used 3
- Four major regions water bodies (228 images),
agricultural land (227 images), fallow land (261
images), and forest (224 images)
23Results - Overall System Performance
24Summary
- Features obtained by ICA transformation provide
reliable segmentation compared to other
transformation approaches. - Choice of the order between the spectral and
spatial transform can quantitatively affect the
image segmentation results. - For the ICA-spectral transformation, the
estimated number of clusters in an image mostly
remains the same.
25Future Work
- Better Multiresolution Approaches
- Beyond wavelets
- Adaptive Filter Design for Multiresolution
Approaches - Improvement to Image Segmentation
- Redundant Approach
- Shape Feature Extraction
26Questions?