Independent Component Analysis: The Fast ICA algorithm - PowerPoint PPT Presentation

About This Presentation
Title:

Independent Component Analysis: The Fast ICA algorithm

Description:

Kurtosis for gaussian random variables is 0. Con not a robust measure of ... Instead of kurtosis function, choose a contrast function G that doesn't grow too ... – PowerPoint PPT presentation

Number of Views:1892
Avg rating:3.0/5.0
Slides: 34
Provided by: jonath100
Category:

less

Transcript and Presenter's Notes

Title: Independent Component Analysis: The Fast ICA algorithm


1
Independent Component Analysis The Fast ICA
algorithm
  • Jonathan Kam
  • EE 645

2
Overview
  • The Problem
  • Definition of ICA
  • Restrictions
  • Ways to solve ICA
  • NonGaussianity
  • Mutual Information
  • Maximum Likelihood
  • Fast ICA algorithm
  • Simulations
  • Conclusion

3
The Problem
  • Cocktail Problem
  • Several Sources
  • Several Sensors
  • Ex Humans hear mixed signal, but able to unmix
    signals and concentrate on a sole source
  • Recover source signals given only mixtures
  • No prior knowledge of sources or mixing matrix
  • aka Blind Source Separation (BSS)

4
Assumptions
  • Source signals are statistically independent
  • Knowing the value of one of the components does
    not give any information about the others
  • ICs have nongaussian distributions
  • Initial distributions unknown
  • At most one Gaussian source
  • Recovered sources can be permutated and scaled

5
Definition of ICA
  • Observe N linear mixtures x1,,xn of n
    independent components
  • xj aj1s1 aj2s2 ajnsn, for all j
  • aj is the column of the mixing matrix A
  • Assume each mixture xj and each IC sk is a random
    variable
  • Time difference between mixes dropped
  • Independent components are latent variables
  • Cannot be directly observed

6
Definition of ICA
  • ICA Mixture model xAs
  • A is mixing matrix s is matrix of source signals
  • Goal
  • Find some matrix W, so that
  • s Wx
  • W inverse of A

7
Definition Independence
  • Two functions independent if
  • Eh1(y1)h2(y2) Eh1(y1) Eh2(y2)
  • If variables are independent, they are
    uncorrelated
  • Uncorrelated variables
  • Defined Ey1y2 Ey1 Ey2 0
  • Uncorrelation doesnt equal independence
  • Ex (0,1),(0,-1),(1,0),(-1,0)
  • Ey12y22 0 ? ¼ Ey12 Ey22
  • ICA has to prove independence

8
ICA restrictions
  • Cannot determine variances
  • s and A are unknown
  • Scalar multipliers on s could be canceled out by
    a divisor on A
  • Multiplier could even be -1
  • Cannot determine order
  • Order of terms can changed.

9
ICA restrictions
  • At most 1 Gaussian source
  • x1 and x2 Gaussian, uncorrelated, and unit
    variance
  • Density function is completely symmetric
  • Does not contain info on direction of the columns
    of the mixing matrix A.

10
ICA estimation
  • Nongaussianity estimates independent
  • Estimation of y wT x
  • let z AT w, so y wTAs zTs
  • y is a linear combination of si, therefore zTs is
    more gaussian than any of si
  • zTs becomes least gaussian when it is equal to
    one of the si
  • wTx zTs equals an independent component
  • Maximizing nongaussianity of wTx gives us one of
    the independent components
  • Maximizing nongaussianity by measuring
    nongaussiantiy
  • Minimizing mutual information
  • Maximum Likelihood

11
Measuring nongaussianity
  • Kurtosis
  • Fourth order cumulant
  • Classical measure of nongaussianity
  • kurt(y) Ey4 3(Ey2)2
  • For gaussian y, fourth moment 3(Ey2)2
  • Kurtosis for gaussian random variables is 0
  • Con not a robust measure of nongaussianity
  • Sensitive to outliers

12
Measuring nongaussianity
  • Entropy (H) degree of information that an
    observation gives
  • A Gaussian variable has the largest entropy among
    all random variables of equal variance
  • Negentropy J
  • Based on the information theoretic quantity of
    differential entropy
  • Computationally difficult

13
Negentropy approximations
  • Classical method using higher-order moments
  • Validity is limited to nonrobustness of kurtosis

14
Negentropy approximations
  • Hyvärinen 1998b maximum-entropy principle
  • G is some contrast function
  • v is a Gaussian variable of zero mean and unit
    variance
  • Taking G(y) y4 makes the equation kurtosis
    based approximation

15
Negentropy approximations
  • Instead of kurtosis function, choose a contrast
    function G that doesnt grow too fast
  • Where 1a12

16
Minimizing mutual information
  • Mutual information I is defined as
  • Measure of the dependence between random
    variables
  • I 0 if variables are statistically independent
  • Equivalent to maximizing negentropy

17
Maximum Likelihood Estimation
  • Closely related to infomax principle
  • Infomax (Bell and Sejnowski, 1995)
  • Maximizing the output entropy of a neural network
    with non-linear outputs
  • Densities of ICs must be estimated properly
  • If estimation is wrong ML will give wrong results

18
Fast ICA
  • Preprocessing
  • Fast ICA algorithm
  • Maximize non gaussianity
  • Unmixing signals

19
Fast ICA Preprocessing
  • Centering
  • Subtract its mean vector to make x a zero-mean
    variable
  • ICA algorithm does not need to estimate the mean
  • Estimate mean vector of s by A-1m, where m is the
    mean the subtracted mean

20
Fast ICA Preprocessing
  • Whitening
  • Transform x so that its components are
    uncorrelated and their variances equal unity
  • Use eigen-value decomposition (EVD) of the
    covariance matrix E
  • D is the diagonal matrix of its eigenvalues
  • E is the orthogonal matrix of eigenvectors

21
Fast ICA Preprocessing
  • Whitening
  • Transforms the mixing matrix into Ã.
  • Makes à orthogonal
  • Lessons the amount of parameters that have to be
    estimated from n2 to n(n-1)/2
  • In large dimensions an orthogonal matrix contains
    approximately ½ the number of parameters

22
Fast ICA Algorithm
  • One-unit (component) version
  • 1. Choose an initial weight vector w.
  • 2. Let w Exg(wTx) Eg'(wTx)w
  • Derivatives of contrast functions G
  • g1(u) tanh(a1u),
  • g2(u) u exp (-u2/2)
  • 3. w w/w. (Normalization step)
  • 4. If not converged go back to 2
  • -converged if norm(wnew wold) gt ? or
    norm(wold-wnew)gt ?
  • - ? typically around 0.0001

23
Fast ICA Algorithm
  • Several unit algorithm
  • Define B as mixing matrix and B' as a matrix
    whose columns are the previously found columns of
    B
  • Add projection step before step 3
  • Step 3 becomes
  • 3. Let w(k) w(k) - B'B'Tw(k). w w/w

24
Simple Simulation
  • Separation of 2 components
  • Figure 1 Two independent non gaussian wav samples

25
Simple Simulation
  • Figure 2 Mixed signals

26
Simple Simulation
  • Recovered signals vs original signals

Figure 3 Recovered signals
Figure 4 Original signals
27
Simulation Results
  • IC 1 recovered in 6 steps and IC 2 recovered in 2
    steps
  • Retested with 20000 samples
  • Requires approximately the same number of steps

28
Gaussian Simulation
Figure 5 2 wav samples and noise signal
29
Gaussian Simulation
Figure 6 3 mixed signals
30
Gaussian Simulation
  • Comparison of recovered signals vs original
    signals

Figure 7 Recovered signals
Figure 8 Original signal
31
Gaussian Simulation 2
  • Tried with 2 gaussian components
  • Components were not estimated properly due to
    more than one Gaussian component

Figure 10 Original signals
Figure 11 Recovered signals
32
Conclusion
  • Fast ICA properties
  • No step size, unlike gradient based ICA
    algorithms
  • Finds any non-Gaussian distribution using any non
    linear g contrast function.
  • Components can be estimated one by one
  • Other Applications
  • Separation of Artifacts in image data
  • Find hidden factors in financial data
  • Reduce noise in natural images
  • Medical signal processing fMRI, ECG, EEG
    (Mackeig)

33
References
  • 1 Aapo Hyvärinen and Erkki Oja, Independent
    Component Analysis Algorithms and Applications.
    Neural Networks Research Centre Helsinki
    University of Technology Neural Networks, 13
    (4-5) 411-430, 2000
  • 2 Aapo Hyvärinen and Erkki Oja, A Fast
    Fixed-Point Algorithm for Independent Component
    Analysis. Helsinki University of Technology
    Laboratory of Computer and Information Science,
    Neural Computation, 914831492, 1997
  • 3 Anthony J. Bell and Terrence J. Sejnowski,
    The Independent Components of Natural Scenes
    are Edeg Filters. Howard Hughes Medical Institute
    Computational Neurobiology Laboratory
  • 4 Te-Won Lee, Mark Girolami, Terrence J.
    Sejnowski, Independent Component Analysis Using
    and Extended Infomax Algorithm for Mixed
    Subgaussian and Supergaussian Sources. 1997
  • 5 Antti Leino, Independent Component Analysis
    An Overview. 2004
  • 6 Erik G. Learned-Miller, John W. Fisher III,
    ICA Using Spacings Estimates of Entropy Journal
    of Machine Learning Research 4 (2003) 1271-1295.
Write a Comment
User Comments (0)
About PowerShow.com