Title: Independent Component Analysis For Time Series Separation
1Independent Component AnalysisFor Time Series
Separation
2ICA
- Blind Signal Separation (BSS) or Independent
Component Analysis (ICA) is the identification
separation of mixtures of sources with little
prior information. - Applications include
-
- Audio Processing
- Medical data
- Finance
- Array processing (beamforming)
- Coding
- and most applications where Factor Analysis and
PCA is currently used.
- While PCA seeks directions that represents data
best in a Sx0 - x2 sense, ICA seeks such
directions that are most independent from each
other. - We will concentrate on Time Series separation of
Multiple Targets
3The simple Cocktail Party Problem
Mixing matrix A
x1
s1
Observations
Sources
x2
s2
x As
n sources, mn observations
4Motivation
Two Independent Sources
Mixture at two Mics
aIJ ... Depend on the distances of the
microphones from the speakers
5Motivation
Get the Independent Signals out of the Mixture
6ICA Model (Noise Free)
- Use statistical latent variables system
- Random variable sk instead of time signal
- xj aj1s1 aj2s2 .. ajnsn, for all j
- x As
- ICs s are latent variables are unknown AND
Mixing matrix A is also unknown
- Task estimate A and s using only the observeable
random vector x
- Lets assume that no. of ICs no of observable
mixtures
- and A is square and invertible
- So after estimating A, we can compute WA-1 and
hence
- s Wx A-1x
7Illustration
2 ICs with distribution Zero mean and v
ariance equal to 1
Mixing matrix A is
The edges of the parallelogram are in the
direction of the cols of A So if we can Est joint
pdf of x1 x2 and then locating the edges, we
can Est A.
8Restrictions
- si are statistically independent
- p(s1,s2) p(s1)p(s2)
- Nongaussian distributions
- The joint density of unit variance s1 s2 is
symmetric. So it doesnt contain any information
about the directions of the cols of the mixing
matrix A. So A cannt be estimated. - If only one IC is gaussian, the estimation is
still possible.
9Ambiguities
- Cant determine the variances (energies) of the
ICs
- Both s A are unknowns, any scalar multiple in
one of the sources can always be cancelled by
dividing the corresponding col of A by it.
- Fix magnitudes of ICs assuming unit variance
Esi2 1
- Only ambiguity of sign remains
- Cant determine the order of the ICs
- Terms can be freely changed, because both s and A
are unknown. So we can call any IC as the first
one.
10ICA Principal (Non-Gaussian is Independent)
- Key to estimating A is non-gaussianity
- The distribution of a sum of independent random
variables tends toward a Gaussian distribution.
(By CLT)
- f(s1)
f(s2) f(x1) f(s1 s2)
- Where w is one of the rows of matrix W.
- y is a linear combination of si, with weights
given by zi.
- Since sum of two indep r.v. is more gaussian than
individual r.v., so zTs is more gaussian than
either of si. AND becomes least gaussian when its
equal to one of si. - So we could take w as a vector which maximizes
the non-gaussianity of wTx.
- Such a w would correspond to a z with only one
non zero comp. So we get back the si.
11Measures of Non-Gaussianity
- We need to have a quantitative measure of
non-gaussianity for ICA Estimation.
- Kurtotis gauss0 (sensitive to outliers)
- Entropy gausslargest
- Neg-entropy gauss 0 (difficult to estimate)
- Approximations
- where v is a standard gaussian random variable
and
12Data Centering Whitening
- Centering
- x x Ex
- But this doesnt mean that ICA cannt estimate the
mean, but it just simplifies the Alg.
- ICs are also zero mean because of
- Es WEx
- After ICA, add W.Ex to zero mean ICs
- Whitening
- We transform the xs linearly so that the x are
white. Its done by EVD.
- x (ED-1/2ET)x ED-1/2ET Ax As
- where Exx EDET
- So we have to Estimate Orthonormal Matrix A
- An orthonormal matrix has n(n-1)/2 degrees of
freedom. So for large dim A we have to est only
half as much parameters. This greatly simplifies
ICA. - Reducing dim of data (choosing dominant Eig)
while doing whitening also help.
13Noisy ICA Model
- x As n
- A ... mxn mixing matrix
- s ... n-dimensional vector of ICs
- n ... m-dimensional random noise vector
- Same assumptions as for noise-free model, if we
use measures of nongaussianity which are immune
to gaussian noise.
- So gaussian moments are used as contrast
functions. i.e.
- however, in pre-whitening the effect of noise
must be taken in to account
- x (ExxT - S)-1/2 x
- x Bs n.
14Simulation Results
I have used the Synthetic data with without
noise to separate the time series of DW AAV
which are moving fairly close to each other
15Simulation Results
16References
- Feature extraction (Images, Video)
- http//hlab.phys.rug.nl/demos/ica/
- Aapo Hyvarinen ICA (1999)
- http//www.cis.hut.fi/aapo/papers/NCS99web/node11.
html
- ICA demo step-by-step
- http//www.cis.hut.fi/projects/ica/icademo/
- Lots of links
- http//sound.media.mit.edu/paris/ica.html
- object-based audio capture demos
- http//www.media.mit.edu/westner/sepdemo.html
- Demo for BBS with CoBliSS (wav-files)
- http//www.esp.ele.tue.nl/onderzoek/daniels/BSS.ht
ml
- Tomas Zemans page on BSS research
- http//ica.fun-thom.misto.cz/page3.html
- Virtual Laboratories in Probability and
Statistics
- http//www.math.uah.edu/stat/index.html