Title: Analyzing Brain Signals by Combinatorial Optimization
1Analyzing Brain Signals by Combinatorial
Optimization
Quantifying statistical interdependence of point
processes Application to spike data and EEG
- Justin Dauwels
- LIDS, MIT
- Amari Research Unit, Brain Science Institute,
RIKEN - December 1, 2008
2Topics
- Mathematical problem
- Similarity of Multiple Point Processes
- Motivation/Application
- Early diagnosis of Alzheimers disease from EEG
signals - Along the way
- Spike synchrony
-
Collaborators François Vialatte, Theophane
Weber, and Andrzej Cichocki (RIKEN, MIT)
Financial Support
3Alzheimer's disease
One disease, many symptoms
Evolution of the disease (stages)
EEG data
- 2 to 5 years before
- mild cognitive impairment (MCI)
- 6 to 25 progress to Alzheimers
memory, language, executive functions, apraxia,
apathy, agnosia, etc
- Mild (early stage)
- becomes less energetic or spontaneous
- noticeable cognitive deficits
- still independent (able to compensate)
Memory (forgetting relatives)
- Moderate (middle stage)
- Mental abilities decline
- personality changes
- become dependent on caregivers
Apathy
- Severe (late stage)
- complete deterioration of the personality
- loss of control over bodily functions
- total dependence on caregivers
Video sources Alzheimer society
- 2 to 5 of people over 65 years old
- up to 20 of people over 80
- Jeong 2004 (Nature)
GOAL Diagnosis of MCI based on EEG
- EEG is relatively simple and inexpensive
technology - Early diagnosis medication more effective, more
time to prepare future care of patient, etc.
4Overview
- Alzheimers Disease (AD)
- decrease in EEG synchrony
- Similarity of Point Processes
- Two 1-D point processes
- Two multi-D point processes
- Multiple multi-D point processes
- Numerical Results
- Conclusion
5Alzheimer's disease
Inside glimpse abnormal EEG
EEG system inexpensive, mobile, useful for
screening
Brain slow-down
slow rhythms (0.5-8 Hz) fast rhythms (8-30 Hz)
(Babiloni et al., 2004 Besthorn et al.,
1997 Jelic et al. 1996, Jeong 2004 Dierks et
al., 1993).
focus of this project
Decrease of synchrony
- AD vs. MCI (Hogan et al. 203 Jiang et
al., 2005) - AD vs. Control (Hermann, Demilrap, 2005,
Yagyu et al. 1997 Stam et al., 2002 Babiloni et
al. 2006) - MCI vs. mildAD (Babiloni et al., 2006).
Images www.cerebromente.org.br
6Spontaneous (scalp) EEG
Time-frequency X(t,f)2 (wavelet transform)
f (Hz)
Time-frequency patterns (bumps)
Fourier X(f)2
Fourier power
t (sec)
amplitude
EEG x(t)
7Sparse representation bump model
f(Hz)
f(Hz)
Bumps Sparse representation
t (sec)
f(Hz)
t (sec)
104- 105 coefficients
t (sec)
- Assumptions
- time-frequency map is suitable representation
- oscillatory bursts (bumps) convey key
information
about 102 parameters
F. Vialatte et al. A machine learning approach
to the analysis of time-frequency maps and its
application to neural dynamics, Neural Networks
(2007).
8Similarity of bump models
How similar are n 2 bump models?
Similarity of multiple multi-dimensional point
processes
with
and
point / event
9Overview
- Alzheimers Disease (AD)
- decrease in EEG synchrony
- Similarity of Point Processes
- Two 1-dim point processes
- Two multi-dim point processes
- Multiple multi-dim point processes
- Numerical Results
- Conclusion
10Two one-dimensional point processes
x
t
0
x
0
t
How synchronous/similar? Classical methods for
continuous time series fail e.g.,
cross-correlation
11Two aspects of synchrony
- Analogy waiting for a train
- Train may not arrive (e.g., mechanical problem)
- Event reliability
- Train may or may not be on time
- Timing precision
12Two 1-dim point processes
- Review of Spike Synchrony Measures
- Surrogate Spike Data
- Spike Trains from Morris-Lecar Neuron
- Conclusion
13Spike Synchrony Measures
- Von Rossum distance (mixed)
- Schreiber et al similarity measure (mixed)
- Hunter-Milton similarity measure (mixed)
- Victor-Purpura distance metric (event
reliability) - Event synchronization (mixed)
- Stochastic event synchrony
- (timing precision and event reliability)
14Van Rossum distance measure
- Spikes convolved with exponential or Gaussian
function - ? spike trains converted into time series s(t)
and s(t) -
- Squared distance between s(t) and s(t)
- If x x, we have DR 0
- Time constant tR
tR
van Rossum M.C.W., 2001. A novel spike distance.
Neural Computation 13, 75163.
15Schreiber et al. similarity measure
- Spikes convolved with exponential or Gaussian
function - ? spike trains converted into time series s(t)
and s(t) - Correlation between s(t) and s(t)
- If x x, we have SS 1
- Time constant tS
Schreiber S., Fellous J.M., Whitmer J.H.,
Tiesinga P.H.E., and Sejnowski T.J., 2003. A new
correlation-based measure of spike timing
reliability. Neurocomputing 52, 925931.
16Victor-Purpura distance measure
- Minimal cost DV of transforming x into x'
- Basic operations
- event insertion/deletion cost 1
- event movement cost proportional to distance
(constant CV) - If x x, we have DV 0
- Time constant tV 1/CV
DELETION
INSERTION
Victor J. D. and Purpura K. P., 1997.
Metric-space analysis of spike trains theory,
algorithms, and application. Network Comput.
Neural Systems 8(17), 127164.
17Stochastic Event Synchrony
- x and x synchronous if identical apart from
- delay
- little timing jitter
- few deletions/insertions
- based on generative statistical model
x
0
v
0
x
0
Dauwels J., Vialatte F., Rutkowski T., and
Cichocki A., 2007. Measuring neural synchrony by
message passing, NIPS 20, in press.
18Stochastic Event Synchrony
T0
v
0
T0
-dt /2
T0
dt /2
Stochastic event synchrony (SES) delay dt ,
jitter st , non-coincidence ?
Dauwels J., Vialatte F., Rutkowski T., and
Cichocki A., 2007. Measuring neural synchrony by
message passing, NIPS 20, in press.
19Stochastic Event Synchrony
T0
v
0
T0
-dt /2
T0
dt /2
Dauwels J., Vialatte F., Rutkowski T., and
Cichocki A., 2007. Measuring neural synchrony by
message passing, NIPS 20, in press.
20Probabilistic inference
PROBLEM Given 2 point processes x and x,
compute ? and ? dt , st
APPROACH (j, j,?) argmaxj,j,? log p(x,
x, j, j,?)
SOLUTION Coordinate descent (j(i1) , j(i1)
) argmaxj,j log p(x, x, j , j , ?(i))
?(i1) argmaxx log p(x, x, j(i1) ,
j(i1) , ?)
DYNAMIC PROGRAMMING
PARAMETER ESTIMATION
x6
x5
x4
x3
x2
x1
xk non-coincident
xk non-coincident
(xk xk ) coincident pair
0
0
x1
x2
x3
x4
x5
x6
21Spike Synchrony Measures
- Von Rossum distance (mixed)
- Schreiber et al similarity measure (mixed)
- Hunter-Milton similarity measure (mixed)
- Victor-Purpura distance metric (event
reliability) - Event synchronization (mixed)
- Stochastic event synchrony
- (timing precision and event reliability)
22Two 1-dim point processes
- Review of Spike Synchrony Measures
- Surrogate Spike Data
- Spike Trains from Morris-Lecar Neuron
- Conclusion
23Surrogate Data
- pd 0, 0.1, , 0.4 (deletion probability)
- dt 0, 25, and 50 ms (delay)
- st 10, 30, and 50 ms (timing jitter)
- length of hidden sequence 40/(1-pd)
- expected length of x and x 40
- ES computed over 10000 pairs
24Surrogate Data Results
dt 0
Van Rossum measure DR similar for SS ,SH ,SQ
Victor Purpura measure DV
- EDR increases with pd and st
- ? DR cannot distinguish timing dispersion from
event reliability - (likewise all measures except SES and DV)
-
- EDV increases with pd, practically
independent of st - ? DV measure for event reliability
- ONLY curves for dt 0ms, measures strongly
depend on lag
25Surrogate Data Results for SES
- Est increases with st, practically independent
of pd - ? st measure for timing dispersion
-
- E? increases with pd, practically independent
of st - ? ? measure for event reliability
- Curves for dt 0, 25, and 50 ms practically
coincident
26Two 1-dim point processes
- Review of Spike Synchrony Measures
- Surrogate Spike Data
- Spike Trains from Morris-Lecar Neuron
- Conclusion
27Morris-Lecar Neurons
- Simple neuron model
- Exhibits behavior of Type I II neurons
(saddle-node/Hopf bifurc.) - Input current baseline sinusoid Gaussian
noise - Membrane potential
Spiking threshold
Type II
Type I
5 trials
28Morris-Lecar Neurons (2)
Type II
Type I
50 trials
Low reliability Small timing dispersion
High reliability Large timing dispersion
jitter st (3ms)2, non-coincidence ? 27
jitter st (15ms)2, non-coincidence ? 3
29Morris-Lecar Neurons Results
- Small t Type II has larger similarity than type
I (dispersion in Type I) - Large t Type I has larger similarity than type
II (drop-outs in Type II) - Observation
- Similarity depends on time constant t ?
similarity FUNCTION S(t) - SES AUTOMATICALLY selects st
30Two 1-dim point processes
- Review of Spike Synchrony Measures
- Surrogate Spike Data
- Spike Trains from Morris-Lecar Neuron
- Conclusion
31Conclusion
- Similarity of pairs of spike trains timing
precision and reliability - Comparison of various spike synchrony measures
- Most measures not able to separate the two
aspect of synchrony - Exception Victor-Purpura and Stochastic Event
Synchrony - Victor-Purpura event reliability
- SES both timing precision and event reliability
- Most measures depend on time constant, to be
chosen by user - Exception Event Synchronization and SES
- Most measures sensitive to lags between the two
spike trains - Exception SES
- Future work application to neurophysiological
recordings
32Overview
- Alzheimers Disease (AD)
- decrease in EEG synchrony
- Similarity of Point Processes
- Two 1-dim point processes
- Two multi-dim point processes
- Multiple multi-dim point processes
- Numerical Results
- Conclusion
33Similarity of two bump models...
34... by matching bumps
- Bumps in one model, but NOT in other
- ? fraction of non-coincident bumps ?
- Bumps in both models, but with offset
- ? Average time offset dt (delay)
- ? Timing jitter with variance st
- ? Average frequency offset df
- ? Frequency jitter with variance sf
Stochastic Event Synchrony (SES) (?, dt,
st, df, sf )
PROBLEM Given two bump models, compute (?, dt,
st, df, sf )
35Generative model
yhidden
- Generate bump model (hidden)
- geometric prior for number of bumps
- bumps are uniformly distributed in rectangle
- amplitude, width (in t and f) all i.i.d.
- Generate two noisy observations
- offset between hidden and observed bump
- Gaussian random vector with
- mean ( dt /2, df /2)
- covariance diag(st/2, sf /2)
- amplitude, width (in t and f) all i.i.d.
- deletion with probability pd
y
y
( -dt /2, -df /2)
( dt /2, df /2)
Dauwels J., Vialatte F., Rutkowski T., and
Cichocki A., 2007. Measuring neural synchrony by
message passing, NIPS 20, in press.
36Summary
PROBLEM Given two bump models, compute (?, dt,
st, df, sf )
?
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
SOLUTION Coordinate descent c(i1)
argmaxc log p(y, y, c, ?(i) ) ?(i1)
argmaxx log p(y, y, c(i1) ,? )
MATCHING ? max-product
ESTIMATION ? closed-form
Dauwels J., Vialatte F., Rutkowski T., and
Cichocki A., 2007. Measuring neural synchrony by
message passing, NIPS 20, in press.
37Average synchrony
3. SES for each pair of models 4. Average the SES
parameters
- Group electrodes in regions
- Bump model for each region
38Overview
- Alzheimers Disease (AD)
- decrease in EEG synchrony
- Similarity of Point Processes
- Two 1-dim point processes
- Two multi-dim point processes
- Multiple multi-dim point processes
- Numerical Results
- Conclusion
39Beyond pairwise interactions
Multi-variate similarity
Pairwise similarity
40Similarity of multiple bump models
y2
y1
y3
y4
y5
- Models similar if
- few deletions/large clusters
- little jitter
y1
y2
y3
y4
y5
Constraint in each cluster at most one bump
from each signal
Dauwels J., Vialatte F., Weber T. and Cichocki.
Analyzing Brain Signals by Combinatorial
Optimization, Allerton 2008.
41Generative model
yhidden
- Generate bump model (hidden)
- geometric prior for number n of bumps
- bumps are uniformly distributed in rectangle
- amplitude, width (in t and f) all i.i.d.
y1
y2
y3
y4
y5
- Generate M noisy observations
- offset between hidden and observed bump
- Gaussian random vector with
- mean ( dt,m /2, df,m /2)
- covariance diag(st,m/2, sf,m
/2) - amplitude, width (in t and f) all i.i.d.
- deletion with probability pd
pc (i) p(cluster size i y) (i
1,2,,M)
Parameters ? dt,m , df,m , st,m , sf,m, pc
Dauwels J., Vialatte F., Weber T. and Cichocki.
Analyzing Brain Signals by Combinatorial
Optimization, Allerton 2008.
42Probabilistic inference
PROBLEM Given M bump models, compute ? dt,m ,
df,m , st,m , sf,m, pc
APPROACH (b,?) argmaxb,? log p(y, y, b,
?)
SOLUTION Coordinate descent b(i1)
argmaxc log p(y, y, b, ?(i) ) ?(i1)
argmaxx log p(y, y, b(i1) ,? )
CLUSTERING (Integer Program)
ESTIMATION OF PARAMETERS
- Integer programming methods (e.g., LP relaxation)
- IP with 10.000 variables solved in about 1s
- CPLEX commercial toolbox for solving IPs
(combines several algorithms)
Dauwels J., Vialatte F., Weber T. and Cichocki.
Analyzing Brain Signals by Combinatorial
Optimization, Allerton 2008.
43Overview
- Alzheimers Disease (AD)
- decrease in EEG synchrony
- Similarity of Point Processes
- Two 1-dim point processes
- Two multi-dim point processes
- Multiple multi-dim point processes
- Numerical Results
- Conclusion
44EEG Data
- EEG of 22 Mild Cognitive Impairment (MCI)
patients and 38 age-matched - control subjects (CTR) recorded while in rest
with closed eyes - ? spontaneous EEG
- All 22 MCI patients suffered from Alzheimers
disease (AD) later on - Electrodes located on 21 sites according to
10-20 international system - Electrodes grouped into 5 zones (reduces number
of pairs) - 1 bump model per zone
- Band pass filtered between 4 and 30 Hz
EEG data provided by Prof. T. Musha
45Similarity measures
- Correlation and coherence
- Granger causality (linear system) DTF, ffDTF,
dDTF, PDC, PC, ... - Phase Synchrony compare instantaneous phases
(wavelet/Hilbert transform) - State space based measures
- sync likelihood, S-estimator,
S-H-N-indices, ... - Information-theoretic measures
FREQUENCY
TIME
No Phase Locking
Phase Locking
46Sensitivity (average synchrony)
Corr/Coh
Granger
Info. Theor.
State Space
Phase
SES
Significant differences for ffDTF and SES (more
unmatched bumps, but same amount of jitter)
Mann-Whitney test small p value suggests large
difference in statistics of both groups
47Classification (bi-SES)
85 correctly classified
ffDTF
- Clear separation, but not yet useful as
diagnostic tool - Additional indicators needed (fMRI, MEG, DTI,
...) - Can be used for screening population
(inexpensive, simple, fast)
48Correlations
Strong (anti-) correlations families of
sync measures
49Overview
- Alzheimers Disease (AD)
- decrease in EEG synchrony
- Similarity of Point Processes
- Two 1-dim point processes
- Two multi-dim point processes
- Multiple multi-dim point processes
- Numerical Results
- Conclusion
50Conclusions
- Measure for similarity of point processes
- Key idea matching of events
- Applications
- Spiking synchrony (surrogate data/Morris Lecar
neuron) - EEG synchrony of MCI patients
- SES allows to distinguish event reliability from
timing precision - About 85-90 correctly classified MCI vs. healthy
subjects - perhaps useful for screening a large
population - Future work
- Combination with other modalities (MEG, fMRI,
...) - Integration of biophysical models
- Alternative inference techniques (variations on
max-product, Monte-Carlo)
51Analyzing Brain Signals by Combinatorial
Optimization
Quantifying statistical interdependence of point
processes Application to spike data and EEG
- Justin Dauwels
- LIDS, MIT
- Amari Research Unit, Brain Science Institute,
RIKEN - December 1, 2008
52References software
- References
- Quantifying Statistical Interdependence by
Message Passing on Graphs - PART I One-Dimensional Point Processes, Neural
Computation (under revision) - Quantifying Statistical Interdependence by
Message Passing on Graphs - PART II Multi-Dimensional Point Processes,
Neural Computation (under revision) - Quantifying Statistical Interdependence by
Message Passing on Graphs - PART III Multivariate Approach, Neural
Computation (in preparation) - A Comparative Study of Synchrony Measures for the
Early Diagnosis of Alzheimer's Disease - Based on EEG, NeuroImage (under revision)
- On the Early Diagnosis of Alzheimer's Disease
Based on EEG, - Current Alzheimers Research (in preparation,
invited review) - Measuring Neural Synchrony by Message Passing,
NIPS 2007
- Software
- MATLAB implementation of the synchrony measures
- MATLAB Toolbox for bump modelling
.
53Summary
Similarity of multiple multi-dimensional point
processes
Step 1 TWO ONE-dimensional point processes
Dynamic programming
Step 2 TWO MULTI-dimensional point processes
Max-product/LP relaxation/Edmund-Karp
Step 3 MULTIPLE MULTI-dimensional point processes
Integer Programming
54Estimation
Simple closed form expressions
Deltas average offset
Sigmas var of offset
artificial observations (conjugate prior)
...where
55Large-scale synchrony
Apparently, all brain regions affected...
56Alzheimer's disease
Outside glimpse the future (prevalence)
USA (Hebert et al. 2003)
- 2 to 5 of people over 65 years old
- Up to 20 of people over 80
- Jeong 2004 (Nature)
Million of sufferers
World (Wimo et al. 2003)
Million of sufferers
57Ongoing and future work
Applications
- Fluctuations of EEG synchrony
- Caused by auditory stimuli and music (T.
Rutkowski) - Caused by visual stimuli (F. Vialatte)
- Yoga professionals (F. Vialatte)
- Professional shogi players (RIKEN Fujitsu)
- Brain-Computer Interfaces (T. Rutkowski)
- Spike data from interacting monkeys (N. Fujii)
- Calcium propagation in gliacells (N. Nakata)
- Neural growth (Y. Tsukada Y. Sakumura)
- ...
Algorithms
- alternative inference techniques (e.g., MCMC,
linear programming) - time dependent (Gaussian processes)
- multivariate (T.Weber)
58Fitting bump models
?
Signal
gradient method
F. Vialatte et al. A machine learning approach
to the analysis of time-frequency maps and its
application to neural dynamics, Neural Networks
(2007).
59Boxplots
- SURPRISE!
- No increase in jitter, but significantly less
matched activity! - Physiological interpretation
- neural assemblies more localized?
- harder to establish large-scale synchrony?
60Generative model
yhidden
- Generate bump model (hidden)
- geometric prior for number n of bumps
- p(n) (1- ? S) (? S)-n
- bumps are uniformly distributed in rectangle
- amplitude, width (in t and f) all i.i.d.
- Generate two noisy observations
- offset between hidden and observed bump
- Gaussian random vector with
- mean ( dt /2, df /2)
- covariance diag(st/2, sf /2)
- amplitude, width (in t and f) all i.i.d.
- deletion with probability pd
y
y
( -dt /2, -df /2)
( dt /2, df /2)
Easily extendable to more than 2 observations
61Probabilistic inference
PROBLEM Given two bump models, compute (?spur,
dt, st, df, sf )
?
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
SOLUTION Coordinate descent c(i1)
argmaxc log p(y, y, c, ?(i) ) ?(i1)
argmaxx log p(y, y, c(i1) ,? )
MATCHING
POINT ESTIMATION
62Alzheimer's disease
Inside glimpse abnormal EEG
EEG system inexpensive, mobile, useful for
screening
Brain slow-down
slow rhythms (0.5-8 Hz) fast rhythms (8-30 Hz)
(Babiloni et al., 2004 Besthorn et al.,
1997 Jelic et al. 1996, Jeong 2004 Dierks et
al., 1993).
focus of this project
Decrease of synchrony
- AD vs. MCI (Hogan et al. 203 Jiang et
al., 2005) - AD vs. Control (Hermann, Demilrap, 2005,
Yagyu et al. 1997 Stam et al., 2002 Babiloni et
al. 2006) - MCI vs. mildAD (Babiloni et al., 2006).
Images www.cerebromente.org.br
63Comparing EEG signal rhythms ?
2 signals
PROBLEM I Signals of 3 seconds sampled at
100 Hz (? 300 samples) Time-frequency
representation of one signal about 25 000
coefficients
64Comparing EEG signal rhythms ?(2)
PROBLEM II Shifts in time-frequency!
65Generative model
yhidden
- Generate bump model (hidden)
- geometric prior for number n of bumps
- p(n) (1- ? S) (? S)-n
- bumps are uniformly distributed in rectangle
- amplitude, width (in t and f) all i.i.d.
y1
y2
y3
y4
y5
- Generate M noisy observations
- offset between hidden and observed bump
- Gaussian random vector with
- mean ( dt,m /2, df,m /2)
- covariance diag(st,m/2, sf,m
/2) - amplitude, width (in t and f) all i.i.d.
- deletion with probability pd
pc (i) p(cluster size i y) (i
1,2,,M)
Parameters ? dt,m , df,m , st,m , sf,m, pc
66Classification (multi-SES)
Average bump freq
85 correctly classified
Average cluster size
Average bump width
90 correctly classified
Average cluster size
ffDTF
67Similarity of bump models...
How similar or synchronous are two bump
models?
68Signatures of local synchrony
f (Hz)
Time-frequency patterns (bumps)
EEG stems from thousands of neurons bump if
neurons are phase-locked local synchrony
t (sec)
69Alzheimer's disease
Inside glimpse brain atrophy
amyloid plaques and neurofibrillary tangles
Video source Alzheimer society
Images Jannis Productions. (R. Fredenburg S.
Jannis)
Video source P. Thompson, J.Neuroscience, 2003
70Probabilistic inference
POINT ESTIMATION ?(i1) argmaxx log p(y, y,
c(i1) ,? )
Uniform prior p(?) dt, df average
offset, st, sf variance of offset Conjugate
prior p(?) still closed-form expression Other
kind of prior p(?) numerical optimization
(gradient method)
71Probabilistic inference
MATCHING c(i1) argmaxc log p(y, y, c, ?(i)
)
EQUIVALENT to (imperfect) bipartite max-weight
matching problem c(i1) argmaxc log p(y, y,
c, ?(i) ) argmaxc Skk wkk(i) ckk
s.t. Sk ckk 1 and Sk ckk 1 and ckk 2
0,1
find heaviest set of disjoint edges
not necessarily perfect
- ALGORITHMS
- Polynomial-time algorithms gives optimal
solution(s) (Edmond-Karp and Auction
algorithm) - Linear programming relaxation extreme points of
LP polytope are integral - Max-product algorithm gives optimal solution if
unique Bayati et al. (2005), Sanghavi (2007)
72Max-product algorithm
MATCHING c(i1) argmaxc log p(y, y, c, ?(i)
)
Generative model
p(y, y, c, ?) / I(c) p?(?) ?kk (N(t k tk
dt ,st,kk) N(f k fk df ,sf, kk) ß-2)ckk
73Max-product algorithm
MATCHING c(i1) argmaxc log p(y, y, c, ?(i)
)
Conditioning on ?
µ?
µ?
µ?
µ?
74Max-product algorithm (2)
- Iteratively compute messages
- At convergence, compute marginals p(ckk)
µ?(ckk) µ?(ckk) µ?(ckk)
75Algorithm
PROBLEM Given two bump models, compute (?spur,
dt, st, df, sf )
?
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
SOLUTION Coordinate descent c(i1)
argmaxc log p(y, y, c, ?(i) ) ?(i1)
argmaxx log p(y, y, c(i1) ,? )
MATCHING ? max-product
ESTIMATION ? closed-form
76Generative model
yhidden
- Generate bump model (hidden)
- geometric prior for number n of bumps
- p(n) (1- ? S) (? S)-n
- bumps are uniformly distributed in rectangle
- amplitude, width (in t and f) all i.i.d.
- Generate two noisy observations
- offset between hidden and observed bump
- Gaussian random vector with
- mean ( dt /2, df /2)
- covariance diag(st/2, sf /2)
- amplitude, width (in t and f) all i.i.d.
- deletion with probability pd
y
y
( -dt /2, -df /2)
( dt /2, df /2)
Easily extendable to more than 2 observations
77Generative model (2)
y
y
i
( -dt /2, -df /2)
i
j
( dt /2, df /2)
- Binary variables ckk
- ckk 1 if k and k are observations of
same hidden bump, else ckk 0 (e.g., cii 1
cij 0) - Constraints bk Sk ckk and bk Sk ckk
are binary (matching constraints) - Generative Model p(y, y, yhidden , c, dt , df
, st , sf ) (symmetric in y and y) - Eliminate yhidden ? offset is Gaussian RV with
mean ( dt , df ) and covariance diag (st , sf) - Probabilistic Inference
?
p(y, y, c, ?) ? p(y, y, yhidden , c, ?)
dyhidden
(c,?) argmaxc,? log p(y, y, c, ?)
78Summary
- Bumps in one model, but NOT in other
- ? fraction of spurious bumps ?spur
- Bumps in both models, but with offset
- ? Average time offset dt (delay)
- ? Timing jitter with variance st
- ? Average frequency offset df
- ? Frequency jitter with variance sf
PROBLEM Given two bump models, compute (?spur,
dt, st, df, sf )
?
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
79Objective function
y
y
i
( -dt /2, -df /2)
i
j
( dt /2, df /2)
- Logarithm of model log p(y, y, c, ?) Skk
wkk ckk log I(c) log p?(?) ?
wkk -(1/st (t k tk dt)2 1/sf (f k
fk df)2 ) - 2 log ß ß pd (?/V)1/2
Euclidean distance between bump centers
- Large wkk if a) bumps are close b)
small pd c) few bumps per volume element - No need to specify pd , ?, and V, they only
appear through ß knob to control matches
80Distance measures
Scaling
wkk 1/st,kk (t k tk dt)2 1/sf,kk
(f k fk df)2 2 log ß st,kk (?tk
?tk) st sf,kk (?fk ?fk) sf
Non-Euclidean
81Generative model
p(y, y, c, ?) / I(c) p?(?) ?kk (N(t k tk
dt ,st,kk) N(f k fk df ,sf, kk) ß-2)ckk
82Prior for parameters
- Expect bumps to appear at about same frequency,
but delayed - Frequency shift requires non-linear
transformation, less likely than delay - Conjugate priors for st and sf (scaled inverse
chi-squared) - Improper prior for dt and dt p(dt) 1 p(df)
83Preliminary results for multi-variate model
linear comb of pc
CTR
MCI
84Probabilistic inference
PROBLEM Given two bump models, compute (?spur,
dt, st, df, sf )
?
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
SOLUTION Coordinate descent c(i1)
argmaxc log p(y, y, c, ?(i) ) ?(i1)
argmaxx log p(y, y, c(i1) ,? )
MATCHING
POINT ESTIMATION
Minx2 X, y2Y d(x,y)
X
Y
85Generative model
yhidden
- Generate bump model (hidden)
- geometric prior for number n of bumps
- p(n) (1- ? S) (? S)-n
- bumps are uniformly distributed in rectangle
- amplitude, width (in t and f) all i.i.d.
y1
y2
y3
y4
y5
- Generate M noisy observations
- offset between hidden and observed bump
- Gaussian random vector with
- mean ( dt,m /2, df,m /2)
- covariance diag(st,m/2, sf,m
/2) - amplitude, width (in t and f) all i.i.d.
- deletion with probability pd
- (other prior pc0 for cluster size)
pc (i) p(cluster size i y) (i
1,2,,M)
Parameters ? dt,m , df,m , st,m , sf,m, pc
86Role of local synchrony
Stimuli
Consolidation
Stimulus
Assembly activation
Assembly recall
Hebbian consolidation
Voice
Face
Voice
(Hebb 1949, Fuster 1997)
87Probabilistic inference
PROBLEM Given M bump models, compute ? dt,m ,
df,m , st,m , sf,m, pc
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
SOLUTION Coordinate descent c(i1)
argmaxc log p(y, y, c, ?(i) ) ?(i1)
argmaxx log p(y, y, c(i1) ,? )
CLUSTERING (IP or MP)
POINT ESTIMATION
- Integer program
- Max-product algorithm (MP) on sparse graph
- Integer programming methods (e.g., LP
relaxation)
88Fourier transform
2
3
1
3
2
1
Frequency
High frequency
Low frequency
89Windowed Fourier transform
Fourier basis functions
Window function
windowed basis functions
f
Windowed Fourier Transform
t
90Overview
- Alzheimers Disease (AD)
- decrease in EEG synchrony
- Synchrony measure in time-frequency domain
- Pairs of EEG signals
- Collections of EEG signals
- Numerical Results
- Conclusion
91Average synchrony
3. SES for each pair of models 4. Average the SES
parameters
- Group electrodes in regions
- Bump model for each region
92Beyond pairwise interactions...
Multi-variate similarity
Pairwise similarity
93Similarity measures
- Correlation and coherence
- Granger causality (linear system) DTF, ffDTF,
dDTF, PDC, PC, ... - Phase Synchrony compare instantaneous phases
(wavelet/Hilbert transform) - State space based measures
- sync likelihood, S-estimator,
S-H-N-indices, ... - Information-theoretic measures
FREQUENCY
TIME
No Phase Locking
Phase Locking
94Generative model (2)
Model
Cost function
unit cost of non-coincident event
unit cost of coincident pair
95Surrogate Data Results (2)
- SS depends on dt
- likewise other S except SES
96Probabilistic inference
PROBLEM Given two bump models, compute (?, dt,
st, df, sf )
?
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
SOLUTION Coordinate descent c(i1)
argmaxc log p(y, y, c, ?(i) ) ?(i1)
argmaxx log p(y, y, c(i1) ,? )
MATCHING
POINT ESTIMATION
97Probabilistic inference (2)
MATCHING c(i1) argmaxc log p(y, y, c, ?(i)
)
EQUIVALENT to (imperfect) bipartite max-weight
matching problem c(i1) argmaxc log p(y, y,
c, ?(i) ) argmaxc Skk wkk(i) ckk
s.t. Sk ckk 1 and Sk ckk 1 and ckk 2
0,1
find heaviest set of disjoint edges
not necessarily perfect
- ALGORITHMS
- Polynomial-time algorithms gives optimal
solution(s) (Edmond-Karp and Auction
algorithm) - Linear programming relaxation gives optimal
solution if unique Sanghavi (2007) - Max-product algorithm gives optimal solution if
unique Bayati et al. (2005), Sanghavi (2007)
98Max-product algorithm
MATCHING c(i1) argmaxc log p(y, y, c, ?(i)
)
µ?
µ?
µ?
µ?
- At convergence, compute marginals p(ckk)
µ?(ckk) µ?(ckk) µ?(ckk) - Decisions ckk argmaxckk p(ckk)
(optimal if solution unique)
99Exemplar-based formulation
yhidden
y1
y2
y3
y4
y5
- Exemplars identical copies of hidden bumps
cluster center - Other bumps in cluster non-identical copies of
exemplars
- Is event an exemplar?
- If not, which exemplar is it associated with?
- Several constraints
Integer program
100Exemplar-based formulation IP
Binary Variables
Integer Program LINEAR objective
function/constraints
Equivalent to k-dim matching for k 2 in P but
for k gt 2 NP-hard!