Title: Modelling Strategies in Functional Magnetic Resonce Imaging
1Modelling Strategies in Functional Magnetic
Resonce Imaging
14th of October 2009
Kristoffer Hougaard Madsen
Danish Research Centre for Magnetic
Resonance Copenhagen University Hospital Hvidovre
DTU Informatics Technical University of Denmark
2Overview
- Part I Preliminaries
- Functional Magnetic Resonance Imaging
- Pre-processing
- Nuisance Variable Regression
- Part II Statistical Parametric Mapping
- Part III Unsupervised analysis
- Conclusion
3Magnetic Resonance Imaging
- Hydrogen has spin property
- Strong magnetic field causes partial alignment
- Perturbed by RF waves at resonance frequency
- Emits RF during the return to equilibrium
- Spatially varying gradients allow images to be
recorded
4Functional MRI
- Typically based on haemodynamics
- Indirect measure of neural activity
- Neuronal activity requires oxygen
- Brain is well perfused
- Increased activity -gt local increase in blood flow
5Blood Oxygenation Level Dependent (BOLD) signal
- Increased neuronal activity gives rise to
increased oxygen metabolism (CMRO2) - This causes increased blood flow (CBF) and
therefore increased blood volume (CBV) - Presence of deoxygenated blood (dHb) causes
signal loss in T2 weighted MRI (transverse
magnetization lost more quickly)
CMRO2
CBF
-
-
CBV
dHb
Buxton et al. (Balloon model), 1998 MRM
6Data acqusition
- Measure volumes repeatedly
(repetition time 0.1-3 s)
7Noise
- Scanner instability
- Typically low frequency (lt0.01 Hz)
- Subject movement
- Rigid body
- Spin history
- Movement by field inhomogeniety
- Physiological effects
- Cardiac cycle (0.5-2 Hz, often appears aliased)
- Respiratory cycle ( 0.1-0.5 Hz)
- End tidal CO2 volume / respiration volume over
time - Secondary and interaction effects
8Typical pre-processing steps
- Retrospective rigid body realignment
- Co-registering/Normalisation
- Slice-timing (time interpolation)
- Spatial smoothing
- Modelling nuisance effects
- High-pass filter
- Cardiac/respiratory regressors
- Movement (residual effects)
9Nuisance Variable Regression (NVR)
- Cardiac effects
- Effects of the cardiac cycle will appear aliased
due to the low sampling rate - Can appear at any frequency
- Measure cardiac cycle using ECG or pulse-oximeter
- Model effect by aliased sine and cosine functions
representing the cardiac cycle (model any phase) - Respiration effects
- Gross head movement
- Blood oxygenation changes
- Field changes due to movement of organs in the
abdomen - Measure respiration using pneumatic belt
- Model effect by Fourier expansion of the
respiration cycle - Residual motion
- Most prominent near edges
- Model by Volterra expansion of the movement
parameters from the realignment procedure
Lund et al., Neuroimage 2006 Glover et al.,
MRM 2000 Friston et al. 1996, MRM, 35,
346-55.
10Detecting fMRI signals
High-pass filter
Residual movement
Paradigm
Note Bayesian model selection No
thresholding Madsen and Hansen, 2008
11Constructing the fMRI Matrix
TR
time
data array 646440
data array 646440
data array 646440
...
time
space
YT
spatial column vector 163840x1
...
data matrix (X) of spacetime
12Overview
- Part I Basic Principles
- Part II Statistical Parametric Mapping
- Stimulation Protocol
- General Linear Model (GLM)
- Inference
- Part III Unsupervised analysis
- Part IV Experiments
- Conclusion
13Paradigm
- Control what kind of neural processing goes on
simple visual example - Construct reference time series of expected
activation by convolving with impulse response
function - Assuming LTI system
- In the analysis compare signal to this
time
14SPM and the General Linear Model (GLM)
- Model (M) (univariate - single voxel time series)
- Likelihood (multivariate Gaussian noise)
Design matrix (TK)
Residual noise (model inadequacy)
Parameters (estimated)
15Hypothesis testing
- Estimate the scale of the covariance from data
(assuming rest of covariance is known) - Null distribution
- How likely is that? (p-value small means
unlikely meaning we will accept the alternative
hypothesis)
16Overlaying activity
- Threshold
- Multiple comparisons
- Familywise error (random field theory)
- False discovery rate
- Overlay thresholded SPM on anatomical image
17Overview
- Part I Basic Principles
- Part II Statistical Parametric Mapping
- Part III Unsupervised analysis
- Linear latent variable model
- Uniqueness, how many components?
- Multisubject analysis and multiway decomposition
18Decomposition as a cocktail party problem
Instantaneous mixing
A1,1
A1,2
A2,1
Y
19The fMRI Party problem
Mixing matrix (estimated)
Residual noise (model inadequacy)
Sources (estimated)
- Likelihood (matrix-variate Gaussian noise)
Where are diagonal (residual is
independent over space and time)
Assumption Data instantaneous mixture of
temporal signatures. (PCA/ICA/NMF)
?
Flaw
X?AS(AQ-1)(QS)ÂS
Representation not unique!
Additional information needed (independence,
sparseness, non-negativity)
20Spatial FA model without the math
matrix of sources (S) aka. spatial components
mixing matrix (A) aka. component time series
data matrix (Y) timespace
space
time
?
21Temporal FA model without the math
data matrix (Y T) spacetime
mixing matrix (A) voxel maps
matrix of sources (S) aka. temporal components
time
space
?
22Singular Value Decomposition
- SVD
- Unique (up to permutation of components)
- Equivalent to PCA
- Convex optimization problem
(one global
solution easy to find) - Sort components according to singular values
- Truncate to obtain approximate model
- The orthogonality constraint is often not
appropriate - Spatial/temporal versions are equivalent
diagonal
23Infomax-ICA
- Start with truncated SVD solution ( - k
sources) - Determine unmixing matrix
- Maximize probability of sources thereby
determining - Assumes independent sources (in this case
spatially) - Sources are assumed non-gaussian (prior)
- Feasible because mixing causes more Gaussian
distributions (central-limit theorem) - In general a non-convex optimasation problem
- Must select non-linearity/prior for sources
Bell Sejnowski, 1995
24Independent Component Analysis
(Example of single subject analysis)
Stimuli full-checkerboard (8Hz), each trial
consist of 10 seconds pause 10 seconds stimuli
and 10 seconds pause. Data acquired at 3 Hz.
25Sparse coding
- Laplace prior for sources (S)
- Negative log posterior proportional to
- Avoid trivial solution where S 0 and A 8
- Normalise A
- Gaussian prior for A
- When performing MAP estimation S tends to become
sparse (many zeros for high l) - Estimate alternating between A and S
Olshausen and Fields, Nature 1996
26How many components?
- Look at singular values of data
- Find the corner (broken-stick method)
Heuristic Very difficult to do in practice the
curve is often very smooth Threshold can be
regarded a noise variance
- Large sample approximations to the model evidence
- Laplace approximation to the model evidence
- Bayesian Information Criterion (BIC) / Minimum
Description Lenght (MDL) - Akaikes Information Criterion (AIC)
- Final Prediction Error (FPE)
- Automatic Relevance Determination
27Cross-validation
Split data into training and test set, learn
model parameters on training set and use these
parameters to predict
Problem facing unsupervised learning diffuculties
in selecting training and test set so they are
independent due to correlation in residual (i.e.
noise correlated, rendering missing values not
truly missing in the training set)
28From 2-way to multi-way analysis
Space
Trial/Condition/Subject
Time
29Multi-subject analysis
- At least four possibilities
- Pre-average data
- Separate analysis
- Data concatenation
- Tensor models
30Concatenation
- Temporal concatenation
- Common spatial map
- Separate temporal profile
- Spatial concatenation
- Common temporal profile
- Separate spatial maps
space
time
space
time
Subjects can have different spatial maps Same
time series
Subjects can have different time profiles Often
combined with data reduction by SVD for each
subject Spatial ICA, GIFT Toolbox
31Multilinear modelling
Assumption Data instantaneous mixture of
temporal signatures.
(PCA/ICA/NMF)
Trilinear Model
Assumption Data instantaneous mixture of
temporal signatures that are expressed
to various degree over the trials
(Canoncial Decomposition, Parallel Factor
(CP))
(weighted averages over the trials)
"A surprising fact is that the nonrotatability
characteristic can hold even when the number of
factors extracted is greater than every dimension
of the three-way array. - Kruskal 1976
32Unfortunately, multi-linear models are often to
restrictive
- Trilinear model can encompass
- Variability in strength over repeats
- However, other common causes of variation are
- Delay Variability
- Shape Variability
Trial 1 Trial 2
Trial 1 Trial 2
33Violation of multi-linearity causes degeneracy
Space
Trial
Time
34Decomposition with Invariance
Shift Invariance
?1,1
A1,1
A1,2
?2,1
?1,2
A2,1
?3,1
?2,2
?3,2
Y
35Modelling Delay Variability
Shifted CP
36(Mørup et al., NeuroImage 2008)
37Delay modelling of fMRI data fromretinotopic
mapping paradigm
B
38(No Transcript)
39Reverberation/Convolution
Y
40Modeling Delay and Shape Variability
convolutive CP
41CP, ShiftCP and ConvCP
ConvCP Can model arbitrary number of component
delays within the trials and account for shape
variation within the convolutional model
representation
42Convolutive Multi-linear decomposition
43Analysis of fMRI data
Degeneracy
Degeneracy
Each trial consists of a visual stimulus
delivered as an annular full-field checkerboard
reversing at 8 Hz.
44Shift Invariant multiway decompostion
45Overview
- Part I Basic Principles
- Part II Bayesian model comparison
- Part III Unsupervised analysis
- Part IV Experiments
- Conclusion
46Conclusions
- Functional MRI data is noise
- Modelling nuisances can help supress known noise
sources - Unsupervised learning is an important framework
for multivariate analysis of neuroimaging data
such as fMRI - Explores pattern in data
- Identifies noise source
- Drives new hypothesis
- Bi-linear analysis ambiguous requiring additional
assumption such as independence or sparsity
(forming ICA and Sparse coding
47Conclusions
- Multi-linear modeling offers the ability to
extract the consistent activity of neuroimaging
data over repeats/subjects/conditions etc. - However, violation of multi-linearity due to
variability causes degeneracy - Common causes of variability in neuroimaging data
are delay and shape variation - Advancing the CP model to ShiftCP and ConvCP
enables to address these types of variability. - Modelling delay and shape changes is also
relevant for bi-linear modelling and open
doorways to address latent causal relations.
48References
- Andersen, A. H. and Rayens, W. S. (2004).
Structure-seeking multilinear methods for the
analysis of fmri data. NeuroImage, 22, 728-739. - Beckmann, C. and Smith, S. (2004). Probabilistic
independent component analysis for functional
magnetic resonance imaging. IEEE Trans Med
Imaging, 23, 137-152. - Beckmann, C. and Smith, S. (2005). Tensorial
extensions of independent component analysis for
multisubject fmri analysis. NeuroImage 25 , pages
294-311. - Bell, A. J. and Sejnowski, T. J. (1995). An
information maximization approach to blind source
separation and blind deconvolution. Neural
Computation, 7, 1129-1159. - Comon, P. (1994). Independent component analysis,
a new concept? Signal Processing, 36, 287-314. - Fox, M. D., Corbetta, M., Snyder, A. Z., Vincent,
J. L., and Raichle, M. E. (2006). Spontaneous
neuronal activity distinguishes human dorsal and
ventral attention systems. Proc Natl Acad Sci U S
A, 103(26), 10046-10051. - Lee, D. and Seung, H. (1999). Learning the parts
of objects by non-negative matrix factorization.
Nature, 401(6755), 788-91. - Lund, T., Madsen, K., Sidaros, K., Luo, W., and
Nichols, T. (2006). Non-white noise in fMRI does
modelling have an impact? Neuroimage, 29, 54-66. - Lund, T. E. and Larsson, H. B. W. (1999). Spatial
distribution of low-frequency noise in fMRI. In
Proceedings of the 7th Annual Meeting of ISMRM,
page 1705. - Madsen, K. H. and Lund, T. E. (2006).
Unsupervised modelling of physiological noise
artifacts in fmri data. In Proc. International
Society of Magnetic Resonance In Medicine - ISMRM
2006, Seattle, Washington, USA. ISMRM. - McKeown, M. and Sejnowski, T. (1998). Independent
component analysis of fMRI data examining the
assumptions. Hum Brain Mapp, 6, 368-372. - McKeown, M., Makeig, S., Brown, G., Jung, T.,
Kindermann, S., Bell, A., and Sejnowski, T.
(1998a). Analysis of fMRI data by blind
separation into independent spatial components.
Hum Brain Mapp, 6, 160-188.
49References
- McKeown, M., Jung, T., Makeig, S., Brown, G.,
Kindermann, S., Lee, T., and Sejnowski, T.
(1998b). Spatially independent activity patterns
in functional MRI data during the stroop
color-naming task. Proc. Natl. Acad. Sci. U.S.A.,
95, 803-810. - McKeown, M. J., Hansen, L. K., and Sejnowsk, T.
J. (2003). Independent component analysis of
functional mri what is signal and what is noise?
Curr Opin Neurobiol , 13(5). - Molgedey, L. and Schuster, H. (1994). Separation
of a mixture of independent signals using time
delayed correlations. Physical Review Letters,
72(34), 3634-3637. - Mørup, M., Hansen, L., Arnfred, S., Lim, L.-K.,
and Madsen, K. (2008). Shift invariant
multilinear decomposition of neuroimaging data.
Neuroimage - Olshausen, B. A. and Field, D. J. (1996).
Emergence of simple-cell receptive field
properties by learning a sparse code for natural
images. Nature, 381, 607-609. - Thomas, C. G., Harshman, R. A., and Menon, R. S.
(2002). Noise reduction in bold-based fmri using
component analysis. Neuroimage, 17(3), 1521-37.