Statistical analysis of fMRI data, - PowerPoint PPT Presentation

About This Presentation
Title:

Statistical analysis of fMRI data,

Description:

Statistical analysis of fMRI data, bubbles data, and the connectivity between the two – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 63
Provided by: KeithW69
Category:

less

Transcript and Presenter's Notes

Title: Statistical analysis of fMRI data,


1
Statistical analysis of fMRI data, bubbles
data, and the connectivity between the two
  • Keith Worsley, McGill (and Chicago)
  • Nicholas Chamandy, McGill and Google
  • Jonathan Taylor, Université de Montréal and
    Stanford
  • Robert Adler, Technion
  • Philippe Schyns, Fraser Smith, Glasgow
  • Frédéric Gosselin, Université de Montréal
  • Arnaud Charil, Alan Evans, Montreal Neurological
    Institute

2
Before you start PCA of time ? space
1 exclude first frames
2 drift
3 long-range correlation or anatomical effect
remove by converting to of brain
4 signal?
3
Bad design 2 mins rest 2 mins Mozart 2 mins
Eminem 2 mins James Brown
4
Rest Mozart Eminem J. Brown
Temporal components (sd, variance
explained)
Period 5.2 16.1
15.6 11.6 seconds
1
0.41, 17
2
0.31, 9.5
Component
3
0.24, 5.6
0
50
100
150
200
Frame
Spatial components

1
1
0.5
Component
2
0
-0.5
3

-1
0
2
4
6
8
10
12
14
16
18
Slice (0 based)
5
Effect of stimulus on brain response
Modeled by convolving the stimulus with the
hemodynamic response function
Stimulus is delayed and dispersed by 6s
6
fMRI data, pain experiment, one slice
T (hot warm effect) / S.d. t110 if no
effect
7
How fMRI differs from other repeated measures data
  • Many reps (200 time points)
  • Few subjects (15)
  • Df within subjects is high, so not worth pooling
    sd across subjects
  • Df between subjects low, so use spatial smoothing
    to boost df
  • Data sets are huge 4GB, not easy to use
    statistics packages such as R

8
FMRISTAT (Matlab) / BRAINSTAT (Python)statistica
l analysis strategy
  • Analyse each voxel separately
  • Borrow strength from neighbours when needed
  • Break up analysis into stages
  • 1st level analyse each time series separately
  • 2nd level combine 1st level results over runs
  • 3rd level combine 2nd level results over
    subjects
  • Cut corners do a reasonable analysis in a
    reasonable time (or else no one will use it!)

9
(No Transcript)
10
1st level Linear model with AR(p) errors
  • Data
  • Yt fMRI data at time t
  • xt (responses,1, t, t2, t3, ) to allow for
    drift
  • Model
  • Yt xtß et
  • et a1et-1 apet-p sF?t, ?t N(0,1)
    i.i.d.
  • Fit in 2 stages
  • 1st pass fit by least squares, find residuals,
    estimate AR parameters a1 ap
  • 2nd pass whiten data, re-fit by least squares

11
Higher levelsMixed effects model
  • Data
  • Ei effect (contrast in ß) from previous level
  • Si sd of effect from previous level
  • zi (1, treatment, group, gender, )
  • Model
  • Ei zi? SieiF sReiR (Si high df, so
    assumed fixed)
  • eiF N(0,1) i.i.d. fixed effects error
  • eiR N(0,1) i.i.d. random effects error
  • Fit by ReML
  • Use EM for stability, 10 iterations

12
Where we use spatial information
  • 1st level smooth AR parameters to lower their
    variability and increase df
  • df defined by Satterthwaite approximation
  • surrogate for variance of the variance parameters
  • Higher levels smooth Random / Fixed effects sd
    ratio to lower variability and increase df
  • Final level use random field theory to correct
    for multiple comparisons

13
1st level Autocorrelation
  • AR(1) model et a1 et-1 sF?t
  • Fit the linear model using least squares
  • et Yt Yt
  • â1 Correlation (et , et-1)
  • Estimating et changes their correlation structure
    slightly, so â1 is slightly biased
  • Raw autocorrelation Smoothed 12.4mm Bias
    corrected â1

  • -0.05 0

14
How much smoothing?
  • Variability in
  • â lowers df
  • Df depends
  • on contrast
  • Smoothing â brings df back up

dfâ dfresidual(2
1) 1 1 2 acor(contrast
of data)2 dfeff dfresidual
dfâ
FWHMâ2 3/2 FWHMdata2

Hot-warm stimulus
Hot stimulus
FWHMdata 8.79
Residual df 110
Residual df 110
100
100
Target 100 df
Target 100 df
Contrast of data, acor 0.79
50
50
Contrast of data, acor 0.61
dfeff
dfeff
0
0
0
10
20
30
0
10
20
30
FWHM 10.3mm
FWHM 12.4mm
FWHMâ
FWHMâ
15
2nd level 4 runs, 3 df for random effects sd
very noisy sd
and Tgt15.96 for Plt0.05 (corrected)
so no response is detected
16
Solution Spatial smoothing of the sd ratio
  • Basic idea increase df by spatial smoothing
    (local pooling) of the sd.
  • Cant smooth the random effects sd directly, -
    too much anatomical structure.
  • Instead,
  • random effects sd
  • fixed effects sd
  • which removes the anatomical structure before
    smoothing.

? )
sd smooth ?
fixed effects sd
17

Average Si
?
Random effects sd, 3 df
Fixed effects sd, 440 df
Mixed effects sd, 100 df
0.2
0.15
0.1
0.05
0
divide
multiply
Smoothed sd ratio
Random sd / fixed sd
1.5
random effect, sd ratio 1.3
1
0.5
18
How much smoothing?
  • dfratio dfrandom(2 1)
  • 1 1 1
  • dfeff dfratio dffixed

FWHMratio2 3/2 FWHMdata2
dfrandom 3, dffixed 4 ? 110
440, FWHMdata 8mm

fixed effects analysis, dfeff 440
dfeff
FWHM 19mm
Target 100 df
random effects analysis, dfeff 3
FWHMratio
19
Final result 19mm smoothing, 100 df
less noisy sd
and Tgt4.93 for Plt0.05 (corrected)
and now we can detect a response!
20
Final level Multiple comparisons correction
0.1
Threshold chosen so that P(maxS Z(s) t) 0.05
0.09
0.08
Random field theory
Bonferroni
0.07
0.06
0.05
P value
Discrete local maxima
0.04

2
0.03
0
0.02
-2

0.01
Z(s)
0
0
1
2
3
4
5
6
7
8
9
10
FWHM (Full Width at Half Maximum) of smoothing
filter
FWHM
21
Random field theory
filter
white noise
Z(s)


FWHM
EC0(S)
Resels0(S)
EC1(S)
Resels1(S)
EC2(S)
Resels2(S)
Resels3(S)
EC3(S)
Resels (Resolution elements)
EC densities
22
Discrete local maxima
  • Bonferroni applied to N events
  • Z(s) t and Z(s) is a discrete local
    maximum i.e.
  • Z(s) t and neighbour Zs Z(s)
  • Conservative
  • If Z(s) is stationary, with
  • Cor(Z(s1),Z(s2)) ?(s1-s2),
  • Then the DLM P-value is
  • PmaxS Z(s) t N PZ(s) t and neighbour
    Zs Z(s)
  • We only need to evaluate a (2D1)-variate
    integral

Z(s2)

Z(s-1) Z(s) Z(s1)

Z(s-2)
23
Discrete local maxima Markovian trick
  • If ? is separable s(x,y),
  • ?((x,y)) ?((x,0))
    ?((0,y))
  • e.g. Gaussian spatial correlation function
  • ?((x,y)) exp(-½(x2y2)/w2)
  • Then Z(s) has a Markovian property
  • conditional on central Z(s), Zs on
  • different axes are independent
  • Z(s1) - Z(s2) Z(s)
  • So condition on Z(s)z, find
  • Pneighbour Zs z Z(s)z ?d PZ(sd) z
    Z(s)z
  • then take expectations over Z(s)z
  • Cuts the (2D1)-variate integral down to a
    bivariate integral

Z(s2)

Z(s-1) Z(s) Z(s1)

Z(s-2)
24
(No Transcript)
25
Example single run, hot-warm
Detected by BON and DLM but not by RFT
Detected by DLM, but not by BON or RFT
26
Estimating the delay of the response
  • Delay or latency to the peak of the HRF is
    approximated by a linear combination of two
    optimally chosen basis functions

delay
basis1
basis2
HRF
shift
  • HRF(t shift) basis1(t) w1(shift)
    basis2(t) w2(shift)
  • Convolve bases with the stimulus, then add to
    the linear model

27
  • Fit linear model,
  • estimate w1 and w2
  • Equate w2 / w1 to estimates, then solve for
    shift (Hensen et al., 2002)
  • To reduce bias when the magnitude is small, use
  • shift / (1 1/T2)
  • where T w1 / Sd(w1) is the T statistic for the
    magnitude
  • Shrinks shift to 0 where there is little
    evidence for a response.

w2 / w1
w1
w2
28
Shift of the hot stimulus
T stat for magnitude T stat for
shift
Shift (secs) Sd of shift
(secs)
29
Shift of the hot stimulus
T stat for magnitude T stat for
shift
Tgt4
T2
Shift (secs) Sd of shift
(secs)
1 sec
/- 0.5 sec
30
Combining shifts of the hot stimulus (Contours
are T stat for magnitude gt 4)
1 sec
/- 0.25 sec
T4
31
Shift of the hot stimulus
Shift (secs)
T stat for magnitude gt 4.93
32
Functional Imaging Analysis Contest HBM2005
  • 15 subjects / 4 runs per subject (2 with events,
    2 with blocks)
  • 4 conditions per run
  • Same sentence, same speaker
  • Same sentence, different speaker
  • Different sentence, same speaker
  • Different sentence, different speaker
  • 3T, 191 frames, TR2.5s
  • Greater BOLD response for
  • different same sentences (1.080.16)
  • different same speaker (0.470.08)
  • Greater latency for
  • different same sentences (0.1480.035 secs)

33
Contrasts in the data used for effects
2
Hot, Sd 0.16
Warm, Sd 0.16
9 sec blocks, 9 sec gaps
1
0
-1
0
50
100
150
200
250
300
350
Hot-warm, Sd 0.19
Time (secs)
2
Hot, Sd 0.28
Warm, Sd 0.43
90 sec blocks, 90 sec gaps
Only using data near block transitions
1
0
Ignoring data in the middle of blocks
-1
0
50
100
150
200
250
300
350
Hot-warm, Sd 0.55
Time (secs)
34
Optimum block design
Sd of hot stimulus
Sd of hot-warm
0.5
0.5
20
20
0.4
0.4
15
15
Best design
0.3
0.3
Magnitude
10
10
Best design
0.2
0.2
X
5
5
0.1
0.1
0
X
0
0
0
5
10
15
20
5
10
15
20
Gap (secs)
(secs)
(secs)
1
1
20
20
0.8
0.8
15
15
0.6
0.6
Best design X
Best design X
Delay
10
10
0.4
0.4
5
5
0.2
0.2
(Not enough signal)
(Not enough signal)
0
0
0
5
10
15
20
5
10
15
20
Block (secs)
35
Optimum event design
0.5
(Not enough signal)
uniform . . . . . . . . .
____ magnitudes . delays
random .. . ... .. .
concentrated
0.4
Sd of effect (secs for delays)
0.3
0.2
12 secs best for magnitudes
0.1
0
7 secs best for delays
5
10
15
20
Average time between events (secs)
36
How many subjects?
  • Largest portion of variance comes from the last
    stage i.e. combining over subjects
  • sdrun2 sdsess2
    sdsubj2
  • nrun nsess nsubj nsess nsubj
    nsubj
  • If you want to optimize total scanner time, take
    more subjects.
  • What you do at early stages doesnt matter very
    much!

37
Features special to FMRISTAT / BRAINSTAT
  • Bias correction for AR coefficients
  • Df boosting due to smoothing
  • AR coefficients
  • random/fixed effects variance
  • P-value adjustment for
  • peaks due to small FWHM (DLM)
  • clusters due to spatially varying FWHM
  • Delays analysed the same way as magnitudes
  • Sd of effects before collecting data

38
What is bubbles?
39
Nature (2005)
40
Subject is shown one of 40 faces chosen at
random
Happy
Sad
Fearful
Neutral
41
but face is only revealed through random
bubbles
  • First trial Sad expression
  • Subject is asked the expression
    Neutral
  • Response
    Incorrect

75 random bubble centres
Smoothed by a Gaussian bubble
What the subject sees
Sad
42
Your turn
  • Trial 2

Subject response Fearful CORRECT
43
Your turn
  • Trial 3

Subject response Happy INCORRECT (Fearful)
44
Your turn
  • Trial 4

Subject response Happy CORRECT
45
Your turn
  • Trial 5

Subject response Fearful CORRECT
46
Your turn
  • Trial 6

Subject response Sad CORRECT
47
Your turn
  • Trial 7

Subject response Happy CORRECT
48
Your turn
  • Trial 8

Subject response Neutral CORRECT
49
Your turn
  • Trial 9

Subject response Happy CORRECT
50
Your turn
  • Trial 3000

Subject response Happy INCORRECT (Fearful)
51
Bubbles analysis
  • E.g. Fearful (3000/4750 trials)

Trial 1 2 3 4
5 6 7 750
Sum
Correct trials
Thresholded at proportion of correct
trials0.68, scaled to 0,1
Use this as a bubble mask
Proportion of correct bubbles (sum correct
bubbles) /(sum all bubbles)
52
Results
  • Mask average face
  • But are these features real or just noise?
  • Need statistics

Happy Sad
Fearful Neutral
53
Statistical analysis
  • Correlate bubbles with response (correct 1,
    incorrect 0), separately for each expression
  • Equivalent to 2-sample Z-statistic for correct
    vs. incorrect bubbles, e.g. Fearful
  • Very similar to the proportion of correct bubbles

ZN(0,1) statistic
Trial 1 2 3 4
5 6 7 750
Response 0 1 1 0
1 1 1 1
54
Results
  • Thresholded at Z1.64 (P0.05)
  • Multiple comparisons correction?
  • Need random field theory

ZN(0,1) statistic
Average face Happy Sad
Fearful Neutral
55
Results, corrected for search
  • Random field theory threshold Z3.92 (P0.05)
  • 3.82 3.80 3.81
    3.80
  • Saddle-point approx (Chamandy, 2007) Z?
    (P0.05)
  • Bonferroni Z4.87 (P0.05) nothing

ZN(0,1) statistic
Average face Happy Sad
Fearful Neutral
56
Scale
Separate analysis of the bubbles at each scale
57
Scale space smooth Z(s) with range of filter
widths w continuous wavelet transform adds an
extra dimension to the random field Z(s,w)
Scale space, no signal
34
8
22.7
6
4
15.2
2
10.2
0
-2
6.8
-60
-40
-20
0
20
40
60
w FWHM (mm, on log scale)
One 15mm signal
34
8
22.7
6
4
15.2
2
10.2
0
-2
6.8
Z(s,w)
-60
-40
-20
0
20
40
60
s (mm)
15mm signal is best detected with a 15mm
smoothing filter
58
Matched Filter Theorem ( Gauss-Markov Theorem)
to best detect signal white noise, filter
should match signal
10mm and 23mm signals
34
8
22.7
6
4
15.2
2
10.2
0
-2
6.8
-60
-40
-20
0
20
40
60
w FWHM (mm, on log scale)
Two 10mm signals 20mm apart
34
8
22.7
6
4
15.2
2
10.2
0
-2
6.8
Z(s,w)
-60
-40
-20
0
20
40
60
s (mm)
But if the signals are too close together they
are detected as a single signal half way between
them
59
Scale space can even separate two signals at the
same location!
8mm and 150mm signals at the same location
10
5
0
-60
-40
-20
0
20
40
60
170
20
76
15
34
w FWHM (mm, on log scale)
10
15.2
5
6.8
Z(s,w)
-60
-40
-20
0
20
40
60
s (mm)
60
Bubbles task in fMRI scanner
  • Correlate bubbles with BOLD at every voxel
  • Calculate Z for each pair (bubble pixel, fMRI
    voxel)
  • a 5D image of Z statistics

Trial 1 2 3 4
5 6 7 3000
fMRI
61
Thresholding?
  • Thresholding in advance is vital, since we cannot
    store all the 1 billion 5D Z values
  • Resels (image resels 146.2) (fMRI resels
    1057.2)
  • for P0.05, threshold is Z 6.22 (approx)
  • Only keep 5D local maxima
  • Z(pixel, voxel) gt Z(pixel, 6 neighbours of voxel)
  • gt Z(4 neighbours of
    pixel, voxel)

62
Generalised linear models?
  • The random response is Y1 (correct) or 0
    (incorrect), or YfMRI
  • The regressors are Xjbubble mask at pixel j, j1
    240x38091200 (!)
  • Logistic regression or ordinary regression
  • logit(E(Y)) or E(Y) b0X1b1X91200b91200
  • But there are only n3000 observations (trials)
  • Instead, since regressors are independent, fit
    them one at a time
  • logit(E(Y)) or E(Y) b0Xjbj
  • However the regressors (bubbles) are random with
    a simple known distribution, so turn the problem
    around and condition on Y
  • E(Xj) c0Ycj
  • Equivalent to conditional logistic regression
    (Cox, 1962) which gives exact inference for b1
    conditional on sufficient statistics for b0
  • Cox also suggested using saddle-point
    approximations to improve accuracy of inference
  • Interactions? logit(E(Y)) or E(Y)b0X1b1X91200
    b91200X1X2b1,2
Write a Comment
User Comments (0)
About PowerShow.com