fMRI Analysis with emphasis on the General Linear Model - PowerPoint PPT Presentation

1 / 105
About This Presentation
Title:

fMRI Analysis with emphasis on the General Linear Model

Description:

Jody Culham Brain and Mind Institute Department of Psychology University of Western Ontario http://www.fmri4newbies.com/ fMRI Analysis with emphasis on the General ... – PowerPoint PPT presentation

Number of Views:645
Avg rating:3.0/5.0
Slides: 106
Provided by: jcu91
Category:

less

Transcript and Presenter's Notes

Title: fMRI Analysis with emphasis on the General Linear Model


1
fMRI Analysiswith emphasis on the General Linear
Model
Jody Culham Brain and Mind Institute Department
of Psychology University of Western Ontario
http//www.fmri4newbies.com/
Last Update January 18, 2012 Last Course
Psychology 9223, W2010, University of Western
Ontario
2
Part 1
  • Statistical Intuitions

3
What data do we start with

These s are from an obsolete scanner. With a
modern 3T, we can get 3X the slices
  • 12 slices 64 voxels x 64 voxels 49,152 voxels
  • Each voxel has 136 time points (volumes)
  • Therefore, for each run, we have 6.7 million data
    points
  • We often have several runs for each experiment

4
Why do we need stats?
  • We could, in principle, analyze data by voxel
    surfing move the cursor over different areas and
    see if any of the time courses look interesting

5
Why do we need stats?
  • Clearly voxel surfing isnt a viable option.
    Wed have to do it 49,152 times in this data set
    and it would require a lot of subjective
    decisions about whether activation was real
  • This is why we need statistics
  • Statistics
  • tell us where to look for activation that is
    related to our paradigm
  • help us decide how likely it is that activation
    is real

The lies and damned lies come in when you write
the manuscript
6
Predicted Responses
  • fMRI is based on the Blood Oxygenation Level
    Dependent (BOLD) response
  • It takes about 5 sec for the blood to catch up
    with the brain
  • We can model the predicted activation in one of
    two ways
  • shift the boxcar by approximately 5 seconds (2
    images x 2 seconds/image 4 sec, close enough)
  • convolve the boxcar with the hemodynamic response
    to model the shape of the true function as well
    as the delay

PREDICTED ACTIVATION IN OBJECT AREA
PREDICTED ACTIVATION IN VISUAL AREA
BOXCAR
7
Types of Errors
p value probability of a Type I error e.g., p
lt.05 There is less than a 5 probability that a
voxel our stats have declared as active is in
reality NOT active
Slide modified from Duke course
8
Statistical Approaches in a Nutshell
  • t-tests
  • compare activation levels between two conditions
  • use a time-shift to account for hemodynamic lag
  • correlations
  • model activation and see whether any areas show a
    similar pattern
  • Fourier analysis
  • Do a Fourier analysis to see if there is energy
    at your paradigm frequency

Fourier analysis images from Huettel, Song
McCarthy, 2004, Functional Magnetic Resonance
Imaging
9
Effect of Thresholds
r 0 0 of variance p lt 1
10
Complications
  • Not only is it hard to determine whats real, but
    there are all sorts of statistical problems
  • Potential problems
  • data may be contaminated by artifacts (e.g., head
    motion, breathing artifacts)
  • .05 49,152 2457 significant voxels by
    chance alone
  • many assumptions of statistics (adjacent voxels
    uncorrelated with each other adjacent time
    points uncorrelated with one another) are false

Whats wrong with these data?
r .24 6 of variance p lt .05
11
The General Linear Model (GLM)
  • GLM definition from Huettel et al.
  • a class of statistical tests that assume that the
    experimental data are composed of the linear
    combination of different model factors, along
    with uncorrelated noise
  • Model
  • statistical model
  • Linear
  • things add up sensibly (11 2)
  • note that linearity refers to the predictors in
    the model and not necessarily the BOLD signal
  • General
  • many simpler statistical procedures such as
    correlations, t-tests and ANOVAs are subsumed by
    the GLM

12
Benefits of the GLM
  • GLM is an overarching tool that can do anything
    that the simpler tests do
  • allows any combination of contrasts (e.g., intact
    - scrambled, scrambled - baseline), unlike
    simpler methods (correlations, t-tests, Fourier
    analyses)
  • allows more complex designs (e.g., factorial
    designs)
  • allows much greater flexibility for combining
    data within subjects and between subjects
  • allows comparisons between groups
  • allows counterbalancing orders within and between
    subjects
  • allows modelling of known sources of noise in the
    data (e.g., error trials, head motion)

13
Part 2
  • Composition of a Voxel Time Course

14
A Simple Experiment
  • Lateral Occipital Complex
  • responds when subject views objects

Blank Screen
Intact Objects
Scrambled Objects
TIME
One volume (12 slices) every 2 seconds for 272
seconds (4 minutes, 32 seconds) Condition
changes every 16 seconds (8 volumes)
15
Whats real?
A.
C.
B.
D.
16
Whats real?
  • I created each of those time courses based by
    taking the predictor function and adding a
    variable amount of random noise

signal


noise
17
Whats real?
Which of the data sets below is more convincing?
18
Formal Statistics
  • Formal statistics are just doing what your
    eyeball test of significance did
  • Estimate how likely it is that the signal is real
    given how noisy the data is
  • confidence how likely is it that the results
    could occur purely due to chance?
  • p value probability value
  • If p .03, that means there is a .03/1 or 3
    chance that the results are bogus
  • By convention, if the probability that a result
    could be due to chance is less than 5 (p lt .05),
    we say that result is statistically significant
  • Significance depends on
  • signal (differences between conditions)
  • noise (other variability)
  • sample size (more time points are more
    convincing)

19
Lets create a time course for one LO voxel
20
Well begin with activation
Response to Intact Objects is 4X greater than
Scrambled Objects
21
Then well assume that our modelled activation is
off because a transient component
22
Our modelled activation could be off for other
reasons
  • All of the following could lead to inaccurate
    models
  • different shape of function
  • different width of function
  • different latency of function

23
Reminder Variability of HRF
Intersubject variability of HRF in M1 Handwerker
et al., 2004, NeuroImage
24
Now lets add some variability due to head motion
25
though really motion is more complex
  • Head motion can be quantified with 6 parameters
    given in any motion correction algorithm
  • x translation
  • y translation
  • z translation
  • xy rotation
  • xz rotation
  • yz rotation
  • For simplicity, Ive only included parameter one
    in our model
  • Head motion can lead to other problems not
    predictable by these parameters

26
Now lets throw in a pinch of linear drift
  • linear drift could arise from magnet noise (e.g.,
    parts warm up) or physiological noise (e.g.,
    subjects head sinks)

27
and then well add a dash of low frequency noise
  • low frequency noise can arise from magnet noise
    or physiological noise (e.g., subjects cycles of
    alertness/drowsiness)
  • low frequency noise would occur over a range of
    frequencies but for simplicity, Ive only
    included one frequency (1 cycle per run) here
  • Linear drift is really just very low frequency
    noise

28
and our last ingredient some high frequency noise
  • high frequency noise can arise from magnet noise
    or physiological noise (e.g., subjects breathing
    rate and heartrate)

29
When we add these all together, we get a
realistic time course
30
Part 3
  • General Linear Model

31
Now lets be the experimenter
  • First, we take our time course and normalize it
    using z scores
  • z (x - mean)/SD
  • normalization leads to data where
  • mean zero
  • SD 1

Alternative You can transform the data into
BOLD signal change. This is usually a better
approach because its not dependent on variance
32
Wake Up!!!!!
If you only pay attention to one slide in this
lecture, it should be the next one!!!
33
We create a GLM with 2 predictors
?1



?2

fMRI Signal
Residuals
Design Matrix

Betas
x
what we CAN explain
what we CANNOT explain
how much of it we CAN explain


x
our data
Statistical significance is basically a ratio of
explained to unexplained variance
34
Implementation of GLM in SPM
Many thanks to Øystein Bech Gadmar for creating
this figure in SPM
? Time
  • SPM represents time as going down
  • SPM represents predictors within the design
    matrix as grayscale plots (where black low,
    white high) over time
  • GLM includes a constant to take care of the
    average activation level throughout each run
  • SPM shows this explicity (BV may not)

35
Effect of Beta Weights
  • Adjustments to the beta weights have the effect
    of raising or lowering the height of the
    predictor while keeping the shape constant

36
Dynamic Example
37
The beta weight is NOT a correlation
  • correlations measure goodness of fit regardless
    of scale
  • beta weights are a measure of scale

small ß large r
small ß small r
large ß small r
large ß large r
38
We create a GLM with 2 predictors
when ?12


when ?20.5

Betas
x
Design Matrix
what we CAN explain
what we CANNOT explain
how much of it we CAN explain


x
our data
Statistical significance is basically a ratio of
explained to unexplained variance
39
Correlated Predictors
  • Where possible, avoid predictors that are highly
    correlated with one another
  • This is why we NEVER include a baseline predictor
  • baseline predictor is almost completely
    correlated with the sum of existing predictors


r -.53

r -.53
r -.95
Two stimulus predictors
Baseline predictor
40
Which model accounts for this data?
x ß 1
x ß 0


OR
x ß 1
x ß 0


x ß 0
x ß -1
  • Because the predictors are highly correlated, the
    model is overdetermined and you cant tell which
    beta combo is best

41
Orthogonalizing Regressors
42
Contrasts in the GLM
  • We can examine whether a single predictor is
    significant (compared to the baseline)

43
Contrasts
? balanced
  • Conjunction of contrasts
  • e.g., (1 -1 0) AND (1 0 -1)
  • (Bio motion - Nonbio motion) AND (Bio motion gt
    control)
  • more rigorous than balanced contrast
  • hypothetical (but not actual) conjunction p
    value multiple of independent p values
  • e.g., .01 x .01 .001

44
A Real Voxel
  • Heres the time course from a voxel that was
    significant in the Intact -Scrambled comparison

45
Maximizing Your Power
signal


noise
  • As we saw earlier, the GLM is basically comparing
    the amount of signal to the amount of noise
  • How can we improve our stats?
  • increase signal
  • decrease noise
  • increase sample size (keep subject in longer)

46
How to Reduce Noise
  • If you cant get rid of an artifact, you can
    include it as a predictor of no interest to
    soak up variance

Example Some people include predictors from the
outcome of motion correction algorithms
Corollary Never leave out predictors for
conditions that will affect your data (e.g.,
error trials)
This works best when the motion is uncorrelated
with your paradigm (predictors of interest)
47
Reducing Residuals
48
Part 3
  • Deconvolution of Event-Related Designs
  • Using the GLM

49
Convolution of Single Trials
Neuronal Activity
BOLD Signal
Haemodynamic Function
Time
Time
Slide from Matt Brown
50
Fast fMRI Detection
Slide from Matt Brown
51
DEconvolution of Single Trials
Neuronal Activity
BOLD Signal
Haemodynamic Function
Time
Time
Slide from Matt Brown
52
Deconvolution Example
  • time course from 4 trials of two types (pink,
    blue) in a jittered design

53
Summed Activation
54
Single Stick Predictor
  • single predictor for first volume of pink trial
    type

55
Predictors for Pink Trial Type
  • set of 12 predictors for subsequent volumes of
    pink trial type
  • need enough predictors to cover unfolding of HRF
    (depends on TR)

56
Predictor Matrix
  • Diagonal filled with 1s

. . .
57
Predictors for the Blue Trial Type
  • set of 12 predictors for subsequent volumes of
    blue trial type

58
Predictor x Beta Weights for Pink Trial Type
  • sequence of beta weights for one trial type
    yields an estimate of the average activation
    (including HRF)

59
Predictor x Beta Weights for Blue Trial Type
  • height of beta weights indicates amplitude of
    response (higher betas larger response)

60
Linear Deconvolution
Miezen et al. 2000
  • Jittering ITI also preserves linear independence
    among the hemodynamic components comprising the
    BOLD signal.

61
Fast fMRI Estimation
  • Pros
  • Produces time course
  • Does not assume specific shape for hemodynamic
    function
  • Robust against trial history biases (though not
    immune to it)
  • Compound trial types possible
  • Cons
  • Complicated
  • Unrealistic assumptions about linearity if trials
    are too close in time
  • BOLD is non-linear with inter-event intervals lt 6
    sec.
  • Nonlinearity becomes severe under 2 sec.
  • Sensitive to noise

62
Part 4
  • Dealing with Faulty Assumptions

63
Whats this ing reviewer complaining about?!
  • Particularly if you do voxelwise stats, you have
    to be careful to follow the accepted standards of
    the field. In the past few years the following
    approaches have been recommended by the stats
    mavens
  • Correction for multiple comparisons
  • Random effects analyses
  • Correction for serial correlations

64
Dead Salmon
poster at Human Brain Mapping conference, 2009
  • 130,000 voxels
  • no correction for multiple comparisons

65
Fishy Headlines
66
Correction for Multiple Comparisons
With conventional probability levels (e.g., p lt
.05) and a huge number of comparisons (e.g., 64 x
64 x 12 49,152), a lot of voxels will be
significant purely by chance e.g., .05 49,152
2458 voxels significant due to chance How can
we avoid this?
  • Bonferroni correction
  • divide desired p value by number of comparisons
  • Example
  • desired p value p lt .05
  • number of voxels 50,000
  • required p value p lt .05 / 50,000 ? p lt
    .000001
  • quite conservative
  • can use less stringent values
  • e.g., Brain Voyager can use the number of voxels
    in the cortical surface
  • small volume correction use more liberal
    thresholds in areas of the brain which you
    expected to be active

67
Correction for Multiple Comparisons
  • Gaussian random field theory
  • Fundamental to SPM
  • If data are very smooth, then the chance of noise
    points passing threshold is reduced
  • Can correct for the number of resolvable
    elements (resels) rather than number of voxels

Slide modified from Duke course
68
  • Cluster correction
  • falsely activated voxels should be randomly
    dispersed
  • set minimum cluster size to be large enough to
    make it unlikely that a cluster of that size
    would occur by chance
  • some algorithms assume that data from adjacent
    voxels are uncorrelated (not true)
  • some algorithms (e.g., Brain Voyager) estimate
    and factor in spatial smoothness of maps
  • cluster threshold may differ for different
    contrasts
  • Test-retest reliability
  • Perform statistical tests on each half of the
    data
  • The probability of a given voxel appearing in
    both purely by chance is the square of the p
    value used in each half
  • e.g., .001 x .001 .000001
  • Alternatively, use the first half to select an
    ROI and evaluate your hypothesis in the second
    half.

69
  • False discovery rate (Genovese et al, 2002,
    NeuroImage)
  • controls the proportion of rejected hypotheses
    that are falsely rejected
  • standard p value (e.g., p lt .01) means that a
    certain proportion of all voxels will be
    significant by chance (1)
  • FDR uses q value (e.g., q lt .01), meaning that a
    certain proportion of the activated (colored)
    voxels will be significant by chance (1)
  • works in theory, though in practice, my lab
    hasnt been that satisfied

Is the region truly active?
Yes
No
Type I Error
HIT
Yes
Does our stat test indicate that the region is
active?
Type II Error
Correct Rejection
No
70
  • 6) Poor mans Bonferroni
  • Jack up the threshold till you get rid of the
    schmutz (especially in air, ventricles, white
    matter)
  • If you have a comparison where one condition is
    expected to produce much more activity than the
    other, turn on both tails of the comparison
  • Jodys rule of thumb If ya cant trust the
    negatives, can ya trust the positives?

Example MT localizer data Moving rings gt
stationary rings (orange) Stationary rings gt
moving rings (blue)
71
Correction for Temporal Correlations
Statistical methods assume that each of our time
points is independent. In the case of fMRI, this
assumption is false. Even in a screen saver
scan, activation in a voxel at one time is
correlated with its activation within 6
sec This fact can artificially inflate your
statistical significance.
72
Autocorrelation function
To calculate the magnitude of the problem, we can
compute the autocorrelation function For a voxel
or ROI, correlate its time course with itself
shifted in time Plot these correlations by the
degree of shift
original
73
BV can correct for the autocorrelation to yield
revised (usually lower) p values
BEFORE
AFTER
74
BV Preprocessing Options
75
Temporal Smoothing of Data
  • We have the option in our software to temporally
    smooth our data (i.e., remove high temporal
    frequencies)
  • However, I recommended that you not use this
    option
  • Now do you understand why?

76
Clarification
  • correction for temporal correlations is NOT
    necessary with random effects analyses, only for
    fixed effects and individual subjects analysis

77
Collapsed Fixed Effects Models
  • assume that the experimental manipulation has
    same effect in each subject
  • treats all data as one concatenated set with one
    beta per predictor (collapsed across all
    subjects)
  • e.g., Intact 2
  • Scrambled .5
  • strong effect in one subject can lead to
    significance even when others show weak or no
    effects
  • you can say that effect was significant in your
    group of subjects but cannot generalize to other
    subjects that you didnt test

78
Separate Subjects Models
  • one beta per predictor per subject
  • e.g., JC Intact 2.1
  • JC Scrambled 0.2
  • DQ Intact 1.5
  • DQ Scrambled 1.0
  • KV Intact 1.2
  • KV Scrambled 1.3
  • weights each subject equally
  • makes data less susceptible to effects of one
    rogue subject

79
Random Effects Analysis
  • Typical fMRI stats test whether the differences
    between conditions are significant in the sample
    of subjects we have tested
  • Often, we want to be able to generalize to the
    population as a whole including all potential
    subjects, not just the ones we tested
  • Random effects analyses allow you to generalize
    to the population you tested
  • Brain Voyager recommends you dont even toy with
    random effects unless youve got 10 or more
    subjects (and 50 is best)
  • Random effects analyses can really squash your
    data, especially if you dont have many subjects.
    Sometimes we refer to the random effects button
    as the make my activation go away button.
  • Though standards were lower in the early days of
    fMRI, today its virtually impossible to publish
    any group voxelwise data without random effects
    analysis
  • You dont have to worry about it if youre using
    the ROI approach because (1) presumably the ROI
    has already been well-established across multiple
    labs and (2) posthoc analyses of results in an
    ROI approach allow you to generalize to the
    population (assuming you include individual
    variance)

underpaid graduate students in need of a few
bucks!
80
Fixed vs. Random Effects GLM
Sample Data 1
Sample Data 2
Subject Intact beta Scram beta Diff
1 4 3 1
2 2 3 -1
3 4 1 3
SUM 10 7 3
Subject Intact beta Scram beta Diff
1 4 3 1
2 2 1 1
3 4 3 1
SUM 10 7 3
  • Fixed Effects GLM cannot tell the difference
    between these data sets because (Intact sum -
    Scram sum) is the same in both cases
  • In Random Effects GLM, Data set 1 would be more
    likely to be significant because all 3 subjects
    show a trend in the same direction (intact gt
    scrambled), whereas in data set 2, only 2 of 3
    subjects show a difference in that direction

81
Strategies for Exploration vs. Publication
  • Deductive approach
  • Have a specific hypothesis/contrast planned
  • Run all your subjects
  • Run the stats as planned
  • Publish
  • Inductive approach
  • Run a few subjects to see if youre on the right
    track
  • Spend a lot of time exploring the pilot data for
    interesting patterns
  • Find the story in the data
  • You may even change the experiment, run
    additional subjects, or run a follow-up
    experiment to chase the story
  • While you need to use rigorous corrections for
    publication, do not be overly conservative when
    exploring pilot data or you might miss
    interesting trends
  • Random effects analyses can be quite
    conservative so you may want to do exploratory
    analyses with fixed effects (and then run more
    subjects if needed so you can publish random
    effects)

82
Part 4
  • To Localize or Not to Localise?

83
To Localize or Not to Localise?
84
Methodological Fundamentalism
The latest review I received
85
Approach 1 Voxelwise Statistics
  1. You dont necessarily need a priori hypotheses
    (though sometimes you can use less conservative
    stats if you have them)
  2. Average all of your data together in Talairach
    space
  3. Compare two (or more) conditions using precise
    statistical procedures within every voxel of the
    brain. Any area that passes a carefully
    determined threshold is considered real.
  4. Make a list of these areas and publish it.

This is the tricky part!
86
Voxelwise Approach Example
  • Malach et al., 1995, PNAS
  • Question Are there areas of the human brain that
    are more responsive to objects than scrambled
    objects
  • You will recognize this as what we now call an LO
    localizer, but Malach was the first to identify LO

LO (red) responds more to objects, abstract
sculptures and faces than to textures, unlike
visual cortex (blue) which responds well to all
stimuli
LO activation is shown in red, behind MT
activation in green
87
The Danger of Voxelwise Approaches
  • This is one of two tables from a paper
  • Some papers publish tables of activation two
    pages long
  • How can anyone make sense of so many areas?

Source Decety et al., 1994, Nature
88
Approach 2 Region of interest (ROI) analysis
  • If you are looking at a well-established area
    (such as visual cortex, motor cortex, or the
    lateral occipital complex), its fairly easy to
    activate and identify the area
  • Do the stats and play with the threshold till you
    get something believable in the right vicinity
    based on anatomical location (e.g., sulcal
    landmarks) or functional location (e.g.,
    Talairach coordinates from prior studies)
  • Once you have found the ROI, do independent
    experiments, extract the time course information
    and determine whether activation differences
    between conditions are significant
  • Because the runs that are used to generate the
    area are independent from those used to test the
    hypothesis, liberal statistics (p lt .05) can be
    used

89
Example of ROI Approach
Culham et al., 2003, Experimental Brain
Research Does the Lateral Occipital Complex
compute object shape for grasping?
Step 1 Localize LOC
Intact Objects
Scrambled Objects
90
Example of ROI Approach
Culham et al., 2003, Experimental Brain
Research Does the Lateral Occipital Complex
compute object shape for grasping?
Step 2 Extract LOC data from experimental runs
Grasping
Reaching
NS p .35
NS p .31
91
Example of ROI Approach
Very Simple Stats
BOLD Signal Change Left Hem. LOC BOLD Signal Change Left Hem. LOC
Subject Grasping Reaching
1 0.02 0.03
2 0.19 0.08
3 0.04 0.01
4 0.10 0.32
5 1.01 -0.27
6 0.16 0.09
7 0.19 0.12
Then simply do a paired t-test to see whether the
peaks are significantly different between
conditions
Extract average peak from each subject for each
condition
NS p .35
NS p .31
  • Instead of using BOLD Signal Change, you can
    use beta weights
  • You can also do a planned contrast in Brain
    Voyager using a module called the ROI GLM

92
Utility of Doing Both Approaches
  • We also verified the result with a voxelwise
    approach

Verification of no LOC activation for grasping gt
reaching even at moderate threshold (p lt .001,
uncorrected)
93
Example The Danger of ROI Approaches
  • Example 1 LOC may be a heterogeneous area with
    subdivisions ROI analyses gloss over this
  • Example 2 Some experiments miss important areas
    (e.g., Kanwisher et al., 1997 identified one
    important face processing area -- the fusiform
    face area, FFA -- but did not report a second
    area that is a very important part of the face
    processing network -- the occipital face area,
    OFA -- because it was less robust and consistent
    than the FFA.

94
Comparing the two approaches
  • Voxelwise Analyses
  • Require no prior hypotheses about areas involved
  • Include entire brain
  • Often neglect individual differences
  • Can lose spatial resolution with intersubject
    averaging
  • Can produce meaningless laundry lists of areas
    that are difficult to interpret
  • You have to be fairly stats-savvy and include all
    the appropriate statistical corrections to be
    certain your activation is really significant
  • Popular in Europe

95
Comparing the two approaches
  • Region of Interest (ROI) Analyses
  • Extraction of ROI data can be subjected to simple
    stats (no need for multiple comparisons,
    autocorrelation or random effects corrections)
  • Gives you more statistical power (e.g., p lt .05)
  • Hypothesis-driven
  • Useful when hypotheses are motivated by other
    techniques (e.g., electrophysiology) in specific
    brain regions
  • ROI is not smeared due to intersubject averaging
  • Important for discriminating abutting areas
    (e.g., V1/V2)
  • Easy to analyze and interpret
  • Neglects other areas which may play a fundamental
    role
  • If multiple ROIs need to be considered, you can
    spend a lot of scan time collecting localizer
    data (thus limiting the time available for
    experimental runs)
  • Works best for reliable and robust areas with
    unambiguous definitions
  • Popular in North America

96
A Proposed Resolution
  • There is no reason not to do BOTH ROI analyses
    and voxelwise analyses
  • ROI analyses for well-defined key regions
  • Voxelwise analyses to see if other regions are
    also involved
  • Ideally, the conclusions will not differ
  • If the conclusions do differ, there may be
    sensible reasons
  • Effect in ROI but not voxelwise
  • perhaps region is highly variable in stereotaxic
    location between subjects
  • perhaps voxelwise approach is not powerful enough
  • Effect in voxelwise but not ROI
  • perhaps ROI is not homogenous or is
    context-specific

97
Part 5
  • The War of Non-Independence

98
Finding the Obvious
A priori probability of getting JQKA sequence
(1/13)4 1/28,561 A posteriori probability of
getting JQKA sequence 1/1 100
  • Non-independence error
  • occurs when statistical tests performed are not
    independent from the means used to select the
    brain region

Arguments from Vul Kanwisher, book chapter in
press
99
Non-independence Error
  • Egregious example
  • Identify Area X with contrast of A gt B
  • Do post hoc stats showing that A is statistically
    higher than B
  • Act surprised
  • More subtle example of selection bias
  • Identify Area X with contrast of A gt B
  • Do post hoc stats showing that A is statistically
    higher than C and C is statistically greater than
    B

Arguments from Vul Kanwisher, book chapter in
press Figure from Kriegeskorte et al., 2009,
Nature Neuroscience
100
Double Dipping How to Avoid It
  • Kriegeskorte et al., 2009, Nature Neuroscience
  • surveyed 134 papers in prestiguous journals
  • 42 showed at least one example of
    non-independence error

101
Correlations Between Individual Subjects Brain
Activity and Behavioral Measures
  • Sample of Critiqued Papers
  • Eisenberg, Lieberman Williams, 2003, Science
  • measured fMRI activity during social rejection
  • correlated self-reported distress with brain
    activity
  • found r .88 in anterior cingulate cortex, an
    area implicated in physical pain perception
  • concluded rejection hurts

social exclusion gt inclusion
102
Voodoo Correlations
The original title of the paper was not
well-received by reviewers so it was changed even
though some people still use the term
Voodoo
2009
  • reliability of personality and emotion measures
    r .7
  • reliability of activation in a given voxel r
    .7
  • highest expected behavior fMRI correlation is
    .74
  • so how can we have behavior fMRI correlations
    of r .9?!

103
Voodoo Correlations
"Notably, 53 of the surveyed studies selected
voxels based on a correlation with the behavioral
individual-differences measure and then used
those same data to compute a correlation within
that subset of voxels."
Vul et al., 2009, Perspectives on Psychological
Science
104
Avoiding Voodoo
  • Use independent means to select region and then
    evaluate correlation
  • Do split-half reliability test
  • WARNING This is reassuring that the result can
    be replicated in your sample but does not
    demonstrate that result generalizes to the
    population

105
Is the voodoo problem all that bad?
  • High correlations can occur in legitimately
    analyzed data
  • Did voxelwise analyses use appropriate correction
    for multiple comparisons?
  • then result is statistically significant
    regardless of specific correlation
  • Is additional data being used for
  • inference purposes?
  • if they pretend to provide independent support,
    thats bad
  • presentation purposes?
  • alternative formats can be useful in
    demonstrating that data is clean (e.g., time
    courses look sensible correlations are not
    driven by outliers)
Write a Comment
User Comments (0)
About PowerShow.com