Title: The General Linear Model
1The General Linear Model
- A Basic Introduction
- Roger Tait (rt337_at_cam.ac.uk)
2Overview
- What is imaging data
- How is data pre-processed
- Hypothesis testing
- GLM simple linear regression
- Analysis software
- How to process results
3What is imaging data?
4Data
A stack of numbers
Functional
5Multiple Data
subjID voxel1 voxel2 voxel 3 voxel 4 .. voxel n
1 1227.308541 1472.770249 1417.745632 1701.294758 1288.742729
2 1612.461523 1934.953827 1677.661927 2013.194312 1465.051592
3 1466.264739 1759.517687 1559.769586 1871.723503 1827.678127
4 1499.70072 1799.640864 1842.474418 2210.969302 1316.392368
5 1598.121692 1917.746031 1510.850757 1813.020909 1740.286976
6 1408.066243 1689.679492 1399.393815 1679.272578 1534.459154
7 1555.951487 1867.141784 1588.529211 1588.529211 1516.464089
8 1397.721831 1677.266197 1523.825912 1523.825912 1340.814881
9 1333.659118 1600.390941 1384.217926 1384.217926 1461.281399
10 1453.14966 1743.779592 1558.603977 1558.603977 1406.575083
6Reorientation
Native
Reoriented
MNI152
7Basic pre-processing (fmri)
worest.nii
obrain.nii
omprage.nii
omrest.nii
wnomrest.nii
nomrest.nii
8Basic pre-processing (structural)
gmomprage.nii
wgmomprage.nii
omprage.nii
9How does standard space data help?
10Hypothesis testing
Statistical inference is commonly done with a
test statistic (t, F, c2) which has a
distribution under H0 mathematically derived.
For example
NB this assumes that the errors are independent
and normally distributed.
11Introducing The GLM
Y Xb e
DATA MODEL ERROR
DATA KNOWN UNKNOWN ERROR
- Encapsulates t-test (paired, un-paired), F-test,
ANOVA (one-way, two-way, main effects, factorial)
MANOVA, ANCOVA, MANCOVA, simple regression,
linear regression, multiple regression,
multivariate regression
12GLM definition
Y Xb e
- Where Y is a matrix with a series of observed
measurements - Where X is a matrix that might be a design matrix
- Where b is a matrix containing parameters to be
estimated - And e is a matrix containing error or noise
13GLM Simple Linear Regression
Y b0 X1b1 e
b0 is the Y axis intercept
Y
b1 is the gradient of slope
Y the black circles
e diff between predicted Y and observed Y
X
14GLM Simple Linear Regression
Y b0 X1b1 e
- This is done by choosing b0 and b1 so that the
sum of the squares of the estimated errors S ei2
is as small as possible. - This is called the Method of Least Squares.
- S ei2 is called the Residual Sum of Squares (RSS)
15GLM example
DATA KNOWN UNKNOWN ERROR
mean reaction time GENDER AGE
Y b0 X1b1 X2b2 X3b3 X4b4 e
16Dummy Variables
- Continuous variables
- measurements on a continuous scale (age, mRT)
- (-4.01, -0.47, 6.35, -7.06, -7.69, -14.24)
- Dummy Variables
- Code for group membership (disease, gender)
- controls 0, patients 1
- females 1, males -1
17Usage
- Hypothesis tests with GLM can be multivariate or
several independent univariate tests - In multivariate tests the columns of Y are tested
together - In univariate tests the columns of Y are tested
independently (multiple univariate tests with the
same design matrix)
18fMRI model specification
silent naming task
The model
BOLD signal
19Actual retrieved data
20fmri analysis with FSL
21Structural analysis with CamBA
sex
weight
group
22Structural analysis output
23Where are my clusters?
here is a big cluster
here is a big cluster
24Where is the cluster I am interested in?
position mouse cursor here
cluster location information shown here
25How do my clusters help me?
26Statistical Testing
- Convert cluster into a binary mask
- Overlay mask on subject data
- Extract voxel intensities
- Do some statistical analysis to get more
information from your data
27Correlation with behaviour
for cluster Pos_002
pgt0.05 close but cluster Pos_001 does not
significantly correlate with behaviour HIT1
28Other Analyses
different from 0
one-sample t-test
Difference between means
two-sample t-test
Linear relationship between 2 variables
simple regression
29What else can I do to find out more about my data?
30Other types of analyses
- Factorial designs
- Permits analysis of multiple time data
- Shows
- Main effects of Factor 1 (time)
- Main effects of Factor 2 (group)
- Interaction between Factor 1 and Factor 2
31Useful software package
- CamBA Cambridge
- http//www-bmu.psychiatry.cam.ac.uk/software/
- FSL Randomise Oxford
- http//fsl.fmrib.ox.ac.uk/fsl/fslwiki/Randomise
- SPM8 UCL
- http//www.fil.ion.ucl.ac.uk/spm/software/spm8/
32In summary
- The GLM allows us to summarize a wide variety of
research outcomes by specifying the exact
equation that best summarizes the data for a
study. If the model is wrongly specified, the
estimates of the coefficients (the beta values)
are likely to be biased (i.e. wrong) and the
resulting equation will not describe the data
accurately. - In complex situations (e.g. cognitive fMRI
paradigms), this model specification problem can
be a serious and difficult one
33Any questions?