Nonnegative least squares for imaging data - PowerPoint PPT Presentation

About This Presentation

Title:

Nonnegative least squares for imaging data

Description:

Nonnegative least squares for imaging data Keith Worsley McGill Jonathan Taylor Stanford and Universit de Montr al John Aston Academia Sinica, Taipei – PowerPoint PPT presentation

Number of Views:73

Avg rating:3.0/5.0

Slides: 60

Provided by: KeithW64

Learn more at: http://galton.uchicago.edu

Category:

more less

Transcript and Presenter's Notes

Title: Nonnegative least squares for imaging data

1
Nonnegative least squares for imaging data

Keith Worsley
McGill
Jonathan Taylor
Stanford and Université de Montréal
John Aston
Academia Sinica, Taipei

2
(No Transcript)
3
Nature (2005)
4
Subject is shown one of 40 faces chosen at
random
Happy
Sad
Fearful
Neutral
5
but face is only revealed through random
bubbles

First trial Sad expression
Subject is asked the expression
Neutral
Response
Incorrect

75 random bubble centres
Smoothed by a Gaussian bubble
What the subject sees
Sad
6
Your turn

Trial 2

Subject response Fearful CORRECT
7
Your turn

Trial 3

Subject response Happy INCORRECT (Fearful)
8
Your turn

Trial 4

Subject response Happy CORRECT
9
Your turn

Trial 5

Subject response Fearful CORRECT
10
Your turn

Trial 6

Subject response Sad CORRECT
11
Your turn

Trial 7

Subject response Happy CORRECT
12
Your turn

Trial 8

Subject response Neutral CORRECT
13
Your turn

Trial 9

Subject response Happy CORRECT
14
Your turn

Trial 3000

Subject response Happy INCORRECT (Fearful)
15
Bubbles analysis

E.g. Fearful (3000/4750 trials)

Trial 1 2 3 4
5 6 7 750
Sum
Correct trials
Thresholded at proportion of correct
trials0.68, scaled to 0,1
Use this as a bubble mask
Proportion of correct bubbles (sum correct
bubbles) /(sum all bubbles)
16
Results

Mask average face
But are these features real or just noise?
Need statistics

Happy Sad
Fearful Neutral
17
Statistical analysis

Correlate bubbles with response (correct 1,
incorrect 0), separately for each expression
Equivalent to 2-sample Z-statistic for correct
vs. incorrect bubbles, e.g. Fearful
Very similar to the proportion of correct bubbles

ZN(0,1) statistic
Trial 1 2 3 4
5 6 7 750
Response 0 1 1 0
1 1 1 1
18
Results

Thresholded at Z1.64 (P0.05)
Multiple comparisons correction for 91200 pixels?
Need random field theory

ZN(0,1) statistic
Average face Happy Sad
Fearful Neutral
19
Euler Characteristic blobs - holes
Excursion set Z gt threshold for neutral face
EC 0 0 -7 -11
13 14 9 1 0
Heuristic At high thresholds t, the holes
disappear, EC 1 or 0, E(EC) P(max Z gt
t).

Exact expression for E(EC) for all thresholds,
E(EC) P(max Z gt t) is extremely accurate.

20
Random field theory
Lipschitz-Killing curvatures of S (Resels(S)
c)
EC densities of Z above t
filter
white noise
Z(s)

FWHM
21
Results, corrected for search

Random field theory threshold Z3.92 (P0.05)
3.82 3.80 3.81
3.80
Saddle-point approx (Rabinowitz, 1997 Chamandy,
2007)?
Bonferroni Z4.87 (P0.05) nothing

ZN(0,1) statistic
Average face Happy Sad
Fearful Neutral
22
fMRI data 120 scans, 3 scans each of hot, rest,
warm, rest,
T (hot warm effect) / S.d. t110 if no
effect
23
Linear model regressors
24
Linear model for fMRI time series with AR(p)
errors
? ?

unknown
parameters ?
? ?

25
Unknown latency d of HRF
? ? ?
unknown parameters
26
Example
convolve with stimulus
0
0
2
0.4
2
-2
-2
2
1
stimulus
0.2
0
0
-0.2
-1
0
10
20
10
20
30
40
50
60
70
t
t
Time,
(seconds)
Time,
(seconds)

Two interesting problems
Estimate the latency shift d and its standard
error
Test for the magnitude ß of the stimulus
allowing for unknown latency.

27
Test for the magnitude ß of the stimulus allowing
for unknown latency

We could do either
T-test on ß1 gt 0, allowing for ß2
loses sensitivity if d is far from 0
F-test on (ß1, ß2) ? 0
wastes sensitivity on unrealistic HRFs

28
Cone alternative

We know that the
magnitude of the response is positive
latency shift must lie in some restricted
interval, say -2, 2 seconds
This implies that
ß 0
-? d ?, where ? 2 seconds
This specifies a cone alternative for (ß1 ß, ß2
dß) (Friman et al. , 2003)

d 2
ß2
Cone angle ? 2 atan(?x2/x1)
cone alternative
d 0
ß1
Null 0
d -2
29
Non-negative least squares
2
x2(t)
x1(t)
1
stimulus
0
-1
10
20
30
40
50
60
70
t
Time,
(seconds)
30
Non-negative least squares
x1(t)
ß2
ß1 0, ß2 0
Cone angle ? angle between x1 and x2
ß1 0 ß2 0
ß1
Null 0
ß1 0, ß2 0
x2(t)
31
Example of three extremes
x3(t) spread 4 seconds
4
3D cone
0 25
4
x1(t) standard
x2(t) delayed 4 seconds
ß3
ß2
ß1
32
General non-negative least squares problem
Footnote Woolrich et al. (2004) replace hard
constraints by soft constraints through a prior
distribution on ß, taking a Bayesian approach.
33
(No Transcript)
34
(No Transcript)
35
Pick a range of say p 150 plausible values of
the non-linear parameter ? ?1,,?p
36
Fitting the NNLS model

Simple
Do all subsets regression
Throw out any model that does not satisfy the
non-negativity constraints
Among those left, pick the model with smallest
error sum of squares
For larger models there are more efficient
methods e.g. Lawson Hanson (1974).
The non-negativity constraints tend to enforce
sparsity, even if regressors are highly
correlated (e.g. PET).
Why? Highly correlated regressors have huge
positive and negative unconstrained coefficients
non-negativity suppresses the negative ones.

37
Example n20, p150, but surprisingly it does
not overfit
30

Y
Yhat
25
Yhat component 1
Yhat component 2
20
15
Tracer
10
5
0

0
1000
2000
3000
4000
5000
6000
Time
Tend to get sparse pairs of adjacent regressors,
suggesting best regressor is somewhere inbetween.
38
P-values?
1 when j?
39
P-values for PET data at a single voxel
p 2 Cone weights w1 ½, w2 ?/2p
j
j
40
The Beta-bar random field
1 when j?
From well-known EC densities of F field
From simulations at a single voxel
Same linear combination!
41
Proof
Z1N(0,1)
Z2N(0,1)
s2
s1
Rejection regions,
Excursion sets,
Threshold t
Z2
Cone alternative
Search Region, S
Z1
Null
42
Euler characteristic heuristic again
Excursion sets, Xt
Search Region, S
EC blobs - holes 1 7
6 5 2 1
1 0
Observed
Expected
Euler characteristic, EC
Threshold, t
EXACT!
43
Proof
44
Steiner-Weyl Tube Formula (1930)
Morse Theory Approach (1995)

Put a tube of radius r about the search
region ?S

r
Tube(?S,r)
?S
For a Gaussian random field
For a chi-bar random field???

Find volume, expand as a power series
in r, pull off coefficients

45
Tube(?S,r)
r
?S
Steiner-Weyl Volume of Tubes Formula (1930)
Lipschitz-Killing curvatures are just intrinisic
volumes or Minkowski functionals in the
(Riemannian) metric of the variance of the
derivative of the process
46
S
S
S
Edge length ?
Lipschitz-Killing curvature of triangles
Lipschitz-Killing curvature of union of triangles
47
Non-isotropic data? Use Riemannian metric of
Var(?Z)
ZN(0,1)
ZN(0,1)
s2
s1
Edge length ?
Lipschitz-Killing curvature of triangles
Lipschitz-Killing curvature of union of triangles
48
We need independent identically distributed
random fields e.g. residuals from a linear model

Lipschitz-Killing curvature of triangles
Lipschitz-Killing curvature of union of triangles
Taylor Worsley, JASA (2007)
49
Beautiful symmetry
Steiner-Weyl Tube Formula (1930)
Taylor Gaussian Tube Formula (2003)

Put a tube of radius r about the search region
?S and rejection region Rt

Z2N(0,1)
Rt
r
Tube(Rt,r)
Tube(?S,r)
r
?S
Z1N(0,1)
t
t-r

Find volume or probability, expand as a power
series in r, pull off coefficients

50
Z2N(0,1)
Rejection region Rt
Tube(Rt,r)
r
Z1N(0,1)
t
t-r
Taylors Gaussian Tube Formula (2003)
51

From well-known EC densities of F field
From simulations at a single voxel
Same linear combination!
52
Proof, n3
53
Power? S 1000cc brain, FWHM 10mm, P
0.05
Event
Block (20 seconds)
1
1
Cone angle ? 78.4o
Cone angle ? 38.1o
0.9
0.9
T-test on ß1
0.8
0.8
Beta-bar test
0.7
0.7
F-test on (ß1, ß2)
Cone weights w1 ½ w2 ?/2p
0.6
0.6
0.5
0.5
Power of test
Power of test
0.4
0.4
0.5
2
x1(t)
0.3
0.3
0
Response
0
Response
0.2
0.2
x2(t)
-0.5
-2
0.1
0.1
0
20
40
0
20
40
Time t (seconds)
Time t (seconds)
0
0
0
1
2
3
0
1
2
3
d
d
Shift
of HRF (seconds)
Shift
of HRF (seconds)
54
Bubbles task in fMRI scanner

Correlate bubbles with BOLD at every voxel
Calculate Z for each pair (bubble pixel, fMRI
voxel) a 5D image of Z statistics

Trial 1 2 3 4
5 6 7 3000
fMRI
55
Thresholding? Cross correlation random field

Correlation between 2 fields at 2 different
locations,
searched over all pairs of locations, one in S,
one in T
Bubbles data P0.05, n3000, c0.113, T6.22

Cao Worsley, Annals of Applied Probability
(1999)
56
NNLS for bubbles?

At the moment, we are correlating Y(t) fMRI
data at each voxel with each of the 240x38091200
face pixels as regressors x1(t),,x91200(t)
separately
Y(t) xj(t)ßj
z(t)? e(t).
We should be doing this simultaneously
Y(t) x1(t)ß1
x91200(t)ß91200 z(t)? e(t).
Obviously impossible observations(3000) ltlt
regressors(91200).
Maybe we can use NNLS ß1 0, , ß91200 0.
It should enforce sparsity over ß activation at
face pixels, provided
observations(3000) gtgt dimensions of cone
resels of face(146.2)
We can threshold Beta-bar over brain voxels to
Plt0.05 using above.
Result will be an face image of isolated local
maxima for each voxel.
It will tell you which brain voxels are
activated, but not which face pixels.
Might be a huge computational task!
Interactions? Y x1 x91200 x1x2
z.

57
MS lesions and cortical thickness

Idea MS lesions interrupt neuronal signals,
causing thinning in down-stream cortex
Data n 425 mild MS patients

5.5
5
4.5
4
Average cortical thickness (mm)
3.5
3
2.5
Correlation -0.568, T -14.20 (423 df)
2
Charil et al, NeuroImage (2007)
1.5
0
10
20
30
40
50
60
70
80
Total lesion volume (cc)
58
MS lesions and cortical thickness at all pairs of
points

Dominated by total lesions and average cortical
thickness, so remove these effects as follows
CT cortical thickness, smoothed 20mm
ACT average cortical thickness
LD lesion density, smoothed 10mm
TLV total lesion volume
Find partial correlation(LD, CT-ACT) removing TLV
via linear model
CT-ACT 1 TLV LD
test for LD
Repeat for all voxels in 3D, nodes in 2D
1 billion correlations, so thresholding
essential!
Look for high negative correlations
Threshold P0.05, c0.300, T6.48